Difference between revisions of "A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml"

From Wikipedia Quality
Jump to: navigation, search
(Adding infobox)
(Categories)
 
(One intermediate revision by one other user not shown)
Line 9: Line 9:
 
== Overview ==
 
== Overview ==
 
To build a comparable [[Wikipedia]] corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, authors used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for [[multilingual]] research in various linguistic topics.
 
To build a comparable [[Wikipedia]] corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, authors used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for [[multilingual]] research in various linguistic topics.
 +
 +
== Embed ==
 +
=== Wikipedia Quality ===
 +
<code>
 +
<nowiki>
 +
Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). "[[A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml]]".
 +
</nowiki>
 +
</code>
 +
 +
=== English Wikipedia ===
 +
<code>
 +
<nowiki>
 +
{{cite journal |last1=Bubenhofer |first1=Noah |last2=Haupt |first2=Stefanie |last3=Schwinn |first3=Horst |title=A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml |date=2011 |url=https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml}}
 +
</nowiki>
 +
</code>
 +
 +
=== HTML ===
 +
<code>
 +
<nowiki>
 +
Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). &amp;quot;<a href="https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml">A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml</a>&amp;quot;.
 +
</nowiki>
 +
</code>
 +
 +
 +
 +
[[Category:Scientific works]]
 +
[[Category:German Wikipedia]]
 +
[[Category:French Wikipedia]]
 +
[[Category:Italian Wikipedia]]
 +
[[Category:Polish Wikipedia]]
 +
[[Category:Norwegian Wikipedia]]
 +
[[Category:Hungarian Wikipedia]]

Latest revision as of 11:05, 27 November 2019


A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml
Authors
Noah Bubenhofer
Stefanie Haupt
Horst Schwinn
Publication date
2011
Links
Original

A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml - scientific work related to Wikipedia quality published in 2011, written by Noah Bubenhofer, Stefanie Haupt and Horst Schwinn.

Overview

To build a comparable Wikipedia corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, authors used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for multilingual research in various linguistic topics.

Embed

Wikipedia Quality

Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). "[[A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml]]".

English Wikipedia

{{cite journal |last1=Bubenhofer |first1=Noah |last2=Haupt |first2=Stefanie |last3=Schwinn |first3=Horst |title=A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml |date=2011 |url=https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml}}

HTML

Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). &quot;<a href="https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml">A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml</a>&quot;.