Difference between revisions of "A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml"
(A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml - new page) |
(Categories) |
||
(3 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | '''A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml''' - scientific work related to Wikipedia quality published in 2011, written by Noah Bubenhofer, Stefanie Haupt and Horst Schwinn. | + | {{Infobox work |
+ | | title = A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml | ||
+ | | date = 2011 | ||
+ | | authors = [[Noah Bubenhofer]]<br />[[Stefanie Haupt]]<br />[[Horst Schwinn]] | ||
+ | | link = https://ids-pub.bsz-bw.de/files/5189/Bubenhofer_Schwinn_Haupt-A_comparable_corpus-2011.pdf | ||
+ | }} | ||
+ | '''A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml''' - scientific work related to [[Wikipedia quality]] published in 2011, written by [[Noah Bubenhofer]], [[Stefanie Haupt]] and [[Horst Schwinn]]. | ||
== Overview == | == Overview == | ||
− | To build a comparable Wikipedia corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, authors used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for multilingual research in various linguistic topics. | + | To build a comparable [[Wikipedia]] corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, authors used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for [[multilingual]] research in various linguistic topics. |
+ | |||
+ | == Embed == | ||
+ | === Wikipedia Quality === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). "[[A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml]]". | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === English Wikipedia === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | {{cite journal |last1=Bubenhofer |first1=Noah |last2=Haupt |first2=Stefanie |last3=Schwinn |first3=Horst |title=A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml |date=2011 |url=https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml}} | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === HTML === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). &quot;<a href="https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml">A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml</a>&quot;. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | |||
+ | |||
+ | [[Category:Scientific works]] | ||
+ | [[Category:German Wikipedia]] | ||
+ | [[Category:French Wikipedia]] | ||
+ | [[Category:Italian Wikipedia]] | ||
+ | [[Category:Polish Wikipedia]] | ||
+ | [[Category:Norwegian Wikipedia]] | ||
+ | [[Category:Hungarian Wikipedia]] |
Latest revision as of 11:05, 27 November 2019
Authors | Noah Bubenhofer Stefanie Haupt Horst Schwinn |
---|---|
Publication date | 2011 |
Links | Original |
A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml - scientific work related to Wikipedia quality published in 2011, written by Noah Bubenhofer, Stefanie Haupt and Horst Schwinn.
Overview
To build a comparable Wikipedia corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, authors used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for multilingual research in various linguistic topics.
Embed
Wikipedia Quality
Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). "[[A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml]]".
English Wikipedia
{{cite journal |last1=Bubenhofer |first1=Noah |last2=Haupt |first2=Stefanie |last3=Schwinn |first3=Horst |title=A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml |date=2011 |url=https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml}}
HTML
Bubenhofer, Noah; Haupt, Stefanie; Schwinn, Horst. (2011). "<a href="https://wikipediaquality.com/wiki/A_Comparable_Wikipedia_Corpus:_from_Wiki_Syntax_to_Pos_Tagged_Xml">A Comparable Wikipedia Corpus: from Wiki Syntax to Pos Tagged Xml</a>".