Difference between revisions of "A New Approach for Building Domain-Specific Corpus with Wikipedia"
(+ wikilinks) |
(cats.) |
||
(One intermediate revision by one other user not shown) | |||
Line 1: | Line 1: | ||
+ | {{Infobox work | ||
+ | | title = A New Approach for Building Domain-Specific Corpus with Wikipedia | ||
+ | | date = 2013 | ||
+ | | authors = [[Xin Ye Zhang]]<br />[[Xiu Li]]<br />[[Zhi Jian Ruan]] | ||
+ | | doi = 10.4028/www.scientific.net/AMM.321-324.2319 | ||
+ | | link = https://www.scientific.net/AMM.321-324.2319 | ||
+ | }} | ||
'''A New Approach for Building Domain-Specific Corpus with Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Xin Ye Zhang]], [[Xiu Li]] and [[Zhi Jian Ruan]]. | '''A New Approach for Building Domain-Specific Corpus with Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Xin Ye Zhang]], [[Xiu Li]] and [[Zhi Jian Ruan]]. | ||
== Overview == | == Overview == | ||
Domain-specific corpus can be used to build domain [[ontology]], which is used in many areas such as IR, NLP and web Mining. Authors propose a multi-root based method to build a domain-specific corpus making use of [[Wikipedia]] resources. First authors select some top-level nodes (Wikipedia category articles) as root nodes and traverse the Wikipedia using BFS-like algorithm. After the traverse, authors get a directed Wikipedia graph (Wiki-graph). Then an algorithm mainly based on Kosaraju Algorithm is proposed to remove the cycles in the Wiki-graph. Finally, topological sort algorithm is used to traverse the Wiki-graph, and ranking and filtering is done during the process. When computing a node’s ranking score, the in-degree of itself and the out-degree of its parents are both considered. The experimental evaluation shows that method could get a high-quality domain-specific corpus | Domain-specific corpus can be used to build domain [[ontology]], which is used in many areas such as IR, NLP and web Mining. Authors propose a multi-root based method to build a domain-specific corpus making use of [[Wikipedia]] resources. First authors select some top-level nodes (Wikipedia category articles) as root nodes and traverse the Wikipedia using BFS-like algorithm. After the traverse, authors get a directed Wikipedia graph (Wiki-graph). Then an algorithm mainly based on Kosaraju Algorithm is proposed to remove the cycles in the Wiki-graph. Finally, topological sort algorithm is used to traverse the Wiki-graph, and ranking and filtering is done during the process. When computing a node’s ranking score, the in-degree of itself and the out-degree of its parents are both considered. The experimental evaluation shows that method could get a high-quality domain-specific corpus | ||
+ | |||
+ | == Embed == | ||
+ | === Wikipedia Quality === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Zhang, Xin Ye; Li, Xiu; Ruan, Zhi Jian. (2013). "[[A New Approach for Building Domain-Specific Corpus with Wikipedia]]". Trans Tech Publications. DOI: 10.4028/www.scientific.net/AMM.321-324.2319. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === English Wikipedia === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | {{cite journal |last1=Zhang |first1=Xin Ye |last2=Li |first2=Xiu |last3=Ruan |first3=Zhi Jian |title=A New Approach for Building Domain-Specific Corpus with Wikipedia |date=2013 |doi=10.4028/www.scientific.net/AMM.321-324.2319 |url=https://wikipediaquality.com/wiki/A_New_Approach_for_Building_Domain-Specific_Corpus_with_Wikipedia |journal=Trans Tech Publications}} | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === HTML === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Zhang, Xin Ye; Li, Xiu; Ruan, Zhi Jian. (2013). &quot;<a href="https://wikipediaquality.com/wiki/A_New_Approach_for_Building_Domain-Specific_Corpus_with_Wikipedia">A New Approach for Building Domain-Specific Corpus with Wikipedia</a>&quot;. Trans Tech Publications. DOI: 10.4028/www.scientific.net/AMM.321-324.2319. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | |||
+ | |||
+ | [[Category:Scientific works]] |
Latest revision as of 09:22, 15 April 2021
Authors | Xin Ye Zhang Xiu Li Zhi Jian Ruan |
---|---|
Publication date | 2013 |
DOI | 10.4028/www.scientific.net/AMM.321-324.2319 |
Links | Original |
A New Approach for Building Domain-Specific Corpus with Wikipedia - scientific work related to Wikipedia quality published in 2013, written by Xin Ye Zhang, Xiu Li and Zhi Jian Ruan.
Overview
Domain-specific corpus can be used to build domain ontology, which is used in many areas such as IR, NLP and web Mining. Authors propose a multi-root based method to build a domain-specific corpus making use of Wikipedia resources. First authors select some top-level nodes (Wikipedia category articles) as root nodes and traverse the Wikipedia using BFS-like algorithm. After the traverse, authors get a directed Wikipedia graph (Wiki-graph). Then an algorithm mainly based on Kosaraju Algorithm is proposed to remove the cycles in the Wiki-graph. Finally, topological sort algorithm is used to traverse the Wiki-graph, and ranking and filtering is done during the process. When computing a node’s ranking score, the in-degree of itself and the out-degree of its parents are both considered. The experimental evaluation shows that method could get a high-quality domain-specific corpus
Embed
Wikipedia Quality
Zhang, Xin Ye; Li, Xiu; Ruan, Zhi Jian. (2013). "[[A New Approach for Building Domain-Specific Corpus with Wikipedia]]". Trans Tech Publications. DOI: 10.4028/www.scientific.net/AMM.321-324.2319.
English Wikipedia
{{cite journal |last1=Zhang |first1=Xin Ye |last2=Li |first2=Xiu |last3=Ruan |first3=Zhi Jian |title=A New Approach for Building Domain-Specific Corpus with Wikipedia |date=2013 |doi=10.4028/www.scientific.net/AMM.321-324.2319 |url=https://wikipediaquality.com/wiki/A_New_Approach_for_Building_Domain-Specific_Corpus_with_Wikipedia |journal=Trans Tech Publications}}
HTML
Zhang, Xin Ye; Li, Xiu; Ruan, Zhi Jian. (2013). "<a href="https://wikipediaquality.com/wiki/A_New_Approach_for_Building_Domain-Specific_Corpus_with_Wikipedia">A New Approach for Building Domain-Specific Corpus with Wikipedia</a>". Trans Tech Publications. DOI: 10.4028/www.scientific.net/AMM.321-324.2319.