A Positive-Unlabeled Learning Model for Extending a Vietnamese Petroleum Dictionary based on Vietnamese Wikipedia Data

From Wikipedia Quality
Revision as of 10:48, 27 February 2021 by Cheri (talk | contribs) (Infobox)
Jump to: navigation, search


A Positive-Unlabeled Learning Model for Extending a Vietnamese Petroleum Dictionary based on Vietnamese Wikipedia Data
Authors
Ngoc Trinh Vu
Quoc-Dat Nguyen
Tien-Dat Nguyen
Manh-Cuong Nguyen
Van-Vuong Vu
Quang-Thuy Ha
Publication date
2018
DOI
10.1007/978-3-319-75417-8_18
Links
Original

A Positive-Unlabeled Learning Model for Extending a Vietnamese Petroleum Dictionary based on Vietnamese Wikipedia Data - scientific work related to Wikipedia quality published in 2018, written by Ngoc Trinh Vu, Quoc-Dat Nguyen, Tien-Dat Nguyen, Manh-Cuong Nguyen, Van-Vuong Vu and Quang-Thuy Ha.

Overview

This study provides a positive-unlabeled learning model for extending a Vietnamese petroleum dictionary based on Vietnamese Wikipedia data. Machine learning algorithms with positive and unlabeled data together with separated and combined between Google similarity distance and Cosine similarity distance, used in this study. The data sources used to integrate are English - Vietnamese oil and gas dictionary and the Vietnamese Wikipedia. In the results, a novelty way for data integration with higher accuracy by using a combination of algorithms. The first Vietnamese oil and gas ontology was built in Vietnam. This ontology is a useful tool for staff in the oil and gas industry in training, research, search daily.