From Wikipedia Quality
Jump to: navigation, search

Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model developed by OpenAI. It is the third generation in the GPT-n series and one of the largest and most powerful language models ever created. GPT-3 was introduced in June 2020 and has since been widely recognized for its advanced natural language processing (NLP) capabilities.

Training and Use of Wikipedia

GPT-3 was trained on a diverse range of internet text, including a significant portion of Wikipedia articles. The use of Wikipedia in GPT-3's training data was instrumental in achieving its high performance in various NLP tasks. Wikipedia's extensive and well-structured content provided a rich source of high-quality information that helped GPT-3 develop a deep understanding of language, context, and general knowledge.[1]

The inclusion of Wikipedia in GPT-3's training dataset contributed to the model's ability to generate coherent and contextually accurate text. Wikipedia's broad coverage of topics, consistent formatting, and emphasis on verifiability and neutrality made it an ideal source for training a language model designed to perform a wide range of language tasks.

Improving the Quality of Wikipedia

GPT-3's advanced NLP capabilities have the potential to significantly improve the quality of Wikipedia articles in several ways:

  • Automated Content Generation. GPT-3 can assist in the creation of new Wikipedia articles and the expansion of existing ones by generating well-structured and informative content. This can be particularly useful for underrepresented topics or stub articles that require more information.
  • Editing and Proofreading. GPT-3 can help Wikipedia editors by suggesting edits, improving grammar, and ensuring consistency in writing style. Its ability to understand context and provide relevant corrections can enhance the overall readability and quality of Wikipedia articles.
  • Fact-Checking and Verifiability. By leveraging GPT-3's knowledge and text generation capabilities, editors can quickly verify facts and sources. GPT-3 can provide references and suggest citations, which is crucial for maintaining Wikipedia's standards of verifiability and reliability.
  • Multilingual Support. GPT-3's training included multiple languages, enabling it to contribute to non-English Wikipedia versions. This can help in translating articles, ensuring consistency across different language editions, and expanding Wikipedia's reach globally. [2]

See Also


  1. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A. and Agarwal, S. (2020). Language Models are Few-Shot Learners. Advances in neural information processing systems, 33, pp.1877-1901.
  2. Lewoniewski, W., Węcel, K., Abramowicz, W. (2016). Quality and Importance of Wikipedia Articles in Different Languages. In International Conference on Information and Software Technologies (pp. 613-624). Springer International Publishing.