# Extraction and Recognition of Polish Multiword Expressions Using Wikipedia and Finite-State Automata

Authors | Pawel Chrzaszcz |
---|---|

Publication date | 2016 |

DOI | 10.18653/v1/W16-1815 |

Links | Original Preprint |

**Extraction and Recognition of Polish Multiword Expressions Using Wikipedia and Finite-State Automata** - scientific work related to Wikipedia quality published in 2016, written by Pawel Chrzaszcz.

## Overview

Linguistic resources for Polish are often missing multiword expressions (MWEs) – idioms, compound nouns and other expressions which have their own distinct meaning as a whole. This paper describes an effort to extract and recognize nominal MWEs in Polish text using Wikipedia, inflection dictionaries and finite-state automata. Wikipedia is used as a lexicon of MWEs and as a corpus annotated with links to articles. Incoming links for each article are used to determine the inflection pattern of the headword – this approach helps eliminate invalid inflected forms. The goal is to recognize known MWEs as well as to find more expressions sharing similar grammatical structure and occurring in similar context.

## Embed

### Wikipedia Quality

```
Chrzaszcz, Pawel. (2016). "[[Extraction and Recognition of Polish Multiword Expressions Using Wikipedia and Finite-State Automata]]".DOI: 10.18653/v1/W16-1815.
```

### English Wikipedia

```
{{cite journal |last1=Chrzaszcz |first1=Pawel |title=Extraction and Recognition of Polish Multiword Expressions Using Wikipedia and Finite-State Automata |date=2016 |doi=10.18653/v1/W16-1815 |url=https://wikipediaquality.com/wiki/Extraction_and_Recognition_of_Polish_Multiword_Expressions_Using_Wikipedia_and_Finite-State_Automata}}
```

### HTML

```
Chrzaszcz, Pawel. (2016). "<a href="https://wikipediaquality.com/wiki/Extraction_and_Recognition_of_Polish_Multiword_Expressions_Using_Wikipedia_and_Finite-State_Automata">Extraction and Recognition of Polish Multiword Expressions Using Wikipedia and Finite-State Automata</a>".DOI: 10.18653/v1/W16-1815.
```