About multilingual parallel corpus of translations

Multilingual parallel corpus of translations is based on the EU Commission data.

The multilingual corpus contains EU acts in 22 EU official languages - however, all the texts in the corpus have not been translated into all languages and therefore the number of hits varies with different languages. Most of the texts are in English, which was the source language in most cases.

The users of this corpus should be aware that only European Community legislation printed in the paper edition of the Official Journal of the European Union is deemed authentic.

The multilingual corpus is especially useful for translators.

Currently the corpus contains about 98 million words in 22 languages (all data have not been included yet); language distribution can be seen from the statistics.


Searching the corpus

Enter the search word(s) in the input field. Select one source language and one or more target languages (you can select several target languages by holding down the Controlor Ctrl key while clicking the mouse.). You can limit the search by specifying a full or partial Celex number.
The output can be limited to terminology or corpus data - or it can contain both data. Corpus output can be monolingual (KWIC - KeyWord In Context) or multilingual.

When making a search, the following wildcards can be used:



The corpus was last updated in July 2008.

Please send any comments regarding the corpus to the author of the program