Lexical Complexity Analysis with NT2Lex

This tool enables you to analyze the lexical difficulty of a text for foreign language learners.
CEFR proficiency level

For researchers or teachers: you can pick the specific proficiency level which interests you. For learners: you can use your own proficiency level.

CEFR-based Lexical Complexity

For each word in the text (except for numerical units and named entities), the tool computes its difficulty according to the CEFR scale. The level of difficulty corresponds to the CEFR level as attested in the chosen resource. In order to define this proficiency level, the tool selects the first level in which the lexical unit is observed.

Different color schemes are used to label the word's difficulty level.

A1  A2  B1  B2  C1  C2  UNKNOWN

Stop words and named entities are automatically ignored. The unknown words, i.e. words that are missing in the resource are highlighted in red. Since they have not been observed in L2 materials, they are considered the most difficult.

CEFR-based Complex Word Identification

The program identifies complex words for a specific proficiency level by identifying the words with a CEFR difficulty level beyond the proficiency level that you have chosen. These complex words have been highlighted.

To obtain more information, you can click on each complex word to see its CEFR level, its lemma, its part of speech and its graded frequency distribution.

And finally ...

Once you have submitted your text and run the analysis, you will be able to visualise the distribution of the different levels of CEFR for your text and dynamically filter the words according to their level of difficulty.

1 Select a version of NT2Lex
Part-Of-Speech Tagger
Level attribution method help_outline
2 Type a text
Please provide a plain text file (.txt) with UTF-8 encoding.

[[ message ]]


Distribution of difficulty in your text  Click to filter words according to their CEFR level