NT2Lex is a lexical database for Dutch as a foreign language (NT2) that includes frequency distributions of words observed in texts graded along the six-level scale of the Common European Framework of Reference for Languages. It is a receptive graded lexicon, with word frequencies observed in textbook reading activities and simplified readers targeting learners of Dutch.

More information can be found in the following paper. When using the resource(s) in your research or publication, please cite this paper as well.

Tack, A., François, T., Desmet, P. & Fairon, C. (2018). NT2Lex: A CEFR-Graded Lexical Resource for Dutch as a Foreign Language Linked to Open Dutch WordNet. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 137-146).



Receptive lexicon
includes word frequencies observed in textbook reading activities and simplified readers


CEFR levels
A1 · A2 · B1 · B2 · C1


Lexical entries
lemma (word)
part of speech (tag) · simplified CGN tagset
sense number (sense_se-id) · Open Dutch WordNet
synset (sense_sy-id) · Open Dutch WordNet


Computed metrics
F · raw frequency
D · dispersion index
SFI · standard frequency index
U · normalized frequency (per 1 million words)
tf-idf · term frequency-inverse document frequency


The resource is formatted as a tab-separated file (UTF-8 encoding) with one line per entry. Each entry has been automatically tagged with its lemma, part of speech, sense and synset (if applicable).

Per each observed CEFR level, a number of frequency metrics (METRIC@LEVEL) are computed on the corresponding graded texts, as well as on all levels (METRIC@TOTAL). If an entry does not appear in a specific level, the corresponding columns are set to an empty symbol (-).


search Search

The resource can be used to compare the frequency distribution of multiple words along the CEFR scale. An online query interface is available and can be accessed via the Search tab.

bar_chart Analyse

The resource can also be used to analyze the complexity of words in a text, in particular to identify which of the words in a text will be difficult at a given level. An online complexity analyzer is available and can be accessed via the Analyze tab.

get_app Download

You can use EFLLex in NLP tasks but also for pedagogical and language assessment purposes.


Anaïs Tack
CENTAL, UCLouvain (BE)
ITEC, KU Leuven (BE)

Thomas François
CENTAL, UCLouvain (BE)

Piet Desmet
ITEC, KU Leuven (BE)

Cédrick Fairon
CENTAL, UCLouvain (BE)


Anne-Sophie Desmet
Corpus Annotation

Brayan Delmée
Logo Design

Damien De Meyere


This research was funded by an F.R.S.-FNRS research grant.