DAFlex is a lexicon of receptive vocabulary for German as a second/foreign language that reports the normalized frequencies of words (lemmas) across the six levels of the CEFR (Common European Framework of Reference for Languages). The frequencies have been estimated on a corpus of textbooks and simplified readers.



Receptive lexicon
includes word frequencies observed in textbook reading activities and simplified readers


CEFR levels
A1 · A2 · B1 · B2 · C1 · C2


Lexical entries
lemma (word)
part of speech (tag) · TreeTagger for German - Deutsches Wortart-Tagset (STTS)


Computed metrics
level_freq · normalized frequency (per 1 million words) for each level of the CEFR
total_freq · total normalized frequency in the source corpus
nb_doc · document frequency


search Search

The resource can be used to compare the frequency distribution of multiple words along the CEFR scale. An online query interface is available and can be accessed via the Search tab.

bar_chart Analyse

The resource can also be used to analyze the complexity of words in a text, in particular to identify which of the words in a text will be difficult at a given level. An online complexity analyzer is available and can be accessed via the Analyze tab.


Thomas François
CENTAL, UCLouvain (BE)

Patricia Kerres
CENTAL, UCLouvain (BE)

Damien De Meyere
CENTAL, UCLouvain (BE)

Ferran Suñer Muñoz


Camille Delaunoy, Chiara Fort, Mélanie Johanns and Lara Schmitz
Corpus collection

Damien De Meyere