There has been a lot of talk in recent days on Twitter and in the Facebook groups about word frequency. This has arisen from the release of the
proposals for the new languages GCSE (read
Steve Smith's blog for excellent in-depth analyses of the proposals) which mention a vocabulary for Foundation Tier of 1200 high-frequency words, and 1700 for Higher Tier. I also mentioned word frequency during my presentation at
Language World 2021. I was telling the story of my
new scheme of work for Key Stage 2 Spanish, where I checked the frequency of the
vocabulary I included.
I have been using the Routledge Frequency dictionaries, which you can see in the picture above (the Spanish one proudly bearing the sticky note which marks the end of the numerical list and the beginning of the alphabetical list).
Some people were asking how these frequency lists were arrived at. The preface to the Spanish version (the one I have used the most) gives as sources for the Routledge lists:
- the corpus, of 20 million words from fiction and non-fiction texts, the internet and transcripts of spoken language
- words from all 21 Spanish-speaking countries
- recent language (the words for this edition were collected in 2014-15)
Most telling, I think, are the subtitles of the dictionaries. The Spanish one is A Frequency Dictionary of Spanish: Core Vocabulary for Learners", thus acknowledging that learners and users will want to use other words outside this list of 5000 to say what they want to say, depending on their interests.
As well as the main numerical and alphabetical lists of the highest frequency 5000 words in the language, the dictionaries also contain some useful lists of different groups of words, such as this list of French animals:
Animals are of great interest to primary children, yet many of the animal words that we teach are relatively low frequency and indeed outside the top 5000. They are usually accompanied in speech and writing, however, by much higher frequency words, such as determiners and key verb forms.