An explorer for large-scale computational sociolinguistics

Welcome to Humboldt, a tool for large-scale sociolinguistics for everyone!

We provide an interface to search lexical phenomena in several languages, and to get statistical analysis and maps of the results along several demographic factors. You don't need any programming experience, training in statistical methods, knowledge of geographical information systems, or expertise in natural language processing.

This tool is named after the Humboldt brothers, because - like them - it combines linguistic knowledge and scientific exploration.

We use data from various online sources, and are constantly expanding our coverage, the languages you can search, and the services we provide, so check back often!

Try out searching a single term, or compare two terms to each other:

Query syntax
  • The Kleene star, *, matches everything, so toll* in German covers toll, tolle, tolles, …
  • A ? matches one character, so toll? in German only covers toll, tolle
  • Space-separated terms return the union of all of them, i.e., good better best matches all comparisons of good
  • Enclosing terms in quotes " searches for the exact phrase, e.g., "nice job" searches for the two words in that order
Currently available records
  • English/UK: 1,735,849
  • Dansk/Danmark: 1,196,033
  • Français/France: 819,233
  • Deutsch/Deutschland: 334,845
  • Nederlands/Nederland: 285,440
Not sure how to start? Try searching for træls in Danish or love in English.