Department of Cognitive Science

Nonword and Pseudohomophone Databases

If you already have a list of nonwords and want their properties, you can download both the nonword database and the pseudohomophone database, complete with the property information. Characteristics of each nonword in the databases are listed in fields; you can use Unix utilities such as Awk to extract the relevant properties for your list of nonwords.

Be aware, though, that you may have developed nonwords that are not in the database. Nonwords contained in the database were derived based on strict rules described in our Quarterly Journal of Experimental Psychology paper.

Below is an explanation of what each field in the database represents;

1 - Nonword
2 - number of neighbours
3 - Number of letters
4 - Number of Body Friends
5 - Number of Body Enemies
6 - Number of Body Neighbours (4+5)
7 - Summed Frequency of Body Friends
8 - Summed Frequency of Body Enemies
9 - Summed Frequency of Body Neighbours (7+8)
10 - Summed Frequency of Neighbours
11 - Nonword Phonological Spelling
12 - Bigram Frequency (position Non-Specific) Type
13 - Bigram Frequency (position Non-Specific) Token
14 - Trigram Frequency (position Non-Specific) Type
15 - Trigram Frequency (position Non-Specific) Token
16 - Bigram Frequency (position Specific) Type
17 - Bigram Frequency (position Specific) Token
18 - Trigram Frequency (position Specific) Type
19 - Trigram Frequency (position Specific) Token
20 - Number of Phonological Neighbours
21 - Summed Frequency of Phonological Neighbours
22 - Number of Nonwords with same Onset - Type
23 - Summed Frequency of Nonwords with same Onset - Token
24 - Illegal Bigram marker ( 0 - Nonwords contains no illegal bigrams, 1- Nonwords contains illegal bigrams )
25 - b, m or p (Morphological Status; both, monomorphemic, or polymorphemic)
26 - Not Applicable