Semantic vector evaluation and human performance on a new vocabulary MCQ test


Vectors derived from patterns of co-occurrence of words in large bodies of text have often been used as representations of some aspects of the meanings of different words. Generally, the distance between such vectors is used as a measure of the semantic similarity between the word meanings they represent. One important way of evaluating the performance of these vectors has been to use them to answer vocabulary multiple choice questions (MCQs) where the participant is asked to judge which of several choice words is closest in meaning to a stem word. The existing vocabulary MCQ tests used in this way have been very useful but there are some practical problems in their use as general evaluation measures. Here, we discuss why such tests remain useful evaluation measures, introduce a new vocabulary test, evaluate several current sets of semantic vectors using the new test and compare their performance to human data.

Back to Table of Contents