Loading...
Thumbnail Image
Publication

A computationally efficient measure for word semantic relatedness using time series

Date
2017
Abstract
Measurement of words semantic relatedness plays an important role in a wide range of natural language processing and information retrieval applications, such as full-text search, summarization, classification and clustering. In this paper, we propose an easy to implement and low-cost method for estimating words semantic relatedness. The proposed method is based on the utilization of words temporal footprints as found in publicly available corpora such as Google Books Ngrams (GBN), and knowledge bases such as Wikipedia. The extracted footprints are represented as time series, their similarities is measured using the Minkowski distance, and averaged using a correlation-based weighting scheme to quantify the words semantic relatedness. The overall performance of the method and the quality of the two sources used for extracting words temporal footprints (i.e., GBN and Wikipedia) are evaluated using the MTurk-287 dataset and the standard measures of Pearson's r and Spearman's ρ.
Supervisor
Description
peer-reviewed
Publisher
IEEE Computer Society
Citation
2017 9th IEEE-GCC Conference and Exhibition (GCCCE);
Funding code
Funding Information
Sustainable Development Goals
External Link
License
Embedded videos