University of Limerick
Browse
Mahdi_2017_computationally.pdf (647.61 kB)

A computationally efficient measure for word semantic relatedness using time series

Download (647.61 kB)
conference contribution
posted on 2021-01-28, 14:46 authored by Arash Joorabchi, Alaa Alahmadi, Michael English, Abdulhussain E. Mahdi
Measurement of words semantic relatedness plays an important role in a wide range of natural language processing and information retrieval applications, such as full-text search, summarization, classification and clustering. In this paper, we propose an easy to implement and low-cost method for estimating words semantic relatedness. The proposed method is based on the utilization of words temporal footprints as found in publicly available corpora such as Google Books Ngrams (GBN), and knowledge bases such as Wikipedia. The extracted footprints are represented as time series, their similarities is measured using the Minkowski distance, and averaged using a correlation-based weighting scheme to quantify the words semantic relatedness. The overall performance of the method and the quality of the two sources used for extracting words temporal footprints (i.e., GBN and Wikipedia) are evaluated using the MTurk-287 dataset and the standard measures of Pearson's r and Spearman's ρ.

History

Publication

2017 9th IEEE-GCC Conference and Exhibition (GCCCE);

Publisher

IEEE Comuter Society

Note

peer-reviewed

Rights

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Language

English

Usage metrics

    University of Limerick

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC