Learning Document Similarity Using Natural Language Processing | Zendy

Paola Merlo | Zendy; James Henderson | Zendy; Gerold Schneider | Zendy; Éric Wehrli | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Learning Document Similarity Using Natural Language Processing

Author(s) -

Paola Merlo,

James Henderson,

Gerold Schneider,

Éric Wehrli

Publication year - 2003

Publication title -

linguistik online

Language(s) - English

Resource type - Journals

ISSN - 1615-3014

DOI - 10.13092/lo.17.788

Subject(s) - computer science , self organizing map , natural language processing , similarity (geometry) , representation (politics) , information retrieval , artificial intelligence , natural language , scale (ratio) , artificial neural network , image (mathematics) , physics , quantum mechanics , politics , political science , law

The recent considerable growth in the amount of easily available on-line text has brought to the foreground the need for large-scale natural language processing tools for text data mining. In this paper we address the problem of organizing documents into meaningful groups according to their content and to visualize a text collection, providing an overview of the range of documents and of their relationships, so that they can be browsed more easily. We use Self- Organizing Maps (SOMs) (Kohonen 1984). Great efficiency challenges arise in creating these maps. We study linguistically-motivated ways of reducing the representation of a document to increase efficiency and ways to disambiguate the words in the documents.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research