z-logo
Premium
A Shallow Text Processing Core Engine
Author(s) -
Neumann Günter,
Piskorski Jakub
Publication year - 2002
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/0824-7935.00197
Subject(s) - parsing , computer science , word (group theory) , natural language processing , set (abstract data type) , artificial intelligence , german , core (optical fiber) , domain (mathematical analysis) , line (geometry) , speech recognition , programming language , mathematics , telecommunications , geometry , archaeology , history , mathematical analysis
In this article we present SMES–SPPC, a high–performance system for intelligent extraction of structured data from free text documents. SMES–SPPC consists of a set of domain–adaptive shallow core components that are realized by means of cascaded weighted finite–state machines and generic dynamic tries. The system has been fully implemented for German; it includes morphological and on–line compound analysis, efficient POS–filtering, high–performance named–entity recognition and chunk parsing based on a novel divide–and–conquer strategy. The whole approach proved to be very useful for processing free word order languages such as German. SMES–SPPC has a good performance (more than 6000 words per second on standard PC environments) and achieves high linguistic coverage, especially for the divide–and–conquer parsing strategy, where we obtained an f –measure of 87.14% on unseen data.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom