User-Relevant Access to Textual Information through Flexible Identification of Terms: A Semi-Automatic Method and Software Based on a Combination of N-Grams and Surface Linguistic Filters
Author(s) -
Ismaïl Biskri,
Sylvain Delisle
Publication year - 2000
Language(s) - English
DOI - 10.5555/2856151.2856163
We present a semi-automatic method and software tool for multi-word term identification. Our approach is hybrid in that it combines numeric computations (N-grams) to linguistic filters. The software tool is different from most other term identification tools in that is it by design semi-automatic: i. e. it is interactive and constantly under the user's control. The software supports the knowledge engineer's work, the (corpus) domain's expert, or the linguist, by helping them do their job more efficiently. We justify this semi-automatic approach by the need to have a more flexible and customisable tool to perform certain term identification tasks. More specifically, in some applications we want to allow the user's perspective, knowledge and subjectivity, influence the results: all this within certain limits, of course. An example of such an application on which we are currently working is that of Web personalisation: to allow individuals to develop their own vision of information universes of interest to them, we need flexible and customisable tools that can support them in such a challenging task, not tools that will impose on them a pseudo-standardised vision of the world.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom