z-logo
open-access-imgOpen Access
Web Categorisation Using Distance-Based Decision Trees
Author(s) -
Vicente Domingo Estruch Fuster,
Cèsar Ferri,
José HernándezOrallo,
Marïa José Ramírez-Quintana
Publication year - 2006
Publication title -
electronic notes in theoretical computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.242
H-Index - 60
ISSN - 1571-0661
DOI - 10.1016/j.entcs.2005.12.043
Subject(s) - computer science , decision tree , web mining , data mining , machine learning , task (project management) , metric (unit) , decision tree learning , tree (set theory) , web page , artificial intelligence , information retrieval , mathematics , world wide web , mathematical analysis , operations management , management , economics
In Web classification, web pages are assigned to pre-defined categories mainly according to their content (content mining). However, the structure of the web site might provide extra information about their category (structure mining). Traditionally, both approaches have been applied separately, or are dealt with techniques that do not generate a model, such as Bayesian techniques. Unfortunately, in some classification contexts, a comprehensible model becomes crucial. Thus, it would be interesting to apply rule-based techniques (rule learning, decision tree learning) for the web categorisation task. In this paper we outline how our general-purpose learning algorithm, the so called distance based decision tree learning algorithm (DBDT), could be used in web categorisation scenarios. This algorithm differs from traditional ones in the sense that the splitting criterion is defined by means of metric conditions (“is nearer than”). This change allows decision trees to handle structured attributes (lists, graphs, sets, etc.) along with the well-known nominal and numerical attributes. Generally speaking, these structured attributes will be employed to represent the content and the structure of the web-site

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom