z-logo
open-access-imgOpen Access
Issues in Large-Scale Hierarchical Classifications
Author(s) -
M.Balaji Prasath,
D. Manjula
Publication year - 2012
Publication title -
international journal of computer science and informatics
Language(s) - English
Resource type - Journals
ISSN - 2231-5292
DOI - 10.47893/ijcsi.2012.1041
Subject(s) - hierarchy , computer science , relevance (law) , class (philosophy) , categorization , information retrieval , text categorization , class hierarchy , directory , scale (ratio) , limit (mathematics) , document classification , artificial intelligence , mathematics , geography , cartography , mathematical analysis , object oriented programming , political science , economics , law , market economy , programming language , operating system
Text documents in the web are in hierarchy, increase in the content, information grows over the years. To classify those text documents, need a class labels. But documents in the corpus belong to more than one class or category. Most of the corpus is large in size example. Wikipedia, Yahoo ODP directory. To classify those large-Scale dataset need a multi-label to categorize those datasets. More number of document added to the hierarchy, it create very high imbalance between classes at the different levels of hierarchy. Difficult to assign the documents to the actual class, so that relevance measure is used to calculate, relevance of text document to the class label, to maintain stable hierarchy. Another issue is if number of unique label is increase, it create instability in a classification, and also slow the classification process, so that try to limit the unique label in the classification, it improves the classification performance.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here