Exploring Text Classification for Messy Data: An Industry Use Case for Domain-Specific Analytics
Author(s) -
Laura Kassner,
Bernhard Mitschang
Publication year - 2016
Language(s) - English
DOI - 10.5441/002/edbt.2016.47
Industrial enterprise data present classication problems which are different from those problems typically discussed in the scientic community { with larger amounts of classes and with domain-specic, often unstructured data. We ad- dress one such problem through an analytics environment which makes use of domain-specic knowledge. Companies are beginning to use analytics on large amounts of text data which they have access to, but in day-to-day business, man- ual effort is still the dominant method for processing un- structured data. In the face of ever larger amounts of data, faster innovation cycles and higher product customization, human experts need to be supported in their work through data analytics. In cooperation with a large automotive man- ufacturer, we have developed a use case in the area of quality management for supporting human labor through text ana- lytics: When processing damaged car parts for quality im- provement and warranty handling, quality experts have to read text reports and assign error codes to damaged parts. We design and implement a system to recommend likely er- ror codes based on the automatic recognition of error men- tions in textual quality reports. In our prototypical imple- mentation, we test several methods forltering out accurate recommendations for error codes and develop further direc- tions for applying this method to a competitive business intelligence use case.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom