A comparison of discriminative classifiers for web news content extraction
Author(s) -
Alex Spengler,
Antoine Bordes,
Patrick Gallinari
Publication year - 2010
Language(s) - English
DOI - 10.5555/1937055.1937099
Until now, approaches to web content extraction have focused on random field models, largely neglecting large margin methods. Structured large margin methods, however, have recently shown great practical success. We compare, for the first time, greedy and structured support vector machines with conditional random fields on a real-world web news content extraction task, showing that large margin approaches are indeed competitive with random field models.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom