Improved TF-IDF for We Media Article Keywords Extraction | Zendy

Xinxin Guan | Zendy; Yeli Li | Zendy; Hechen Gong | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Improved TF-IDF for We Media Article Keywords Extraction

Author(s) -

Xinxin Guan,

Yeli Li,

Hechen Gong

Publication year - 2019

Publication title -

journal of physics conference series

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.21

H-Index - 85

eISSN - 1742-6596

pISSN - 1742-6588

DOI - 10.1088/1742-6596/1302/3/032003

Subject(s) - tf–idf , computer science , keyword extraction , python (programming language) , recall rate , sentiment analysis , precision and recall , artificial intelligence , word (group theory) , recall , natural language processing , data mining , information retrieval , mathematics , linguistics , philosophy , physics , geometry , quantum mechanics , term (time) , operating system

Keyword extraction is one of the work of computer text topic mining, and it is also the basis of text analysis and public opinion analysis. The keywords extracted by the traditional TF-IDF algorithm are mainly calculated based on the word frequency. The importance of other feature words with fewer occurrences and the comments of readers below the article are not considered. Aiming at the above problems, this paper improves the traditional TF-IDF algorithm, adds the part of speech and the reader’s comment as the impact factor, and recalculates the weight of TF-IDF, so that the accuracy of the algorithm is improved. This paper uses the Python language programming to crawl from the media article and implement the improvement of the algorithm. Experiments show that the improved TF-IDF algorithm has significantly improved compared with the traditional TF-IDF, in terms of accuracy, recall rate, F1, MacAvg_P, MacAvg_R and MacAvg_F1.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore