The influence of the broadness of a query of a topic on its h‐index: Models and examples of the h‐index of n‐grams
Author(s) -
Egghe Leo,
Ravichandra Rao I.K.
Publication year - 2008
Publication title -
journal of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1532-2890
pISSN - 1532-2882
DOI - 10.1002/asi.20843
Subject(s) - mathematics , combinatorics , index (typography) , function (biology) , scopus , measure (data warehouse) , exponent , statistics , database , computer science , world wide web , linguistics , philosophy , medline , evolutionary biology , political science , law , biology
The article studies the influence of the query formulation of a topic on its h‐index. In order to generate pure random sets of documents, we used N‐grams (N variable) to measure this influence: strings of zeros, truncated at the end. The used databases are WoS and Scopus. The formula ${\rm{h = T}}^{{\textstyle{1 \over \alpha }}} $ , proved in Egghe and Rousseau (2006) where T is the number of retrieved documents and α is Lotka's exponent, is confirmed being a concavely increasing function of T. We also give a formula for the relation between h and N the length of the N‐gram: ${\rm{h = D10}}^{ - {\textstyle{{\rm{N}} \over \alpha }}} $ where D is a constant, a convexly decreasing function, which is found in our experiments. Nonlinear regression on ${\rm{h = T}}^{{\textstyle{1 \over \alpha }}} $ gives an estimation of α, which can then be used to estimate the h‐index of the entire database (Web of Science [WoS] and Scopus): ${\rm{h = S}}^{{\textstyle{1 \over \alpha }}} $ , where S is the total number of documents in the database.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom