Premium
Measures for bibliometric size, impact, and concentration
Author(s) -
Prathap Gangan
Publication year - 2015
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.23364
Subject(s) - citation , citation impact , library science , computer science , bibliometrics , information retrieval
Dear Sir, Recently, Egghe (2014) suggested that the g-index (Egghe, 2006), normalized using the square root of the total number of citations C as s g C = ⎢⎣ ⎥⎦ / where / ∗ ⎢⎣ ⎥⎦ denotes the floor function ( x ⎢⎣ ⎥⎦ is the largest integer smaller than or equal to x) is a “good” normalized measure of concentration. Rousseau (2014) in a thoughtful analysis showed that Egghe’s s-measure is not an acceptable concentration measure and neither is the g-index or any other h-type index. Further, the new measure s no longer serves as a measure of impact. It is meaningful to review all this in the light of the three-dimensional approach recently introduced by Prathap (2014). Let ck, k = 1 to P, represent the citation sequence of all P papers in a portfolio (Prathap, 2011). Note that the notation T is used instead of P for the total number of papers in Egghe (2014) and Rousseau (2014). Then C = Σck, k = 1 to P is the total number of citations. We introduce a note of clarification here that k is used as the index of the citation sequence instead of i because of the historical legacy where the latter has served as the notation for impact. While P serves as a proxy for quantity or size of the academic effort in the portfolio, the impact i = C/P is an empirical proxy for quality. P can be viewed as a performance indicator of the zeroth-order. Then, i and P become two orthogonal components of a three-dimensional performance evaluation protocol. C = Pi can be considered a performance indicator of the first-order (Prathap, 2011). Prathap (2011) showed that it is possible to define second-order, energy-like terms E = Σci and X = iC. The product X = iC = iP becomes a higherorder measure. It is a robust second-order performance indicator (Prathap, 2011). Apart from X, the additional indicator defined by E = Σck also appears as a second-order indicator. The coexistence of X and E allows us to introduce a third attribute that is neither quantity nor quality. The simple ratio of X to E can be viewed as the third component of performance, namely, the consistency term η = X/E. Perfect consistency (η = 1, i.e., when X = E) is a case of absolutely uniform performance; that is, all papers in the set have the same number of citations, ck = c = i. When the best work is concentrated in a very few papers of extraordinary impact, we have highly inconsistent performance; the inverse of consistency thus becomes a measure of concentration. It is possible to show that η is related to other popular measures of concentration like Simpson’s diversity index D (Simpson, 1949) or the Herfindahl–Hirschmann index (Herfindahl, 1950; Hirschman, 1964). The same logic that applies to bibliometric variation or skew also applies to ecological diversity, where species richness and abundance are governed by identical sequences (Jost, 2010) and Simpson’s diversity index D is a measure of diversity and indirectly of the evenness ν of distribution of the species through the relation D = P ν. A measure of market concentration, the Herfindahl–Hirschman Index (HHI) is used to describe the competitiveness of an industry by using the sum of the squares Σck for k = 1, P, where the market share ck is expressed as a fraction. The measure is essentially equivalent to the Simpson diversity index used in ecology in the manner described by η = ν = 1/ (HHI × P) = D/P. The h-index, as originally proposed (Hirsch, 2005), is a purely heuristic construction that is sensitive to the form of the citation distribution (described by consistency η), in addition to the normal bibliometric indicators which sense quantity (or size), namely, the number of papers P, and the quality (or impact i) as measured by the ratio C/P, where C is the total number of citations received by the P papers. The h-index, the g-index, and other h-type indices are actually heuristic constructions that try to condense P, i, and η into a single number that has the same dimensions as the number of papers P (Prathap, 2012).