Impact Factors and the Mismeasure of Quality in Research
Recently Henry Roediger III produced an article in the APA Observer criticizing simplistic uses of Impact Factors (IFs) as measures of research quality. A journal’s IF in a given year is defined as the number of times papers published in the preceding two years have been cited during the given year, divided by the number of papers published in the preceding two years. Roediger’s main points are that IFs involve some basic abuses of statistical and number craft, and they don’t measure what they are being used to measure. He points out that IFs are reported to three decimal places, which usually is spuriously precise. IFs also are means, and yet the citations of papers in most journals are very strongly skewed, with many papers not being cited at all and a few being cited many times. In many such journals the mode and median number of citations per paper both are 0. Finally, he points out that other indicators of impact are ignored, such as whether research publications are used in textbooks or applied in real-world settings. Similar criticisms have been raised in the San Francisco Declaration on Research Assessment, plus the points that IFs can be gamed by editorial policy and that IFs are strongly field-specific.
I agree with these points but I don’t think Roediger’s critique goes far enough. The chief problem with IFs is their usage in evaluating individual researchers, research departments, and universities. It should be blindingly obvious that the IF of a journal has almost nothing to do with the number of citations of your or my publications in that journal. The number of citations may be weakly driven by the journal’s reputation and focus, but it also strongly depends on how long your paper has been out there. Citation rates may be a bit more strongly connected with journal IFs than citation totals, but there still are many other factors influencing citation rates. These observations may be obvious, but they seem completely uncomprehended by those who would like to rule academics. Instead, all too often we researchers find ourselves in situations where our fates are being determined by an inappropriate metric that is mindlessly applied by people who know nothing about our research and who lack any modicum of numeracy. The use of IFs for judgments of individual researchers’ output quality should be junked.
Number of citations, the h-index, or the i10-index might seem reasonable measures, but there are difficulties with these alternatives too. Young academics can be the victims of the slow accrual of citations. My own case is a fairly extreme illustration. My citations currently number more than 2600. I got my PhD in 1976, so I am about 37 years out of my PhD. Nearly half of those citations (1290) have occurred in the past 5 1/2 years (since 2008). That’s right– 31 years versus less than 6 years for the same number of citations. In 2012 alone I had approximately 250 citations. This isn’t because I didn’t produce anything of impact until late—Two of my most cited works were published in 1987 and 1989 and have gained about half their citations since 2005 because, frankly, they were ahead of their time. There also seems to be confusion between number of citations and citation-rates. The h and i10 indexes are based on numbers of citations, as is the list of your publications that Google Scholar provides. But the graph of Google Scholar presents of citations of your works by year is presenting information about citation rates. You get a substantially different view of a publication’s impact if you measure it by citations per year since publication than if you do so by total number of citations. For instance, two of my works have nearly identical total numbers of citations (197 and 192), but the first has 7.6 citations/year whereas the second has 19.9 citations/year. Finally, like IFs, number of citations and citation-rates depend on field-specific characteristics such as the number of people working in your area and achievable publication rates.
Another thing while I’m on the soap-box: Books should count. My books account for slightly more than half of my citations, and occupy ranks 1, 2, 3, 4, 7, 10, 13, and 14 out of the 75 of my publications that have received any citations. My most widely cited work by a long chalk is a book (published in 1989, now has more than 530 citations, approximately 22 per year, and about half of these since 2005). However, in the Australian science departments, books don’t count. Of course, this isn’t limited to the Australian scene. ISI ignores books and book chapters, and some journals forbid authors to include books or book chapters in their reference lists. This is a purely socially constructed make-believe version of “quality” that sanctions blindness to literature that has real, measurable, and very considerable impact. I have colleagues who seriously consider rewriting highly-cited book chapters and submitting them to journals so that they’ll count. This is sheer make-work of the most wasteful kind.
Finally, IFs and citation numbers or rates have no logical connection or demonstrated correlation with the quality of publications. Very bad works can garner high citation rates because numerous authors attack them. Likewise, useful but pedestrian papers can have high citation rates. Conversely, genuinely pioneering works may not be widely cited for a long time. There simply is no substitute for experts making careful assessments of the quality of research publications in their domains of expertise. Numerical indices can help, but they cannot supplant expert judgment. And yet, I claim that this is precisely the attraction that bureaucrats find in single-yardstick numerical measures. They don’t have to know anything about research areas or even basic number-craft to be able to rank-order researchers, departments, and/or entire universities by applying such a yardstick in a perfectly mindless manner. It’s a recipe for us to be ruled and controlled by folk who are massively ignorant and, worse still, meta-ignorant.