SciVal’s field-weighted citation impact (FWCI) is an article-level metric that takes the form of a simple ratio: actual citations to a given output divided by the expected rate for outputs of similar age, subject and publication type. FWCI has the dual merits of simplicity and ease of interpretation: a value of 2 indicates that an output has achieved twice the expected impact relative to the world literature. It is a really useful addition to the benchmarking toolkit.

The trouble is that, typically, the distribution of citations to outputs is highly skewed, with most outputs achieving minimal impact at one end and a small number of extreme statistical outliers at the other. Applying the arithmetic mean to data distributed like this, as does FWCI, is not ideal because the outliers can exert a strong leveraging effect, “inflating” the average for the whole set. This effect is likely to be more marked the smaller the sample size.

I explored this effect in a simple experiment. I downloaded SciVal FWCI values for 52,118 King’s College London papers published up until 2014. I then calculated mean FWCI and 95% confidence (or stability) intervals for the whole sample using the bootstrapping[1] feature in SPSS. Then I took progressively smaller random samples (99%, 98%, and so on to 1%, then 0.1%), recalculating mean FWCI and stability intervals each time.

The findings shows how mean FWCI becomes less stable as sample size decreases. Highly cited outliers are relatively uncommon, but their chance inclusion or exclusion makes a big difference, especially as the number of outputs decreases. In this experiment, FWCI values range across four orders of magnitude, from 0.03 to 398.28.

*What does this mean for interpreting FWCI, especially when benchmarking? *The table below offers some guidance. It shows typical stability intervals around FWCI at different scales. The final column assumes that SciVal spits out a value of 2.20 and shows how that figure should be interpreted in terms of its stability.

It’s pretty clear from this analysis that you need to know when it’s time to stop when you are drilling down in SciVal! Another implication is that there is no sensible justification for quoting FWCI to two let alone three decimal places of precision. I’ve kept the second decimal place above simply for purposes of demonstration.

I am well aware that the guidance above is based on data from just one institution, and may not travel well. If you would like to replicate this experiment using your own data, I’m happy to share my SPSS Syntax file. It automates the whole thing, so you just have to load and go off on a short holiday! Just drop me an email.

*Ian Rowlands is a Research Information & Intelligence Specialist at King’s College London and a member of the LIS-Bibliometrics committee.*

[1] Bootstrapping entry in Wikipedia: https://en.wikipedia.org/wiki/Bootstrapping_(statistics)

Won’t the effect you describe be even stronger if you analyse the raw citations? I wonder how much of the described effect is actually caused by the specific characteristics of the FWCI and how much can be explained by inducing any population statistic from (too) small sample sizes.

A potential issus with the FWRI arises if the expected rate (in the corresponding subject field) would vary strongly across years and this variation would be caused by outliers with many citations affecting the notoriously non-robust mean. Accoding to a quick (and admittedly dirty) lock at our data, that does not seem to be the case.

LikeLike

I didn’t look at raw citations as I was only concerned with how to interpret FWCI. You’re right that some of the effect is simply a function of small samples rather than the FWCI calculation per se: but I doubt that matters in practice. The key thing is to discourage over-interpretation of modest changes in FWCI because this could lead to some poor decisions.

LikeLike