Six weeks is a long time in bibliometrics: Stability and Field-Weighted Citation Percentile

Dr Ian Rowlands, writing in a personal capacity

Let’s begin with a self-evident truth: publication counts aside, bibliometric indicators often change each time new material is added to a citation database.  That’s a given, but it begs a couple of interesting questions: how much do indicators change, and how quickly?  We know that even one highly cited paper can exert big changes in a journal’s annual impact factor (see, for example, Dimitrov, Kaveri & Bayry, 2010) but as far as I know, no one has looked at volatility specifically in the context of SciVal’s field-weighted citation percentiles (FWCPs), a popular indicator widely influential in research evaluation.  Like many other citation-based indicators, FWCPs are relative, not absolute, measures of impact and in a sense are much like a stock market index, whose value reflects the sentiment of the market at a particular point in time.

This post presents the results of a simple exercise where we put some figures against those questions (how much? how quickly?) by comparing FWCPs over intervals of six weeks and two and half years respectively. 

Here are the results.  I downloaded FWCPs for 15,834 journal articles (from a research-led UK institution) on 1 June 2017.  The fixed test population is a broad mix of disciplines (Table 1):

Journal articles published in:

  2014 2015 2016 2017 Domain totals
Health sciences 1,778 1,8641,8427296,213
Life sciences 1,1911,1981,1334283,950
Physical sciences 7368168062982,656
Social sciences &
humanities
8778798943653,015
Publication year totals4,5824,7574,6751,82015,834

Table 1: Population of journal articles by publication year

Change over six weeks

The first series of scatterplots (Figs 1-4), broken down by domain and publication year, contrast the percentile values as they stood on 1 June 2017 and then again on 10 July 2017.  The dots show absolute values, not percentage change.

Figure 1: Change in FWCP over six weeks (for articles published in 2014)
Figure 2: Change in FWCP over six weeks (for articles published in 2015)
Figure 3: Change in FWCP over six weeks (for articles published in 2016)
Figure 4: Change in FWCP over six weeks (for articles published in 2017)

The initial impression is that the percentiles look pretty volatile, but remember that there are an awful lot of data points on each scatterplot (see Table 1) so there’s a danger that the eye may be being deceived by a relatively small number of outliers. Against that, only 25.7% of the articles retained the same percentile value over the six week period and almost half (49.2%) moved, up or down, by more than one point.

Since there are too many data points to see them clearly, it’s probably better to find out how closely the two distributions (1 June vs 10 July) correlate. If there was no volatility and all the articles retained the same value, we’d expect the correlation to be exactly = 1.

Journal articles published in:

2014201520162017
Health sciences
95% CI
0.987
0.978 0.994
0.976
0.966 0.983
0.909
0.888 0.928
0.789
0.649 0.884
Life sciences
95% CI
0.987
0.973 0.995
0.971
0.954 0.984
0.953
0.944 0.961
0.756
0.628 0.860
Physical sciences
95% CI
0.984
0.964 0.995
0.967
0.943 0.984
0.928
0.906 0.946
0.733
0.562 0.877
Social sciences
& humanities
95% CI
0.970
0.946 0.986
0.946
0.916 0.968
0.903
0.862 0.935
0.874
0.695 0.967

Table 2: Pearson correlations between 1 June 2017 and 10 July 2017 FWCP values

In all cases the data correlate highly – but not perfectly – and that fit becomes increasingly approximate as we consider more recent articles. Put another way, a reading taken today in the case of a recent paper is very unlikely to be an accurate predictor of performance in six weeks’ time. This is precisely why responsible bibliometricians like things to settle and leave a healthy time gap before evaluating impact (thus putting themselves in conflict with users who want data almost in real time).

Change over two and a half years

This second analysis (Figs 5-8) assesses change over a much longer interval, nearly two and a half years, with FWCP values compared on 1 June 2017 and 16 October 2019.

This time only 9.3% of the articles retain the same percentile value and the large majority, 80.3%, have moved up or down by more than 1 percentage point.

Figure 5: Change in FWCP over two and a half years (for articles published in 2014)
Figure 6: Change in FWCP over two and a half years (for articles published in 2015)
Figure 7: Change in FWCP over two and a half years (for articles published in 2016)
Figure 8: Change in FWCP over two and a half years (for articles published in 2017)

Both the scatterplots and the correlations (Table 3) reveal change to a much greater extent than for six weeks, as one would expect.

Journal articles published in:

2014201520162017
Health
sciences
95% CI
0.879
0.532 0.891
0.792
0.733 0.810
0.488
0.446 0.583
0.095
-0.038 0.242
Life
sciences
95% CI
0.893
0.876 0.907
0.803
0.755 0.828
0.594
0.548 0.638
0.400
0.251 0.552
Physical
sciences
95% CI
0.839
0.809 0.867
0.760
0.724 0.792
0.460
0.388 0.525
0.235
0.023 0.443
Social
sciences
& humanities
95% CI
0.751
0.711 0.786
0.581
0.528 0.633
0.296
0.200 0.388
0.153
-0.096 0.420

Table 3: Pearson correlations between 1 June 2017 and 16 October 2019 FWCP values

Conclusions and implications for practice

There are some obvious conclusions to be drawn, most of which we already “knew” in advance, but without numbers to challenge our assumptions.  Am I surprised to find that FWCP values change over time?  No, of course not.  Am I surprised by how much and how quickly? I have to say I am, although I’m cautious about making value judgements.

The findings raise some interesting challenges for professional practice. 

For example, we’ve become very enamoured of “cliff edge” metrics, such as the number of outputs in the top decile of world impact.  The trouble is the boundary is permeable (and  arbitrary).  Sometimes papers are in and sometimes they’re not.  To make that point clearer, let’s see what happens at the 10% boundary over the two time intervals.  We’ll exclude the 2017 articles from this analysis because they’re just too flaky and focus on 2014-16 only.

On 1 Jun 2017, there were 3,151 articles in the top FWCP decile. This edged down slightly to 3,115 by 10 July with 2,913 articles common to both census dates.  So, over the six weeks, 238 papers dropped out of the top decile, and 202 dropped in.

By 19 October 2019, the number of top decile articles had grown substantially, to 4,273 with 2,649 common to both census dates.  502 articles dropped out and 1,624 dropped in!

So, when we talk of a “top decile” paper, perhaps we’re confounding two populations: consistent or native top decile papers and top decile visitors?  This seems an important distinction to me given the churn in the data set.

We face another conundrum if we generate reports containing numbers of outputs by FWCP band, perhaps as input into an institutional KPI.  The first time we do this, everything is fine.  But what about the update twelve months later?  Should we just paste in a new column with the most recent publication year’s metric, or should we recalculate the earlier years as well?  I think we should because we’re clearly adding to the robustness of the earlier readings.  The trouble is, how do you explain to faculty that they have fewer 2016 “top papers” now than they did last year?

The purpose of this post is not to denigrate FWCPs.  They have many good features: they’re neat, intuitive, informative and much more appropriate than mean field-weighted citation impact (FWCI) when we’re dealing with small samples.  It’s just we need to reframe understanding.  They really are more akin to stock market indices, not academic BAFTAs.

Reference

Jordan D Dimitro, Srini V Kaveri & Jagadeesh Bayry, Metrics: Journal’s impact factor skewed by a single paper, Nature 466, 179 (2010) https://doi.org/10.1038/466179b


Ian Rowlands head shot

Ian Rowlands is a Research Information & Intelligence Specialist at King’s College London and a member of the LIS-Bibliometrics committee.

 
Unless it states other wise, the content of the 
Bibliomagician is licensed under a 
Creative Commons Attribution 4.0 International License.  

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.