Course summary: Statistics for Responsible Bibliometrics

On 26th March 2018, the Lis-Bibliometrics Forum launched a practical course: Statistics for Responsible Metrics aimed to offer a bespoke training that focused exclusively on statistics for bibliometrics. In this post, Dr Abigail McBirnie gives a summary of what happened on the day. 

If the growth of the Lis-Bibliometrics forum is anything to go by, an increasing number of librarians and research administrators around the world are finding themselves supporting or working with bibliometric data. However when we ran a survey of the community back in 2017, we found that only 29% of respondents had received any bibliometrics training as part of their Library and Information Science course. So how are such individuals expected to do bibliometrics responsibly?

…soon our pilot Statistics for Responsible Bibliometrics course was born!

Well the Lis-Bibliometrics forum provides a safe space for individuals to ask honest questions, and The Bibliomagician provides a permanent archive of some practitioner-focused posts and resources. But one thing we all felt was lacking was a proper statistical foundation for the part-time bibliometric practitioner (for that is what most of us are). To really do bibliometrics responsibly, we need to have some understanding of how bibliometric data behaves statistically in order to better interpret our bibliometric analyses.

Unfortunately, though, many of us have found typical ‘introduction to statistics’ courses not as helpful as we had hoped, primarily because our bibliometric data are so different from the ‘normal’ data covered in such courses.  But what if we could offer a bespoke training event that focused exclusively on statistics for bibliometrics?  Would that be more useful for practitioners in our community?  The Lis-Bibliometrics Committee really wanted to find out, so with a couple of us—Dr Abigail McBirnie and Dr Ian Rowlands—taking the lead on developing a training event, soon our pilot Statistics for Responsible Bibliometrics course was born!

As a starting point, a basic question we had to address was ‘what level of bibliometric knowledge do we expect potential attendees to bring?’

Very early on, we approached Professor Mike Thelwall (University of Wolverhampton), world-renowned metrics expert and member of the UK Forum for Responsible Research Metrics, to ask if he would be willing to support and be a part of our endeavour.  Mike not only generously offered his advice and insight during the course preparation stage, but also led several sessions at the event itself and provided us with a teaching space and IT facilities at Wolverhampton complete with refreshments on the day!  His commitment, expertise, and practical help proved absolutely invaluable to the project.

Fitting the course content into a day, while aiming the event at those with little or no prior statistical knowledge proved very challenging.  As a starting point, a basic question we had to address was ‘what level of bibliometric knowledge do we expect potential attendees to bring?’  We wouldn’t have time to teach both bibliometrics and statistics from scratch, so we had to compromise by assuming a reasonable familiarity with bibliometrics amongst attendees.  That left us free to concentrate on the statistics bit.   In relation to that, should the teaching focus be theoretical or practical?  A difficult choice: either way, we would have to leave something out.  Because we wanted attendees to be able to go away and apply what they had learned, we opted for a more practical approach, including hands on work in the open statistical environment R.

CC0 By_ Lukas

After several drafts and some very helpful discussions with Dr Elizabeth Gadd (Chair of the LIS-Bibliometrics Committee) we settled on a course outline for the one day pilot event.  Now, as any of you who may have been a part of these things will know, events don’t just set-up, organise and run themselves!  We were very grateful to the Association of Research Managers & Administrators (ARMA) for hosting our web advertisement and co-ordinating bookings for us, as well as the staff at Wolverhampton University. It was thanks in large part to their behind the scenes effort that things went smoothly on the day.

And a full-on day it was!  The event on 26th March was booked to capacity and attendees arrived early, prepared for an intense day.  Abigail and Mike took the morning sessions, introducing key statistical concepts, leading attendees through hands on exercises in R, and offering a brief review of bibliometric indicators and their calculation.  After lunch, Mike looked in detail at practical approaches for quantifying uncertainty and ways of inferring difference.  Around these sessions, Ian wove the statistical concepts from the day into the context of responsible metrics, highlighting amusing and worrying (!) real-life examples in which these do—or, more often than not—don’t play out as they should.

With the content we covered and the speed at which we covered it, no doubt everyone was exhausted by the end of the day!  But when the dust settled, how did it all go?  A key aim of the pilot course was to gather feedback from attendees.  What worked, what didn’t, what should we change for next time?  To this end, Lizzie was ready with a survey…

The survey results suggested that, as we predicted, the majority of attendees had basic or no prior statistical knowledge.  More surprisingly, about half of the respondents also claimed only entry-level competency in bibliometrics (see, a point to note given our assumptions around attendees having ‘reasonable familiarity’ with bibliometrics.  Indeed, as some of the respondents indicated, the level at which the content was pitched and the speed at which it was delivered was a challenge at times.  Yet, despite some of the more technical sessions being hard to follow, attendees replied that overall the course was helpful and appropriate to their work.  Perhaps most importantly, an overwhelming majority said that attending would help them be more responsible in their use of metrics in future.

And what changes would attendees like to see next time around?   The length of the course was definitely a matter for debate, with some wanting a two-day event to allow for a slower pace but others noting that a one-day course was less demanding in terms of time commitment.  It was a similar situation with the use of R:  some attendees wanted even more hands-on work while others felt the software was too complicated.  Most attendees did seem to agree, though, that more opportunity to apply the statistical content by walking through real-life bibliometrics scenarios on the day would have been helpful.

Our thanks again to all those who completed the survey.  Very useful feedback and lots for us to think about as we look to round two!


Statistics for Responsible Bibliometrics presentation slides:-

  1. Statistical Starting Points
  2. Hands on with R
  3. How key normalised indicators are calculated
  4. Quantifying Uncertainty: How accurate are our indicators?
  5. Using statistics to promote responsible metrics
  6. Inferring Difference: When are scores different enough to matter?



Dr Abigail McBirnie is a UK-based information specialist.  As an analyst, she has worked in higher education and research.  A current LIS-Bibliometrics committee member, she is interested in the use of statistics in bibliometrics and the implications of this for responsible metrics.


Creative Commons LicenceUnless it states other wise, the content 
of the Bibliomagician is licensed under a 
Creative Commons Attribution 4.0 International License.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: