A review of the #AHSSmetrics conference

Stephanie Meece, Scholarly Communications Manager at University of the Arts London, summarised the Bibliometrics in the Arts, Humanities and Social Sciences conference at the end of the event. Here she offers her reflections on the event.

On Friday, 24 March 2017, I attended the meeting of the LIS-Bibliometrics group in London. Although I have been on the mailing list for this group for some time, I had never attended an event; I had feared it was going to be a day of maths and statistics, and was happy to find out that it was not at all the case.  I also wondered how anyone could spend a day talking about bibliometrics for AHSS when it’s very clear that much of AHSS scholarship is still print-only, and that the standard bibliometrics services index a relatively low proportion of the AHSS journals. I found that there was indeed a lot to talk about, as these constraints spurred originality and creative thinking, to find alternative ways to track the impact of research in non-STM fields.

By the end of the day, I had noticed that despite the wide variety of backgrounds of the speakers we heard, a few of the same ideas came up repeatedly, though expressed differently and reflected by their own experiences and research interests.

The idea of ‘telling the story of our research’ (Jane Winters, Emily Rosamond) was attractive to many speakers. Researchers in arts and humanities can be suspicious of metrics, but phrased as narrative, rather than metrics, the same data can be welcomed. Certainly most academics do like hearing the number of times their work has been downloaded. We are seeing the growth of an audit culture in academia – we should be critical of this, but also realistic: we are unlikely to see a reversal of this trend  (Harzing).  Academics aren’t the only people who feel anxious about having their reputations quantified (Emily Rosamond);  ‘black box methodologies’ are being applied to us all, which is an understandable cause of anxiety. Tying these methodologies back to stories perhaps would ease some anxiety around metrics.

Also, again and again we saw the breadth of the methods to obtain relevant, interesting bibliometrics in arts and humanities, and the range and diversity of things we use metrics for.

Regarding methods, the important of a pragmatic approach was stressed: universal informed peer review is impossible (Harzing).  An ideal peer review, with informed, dedicated, appropriate experts, will always be better than metrics, especially if metrics are thought of in a reductionist way. But an inclusive version of metrics is better than the likely reality of peer review: hurried, done by semi-experts, and potentially biased (Harzing).

The language of metrics can be off-putting to scholars of arts and humanities. Should we even use the word ‘metrics’, which implies quantification and a pretense of objectivity that arts and humanities scholars are sceptical of? Perhaps other words like ‘proxies’ (Sumi David), reputation, influence, and the diffusion of ideas make more sense in arts and humanities. In any case, I believe we created a neologism, ‘metrifying’: there was mention of ‘metrifying personality’ and ‘metrifying reputation’.

By the end of the day, there were several participants who said they’d found inspiration that would be tangibly useful in normal work, ideas for a new project, or a new direction to do some work in. I think that’s the best indication that the day was a success.


SciVal’s Field weighted citation impact: Sample size matters!

SciVal’s field-weighted citation impact (FWCI) is an article-level metric that takes the form of a simple ratio: actual citations to a given output divided by the expected rate for outputs of similar age, subject and publication type.  FWCI has the dual merits of simplicity and ease of interpretation: a value of 2 indicates that an output has achieved twice the expected impact relative to the world literature.  It is a really useful addition to the benchmarking toolkit.

The trouble is that, typically, the distribution of citations to outputs is highly skewed, with most outputs achieving minimal impact at one end and a small number of extreme statistical outliers at the other.  Applying the arithmetic mean to data distributed like this, as does FWCI, is not ideal because the outliers can exert a strong leveraging effect, “inflating” the average for the whole set.  This effect is likely to be more marked the smaller the sample size.

I explored this effect in a simple experiment.  I downloaded SciVal FWCI values for 52,118 King’s College London papers published up until 2014.  I then calculated mean FWCI and 95% confidence (or stability) intervals for the whole sample using the bootstrapping[1] feature in SPSS.  Then I took progressively smaller random samples (99%, 98%, and so on to 1%, then 0.1%), recalculating mean FWCI and stability intervals each time.

The findings shows how mean FWCI becomes less stable as sample size decreases.  Highly cited outliers are relatively uncommon, but their chance inclusion or exclusion makes a big difference, especially as the number of outputs decreases.  In this experiment, FWCI values range across four orders of magnitude, from 0.03 to 398.28.

FWCI chart_black

What does this mean for interpreting FWCI, especially when benchmarking? The table below offers some guidance.  It shows typical stability intervals around FWCI at different scales.  The final column assumes that SciVal spits out a value of 2.20 and shows how that figure should be interpreted in terms of its stability.

FWCI Table

It’s pretty clear from this analysis that you need to know when it’s time to stop when you are drilling down in SciVal!  Another implication is that there is no sensible justification for quoting FWCI to two let alone three decimal places of precision.  I’ve kept the second decimal place above simply for purposes of demonstration.

I am well aware that the guidance above is based on data from just one institution, and may not travel well. If you would like to replicate this experiment using your own data, I’m happy to share my SPSS Syntax file.  It automates the whole thing, so you just have to load and go off on a short holiday! Just drop me an email.

Ian Rowlands is a Research Information & Intelligence Specialist at King’s College London and a member of the LIS-Bibliometrics committee.


[1] Bootstrapping entry in Wikipedia: https://en.wikipedia.org/wiki/Bootstrapping_(statistics)


Measuring the magnificence of monographs

At Loughborough University we have recently been thinking about how we can use bibliometrics responsibly.  Not surprisingly our conversations tended to focus on journal and conference papers where the majority of citation databases focus.  However, as part of this process the question inevitably arose as to whether there were also ways we could measure the quality or visibility of non-journal outputs, in particular, monographs.  To this end we thought we should explore this question with senior staff in Art, English, Drama, History and Social Sciences.

During the conversation we had, the Associate Dean for Research in the School of Art, English and Drama said he felt that assessing the arts through numbers was like asking engineers to describe their work through dance!  Whilst he was not averse to exploring the value of publication indicators, I think this serves to highlight how alien such numbers are perceived to be to those working in creative fields.

So what do we know about monographs?  Well, we know that the monograph as a format is not going away, although some publishers are now also offering ‘short form’ monographs (between a book and journal article in size).  We also know that monographs are not covered by the commercial citation benchmarking tools which many of us rely on for analyses.  They are of course covered by Google Scholar – but there are known concerns with the robustness of Google data, and there is no way of easily benchmarking it.  However, the biggest problem with citation analysis in the field of monographs is not the lack of coverage in the benchmarking tools, but what a citation actually means in these fields.  In English & Drama for example, citations are often used to refute previous work in an effort to promote new ideas (“So-and-so thinks X but I have new evidence to think Y”). So the question remains: is it possible to measure the quality and impact of monographs in a different way?

Well, as part of our conversation we explored some alternatives which I’ll briefly run though here.

The Publisher

The most obvious choice of indicator is who actually published the work.  And we know that the Danish research evaluation system allocates publishers to tiers and those books published with top tier publishers are weighted more heavily than books published with lower tier publishers.  Whilst academics at Loughborough were not minded to formalise such a system internally, it was clear that they do make quality judgements based on publishers with comments such as: “if a university DIDN’T have at least a handful of monographs published by the Big 6, that would be a concern.”  So quality is assumed because the process of getting a contract with a top tier publisher is competitive, and the standard of peer review is very high. A bit like highly cited journals…

Book Reviews

Book reviews could serve to indicate quality–not only in terms of the content of those reviews, and how many there are, but where the reviews are published.  However, whilst there are book review indices, reviews can take a long time to come out.  Also, in the Arts & Humanities, it’s unusual for a book to get a negative review because the discipline areas are small and collegiate.  Essentially, if a book gets a review it means something, but if it doesn’t get reviewed, it doesn’t necessarily mean anything.  Just like citations…

Book sales

High book sales could be seen as an indicator of quality and the beauty of sales is that they are numerical indicators which bibliometricians like!  However, there is no publicly available source for book sales (that I’m aware of). Also, sales can be affected by market size. Thus books sold in the US will often outnumber those sold in the UK – an effect of population size.  Sales are also affected by the print run – i.e., whether the book comes out as a short-print-run hardback aimed at libraries, or a large-print-run paperback aimed at undergrads.  The former might be little sold but widely read; the latter might be widely sold, but never read!  So sales might be more an indicator of popularity than quality.  But the same could be said of citations….


Many alt-metric offerings cover books and provide a wide range of indicators.  One of particular relevance is the course syllabi on which books are listed – although this is probably more likely to favour text books than research monographs.  It is also possible to see the number of book reviews on such tools as well as other social media and news mentions.  However, altmetric providers have never claimed that they measure quality, but rather attention, visibility and possibly impact.  But, at the risk of repeating myself, the same could be said of citations…

The problem for us at Loughborough was that none of these indicators met our criteria for a usable indicator, which we defined as:

  • Normalisable – Can we normalise for disciplinary differences (at least)
  • Benchmarkable – Is comprehensive data available at different entity levels (individual, university, discipline, etc) to compare performance
  • Obtainable – Is it relatively simple for both evaluators to get hold of the data, and for individuals to verify it.

So to summarise, whilst there are legitimate objections to the use of non-citation indicators to measure the magnificence of monographs, most of those objections could also apply to citations.  The key difference is that we do have normalisable, benchmarkable and accessible indicators for journal and conference papers: we don’t yet for books.  At Loughborough we concluded that measuring the magnificence of monographs can only currently reliably be done through peer review.  However, evidence of the sort presented here can be used to tell good stories about the quality and visibility of books at individual author and output level. And these stories can be legitimately told in some of the same places (job applications, funding bids, etc.,) we’d normally see citation stories.  Whether colleagues in the Arts, Humanities and Social Sciences ever feel comfortable doing so is another question.


Elizabeth Gadd is the Research Policy Manager (Publications) at Loughborough University.  She has a background in Libraries and Scholarly Communication research.  She is the co-founder of the Lis-Bibliometrics Forum and is the ARMA Metrics Special Interest Group Champion.

Outputs from Bibliometrics in Arts, Humanities and Social Sciences conference

Here are the links to presentations given at the recent #AHSSmetrics conference at the University of Westminster, 24 March 2017. Many thanks to all the presenters, and to the participants, for a stimulating day. For those who missed the event, Karen Rowlett has helpfully created a Storify of the tweets at https://storify.com/karenanya/bibliometrics-for-the-arts-and-humanities.

10.00 Welcome – Martin Doherty – Head of Department, Dept of History, Sociology & Criminology, University of Westminster

10.10 Opening Panel: How appropriate is bibliometrics for Arts, Humanities and Social Sciences?( Chaired by Katie Evans, University of Bath) – Peter Darroch (Plum Analytics), Professor Jane Winters (School of Advanced Study and Senate House Library), Stephen Grace (London South Bank University)

10.40 Citation metrics across disciplines – Google Scholar, Scopus and the Web of Science: A cross-disciplinary comparison – Anne-Wil Harzing (University of Middlesex)

11.20 Tea & Coffee

11.50 Impacts of reputation metrics and contemporary art practices – Emily Rosamond (Arts University of Bournemouth)

12.20 Bibliometrics as a research tool: The international rise of Jurgen Habermas – Christian Morgner (University of Leicester) NB presentation in person only

1.00 Lunch (Kindly sponsored by Plum Analytics)

1.45 Workshop: Practice with PoP: How to use Publish or Perish effectively? (laptop with PoP software installed needed) – Anne-Wil Harzing

2.45 A funder’s perspective: bibliometrics and the arts and humanities – Sumi David (AHRC)

3.15 Bibliometric Competencies – Sabrina Petersohn (University of Wuppertal)

3.45 Tea & Coffee

4.00 Lightning talks:

4.30 Round Up by Stephanie Meece (University of the Arts London)

Job Opportunity: University of Sheffield is looking for a Library Scholarly Communications Manager!

Job Reference Number: UOS015654
Job Title: Library Scholarly Communications Manager

Salary: Grade 8 £39,324-£46,924 per annum with potential to progress to £52,793 through
sustained exceptional contribution

Closing Date: 31st March 2017

Copyright advisory and advocacy services form a critical component of the infrastructure necessary in advancing teaching and learning in the digital age. Specialist educative copyright and licensing services also benefit research within the context of more open publishing of scholarly communications.

Reporting to the Associate Director for Academic & Digital Services, you will develop and implement services and programmes that build an understanding of copyright and licensing within the scholarly communications landscape and publishing, across the university community. You will ensure compliance with the copyright legislation, university policy and licenses and develop a shared institutional understanding of both the opportunities and challenges associated with this field.

The post-holder will provide to the university community (Faculties and Professional Services) legally compliant, detailed interpretation and policy advice on copyright. You will actively coordinate advisory services, making available current and reliable information on the web and bringing together specialists in the areas of broadcast media, newspapers, music and other formats particularly where the university has agreed license schemes. Operationally, you will oversee the necessary information management processes and audit requirements.

You will be the key contact with the University’s Legal Panel Agreement on matters pertaining to the Copyright, Designs and Patents Act 1988 and subsequent statutory instruments including the 2014 exemptions. You will engage with external bodies including the Copyright Licensing Agency, the UK Government Intellectual Property Office and other license issuing bodies.  Professionally, you will establish effective external networks concerned with copyright and intellectual property in

Educated to degree level (or equivalent work experience) you will be able to think strategically as well as deliver operationally.  You will be a confident communicator and able to identify opportunities to innovate and change within the evolving regulatory framework. You will enjoy working with groups and individuals, including academic staff, researchers and  students, as well as networking beyond the University.

Please see the Job Description & Person Specification for further details and apply using the online application form.

HEFCE: The road to the Responsible Research Metrics Forum – Guest post by Ben Johnson

On Wednesday 8th February 2017, Imperial College made headlines by announcing that it has signed the San Francisco Declaration on Research Assessment. Meaning that Imperial will no longer consider journal-based metrics, such as journal impact factors, in decisions on the hiring and promotion of academic staff. Their decision followed a long campaign by Stephen Curry, a professor of structural biology and long advocate of the responsible use of metrics.

At the end of last year, Loughborough University issued a statement on the responsible use of metrics in research assessment, building on the Leiden Manifesto.  This was followed two weeks ago, with a statement on principles of research assessment and management from The University of Bath, building on the concept of responsible use of quantitative indicators. And, earlier in 2016, the Stern review of the Research Excellence Framework recognised clearly that “it is not currently feasible to assess research outputs in the REF using quantitative indicators alone.

What these examples and others show is that the issue of metrics – in particular ‘responsible metrics’ – has risen up the agenda for many universities. As one of those closely involved in the HEFCE review of metrics (The Metric Tide), and secretary to the new UK Forum for Responsible Research Metrics, this of course is great to see.

Nevertheless, of course the issue of metrics has been bubbling away for much longer than that, as the Metric Tide report set out. University administrators, librarians and academics themselves have taken a leading role in promoting the proper use of metrics, with forums like the ARMA metrics special interest group promising to play a key part in challenging attitudes and changing behaviours.

In addition, as we have seen with university responses to the government’s HE green paper and to the Stern review, the wider community is very alive to the risks of an over reliance on metrics. This was reflected in the outcomes of both exercises, with peer review given serious endorsement in both the draft legislation and the Stern report as being the gold standard for the assessment of research.

These developments are exactly the kinds of things that the new UK Forum for Responsible Research Metrics wants to see happening. This forum has been set up with the specific remit to advance the agenda of responsible metrics in UK research, but it’s clear that this is not something it can deliver alone – it is a substantial collective effort.

So what will the Forum do? Well, as the Metric Tide report states, many of the issues relate to metrics infrastructure, particularly around standards, openness and interoperability. The Forum will have a specific role in helping to address longstanding issues, particularly around the adoption of identifiers – an area of focus echoed by the Science Europe position statement on research information systems published at the end of 2016, which is itself a useful touchstone for thinking about these issues.

To support the Forum; Jisc are working hard on developing an action plan to address the specific recommendations of the Metric Tide report, with a particular focus on building effective links with other groups working in this area, e.g. the RCUK/Jisc-led Research Information Management (RIM) Coordination Group. This will be discussed when the Forum meets again in early May.

However, sorting out the ‘plumbing’ that underpins metrics is no good if people continue to misuse them. To support this, the Forum will therefore take a complementary look at the cultures and behaviours within institutions and elsewhere; firstly to develop more granular evidence of how metrics are being used, and secondly to look at making specific interventions to support greater responsibility from academics, administrators, funders, publishers and others involved in research.

With that in mind, Universities UK and the UCL Bibliometrics Group, under the auspices of the Forum, will shortly be jointly issuing a survey of HEIs on the use of problematic metrics in university management and among academic groups, to help identify the scale of any (mis)uses of measures such as the JIF. But, to help us also understand better why initiatives like DORA have not been more widely adopted in the UK.

Of course, metrics have much broader uses than just measuring outputs – they are also used to measure people, groups and institutions. This is a key finding of the Metric Tide report, but one that is often overlooked when focussing very narrowly on output metrics. The forum will also be focussing on this, seeking to bring people together across all domains.

To make a decisive contribution here, the Forum needs to have clout, and it is for this reasons that the five partners (HEFCE, RCUK, Jisc, Wellcome and Universities UK) asked Professor David Price to convene and chair the Forum as a mixed group of metrics experts and people in positions of serious influence in their communities. This was a delicate balance to strike, and one that can only be successful if the Forum engages effectively with the various interested communities.

With that in mind, the Forum is planning to set up a number of ‘town hall’ meetings throughout 2017 to engage with specific communities on particular topics, and would very much welcome hearing from anyone interested in being involved in these or in engaging with the Forum in any other way. We will be announcing further details of these on the Forum’s web pages soon.

If you are interested in joining up with the work of the Forum throughout 2017, please contact me on b.johnson@hefce.ac.uk – I’d be delighted to hear from you.

Ben Johnson is a research policy adviser at the Higher Education Funding Council for England, is secretary to the UK Forum for Responsible Research Metrics and a member of the G7 expert group on open science.

He has responsibility for policy on open access, open data, research metrics, technical infrastructure and research sector efficiency within universities in England. In recent years, he co-authored The Metric Tide (a report on research metrics), developed and implemented a policy for open access in the UK Research Excellence Framework (REF), and supported Professor Geoffrey Crossick’s project and report to HEFCE on monographs and open access. He is a member of the UK’s open data forum and co-authored the forthcoming UK Open Research Data Concordat. In addition to this, he is currently part-seconded to the Department of Business, Energy and Industrial Strategy to work on reforming the research and innovation landscape.

REF consultation: Lis-Bibliometrics response

The four UK higher education funding bodies are consulting on proposals for the next Research Excellence Framework.  Thank you to all Lis-Bibliometrics members who have contributed their thoughts on this.  Here is a draft response the Lis-Bibliometrics Committee intends to submit on behalf of the group.  If you have any last minute comments please contact me or share via the list as soon as possible.  We’ve decided to respond only to consultation question 18:

Q.18 Do you agree with the proposal for using quantitative data to inform the assessment of outputs, where considered appropriate for the discipline? If you agree, have you any suggestions for data that could be provided to the panels at output and aggregate level?

We agree that quantitative data can support the assessment of outputs where considered appropriate by the discipline.  Any use of quantitative data should follow the principles for responsible use of metrics set out in the Metric Tide and the Leiden Manifesto.

  • Disciplinary difference, including citation patterns varying by output type, must be taken into account.
  • Data should only be used if it offers a high standard of coverage, quality and transparency. Providing data from a range of sources (e.g. Scopus, Web of Science, Google Scholar) would allow the panel to benefit from the strengths of each source whilst highlighting the limitations.
  • Known biases reflected by bibliometric indicators (e.g. around interdisciplinary research and gender) should be taken into account.
  • A range of data should be provided to avoid incentivizing undesirable side effects or gaming by focusing attention on a single indicator.
  • Given the skewed distribution of citations, and the ‘lumpiness’ of citations for recent publications in particular, we recommend measures of uncertainty be provided alongside any citation data. At the very least, false precision should be avoided.
  • In addition to citation indicators, panels should take into account the number of authors of the output.

Panels should receive training on understanding and interpreting the data and be supported by an expert bibliometric advisor.

We do not consider the field-weighted citation impact indicator appropriate for the assessment of individual outputs: as an arithmetic mean based indicator it is too heavily skewed by small numbers of ‘unexpected’ citations.  Furthermore its 4 year citation window would not capture the full citation impact of outputs from early in the REF period.  The use of field-weighted citation percentiles (i.e. the percentile n such that the output is among the top n% most cited outputs worldwide for its subject area and year of publication) or percentile bands (as used in REF2014) is preferable.  Percentile based indicators are more stable and easier to understand as the “performance” of papers is scaled from 1-100, but can be skewed by large numbers of uncited items.

Output level citation indicators are less useful for recent outputs.   Consequently, it might be tempting to look at journal indicators.  This temptation should be resisted!  Given the wide distribution of citations to outputs within a journal, and issues of unscrupulous ‘gaming’, journal metrics are a poor proxy for individual output quality.  Furthermore, use of journal metrics would incentivize the pursuit of a few ‘high impact’ journals to the detriment of timely, diverse and sustainable scholarly communications.

Use of aggregate level data raises the question of whether the analysis is performed only on the submitted outputs, or on the entire output from the institution during the census period. The latter would provide a more accurate picture of the institution’s performance within the discipline, but automatically mapping outputs to REF units of assessment is extremely challenging.  Furthermore it would be hard to disaggregate those papers written by staff who are not eligible for submission to REF.

Katie Evans, on behalf of the Lis-Bibliometrics Committee

Note: This replaces an earlier draft REF consultation response posted on 1st March 2017.