Lizzie Gadd encourages us to clarify what, why and how we are measuring before launching into any conversation about responsible metrics.
I love the poem, “Five Blind Men and an Elephant” by John Godfrey Saxe. For those who’ve not read it, five blind people are confronted with an elephant and each grabs a part of it (trunk, leg, tail) and proceeds to define the elephant by the part they have (“it’s like a snake…a tree…a rope”). I keep coming back to this story when I reflect on some of our responsible metrics debates because all too often we seem to be hindered by the lack of a clear definition as to what ‘part of the elephant’ we are describing. Without this we can end up talking at cross-purposes, and get tangled up about when and where metrics might be appropriate. Let me give a couple of examples.
I was making the point at a conference that the availability of opportunities to engage with open research wasn’t yet at a place where we could use openness indicators to compare and assess individual job candidates. However, a discombobulated delegate feared I was throwing the baby out with the bath water, for openness indicators were an essential way of understanding the the world of open research. He was right. But so was I. Just because the use of an indicator is inappropriate at the level of the individual, and for as important a context as comparing job candidates, does not mean that at a higher level of granularity (assessing how universities were engaging with open research over time) it can’t be used.
Again, after another conference I was having a conversation with a publisher who was decrying the use of journal metrics and thought they should all be banned. I encouraged him to rewind on that statement because there are things that journal metrics can tell us. What’s at fault is not the existence of the metric, but what it is being used as a proxy for. A journal citation metric arguably tells us nothing about the quality of the journal, it tells us something about the citedness of the articles published by that journal.
“What’s at fault is not the existence of the metric, but what it is being used as a proxy for.”
I have been pondering on all this and it seems to me that there are three main factors that affect whether and where indicators might be appropriate: 1) the size of the entity being measured, 2) the reason the entity is being measured and 3) whether the indicator is a legitimate proxy for the quality being measured. I’m not saying that there aren’t also other considerations when running an analysis such as the age and discipline of the entities under scrutiny, just that these are secondary considerations after we’ve established what we’re measuring, how we’re measuring and why we’re measuring it.
It occurs to me that for #1 (what) and #2 (why) some categories might aid our conversations. Of course in terms of size, our categories are quite straightforward: the entities might be individuals, groups, institutions or countries. (Journals might be another). In terms of purposes, to my mind there are six main reasons for using metrics and I’ve outlined these below.
- Measure to understand. “Science of science” activities that study publication patterns and trends for the sole purpose of understanding them better.
- Measure to show off. “Pick me!” activities. The use of metrics to market an individual, group or university on promotional materials or grant applications.
- Measure to monitor. Plotting progress against an objective whether internally or externally set. This may include some comparison activity as outlined below.
- Measure to compare. The use of indicators to compare one entity with another. University rankings are an example of this.
- Measure to incentivise. The use of indicators to incentivise certain behaviours. Now I’m aware that once you start to measure anything it can act as an incentive to engage in the activity being measured. However, I’ve put this in as a separate category as increasingly, as a backlash against the measurement of things that are unhelpful (e.g., cash awards for publishing in high-impact journals) we’ve observed a rise in the use of measurement solely as a means to incentivise. The measurement of open access content submitted to REF is an example of this.
- Measure to reward. Any activity that results in some kind of reward for the entity being measured, be this a job, promotion, grant, prize or award of any description.
Of course, no matter what combination of entity size and measurement purpose you are using, our third criteria (how) always needs to be considered, namely, does your measure measure what it claims to measure?
As I was contemplating these categories (what and why), it occurred to me that there is some kind of logical progression around the level of caution we need to apply when using metrics in these different contexts. (And by ‘caution’, I mean the level of care we need to take with our analysis, and the weight we put on the resulting outcome.) I had fun plotting these onto a chart (see fig 1) and RAG-rating them accordingly.
So for example, when using indicators to understand the publication activity of different countries over time, you might use any number of indicators to do so. The entities are very large and there will be no real impact on the countries under examination as a result of the measurement activity. However, when you start to examine smaller entities (individuals and groups) for these low-impact activities, or larger entities for medium impact activities (monitoring or comparison) you need to start being a bit careful about the indicators you use and how you interpret the outcomes. And when using any indicator for purposes that have rewards attached – especially when the entity is small – you should use metrics with extreme care, certainly in conjunction with expert judgement, and ensure you keep sense-checking whether your indicator is an adequate proxy for the quality you seek to measure.
“When using any indicator for purposes that have rewards attached – especially when the entity is small – you should use metrics with extreme care.”
I’m aware that this is not a scientific approach to classifying our use of metrics, nor understanding the associated ‘risks’. And I’m definitely open to discussion as to whether these classifications are the right ones. However, I hope that for the bibliometric practitioner this chart might act as a useful heuristic as to when they might feel able to leave colleagues to their own measuring devices and when they might need to wade in to the conversation. I also hope that these three questions (the what, how and why) might provide a useful sense-check in our conversations about metrics: to remind us that when we talk about responsible metrics, we might be taking about any number of “elephant parts”, and what’s true for one, might not necessarily be true for another.
Elizabeth Gadd is the Research Policy Manager (Publications) at Loughborough University. She is the chair of the Lis-Bibliometrics Forum and co-Champions the ARMA Research Evaluation Special Interest Group. She also chairs the INORMS International Research Evaluation Working Group.
Unless it states other wise, the content of the Bibliomagician is licensed under a Creative Commons Attribution 4.0 International License.