Lizzie Gadd spent some time with the Research Intelligence Network of the Netherlands, and explains why she’s a little bit in love with the Dutch…
A couple of weeks ago I was lucky enough to spend a few days in Amsterdam talking to research evaluation colleagues from across the Netherlands. It was the second meeting of their newly-formed Research Intelligence Network of the Netherlands (RINN) organised and chaired by Gert Goris of the Erasmus University Rotterdam. I must admit that I’ve always been a little bit in love with the Dutch: with their liberal ways, their tolerance of diversity and their straight-forward approach to seemingly everything. So I was delighted to see these qualities evident in their approach to research evaluation too – both at national level and at university level. Being confronted by different world view is always a learning experience. So here, for your delectation and my personal record, are a few things that I learned about research evaluation from the Dutch.
1) Research done in a group should be evaluated in a group
We all know that research (in most disciplines) tends to be done in teams. However, research evaluation is often done on individuals, or in the case of the UK Research Evaluation Framework (REF), on very large disciplinary groups where sub-disciplinary nuances can be lost. One of the most regular complaints I hear about the REF is that “there are no experts on the panel in my area so how can the REF call itself an expert peer review exercise?” In this context I was delighted to learn about the Netherlands’ national research evaluation scheme, the ‘Standard Evaluation Protocol’ (SEP). All Dutch universities are tasked with running their own research evaluations once every 6 years, to their own timetable, according to a standard protocol involving institutional submissions reviewed by an external, international, expert panel. There is so much to like about this system, but one of the best things, in my view, is that the evaluation is done at the level of the research group, unit or team. For me this solves so many problems created by the REF and its demands for all academics to submit a mean number of papers.
• No more internal tensions as co-authors arm-wrestle over which paper ‘they’ are going to submit
• No internal competition around the number of papers selected for submission by each academic. (Are you a 1-paper plebeian or a 5-paper superstar?)
• No ‘shame’ for early career academics whose work may not yet be mature enough to be rated as 4-star.
• No pressure on all academics to be paper writers.
OK, so most academics will have a hand in paper-writing, but the world might be better served if some academics did more science and less writing. Or more public engagement, or enterprise. The REF as it stands demands that everyone is a producer of publications. The Dutch system lets each Research group decide who are their writers, who are their entrepreneurs, who are their team leaders and it pays them to let folks play to their strengths as it is the group that is judged, not the individuals. We have a known, significant, mental health problem in Higher Education. I can’t help thinking that a team-based approach to evaluation would go a long way to alleviating the stress currently placed on individuals to produce a REF-submittable portfolio of outputs. But of course, the main benefit is my first point: research is done in teams and therefore it should be evaluated as such. Responsible evaluation is just sensible evaluation.
2) Responsible research evaluation should look forward as well as back
Another glorious thing about the Dutch SEP is that groups are evaluated not only on past performance but on future potential. Indeed, what they call ‘viability’ is one of three pillars of the whole assessment (research quality and societal relevance being the other two). So in addition to providing the usual publication lists (along with impact evidence/indicators of their own choosing), PhD completions, grant income, and so on, each group is asked to provide a narrative around their future plans. This brought to my mind the recent DORA webinar in which Sandra Schmid spoke about their novel recruitment approach at UT Southwestern. Fed up with publication metrics being used as a proxy for good candidates, Schmid instituted a new system in which Skype interviews were held with all candidates who were asked just two questions:
i) Where will your research programme be in 5 years?
ii) How will UT Southwestern help you get there?
Because science is often done in teams, it can be hard to untangle an individual’s contribution from past efforts. And those at an early stage of their career won’t have had much time to prove themselves yet. But you can get a good sense of an individual or group’s passion and potential from their future plans.
3) Responsible research evaluation should reward societal relevance not just impact.
I could continue waxing lyrical about the SEP but instead I’ll highlight just one more ‘like’, and that is its focus on societal relevance rather than impact. Thed van Leeuwen and Ingeborg Meijer (CWTS Leiden) made an important point in this regard, namely, extraordinary “world-leading” impact (as rewarded by the REF) is the exception, not the norm, and usually serendipitous. What is wrong with “normal” everyday impact? What is wrong with work that is “just” societally relevant? Do we actually impede progress by creating a culture in which only ‘extraordinary’ levels of impact are rewarded? A point to ponder, I think.
A presentation by Wilfrid Mijnhart (Erasmus University Rotterdam) on the development of Dimensions, including the somewhat controversial RCR and FCR indicators, led to a very sensible question about whether the range of metrics was getting out of hand.
4) Responsible is as responsible does.
Being an aficionado of responsible metrics statements, I was very keen to get a sense as to whether and how the Dutch were engaging with these things. It was probably the first thing I asked the initial trickle of unsuspecting colleagues around the early morning coffee tables. “So, does your institution do responsible metrics?”. “Well, yes of course”. “So which statement has it signed?”. “We haven’t.” I checked the DORA website and the Association of Netherlands Universities (VSNU) is a signatory, but it doesn’t look like any individual Dutch institutions have signed (happy to be proved wrong on this). But it was clear from their presentations that responsible metrics were very much a consideration in everything they did. So it turns out you can do responsible metrics without signing up to a statement. Who’d have thought? I am still a believer in statements as I think they provide a means by which institutions can be held accountable, both to their own employees and also externally. But it was a reminder to me that ‘responsible is, as responsible does’. If you’re quietly getting on with responsible metrics within a responsible university culture, maybe a statement that shouts it from the rooftops is not necessary. And of course having a statement might project a positive message to the world, whilst masking some otherwise unsavoury research evaluation practices beneath.
5) If peer review is so great, why not peer review your metrics?
One of the genius ideas the RINN had was to provide an opportunity for individuals to bring a research evaluation project or initiative that they were working on and put it out there for peer review. Within the safe confines of a small group of colleagues, research intelligence types would present their work for 10 minutes, and then ask for responses to a particular question or challenge they were facing. Their peers would then offer their suggestions and comments. Brilliant! What a great way to learn from each other and take a shared approach to doing metrics responsibly and well. Definitely something I’d like to see happening in the UK.
6) The more indicators we have, the more nuance we can bring to our evaluations
A presentation by Wilfred Mijnhardt (Erasmus University Rotterdam) on the development of Dimensions, including the somewhat controversial RCR and FCR indicators, led to a very sensible question about whether the range of metrics was getting out of hand. Are there too many indicators on the market? Does this make things overly complicated for researchers? This was a timely question considering the recent publication of a preprint by Lutz Bornmann and Werner Marx at the Max Plank Institute, calling for a set of standardised field-normalised bibliometric indicators. Would this be a good thing? Others liked the variety. They suggested that the increasing range of products and indicators actually provided opportunity for greater nuance around bibliometric analysis (more tools in the toolbox). Indeed, in a later presentation, Thed Van Leeuwen (CWTS Leiden) made the point that responsible metrics did not necessitate a universal standard, but should be responsive and context-specific. And we all know that universal “standards” such as the JIF and the h-index have got us into trouble in the past…
So. Lucky me, getting to spend two fruitful days discussing all things research evaluation with a bunch of dedicated, friendly professionals in the beautiful city of Amsterdam. If they learned anything from me, I reaped two-fold what I sowed. I hope that if and when other national research evaluation networks arise we will all be able to join forces on matters of global interest and continue to share experiences and expertise. At the risk of sounding like a beauty pageant contestant, I’m convinced that this is the route to making the world of research evaluation a better place.
Elizabeth Gadd is the Research Policy Manager (Publications) at Loughborough University. She is the founding chair of the Lis-Bibliometrics Forum and co-Champion of the ARMA Metrics Special Interest Group. She has a background in acdemic libraries and scholarly communication research.
Unless it states otherwise, the content of The Bibliomagician is licensed under a Creative Commons Attribution 4.0 International License.