Josh Nicholson writes about how to use scite, a platform for discovering and evaluating scientific articles via Smart Citations. Smart Citations allow users to see how a scientific paper has been cited by providing the context of the citation and a classification describing whether it provides supporting or contradicting evidence for the cited claim.
In December 2018, I published an op-ed entitled, “Solving the fake news problem in science.” I wasn’t talking about COVID-19 obviously but about science in general and how we can tell what reliable research is. That question is especially pronounced today with an avalanche of research on COVID-19 being published, most of it without peer review, and most of it being amplified directly to the public.
This problem has existed for years, but it’s always felt abstract or like an academic problem. Can’t reproduce a finding in cancer research? That’s bad, but it doesn’t directly impact me. Now, any scientific report on COVID-19 may influence what the entire world does. Each day we have conversations with our families and friends about new scientific findings, discussing the merits and shortcomings of studies as if we were tenured professors discussing the proposal of a Ph.D. student. Overnight we have all become armchair scientists.
It’s amazing that science is now at the forefront of everyone’s mind, but it can also have dangerous consequences. The President of the United States recently said, “HYDROXYCHLOROQUINE & AZITHROMYCIN, taken together, have a real chance to be one of the biggest game changers in the history of medicine.” Shortly thereafter, a couple ingested fish tank cleaner in an attempt to prevent COVID-19, because it contained a form of chloroquine; one of them died and the other was hospitalized.
Ivan Oransky and Adam Marcus, founders of Retraction Watch, a watchdog website for scientific fraud have written that “much of the research that emerges in the coming weeks will turn out to be unreliable, even wrong. We’ll be OK if we remember that.”
But will we remember it? Will we be able to identify what turned out to be right and what turned out to be wrong? Can we even do that now?
This is something that I deal with daily as the co-founder and CEO of scite. scite is probably best described to non-scientists and non-bibliometricians like Rotten Tomatoes but for scientific articles. We use artificial intelligence so that anyone can see if the claim made in a scientific paper has been supported or contradicted by subsequent research. To a bibliometrician, we are a citation index, and while we share many similarities to other citation indices, like Web of Science, Scopus, or Dimensions, we do things quite differently. Specifically, we show the citation context or “citance” surrounding the citations received by an individual research output; we use deep learning to classify the citation statements by the rhetorical function. For example, does the citation to the original article support or contradict the original article’s conclusion or recommendation? In other words, is the previous research confirmed or refuted by the new research? Our aim is to help advance citation indicators beyond quantitative data and metrics, to provide more qualitative and contextual data to the bibliometrics community, and to specifically offer useful guidance to non-scientists.
scite is not a perfect solution and there is no perfect solution, but the innovation of new tools, such as scite, PubPeer, and Get The Research, offer scientists and non-scientists a chance to consume, digest, and evaluate research outputs on their own; we argue it can be extremely helpful for figuring out the veracity of research articles, including recent articles on COVID-19. For example, this article, posted on February 22 concludes: “Both IL-6 and IL-10 levels showed sustained increases in the severe group compared to the mild group,” suggesting that two specific signaling molecules are more prevalent in severe rather than mild COVID-19 disease. Can we rely upon that finding? Have others tested it? Looking at it you can’t tell. With scite, you can see that only five days later, another preprint with supporting evidence for this claim appears. The authors state, “In accordance with Wan S’s  and Liu J’s  study, this study also found that the levels of IL-6 and IL-10 were associated with the severity of COVID-19 pneumonia.”
To date, we have analyzed over 16 million scientific articles, 60k related to coronavirus, producing over 600 million classified citation statements like the quoted sentences above, and currently adding ~10M per day. However, in order to truly identify what research is reliable or not, we need to access every scientific article ever written. This has been challenging given that most research is locked behind paywalls. Fortunately, leading academic publishers like Wiley, The British Medical Journal, Karger, Rockefeller University Press, and others have started to share these with scite. Some have even started to display scite information directly on their articles.
We’re excited about the possibilities of bibliometrics to help scientists and non-scientists understand science better and would like to invite researchers to use our data (for free) to perform their own studies.
Josh Nicholson is co-founder and CEO of scite (scite.ai). Previously, he was the founder and CEO of the Winnower and CEO of Authorea, two companies aimed at improving how scientists publish and collaborate. He holds a Ph.D. in cell biology from Virginia Tech, where his research focused on the effects of aneuploidy on chromosome segregation in cancer.
Unless it states other wise, the content of the Bibliomagician is licensed under a Creative Commons Attribution 4.0 International License.