scite: more than just a citation index

Josh Nicholson writes about how to use scite, a platform for discovering and evaluating scientific articles via Smart Citations. Smart Citations allow users to see how a scientific paper has been cited by providing the context of the citation and a classification describing whether it provides supporting or contradicting evidence for the cited claim.

In December 2018, I published an op-ed entitled, “Solving the fake news problem in science.” I wasn’t talking about COVID-19 obviously but about science in general and how we can tell what reliable research is. That question is especially pronounced today with an avalanche of research on COVID-19 being published, most of it without peer review, and most of it being amplified directly to the public.

This problem has existed for years, but it’s always felt abstract or like an academic problem. Can’t reproduce a finding in cancer research? That’s bad, but it doesn’t directly impact me. Now, any scientific report on COVID-19 may influence what the entire world does. Each day we have conversations with our families and friends about new scientific findings, discussing the merits and shortcomings of studies as if we were tenured professors discussing the proposal of a Ph.D. student. Overnight we have all become armchair scientists.

It’s amazing that science is now at the forefront of everyone’s mind, but it can also have dangerous consequences. The President of the United States recently said, “HYDROXYCHLOROQUINE & AZITHROMYCIN, taken together, have a real chance to be one of the biggest game changers in the history of medicine.”  Shortly thereafter, a couple ingested fish tank cleaner in an attempt to prevent COVID-19, because it contained a form of chloroquine; one of them died and the other was hospitalized. 

Ivan Oransky and Adam Marcus, founders of Retraction Watch, a watchdog website for scientific fraud have written that “much of the research that emerges in the coming weeks will turn out to be unreliable, even wrong. We’ll be OK if we remember that.”

But will we remember it? Will we be able to identify what turned out to be right and what turned out to be wrong? Can we even do that now?

Screenshot of an example of an article on which explains the citation classifications and citation classification
Screenshot of scite platform with text explaining the citation classification. Image © scite, Inc.

This is something that I deal with daily as the co-founder and CEO of scite. scite is probably best described to non-scientists and non-bibliometricians like Rotten Tomatoes but for scientific articles. We use artificial intelligence so that anyone can see if the claim made in a scientific paper has been supported or contradicted by subsequent research. To a bibliometrician, we are a citation index, and while we share many similarities to other citation indices, like Web of Science, Scopus, or Dimensions, we do things quite differently. Specifically, we show the citation context or “citance” surrounding the citations received by an individual research output; we use deep learning to classify the citation statements by the rhetorical function. For example, does the citation to the original article support or contradict the original article’s conclusion or recommendation? In other words, is the previous research confirmed or refuted by the new research? Our aim is to help advance citation indicators beyond quantitative data and metrics, to provide more qualitative and contextual data to the bibliometrics community, and to specifically offer useful guidance to non-scientists.

scite is not a perfect solution and there is no perfect solution, but the innovation of new tools, such as scite, PubPeer, and Get The Research, offer scientists and non-scientists a chance to consume, digest, and evaluate research outputs on their own; we argue it can be extremely helpful for figuring out the veracity of research articles, including recent articles on COVID-19. For example, this article, posted on February 22 concludes: “Both IL-6 and IL-10 levels showed sustained increases in the severe group compared to the mild group,” suggesting that two specific signaling molecules are more prevalent in severe rather than mild COVID-19 disease. Can we rely upon that finding? Have others tested it? Looking at it you can’t tell. With scite, you can see that only five days later, another preprint with supporting evidence for this claim appears. The authors state, “In accordance with Wan S’s [7] and Liu J’s [8] study, this study also found that the levels of IL-6 and IL-10 were associated with the severity of COVID-19 pneumonia.”

Screenshot of an example of an article on which explains the citation classifications and citation context
Screenshot of scite platform with text explaining the citation context. Image © scite, Inc.

To date, we have analyzed over 16 million scientific articles, 60k related to coronavirus, producing over 600 million classified citation statements like the quoted sentences above, and currently adding ~10M per day. However, in order to truly identify what research is reliable or not, we need to access every scientific article ever written. This has been challenging given that most research is locked behind paywalls. Fortunately, leading academic publishers like Wiley, The British Medical Journal, Karger, Rockefeller University Press, and others have started to share these with scite. Some have even started to display scite information directly on their articles.

We’re excited about the possibilities of bibliometrics to help scientists and non-scientists understand science better and would like to invite researchers to use our data (for free) to perform their own studies.

Josh Nicholson is co-founder and CEO of scite ( Previously, he was the founder and CEO of the Winnower and CEO of Authorea, two companies aimed at improving how scientists publish and collaborate. He holds a Ph.D. in cell biology from Virginia Tech, where his research focused on the effects of aneuploidy on chromosome segregation in cancer.

Unless it states other wise, the content of the Bibliomagician is licensed under a Creative Commons Attribution 4.0 International License.

7 Replies to “scite: more than just a citation index”

  1. Hi Josh – thanks for this really accessible & interesting piece! I guess the thing that concerns me most about the use of AI for evaluative purposes is the accuracy of the training data, uncertainty as to who moderates and checks this, and ultimately what impact it might have on scholarship and scholars. Scite currently (incorrectly) lists one of my papers as contradicting another of my own papers I *think* because I use the word “however” in the sentence that introduces it. What if the very first citation I’d received had been wrongly labelled as contradictory? Would this have a negative impact on how my paper was then perceived, used, cited? So my question is, do you have an ethics group that looks at these sorts of issues?


  2. (sorry to write anonymously but I don’t like giving my identity in these comments systems)

    A nice piece about an interesting new platform. However, my comments essentially echo those of Lizzie – how can you ensure the accuracy of what is classified as contradictory or supporting? I have looked at several documents today and found a large number of citations classified as “contradictory” that, if the broader context is taken into account, are not at all contradictory (e.g. one study wrote, to paraphrase, that they observed a higher estimate of a predictor variable than another study – the next sentence explained that this was because the two studies used different methodologies. The estimates were not contradictory, but in fact consistent and supportive of each other). I also checked my own publication record and found the vast majority of citations were “mentioning” my work, when in fact many of those citing papers supported my work. I guess what I am trying to say is that by relying on classification of just the text immediately surrounding the citation, the real context is lost, and giving citations classifications that are tantamount to “good” or “bad” on the basis of this text seems dangerous.


  3. Hi Lizzie,

    Thanks for taking the time to read the piece and for your comment. I think you bring up a very valid concern–how do users/researchers know how reliable the tool is at classifying citation statements and what are the unintended consequences if we get it wrong.

    First, I think it is important to emphasize that we classify citation statements based on rhetorical function, not just positive or negative sentiment–does the statement indicate that the authors provide supporting or contradicting evidence, not just are they expressing positive or negative sentiment. We have some more information and examples here:

    How reliable is our classifier?

    -This is an empirical question in that we can measure the precision of the classifier (precision is 0.8, 0.85, and 0.97 for supporting, contradicting, and mentioning, respectively) but also related to the point above about being clear enough on what supporting and contradicting mean. We are continually working on both and in fact, will be changing the name from contradicting to disputing soon. We’ve also added onboarding and display an information link next to the classifications on the site. In addition to this, we are continually improving the model by adding new training data, improving sentence segmentation techniques, and trying to add new features/models and have come a long way since we first launched. However, the tool will never be perfect nor do I expect it to be given the nuances of human language/writing. Thus, we do allow users to flag what they believe to be a misclassified citation statement ( This is reviewed by two members of the scite team independently and blindly and either accepted or rejected and this is communicated back to the user as quickly as we can (basically within 24 hrs). It needs to be explicit in the writing somewhere in the text that the study indicates supporting or contradicting evidence. We could improve this process to make it something more automated and require more input from other experts like captchas but we just don’t have the resources or see this as a priority right now.

    The last thing I will say about this is that we have been increasingly having conversations with researchers and bibliometricians and the obvious outcome is that we need to report this in a research paper, not just a comment on a blog post. I am starting to work on this now and will try to address your points in the paper beyond my response here.

    Can a misclassification have an unintended effect on how a paper is evaluated?

    -Yes, I think this can certainly happen, which is why I think it is critical for authors/users to be able to flag misclassified cites on scite. Indeed, most people first search themselves and flag anything they see as misclassified, generally without any issue. We have had people try to game the system already changing mentioning cites to supporting because it “overall supports” which we have rejected because they could not provide explicit proof from the paper.

    Ultimately, we are trying to improve citations by displaying the context so anyone can read how something has been cited, not just how many times and providing various filters so people can better navigate these excerpts. We think this is the real value of scite–showing context, not just numbers.

    I do think if a paper has a supporting citation and people can read the supporting statement it builds trust in a reader and perhaps they are more likely to trust it and cite it and vice versa for a contradicting cite. To ensure trust we show the snippet so anyone can read it and judge for themselves and the model numbers to provide some transparency on how the model scored the snippet.

    Hopefully, this answers your question and gives you some more context on our thinking. We’re certainly open to improving our approach and system and welcome feedback.



  4. This is a really nice invention and I hope it will develop further in the subsequent years.
    One more category I’d love to see among the three is ‘Builds upon’ or some other phrase that suggests that the cited paper was instrumental in the research that cites it (e.g. a paper describing an improved version of a synthetic procedure published previously). The citing context you announce above may cover for that, though.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: