About the Rich Citations API Alpha

Rich Citations is a PLOS Labs project adding metadata to citation data in scientific articles and stores this information in a centralized database. This alpha API allows you to access and scrape the Rich Citations database.

Project Code Repository

Terminology

For the definitions below, assume that you are reading a paper A, that contains in-text citations and references to other papers, including paper B.

Identifier formats

Getting the reference information for a paper

GET http://api.richcitations.org/v0/papers?uri=http%3A%2F%2Fdx.doi.org%2F10.1371%252Fjournal.pone.0000000

This returns JSON describing the paper and its references:

{
    "uri": "http://dx.doi.org/10.1371%2Fjournal.pone.0000000",
    "word_count": 4567,
    "references": { … },
    "bibliographic": { … },
    "citation_groups": [ … ]
}

You can also request JSONP with the same protocol. In this case an optional 'callback' parameter is accepted which defaults to 'jsonpCallback'

References

The references part of the JSON is a hash with the key being the unique identifier for the paper as cited in another paper and the fields.

"http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Doe1": {
    "uri": "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Doe1",
    "citing_id": "http://dx.doi.org/10.1371/journal.pone.0000000",
    "cited_id": "http://dx.doi.org/10.1234/1",
    "index": 1,
    "self_citation": false,
    "original_citation": "Doe J. (2000) Morbi vitae lorem blandit. Duis in lorem interdum. 14: 11–18.",
    "bibliographic": { … },
    "citation_groups": [ … ]
}

We distinguish between a paper and a reference. A paper is identified by a URI, e.g. http://dx.doi.org/10.1371/journal.pone.0000000. A reference is identified by a different URI. For PLOS papers this is the citing paper A with an anchor link, e.g.: http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Doe1. In the reference metadata the id field identifies the reference, while citing_id identifies the citing paper A and the cited_id fields identifies the cited paper B.

Bibliographic metadata

We use citeproc-json as our format for bibliographic metadata, stored in the bibliographic fields above.

In addition to the fields defined in the above document, we also include information about Paper B's license in a license field. This field is a hash with a url field that links to information about the license.

{
    "url": "http://creativecommons.org/licenses/by/3.0/"
}

Citation groups

A citation group describes a group of citations in a paper. For PLOS, a citation's URL includes the author last name after a hyphen and is of the form:

{
    "references": [
        "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-PLOS1",
        "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-PLOS3"
    ],
    "context": {
        "ellipses_before": true,
        "text_before": "non tempor nisi, sed blandit enim. Nam a tortor sapien",
        "citation": "[1, 2]",
        "text_after": ". Praesent felis lorem, dignissim ac diam quis, bibendum vehicula leo.",
        "ellipses_after": false
    },
    "section": "Introduction",
    "word_position": 23
    }
}

Each paper containing reference metadata will have an array of citation groups in the citation_groups field.

(Mostly) full example

GET http://api.richcitations.org/v0/paper?uri=http%3A%2F%2Fdx.doi.org%2F10.1371%2Fjournal.pone.0000000

{
    "uri": "http://dx.doi.org/10.1371/journal.pone.0000000",
    "word_count": 4567,
    "bibliographic": {
        "source": "CrossRef",
        "type": "journal-article",
        "title": "Quisque congue massa",
        "page": "1-8",
        "reference-count": 2,
        "container-title": "PLOS One",
        "author": [
            {
                "given": "John",
                "family": "Doe"
            }
        ],
        "issued": {
            "date-parts": [
                [
                     2013
                ]
            ]
        }
    },
    "references": {
        "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Doe1": {
            "uri": "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Doe1",
            "citing_id": "http://dx.doi.org/10.1371/journal.pone.0000000",
            "cited_id": "http://dx.doi.org/10.1234/1",
            "index": 1,
            "self_citation": false,
            "original_citation": "Doe J. (2000) Morbi vitae lorem blandit. Duis in lorem interdum. 14: 11–18.",
            "bibliographic": { … },
            "citation_groups": [ (see below) ],
        },
        "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Roe1": {
            "uri": "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Roe1",
            "citing_id": "http://dx.doi.org/10.1371/journal.pone.0000000",
            "cited_id": "http://dx.doi.org/10.1234/1",
            "index": 1,
            "self_citation": false,
            "original_citation": "Roe J. (2000) Maecenas imperdiet leo ut bibendum auctor. Vivamus mollis. 88: 1012–22.",
            "bibliographic": { … },
            "citation_groups": [ (see below) ],
        }
    },
    "citation_groups": [
        {
            "references": [
                "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Doe1",
                "http://dx.doi.org/10.1371/journal.pone.0000000#pone.0000000-Roe1"
            ],
            "context": {
                "ellipses_before": true,
                "text_before": "non tempor nisi, sed blandit enim. Nam a tortor sapien",
                "citation": "[1, 2]",
                "text_after": ". Praesent felis lorem, dignissim ac diam quis, bibendum vehicula leo.",
                "ellipses_after": false
            },
            "section": "Introduction",
            "word_position": 23
        }
    ]
}

CSV

Rich Citations also supports output in CSV format as an experimental feature. To retrieve information about an articles references in CSV format, try:

http://api.richcitations.org/papers?random=1&format=csv

This will output one line for each citation in the paper, in the order in which each appears in the text. This data is highly redundant but is useful for doing analysis.

If you only want to retrieve the URIs of the citing papers, you can request fields=uri.

CSV Citation Graph

You can also retrieve a complete citation graph by accessing the following URL:

http://api.richcitations.org/papers?format=csv&fields=citegraph&all=t

Because is a very large file (~400 MB and 4 million lines). You will be required to have an API key to access this.

API Key

Please write to to request an API key.