...
URL | HTTP Type | Description | Response Object | Notes |
---|---|---|---|---|
/entity/{id}/doi | PUT | Creates a DOI for the specified entity. The DOI will associated with the most recent version where applicable. | Doi | Deprecate in favor of corresponding POST |
/entity/{id}/version/{versionNumber}/doi | PUT | Creates a new DOI for the specified entity version. | Doi | Deprecate, ditto |
/entity/{id}/doi | GET | Gets the DOI status for the specified entity. | Doi | Maintain, as this will not require a call to another service and should be relatively quick |
/entity/{id}/version/{versionNumber}/doi | GET | Gets the DOI status for the specified entity version. | Doi | Maintain, ditto |
Proposed API Changes
To be revised
Object changes:
Creation of DoiMetadata objectThis object abstracts DataCite's more complex and granular metadata API by including only fields that are required and cannot be automatically populated.This object is easily extensible to further support DataCite's metadata schema to include optional fields or the introduction of new required fields
Doi and DoiMetadata are proposed to be uncoupled becauseWe store the data in Doi objects; it allows us to quickly identify if a DOI has been registered and report that to the clientWe do NOT store the data in the DoiMetadata objects; this is stored by the DOI provider and retrieved when necessaryCaching this data seems unintuitive; retrieving this data should only be expected when a user considers updating it (see notes in table below), which requires the external service to be available anyways.
DataCite-imposed constraints:
- There cannot be more than 8000-10000 creators
- Publication year must be in 'YYYY' format (regex:
/[\d]{4}/
) - The creators should be at least 1 character long
- The title should be at least one character long
...
Note: the Doi object has a DoiStatus field, we need to evaluate how that should be handled with asynchronous workers (we would probably just deprecate that field in favor of using AsynchronousJobStatus).
Required Involvement and Timeline
...
Internal design notes
This section contains notes about how we plan to interface with DataCite, and what goes on under-the-hood to register/update a DOI
API Choice
DataCite has two APIs that we can use. They have a standard "MDS" API that they recommend for users, and they have a new (but also seemingly temporary) EZ API that is designed for orgs like us who are transitioning from EZID. For the sake of not having to do more work later, we are opting to not use DataCite's temporary EZ API that is designed to mock the EZID API. Instead, we will be using their standard MDS API, as we would need to transition to it eventually anyways.
HTTP Client
The current EZID client interfaces with EZID using now-deprecated implementations of Apache's HTTP client. We will replace this client with a new client that will use our SimpleHttpClient to make requests to the DataCite MDS API.
CRUD Workflows
With the MDS API the basic workflow to create a DOI is to
- (POST) Register metadata (including the DOI symbol e.g. 10.####/syn01234)
- (PUT) Register the DOI symbol and tie it to a URL (synapse.org/#!Synapse:syn01234)
Simply updating the metadata requires just step 1. Both of the above calls are idempotent, so we can combine create and update calls and simply treat the implementation as a create. This would simplify the implementation, though an unnecessary outgoing PUT call would made when existing DOIs are updated.
Retrieval and Conversion of Metadata
DataCite requires that new DOIs have associated metadata that adheres to a schema that they revise occasionally. The current version of their schema is v4.1 (Oct 2017). Metadata that we register through EZID is adherent to v2.2 (Jul 2011). It is unclear if/when DataCite will deprecate v2.2 and no longer accept it. In another attempt to future-proof our DOI minting service, we will only submit metadata adherent to v4.1.
As a result of this, we must be able to retrieve metadata adherent to schemas 2.2 and 4.1 in order for the client to update it. We can create a translator tool to convert data from both schemas to an intermediate object (see DoiMetadata above) that can hold the appropriate metadata. The client can retrieve and submit this object by interfacing with our API
Required Involvement and Timeline
Outside of our control
Datacite DataCite has yet to approve us and give us a registration account. This should happen soon, at which point we have ~3 months to shift to the new provider
...
- Should we permit creating DOIs for any object? Or just entities?
- Shifting to support DOI non-entity objects is non-trivial but it would be easier to support them sooner than later
- Schema enforcement
- Should we force users to provide required metadata to mint a DOI?
Permit and submit no metadata(this is currently the only way to mint a DOI in Synapse)- In Datacite, it is only possible to mint a DOI without metadata with the temporary EZID bridge API i.e. it will likely not be possible to mint without metadata in the near future
- Should we allow users to not supply required metadata? We could fill required fields with "mock" data. (For example, permit submitting a blank author field, and then the backend can submit "(Author not available)" to Datacite as we currently do)
- One required metadata field is ResourceTypeGeneral, which has specified categories for the type of resource a DOI refers to. Should we omit categories of resources that are likely not in Synapse? Like "Audiovisual" or "Physical Object". There is no technical benefit of excluding these fields.
- Future feature expansion: which recommended/optional metadata fields should we permit or require?
- Synapse could theoretically support all metadata fields, but for scope/UX reasons, maybe we shouldn't. Input from UX, users, anyone would be helpful.
- Should we force users to provide required metadata to mint a DOI?
- Which fields should be immutable?
- DOI ID (this can actually be retrieved from the API call rather than the request body, so the client doesn't need to worry about this)
- Publisher: "Synapse"
- Publication Year?
- We can make these "immutable" by not including these objects in the DoiMetadata body. The user cannot see them nor modify them, and they are handled entirely by the backend.Do we hide these from the client, or just automatically overwrite them if they try to change them?
Mockups
TBD
...
Misc. Notes
When we retrieve DOI data for existing Synapse entities (published to Datacite through EZID), the metadata is compliant with Datacite Schema 2.2 (the most recent version is 4.1). We can leave these alone, and force users to supply the metadata required to be compliant with 4.1 if they want to update the info. This way, we avoid running into issues if/when the old schema is deprecated.
...