Skip to end of banner
Go to start of banner

DOI API Update [WIP] (Proposal) (2018)

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Work in progress, nothing on this page should be considered finalized at this point

Background & Motivation

See Meredith Slota (Unlicensed)'s 1-pager and the epic issue:

  PLFM-4063 - Getting issue details... STATUS

In summary, we should aim to do these now:

  1. Transition to a new DOI provider, DataCite (necessary, old provider is discontinuing service)
  2. Overhaul how we handle asynchronous requests to manage a DOI to match how we handle other asynchronous requests
  3. Overhaul how we handle metadata submission to comply with the new provider's standards
    1. At the moment, we submit a lot of "dummy' information with no way to change it, which has caused practical issues for users who utilize DOIs

And prepare to do the following in the future:

  1. Be able to easily extend the metadata interface
    1. Submission of metadata can be extended to include optional fields
    2. DataCite occasionally updates their metadata schema. If we can easily adjust to newer schemas as they are released, then we can more easily avert the risk of using a deprecated schema.

Current API + Notes for change

We should consider deprecating the DOI creation calls. We could pseudo-maintain until we decide to deprecate by just making the new API calls when these are called.

URL
HTTP Type
Description
Response ObjectNotes
/entity/{id}/doiPUTCreates a DOI for the specified entity. The DOI will associated with the most recent version where applicable.DoiDeprecate in favor of corresponding POST
/entity/{id}/version/{versionNumber}/doiPUTCreates a new DOI for the specified entity version.DoiDeprecate, ditto
/entity/{id}/doiGETGets the DOI status for the specified entity.DoiMaintain, as this will not require a call to another service and should be relatively quick
/entity/{id}/version/{versionNumber}/doiGETGets the DOI status for the specified entity version.DoiMaintain, ditto

Proposed API Changes

Object changes:
  • Creation of DoiMetadata object
    • This object abstracts DataCite's more complex and granular metadata API by including only fields that are required and cannot be automatically populated.
  • Extension of Doi object to include DoiMetadata

In addition to the preexisting API:

URLHTTP VerbDescriptionRequest ObjectResponse ObjectNotes
/entity/{id}/doi/async/startPOST

Asynchronously create or update a DOI. If the DOI does not exist, start a DOI creation job that will attempt to register a DOI with the DOI provider with the supplied metadata. If the DOI does exist, then it will simply update the DOI with the supplied metadata.

Note: The caller must have the ACCESS_TYPE.UPDATE permission on the Entity to make this call.

DoiMetadata

(application/json)

AsyncJobId

Shift the work to an asynchronous worker queue (as we have been doing with other asynchronous services)
/entity/{id}/doi/async/get/{asyncToken}GET

Asynchronously get the results of a DOI transaction started with POST /entity/{id}/doi/async/start

Note: When the result is not ready yet, this method will return a status code of 202 (ACCEPTED) and the response body will be a AsynchronousJobStatus object.

None

AsynchronousJobStatus

Doi


/entity/{id}/doi/metadataGETGet the metadata associated with a DOI Object, if it exists.NoneDoiMetadataCan be used to populate the metadata fields of an object that has a DOI, since that data is stored on DataCite

Note: the Doi object has a DoiStatus field, we need to evaluate how that should be handled with asynchronous workers (we can probably just deprecate that field).


Required Involvement and Timeline

Who needs to do what and when

Outside of our control

Datacite has yet to approve us and give us a registration account. This should happen soon, at which point we have ~3 months to shift to the new provider

Platform

  • Create, test, and implement a Datacite Java client that simplifies creating/updating DOIs and their metadata.
    • Should begin as soon as we agree upon the API
    • This can be done without coordination if we preserve existing behavior, but it would be much easier if we create this client intending to only support proposed and agreed-upon behavior.
    • Implementation can use test credentials until we are ready to switch to DataCite in prod
  • Create and route new API changes

Clients

  • Support asynchronous API + metadata submission
    • Can begin as soon as we agree upon the API
    • Can implement as soon as it is tested and implemented on backend

UX

  • User-facing design of DOI minting process and metadata submission

Questions that need Input

  • Should we permit creating DOIs for any object? Or just entities?
    • Shifting to support DOI non-entity objects is non-trivial but it would be easier to support them sooner than later
  • Schema enforcement
    • Should we force users to provide required metadata to mint a DOI?
      • Permit and submit no metadata (this is currently the only way to mint a DOI in Synapse)
        • In Datacite, it is only possible to mint a DOI without metadata with the temporary EZID bridge API i.e. it will likely not be possible to mint without metadata in the near future
      • Should we allow users to not supply required metadata? We could fill required fields with "mock" data. (For example, permit submitting a blank author field, and then the backend can submit  "(Author not available)" to Datacite as we currently do)
      • One required metadata field is ResourceTypeGeneral, which has specified categories for the type of resource a DOI refers to. Should we omit categories of resources that are likely not in Synapse? Like "Audiovisual" or "Physical Object". There is no technical benefit of excluding these fields.
    • Future feature expansion: which recommended/optional metadata fields should we permit or require?
      • Synapse could theoretically support all metadata fields, but for scope/UX reasons, maybe we shouldn't. Input from UX, users, anyone would be helpful.
  • Which fields should be immutable?
    • DOI ID
    • Publisher: "Synapse"
    • Publication Year?

Mockups

TBD


Internal Design/Implementation Notes

When we retrieve DOI data for existing Synapse entities (published to Datacite through EZID), the metadata is compliant with Datacite Schema 2.2 (the most recent version is 4.1). We can leave these alone, and force users to supply the metadata required to be compliant with 4.1 if they want to update the info. This way, we avoid running into issues if/when the old schema is deprecated.



  • No labels