Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We should consider deprecating the DOI creation calls. We could pseudo-maintain until we decide to deprecate by just making the new API calls when these are called.


URL
HTTP Type
Description
Response ObjectNotes
/entity/{id}/doiPUTCreates a DOI for the specified entity. The DOI will associated with the most recent version where applicable.DoiDeprecate in favor of corresponding POST
/entity/{id}/version/{versionNumber}/doiPUTCreates a new DOI for the specified entity version.DoiDeprecate, ditto
/entity/{id}/doiGETGets the DOI status for the specified entity.DoiMaintain, as this will not require a call to another service and should be relatively quick
/entity/{id}/version/{versionNumber}/doiGETGets the DOI status for the specified entity version.DoiMaintain, ditto


Proposed API Changes

To be revised

Object changes:
  • Creation of DoiMetadata object (and other child objects contained in DoiMetadata)
    • This object abstracts is a direct mapping of the most recent version (v4.1) of DataCite's more complex and granular metadata API by including only fields that are required and cannot be automatically populated.
    • This object is easily extensible to further support DataCite's metadata schema to include optional fields or the introduction of new required fields
    Doi
    • metadata schema, simplified to contain only required fields and a small amount of curated optional fields. This ensures that we don't need to deprecate our API if we wish to support more of their optional metadata fields. If we do wish to support new optional metadata fields, we can easily extend our object.
    • Similarly, this is likely to simplify future transitions if DataCite deprecates the schema that we configure to use.

Image Added

  • The existing Doi object and DoiMetadata are proposed to be uncoupled because:
    • We store the data in Doi objectsDTOs; it allows us to quickly identify if a DOI has been registered and report that to the client
    • We do NOT store the data in the DoiMetadata objects; this is stored by the DOI provider and retrieved when necessary
      • Caching this data seems unintuitive; retrieving this data should only be expected when a user considers updating it (see notes for GET in table below), which requires the external service to be available anyways.

Image Removed

DataCite-imposed constraints:

  • There cannot be more than 8000-10000 creators
  • Publication year must be in 'YYYY' format (regex: /[\d]{4}/)
  • The creators should be There must be at least one creator
    • Each creator must have a creatorName that is at least 1 character long
    • nameIdentifier is not required, but if an identifier is provided, the scheme must also be provided
  • There must be at least one title
    • The title should be at least one character long
  • There must be a resourceTypeGeneral

In addition to the preexisting API:

URLHTTP VerbDescriptionRequest ObjectResponse ObjectNotes
/entity/{id}/doi/async/startPOST

Asynchronously create or update a DOI. If the DOI does not exist, start a DOI creation job that will attempt to register a DOI with the DOI provider with the supplied metadata. If the DOI does exist, then it will simply update the DOI with the supplied metadata.

Note: The caller must have the ACCESS_TYPE.UPDATE permission on the Entity to make this call.

DoiMetadata

(application/json)

AsyncJobId

Shift the work to an asynchronous worker queue (as we have been doing with other asynchronous services).

We combine the create and update calls because they require the same information . The workflow for the business logic required to register and update a DOI with DataCite is similarand are both idempotent.

If no DoiMetadata object is provided, we may choose to submit "N/A" fields (pending discussion on if this is appropriate)

/entity/{id}/version/{versionNumber}/ doi/async/start

POST

Ditto;

For a specific entity version

DoiMetadataAsyncJobIdDitto
/entity/{id}/doi/async/get/{asyncToken}GET

Asynchronously get the results of a DOI transaction started with POST /entity/{id}/doi/async/start

Note: When the result is not ready yet, this method will return a status code of 202 (ACCEPTED) and the response body will be a AsynchronousJobStatus object.

None

AsynchronousJobStatus

Doi

After the job completes, this should be identical in function to the existing GET calls.
/entity/{id}/version/{versionNumber}/ doi/async/get/{asyncToken}GET

Ditto;

For a specific entity version

None

AsynchronousJobStatus

Doi

Ditto
/entity/{id}/doi/metadataGET

Get the metadata associated with a DOI Object, if it exists.

Note: The caller must have the ACCESS_TYPE.UPDATE permission on the Entity to make this call.

NoneDoiMetadata

Can be used to populate the metadata fields of an object that has a DOI, since that data is stored on DataCite.

We should restrict this to users that can update the data because it should only be used for update purposeswhen considering updating the metadata; if an unprivileged user wants to retrieve the metadata for an object, they should could use the DOI provider's public API.

...

  • Should we permit creating DOIs for any object? Or just entities?
    • Shifting to support DOI non-entity objects is non-trivial but it would be easier to support them sooner than later
  • Schema enforcement
    • Should we force users to provide required metadata to mint a DOI?Permit and submit no metadata (this is currently the only way to mint a DOI in Synapse)In Datacite, it is only possible to mint a DOI without metadata with the temporary EZID bridge API i.e. it will likely not be possible to mint without metadata in the near futureShould we allow users to not supply required metadata? We could fill required fields with "mock" data. (For example, permit submitting a blank author field, and then the backend can submit  "(Author not available)" to Datacite as we currently do)
      • One required metadata field is ResourceTypeGeneral, which has specified categories for the type of resource a DOI refers to. Should we omit categories of resources that are likely not in Synapse? Like "Audiovisual" or "Physical Object". There is no technical benefit of excluding these fields.
    • Future feature expansion: which recommended/optional metadata fields should we permit or require?
      • Synapse could theoretically support all metadata fields, but for scope/UX reasons, maybe we shouldn't. Input from UX, users, anyone would be helpful.
  • Which fields should be immutable? 
    • DOI ID (this can actually be retrieved from the API call rather than the request body, so the client doesn't need to worry about this)
    • Publisher: "Synapse"
    • Publication Year?
    • Do we hide these from the client, or just automatically overwrite them if they try to change them?

...

When we retrieve DOI data for existing Synapse entities (published to Datacite through EZID), the metadata is compliant with adheres to schemas as old as Datacite Schema 2.2 (the most recent version is 4.1). We can leave these alone, and force users to supply the metadata required to be compliant with 4.1 if they want to update the info. This way, we avoid running into issues if/when the old schema is deprecated.

...