TypeScript SDK Generation Using the Synapse OpenAPI Definition
The Synapse backend repository now publishes an OpenAPI definition that describes how a programmatic client interacts with the Synapse REST API. We aim to use the definition to generate TypeScript type definitions of the object models used by Synapse, as well as generate an HTTP client that handles issuing requests to the REST API endpoints, and uses the generated models. This is valuable because it reduces the amount of boilerplate code that client developers have to spend time writing, maintaining, and fixing when errors are inadvertently introduced.
Summary
While it is possible to generate a TypeScript SDK using the Synapse OpenAPI definition, key issues prevent the SDK from being more useful than manually writing and maintaining the mostly-boilerplate code required to programmatically interact with the Synapse REST API. The usability concerns relate to enhancing code quality and developer ease-of-use. Certain solutions to these issues may require changes to our OpenAPI translator, while other required changes may include curating unique identifiers for our backend controller methods.
Background
The OpenAPI Specification describes how an OpenAPI definition document should be written. When a document properly follows the specification, there are many tools that can be used to process the definition for various purposes, such as validation, documentation generation, and client SDK generation. This document shares findings from attempts to use the definition to generate a TypeScript SDK to use in our frontend web applications.
The Synapse REST OpenAPI definition is generated using a custom translator that builds the definition by referencing the Spring Controller implementations. Now that the definition is published, client developers can build and use tools that consume the definition. For more information about the Synapse OpenAPI definition, see - PLFM-7768Getting issue details... STATUS .
Choosing an SDK Generator
There are many SDK generators for the OpenAPI specification. A list can be found here. In general, my criteria for choosing an SDK generator are
Regularly maintained
Support for more recent versions of the OpenAPI Specification
Prefer software that does not require sign-up or payment, avoid SaaS offerings
One of the most popular and most versatile options in that list is the OpenAPI Generator, which is highly configurable, supports many languages, has powerful template customization, and has an active GitHub community. This investigation has utilized the typescript-fetch generator in this project.
OpenAPI Generator also has an NPM shim that makes it easy to write a package in the synapse-web-monorepo that handles the entire codegen pipeline, and publish our generated client to the NPM repository.
Some generators that I tried and opted against using:
Kiota - Microsoft’s OpenAPI SDK generator does not yet support oneOf/allOf
Autorest - Also developed by Microsoft. Autorest was developed for Azure and it seems the general focus of maintenance will move to Kiota. Additionally, it seemed to be less configurable, so it would be more difficult to overcome certain challenges (outlined later) compared to the OpenAPI Generator.
NSwag - Popular C#/TypeScript client generator, challenging to configure on non-Windows machines and no official Docker image
Issues with Generated SDK
The following are some of the issues I have encountered and describe how each issue affects the usability of the generated SDK. I will indicate potential solution for each issue, and link to Jira issues that track relevant work.
Definition does not indicate required fields
Required fields are not indicated, even when they are semantically required. For example org.sagebionetworks.repo.model.principal.PrincipalAliasResponse
always returns a principalId
. However, principalId
is not indicated as required in the specification.
The generated TypeScript interface indicates that the principalId
field may be of type number
or undefined
:
export interface PrincipalAliasResponse {
/**
*
* @type {number}
* @memberof PrincipalAliasResponse
*/
principalId?: number;
}
When client code handles a PrincipalAliasResponse
object, we have to do an unnecessary null check on principalId
to use it as a number. This is a common issue across many defined object types in the specification.
I think the only way to solve this is to manually (and gradually) add required
properties to our schema models in lib-auto-generated
, and ensure the required
designation is included in the translated OpenAPI definition.
This may be more challenging to solve for types used in both requests and responses where certain fields may not be provided in requests to create a resource, but may always be required/provided in other contexts (e.g. object IDs generated by the system).
Without a solution, the generated TypeScript models would more challenging to use than our manually curated models.
Excessively long method names for API calls
The generator creates a method to send a request to the server for each (e.g. GET /repo/v1/entity/{id}
) causes the generator to create a operationId
also create request methods with excessively long names. GET /repo/v1/entity/{id}/table/transaction/async/get/{asyncToken}
has operationId get-/repo/v1/entity/{id}/table/transaction/async/get/{asyncToken}
. The generator creates the corresponding method getRepoV1EntityIdTableTransactionAsyncGetAsyncToken
. We cannot strip the /(repo|file|auth)/v1/
path from all operationId
s because removal of that substring leads to some collisions.
A few possible solutions:
Identify a new programmatic scheme for generating unique
operationId
valuesManually curate
operationId
values in the controllers and include the value in the translator. This is not feasible to do all at once, so these would be incrementally curated over time. The translator should also check theoperationId
s for uniqueness
The TypeScript client may be usable without a solution to this problem, but attempts to solve this problem will almost certainly result in breaking changes for generated client code.
Generated instanceOf...
methods are not reliable
Generated instanceOf<Model>
methods do not respect concreteType
. The typescript-fetch
generator creates currently only checks fields that are required
.
As a partial solution I have been able to override a template in the typescript-fetch
generator which can validate the values of required enumerations.
The OpenAPI definition would require these changes:
The
concreteType
property must be listed asrequired
in every object which includes itThe
concreteType
definition must be defined as an enumeration, where the concrete type value is the sole valid enumeration value, e.g. theorg.sagebionetworks.repo.model.Foo
schema should contain:{ "properties": { "concreteType": { "type": "string", "enum": ["org.sagebionetworks.repo.model.Foo"] } } }
The OpenAPI specification does not allow the JSON Schema keyword const
, which may seem like a more natural fit here. A single enum
value is equivalent to a const
value, and is permitted by the OpenAPI specification.
Solving this issue is necessary to use the generated client, because our client code MUST be able to easily identify the concrete type of a data object, especially for polymorphic types.
Polymorphic types are missing the discriminator
keyword
Synapse uses polymorphism for various types across the system. One example of this is the FileHandle
interface. The current FileHandle
definition is as follows:
{
"org.sagebionetworks.repo.model.file.FileHandle": {
"type": "object",
"properties": {
"id": { "type": "string" },
"etag": { "type": "string" },
"createdBy": { "type": "string" },
"createdOn": { "type": "string" },
"modifiedOn": { "type": "string" },
"concreteType": { "type": "string" },
"contentType": { "type": "string" },
"contentMd5": { "type": "string" },
"fileName": { "type": "string" },
"storageLocationId": {
"type": "integer",
"format": "int32"
},
"contentSize": {
"type": "integer",
"format": "int32"
},
"status": { "type": "string" }
},
"description": "The FileHandle interface defines all of the fields that are common to all implementations.",
"oneOf": [
{
"$ref": "#/components/schemas/org.sagebionetworks.repo.model.file.ExternalObjectStoreFileHandle"
},
{
"$ref": "#/components/schemas/org.sagebionetworks.repo.model.file.GoogleCloudFileHandle"
},
{
"$ref": "#/components/schemas/org.sagebionetworks.repo.model.file.ProxyFileHandle"
},
{
"$ref": "#/components/schemas/org.sagebionetworks.repo.model.file.ExternalFileHandle"
},
{
"$ref": "#/components/schemas/org.sagebionetworks.repo.model.file.S3FileHandle"
}
]
}
}
When deserializing JSON fetched from the API, the generated client does not know which implementation of FileHandle
to use. The generated TypeScript code attempts to include properties from all potential implementations. This is acceptable in TypeScript, but may not work for other languages.
If we append the discriminator
property to the model schema, generators should be able to identify implementation schemas based on concreteType
:
The discriminator
property may also include a mapping
object that maps discriminator values to model IDs. This may be unnecessary for the Synapse REST API because the concreteType
discriminator values are identical to the IDs of the corresponding model schemas.
I do not think that this issue must be solved to use a generated TypeScript client, but that may change as we attempt to use the client in more complicated scenarios. Resolution of this issue may be required to generate client code for other languages.
Resolved issues
This section describes a issues I encountered, but identified solutions that do not rely on changes to the OpenAPI definition provided by the translator.
Excessively Long Model Interfaces
The current model names lead to having TypeScript interface names that are excessively long. Models use the Java canonical name, such as org.sagebionetworks.repo.model.RestrictionInformationRequest
. The generator creates a corresponding model called OrgSagebionetworksRepoModelRestrictionInformationRequest
.
One possible solution is to use openapi-generator’s model-name-mapping argument to replace the model names with the shortest unique name for each model. The shortest unique name for each model can be determined programmatically in the project that will run the generator, for example, if the system contained the following three models, we could map them to the following model names:
Canonical name | Unique model name mapping result |
---|---|
org.sagebionetworks.model.abc.Foo | Foo |
org.sagebionetworks.model.abc.Bar | abc.Bar |
org.sagebionetworks.model.xyz.Bar | xyz.Bar |
This mapping should be sufficient for client SDK usage, and no change would be needed in the API or controller-to-specification translator.
Retry logic for Errors
Our manually-curated Synapse TypeScript client has built-in refetching logic based on the status code returned in the HTTP response. For example, if the service returns a 400 (Bad Request) error, we do not retry the request. If the service returns a 429 (Too Many Requests), 502 (Bad Gateway), 503 (Service Unavailable), or 504 (Gateway Timeout), then we retry the request with exponential backoff.
The generated client allows overriding the fetch implementation with any API-compatible drop-in. Our retry logic is already written to work this way, so we can use it in the client configuration generated at runtime.
Changing the endpoint to staging, dev, local instances of the backend
For the client to work with other backend stacks (staging, dev, local), the client must be configurable so requests can be sent to other endpoints.
The runtime configuration allows overriding a property called basePath
, which provides this functionality.
Asynchronous job polling
The web client often has to monitor the status of asynchronous jobs, such as table queries, table updates, creation of DOIs, and many other features in Synapse. The web client accomplishes this by polling services that provide the status of an asynchronous job.
We can likely utilize our existing logic for polling asynchronous job status to use the generated client instead of our existing implementation.