The goal of this document is to detail the style guide and best practices for development of GA4GH specifications at the technical level.
We draw from experience developing several mature APIs across a range of GA4GH workstreams (mainly Discovery, Cloud, and Data Use and Researcher Identity), including Beacon, Beacon Network, Matchmaker Exchange, Search, Service Info, Service Registry, Workflow Execution Service, Data Repository Service, Tool Registry Service, Task Execution Service, Consent Codes, Automatable Discovery and Access Matrix, legacy Core APIs, and others.
TL;DR: REST.
First, you need to decide on the style of your API. People typically consider REST and gRPC. While there are alid reasons to choose gRPC, with performance being the most commonly cited one, we recommend you choose REST.
The main reasons include:
-
Compatibility.
The vast majority of GA4GH specifications are REST APIs, e.g. Beacon, Beacon Network, Service Info, Service Registry, Workflow Execution Service, Data Repository Service, Tool Registry Service, Task Execution Service.
-
Familiarity.
REST is a well established API style and virtually every developer is familiar with it
-
Ease of use.
Supporting tooling and libraries are available across all technology stacks.
Standards only make sense if they're well adopted, and familiarity and ease of use are critical in supporting developer adoption. We've learned this the hard way across a range of older GA4GH products, such as Beacon.
TL;DR: OpenAPI 3.
There are 4 formats that have been used to specify REST APIs in GA4GH - OpenAPI, JSON Schema, Protocol Buffers, and Avro IDL.
Some projects have gone through several of these formats over time. For example, Beacon started with Avro IDL, evolved to Protocol Buffers, and finally landed on OpenAPI. Major reasons for moving away from Avro IDL and Protocol Buffers included the facts that they're not great for specifying REST APIs (dealing with endpoints, requests/responses, and not leveraging the binary format offered by their respective technologies), and are not developer-friendly (steep learning curve, small user base). On top of that, JSON Schema and OpenAPI provide generally better tooling. As such, we advise against using these formats nowadays.
The choice between OpenAPI and JSON Schema comes down to the scope of your specification. JSON Schema is not good for specifying APIs, but excels at specifying data models. In fact, OpenAPI leverages JSON Schema for this reason. If your specification involves endpoints, use OpenAPI. If your specification contains only data models, use JSON Schema. Vast majority of GA4GH specifications use OpenAPI.
Once you decide to use OpenAPI, you have a choice between version 3, and older 2. Majority of GA4GH specifications use version 3, and we recommend you do the same. However, it should be noted that even though OpenAPI 3 was released mid-2017, as of mid-2019, this version of the specification sometimes has only experimental support common tooling (validation, compliance testing etc.). In general, that has not been a big issue for us, but keep the tooling in mind when making the decision.
TL;DR: YAML.
An OpenAPI document that conforms to the specification is itself a JSON object, which may be represented either in JSON or YAML format. YAML is a superset of JSON, is generally considered more human-readable, and has good extra features such as commenting, aliasing and anchoring. We consider readability and commenting very useful when specifying APIs, and recommend using YAML over JSON.
XML is considered legacy and is not recommended.
TL;DR: openapi.yaml
.
Most likely, your specification is going to consist of a single file. We recommend you name it openapi.yaml
. This is the common default name, and would allow you to run tools with their default setting.
If your specification consists of multiple files, prefer domain-specific names.
TL;DR: Travis CI with OAS Validator.
WIth the OAS specification being the main artifact you're delivering, it's important you test it continuously. We recommend setting up a CI solution and trigger builds on pull requests. Travis CI is the most common choice in GA4GH due to its ease of use.
At minimum, you should make sure your specification is valid. Several GA4GH specifications use OAS Validator, we recommend you do the same.
TL;DR: GitHub.
You should use a public repository on GitHub.
TL;DR: feature, bug, task, wontfix.
You'll probably need at least 3 kinds of labels: issue type, status and requester.
For issue types, we recommend starting with 3 basic labels: feature (new feature or request), bug (something isn't working), and task (a task not requiring code changes).
For statuses, we recommend to start with a single wontfix (this will not be worked on), to distinguish between closed issues that were resolved and rejected. Later on, you might choose to add more labels for finer-grained status notation, e.g. in progress.
Optionally, you might want to label your issues based on who requested it through various channels, which would typically reflect names of driver projects.
Consistency is good, and developers often contribute to many GA4GH repositories. You should use consistent issue names, descriptions and colour codes. See e.g. service-info labels.
TL;DR: SemVer releases, 1.0.0.
Having milestones reflecting releases as per semantic versioning is a good practice. You'll probably want to start with 1.0.0 as the first major release with backward compatibility commitment, to capture the specification at a point where it will be reviewed by the PRC. To pass product review, you'll need several implementations from the driver projects, who will continuously need a stable version of the specification to develop against. Start with the 0.1 milestone for your first usable spec and go from there as per semantic versioning.
TL;DR: Apache 2.0.
While GA4GH does not prescribe a particular license, practically all our projects use Apache 2.0. This will most likely suit your needs.
Include the license text in your repository like this.
TL;DR: SemVer.
Semantic versioning is used across GA4GH. It's a good practice to use 1.0.0 release as the target for PRC approval.
Tools you might find useful for your specifications:
This style guide has not gone through any formal approval process and is not mandatory in any way. Having said that, this style guide reflects what we consider good development practices, which have been validated over the years in the context of several GA4GH products. Consistency and interoperability are important - developers often work on several specifications across a range of workstreams, and GA4GH products live in the same ecosystem, where they need to interoperate. If you're looking to implement your standard in a consistent and interoperable way, please consider this document. For ease of use, we recommend linking to this README.md from the README.md in the repository of your specification.
- Documentation
- GitFLow
- Contributing rules
- Apache voting
- Property naming
- Styleguide hierarchy
- URI/URN/ID/contact
- Swagger tools
- Badges
- CI validation
- commit message convention
- default repository facets enabled
- repository naming
- API extensions
- Service types and API referral
- Service info
- Error representation