Introduction
Commitspark is a set of tools to manage structured data with Git through a GraphQL API.
This library provides the GraphQL API that allows reading and writing structured data (entries) from and to a Git repository.
Queries and mutations offered by the API are determined by a standard GraphQL type definition file (schema) inside the Git repository.
Entries (data) are stored using plain YAML text files in the same Git repository. No other data store is needed.
Installation
There are two common ways to use this library:
-
By making GraphQL calls directly to the library as a code dependency in your own JavaScript / TypeScript / Node.js application.
To do this, simply install the library with
npm i @commitspark/graphql-api
-
By making GraphQL calls over HTTP to this library wrapped in a webserver or Lambda function.
You can find an example Node.js Express web server implementation here.
Installing Git provider support
This library is agnostic to where a Git repository is stored and relies on separate adapters for repository access. To access a Git repository, use one of the pre-built adapters listed below or build your own using the interfaces in this repository.
| Adapter | Description | Install with |
|---|---|---|
| GitHub | Provides support for Git repositories hosted on github.com | npm i @commitspark/git-adapter-github |
| GitLab (SaaS) | Provides support for Git repositories hosted on gitlab.com | npm i @commitspark/git-adapter-gitlab |
| Filesystem | Provides read-only access to files on the filesystem level | npm i @commitspark/git-adapter-filesystem |
Building your GraphQL API
Commitspark builds a GraphQL data management API with create, read, update, and delete (CRUD) functionality that is solely driven by data types you define in a standard GraphQL schema file in your Git repository.
Commitspark achieves this by extending the types in your schema file at runtime with queries, mutations, and additional helper types.
Let's assume you want to manage information about rocket flights and have already defined the following simple GraphQL schema in your Git repository:
# commitspark/schema/schema.graphql directive @Entry on OBJECT type RocketFlight @Entry { id: ID! vehicleName: String! payloads: [Payload!] } type Payload { weight: Int! }
At runtime, when sending a GraphQL request to Commitspark, these are the queries, mutations, and helper types that are added by Commitspark to your schema for the duration of request execution:
schema { query: Query mutation: Mutation } type Query { everyRocketFlight: [RocketFlight!] RocketFlight(id: ID!): RocketFlight _typeName(id: ID!): String } type Mutation { createRocketFlight(id: ID!, data: RocketFlightInput!, commitMessage: String): RocketFlight updateRocketFlight(id: ID!, data: RocketFlightInput!, commitMessage: String): RocketFlight deleteRocketFlight(id: ID!, commitMessage: String): ID } input RocketFlightInput { vehicleName: String! payloads: [PayloadInput!] } input PayloadInput { weight: Int! }
Making GraphQL calls
Let's now assume your repository is located on GitHub and you want to query for a single rocket flight.
The code to do so could look like this:
import { createAdapter } from '@commitspark/git-adapter-github' import { createClient } from '@commitspark/graphql-api' const gitHubAdapter = createAdapter({ repositoryOwner: process.env.GITHUB_REPOSITORY_OWNER, repositoryName: process.env.GITHUB_REPOSITORY_NAME, accessToken: process.env.GITHUB_ACCESS_TOKEN, }) const client = await createClient(gitHubAdapter) const response = await client.postGraphQL( process.env.GIT_BRANCH ?? 'main', { query: `query ($rocketFlightId: ID!) { rocketFlight: RocketFlight(id: $rocketFlightId) { vehicleName payloads { weight } } }`, variables: { rocketFlightId: 'VA256', } }, ) const rocketFlight = response.data.rocketFlight // ...
Technical documentation
createClient()
This function is used to create a Commitspark GraphQL API client instance.
Argument gitAdapter expects a Commitspark git adapter instance which is then used by the client to access the
adapter's Git repository.
Client
postGraphQL()
This function is used to make GraphQL requests.
Request execution is handled by ApolloServer behind the scenes.
Argument request expects a conventional GraphQL query and supports query variables as well as introspection.
getSchema()
This function allows retrieving the GraphQL schema extended by Commitspark as a string.
Compared to schema data obtained through GraphQL introspection, the schema returned by this function also includes directive declarations and annotations, allowing for development of additional tools that require this information.
Picking from the Git tree
As Commitspark is Git-based, all GraphQL requests support traversing the Git commit tree by setting the ref argument
in library calls to a
- ref (i.e. commit hash),
- branch name, or
- tag name (light or regular)
This enables great flexibility, e.g. to use branches in order to enable data (entry) development workflows, to retrieve a specific (historic) commit where it is guaranteed that entries are immutable, or to retrieve entries by tag such as one that marks the latest reviewed and approved version in a repository.
Writing data
Mutation operations work on branch names only and (when successful) each append a new commit on HEAD in the given branch.
To guarantee deterministic results, mutations in calls with multiple mutations are processed sequentially (see the official GraphQL documentation for details).
Data model
The data model (i.e. schema) is defined in a single GraphQL type definition text file using the GraphQL type system.
The schema file must be located at commitspark/schema/schema.graphql inside the Git repository (unless otherwise
configured in your Git adapter).
Commitspark currently supports the following GraphQL types:
typeunionenum
Data entries
To denote which data is to be given a unique identity for referencing, Commitspark expects type annotation with
directive @Entry:
directive @Entry on OBJECT # Important: You must declare this for your schema to be valid type MyType @Entry { id: ID! # Important: Any type annotated with `@Entry` must have such a field # ... }
Note: As a general guideline, you should only apply @Entry to data types that meet one of the following
conditions:
- You want to independently create and query instances of this type
- You want to reference or link to an instance of such a type from multiple other entries
This keeps the number of entries low and performance up.
Entry storage
Entries, i.e. instances of data types annotated with @Entry, are stored as .yaml YAML text files inside
folder commitspark/entries/ in the given Git repository (unless otherwise configured in your Git adapter).
The filename (excluding file extension) constitutes the entry ID.
Entry files have the following structure:
metadata: type: MyType # name of type as defined in your schema referencedBy: [ ] # array of entry IDs that hold a reference to this entry data: # ... fields of the type as defined in your schema
Serialization / Deserialization
References
References to types annotated with @Entry are serialized using a sub-field id.
For example, consider this variation of our rocket flight schema above:
directive @Entry on OBJECT type RocketFlight @Entry { id: ID! operator: Operator } type Operator @Entry { id: ID! fullName: String! }
An entry YAML file for a RocketFlight with ID VA256 referencing an Operator with ID Arianespace will look
like this:
# commitspark/entries/VA256.yaml metadata: type: RocketFlight referencedBy: [ ] data: operator: id: Arianespace
The YAML file of referenced Operator with ID Arianespace will then look like this:
# commitspark/entries/Arianespace.yaml metadata: type: Operator referencedBy: - VA256 data: fullName: Arianespace SA
When this data is deserialized, Commitspark transparently resolves references to other @Entry instances, allowing for
retrieval of complex, linked data in a single query such as this one:
query { RocketFlight(id: "VA256") { id operator { fullName } } }
This returns the following data:
{
"id": "VA256",
"operator": {
"fullName": "Arianespace SA"
}
}Unions
In our rocket example, let's assume we want to store information about a rocket's stages. Assuming there are two
different types of rocket motors for a rocket stage, a stage could be modeled as a GraphQL union type Stage, allowing
different concrete types LiquidRocketMotor or SolidRocketMotor to be added to a rocket's stages list:
directive @Entry on OBJECT type Rocket @Entry { id: ID! stages: [Stage!]! } union Stage = | LiquidRocketMotor | SolidRocketMotor type LiquidRocketMotor { fuelTemperature: Int! } type SolidRocketMotor { fuelMass: Int! }
During serialization, concrete type instances are represented through an additional nested level of data, using the concrete instance's type name as field name:
# commitspark/entries/VA256.yaml metadata: type: Rocket referencedBy: [ ] data: stages: - LiquidRocketMotor: fuelTemperature: 21 - SolidRocketMotor: fuelMass: 200000
We can now query for a rocket and its stages like this:
query { Rocket(id: "VA256") { id stages { __typename ... on LiquidRocketMotor { fuelTemperature } ... on SolidRocketMotor { fuelMass } } } }
This returns the following schema-conformant result data where the additional level of nesting has been transparently removed:
{
"id": "VA256",
"stages": [
{
"__typename": "LiquidRocketMotor",
"fuelTemperature": 21
},
{
"__typename": "SolidRocketMotor",
"fuelMass": 200000
}
]
}Error handling
Instead of throwing errors, this library catches known error cases and returns error information for GraphQL calls via
the errors response field. The type of error is indicated in error field extensions.code, with additional
information in error field extensions.commitspark (where available). This allows API callers to determine the cause of
errors and take appropriate action.
Example GraphQL response with error:
{
"errors": [
{
"message": "No entry with ID \"SOME_UNKNOWN_ID\" exists.",
"extensions": {
"code": "NOT_FOUND",
"commitspark": {
"argumentName": "id",
"argumentValue": "SOME_UNKNOWN_ID"
}
}
}
]
}The following error codes are returned together with error codes of Git adapters as documented here:
| Error code | Description |
|---|---|
BAD_USER_INPUT |
Invalid input data provided by the caller |
NOT_FOUND |
Requested resource (entry, type, etc.) does not exist |
BAD_REPOSITORY_DATA |
Data in the repository is malformed or invalid according to schema |
BAD_SCHEMA |
Schema definition is malformed or invalid |
IN_USE |
Entry cannot be deleted because it is referenced by other entries |
INTERNAL_ERROR |
Internal processing error |
License
The code in this repository is licensed under the permissive ISC license (see LICENSE).