Data Contracts in Action: Tools
Some people have asked me “Are data contracts really a thing?”, soon followed up by “What tools are available when using data contracts?”
To address these questions, let’s first take a step back and explore the potential use cases for data contracts. Once we understand their applications, we can dive into the tools that support them.
Use Cases
Press enter or click to view image in full size
When you use data contracts as your source of truth, all the previously mentioned use cases become standardized patterns across your data pipelines. By adopting a contract-first approach, you ensure that every aspect of your data pipeline is captured within the data contract. Any change to the pipeline, whether it’s related to schema, security, stakeholders, physical location (e.g., data stored in a specific Postgres table), data quality, SLAs, or other elements, must go through the data contract.
Get Peter Flook’s stories in your inbox
Join Medium for free to get updates from this writer.
Another key advantage of a contract-first approach is that it abstracts your data infrastructure from specific technologies. Whether you’re using a particular database, cloud provider, file format, message format, ETL tool, orchestrator, or data catalog, you can generate the necessary artifacts directly from the data defined in the data contract.
Press enter or click to view image in full size
But how to make this a reality? Words are cheap so let’s jump straight into the action and see what tools we can use to take this concept to production.
Tools
Generate, Export and Validate Contracts
- Data Contract CLI — import and export to different data sources
- Data Contract Playground — website to play around with creating, exporting or validating data contracts via datacontract-cli
Data Catalog
- Datahub — catalog of data contracts
- Data Mesh Manager — data contracts applied to a data mesh architecture
Data Quality
Testing
ETL
Schema Registry
Standard
The Open Data Contract Standard (ODCS) has introduced a standardized data contract format designed to provide a common framework for data practitioners. Similar to how the OpenAPI specification simplified life for API developers, ODCS aims to streamline many aspects of working with data.
Press enter or click to view image in full size
In an ideal world, if we had a standard data contract, our combined efforts could be pooled together into building, updating and maintaining a common toolset that everyone can benefit from. So why not? Who is holding you back?
Conclusion
Using data contracts with the right tools ensures accurate and consistent data across your organization. Whether it’s schema details, access permissions, or SLAs provided by data producers, all this information is centralised in one place. This creates a single source of truth, allowing teams to collaborate more effectively, avoid errors, and make informed decisions. As technology evolves, adopting data contracts helps your organisation stay adaptable and efficient, regardless of the tools or tech stack you use.
Other data contract articles I’ve written can be found here.
Feel free to let me know if there are any other tools I’ve missed that utilise data contracts, or if you can think of additional use cases where data contracts could be applied.
Thanks for reading!