The Astronomer Blog: Data Orchestration Insights & Guides

12 min read Original article ↗

The latest insights from our team of Apache Airflow® experts.

Better together: Astro CLI and Astro IDE

Last week, the first-ever in-person Airflow Summit occurred in Toronto, Canada. Over 500 attendees from 20+ countries came together for all things Airflow, orchestration, and open source.

What is Data Observability?

A guide to understanding data observability, why it matters, and how modern teams are using it to deliver trusted, reliable data at scale.

Best practices for writing Airflow Dags with AI

Learn how to effectively use AI tools like the Astro IDE to write maintainable Airflow DAGs by providing proper context, using structured prompting techniques, and enforcing governance standards rather than relying on "vibe coding."

Astro IDE in Action: How Data Teams Are Accelerating With AI

Since launching in private preview earlier this year and over the course of the public preview, design partners across industries – from automotive to travel, logistics, education, and sports – have been putting Astro IDE to work.

Orchestrating Data Quality with Airflow

Here we dive into why + how the Data Team built scalable, maintainable data quality checks inside the developer experience and made maintaining high data quality an attainable outcome.

Meet the Astronomer Data Team

We’re Astronomer’s in-house data team. We center around two main goals: make data valuable and reliable at Astronomer , all with Airflow!

Update: Astro Observe is Now In Public Preview

Today, we’re excited to announce Astro Observe, which brings new and more robust data observability capabilities to users on Astro, OSS Airflow, Amazon Managed Workflows for Apache Airflow (MWAA), and Google Cloud Composer (GCC).

Introducing Apache Airflow 2.10

The Airflow 2.10 release brings greater flexibility and expansion of some of the most widely used Airflow features.

What’s New in the Astro Platform Release

We are excited to announce the new release of the Astro Platform, introducing exciting new features designed to enhance your data orchestration experience.

Understanding Airflow Trigger Rules: A Comprehensive Visual Guide

Discover the intricacies of Airflow trigger rules with visual examples and practical applications. Learn how to define and use various trigger rules to optimize your DAGs efficiently in Airflow. Essential reading for Airflow users working with version 2.9.2.

Introducing Apache Airflow® 2.9

The Airflow 2.9 release brings significant enhancements to user-favorite features like data-aware scheduling, dynamic task mapping, and object storage.

What’s new in the Astro Platform Release, Q1 2024

Welcome to the latest Astro Platform release — we’re thrilled to introduce enhancements aimed at bolstering governance at scale and across environments, fortifying the security of your data platform, and accelerating innovation.

Reliable Data Orchestration for AI Applications

Discover how Dosu leverages Astronomer to streamline data orchestration for AI applications, ensuring reliable pipelines and boosting productivity. Learn how this partnership enhances AI development and supports open-source communities.

Welcome to our new New York City headquarters!

Astronomer has moved! At the start of this year, we relocated our headquarters to the heart of New York City at 50 West 23rd Street to support our growing business and customer base

Standardizing your Astro projects with Cookiecutter and Cruft

A demonstration of how a platform team can develop a template Astro project for bootstrapping Astro projects for development teams. We demonstrate how to use Cookiecutter for developing a template project and Cruft for synchronizing generated projects with changes in the template project.

Introducing the Astronomer Champions Program for Apache Airflow®

Today, we're thrilled to announce the launch of the Astronomer Champions Program for Apache Airflow®, a global initiative designed to recognize and empower outstanding data practitioners who are dedicated advocates of this powerful open-source orchestration tool.

Introducing Airflow 2.8

The latest minor Airflow release includes new features and improvements such as the Airflow ObjectStore, Listener hook for Datasets, enhanced logging capabilities, and more.

Introducing the Astro Platform Release, Q4 2023

Unveiling Astro's latest features for streamlined connectivity, confident upgrades, and cost-efficient scaling. In this article, we’ll dive into these key features and explore how they can benefit your organization.

Introducing Apache Airflow® on Astro – an Azure Native ISV Service

Introducing Apache Airflow® on Astro, an Azure Native ISV Service. This partnership with Microsoft seamlessly embeds Apache Airflow® into the Azure ecosystem, offering a unified environment for scalable, secure, and easy-to-manage mission-critical data pipelines.

3 Key Takeaways from Airflow Summit 2023

Last week, the first-ever in-person Airflow Summit occurred in Toronto, Canada. Over 500 attendees from 20+ countries came together for all things Airflow, orchestration, and open source.

Advanced Airflow CDC Implementation

Dive deeper into Airflow CDC implementation. Explore advanced use cases, best practices, and handle schema evolution & log-based sync effectively.

Introducing Airflow 2.7

The latest minor release includes several new features, such as automatic setup/teardown of tasks, built-in OpenLineage support, cluster activity view, fail-stop functionality, and more.

Test Airflow Upgrades with the Astro CLI

The Local Upgrade Test command in the Astro CLI eliminates upgrade pains and ensures safe upgrades, allowing users to confidently identify and resolve compatibility issues, and DAG import errors.

Advantages of Hosted Airflow for Your ETL Workflows

Maximize ETL efficiency with hosted Apache Airflow® on Astro, not self-hosting open-source. Benefit from simplified infrastructure management, scalable elasticity, and dedicated support for your workloads.

Three ways to use Airflow with MotherDuck and DuckDB

Use Apache Airflow® with DuckDB and MotherDuck in three different ways. Access the DuckDB Python package directly, leverage the DuckDB Airflow provider, and use DuckDB with the Astro Python SDK.

The Top 7 Alternatives to Google Cloud Composer

Picking the right tools for your data stack depends on your exact business and engineering needs, and the choice may seem daunting. Thankfully, there are several popular tools, each with thousands of users, all with a unique approach for managing data pipelines.

Introducing Airflow 2.6

Apache Airflow® 2.6 contains over 500 commits from over 130 contributors, adding up to 35 new features, 50 general improvements, and 27 bug fixes.

The Airflow Year in Review 2022

Find out how Airflow has been optimized in 2022. Learn about major updates, including data-driven scheduling, dynamic task mapping, and UI enhancements.

5 Ways to View and Manage DAGs in Airflow

Find out what the most popular and useful DAG views in the Airflow UI are. Learn about the Airflow Graph View, Grid View, Calendar View, and Browse Tab.

Introducing the Astro Cloud IDE

Discover the Astro Cloud IDE, a notebook-inspired tool for writing data pipelines. See how to define tasks and connections without knowing Apache Airflow®.

What’s New in Apache Airflow® 2.5

Check out what’s new in Apache Airflow® 2.5. Learn more about improvements to Airflow’s dynamic task mapping and data-dependent scheduling features.

A Short History of DAG Writing

Learn about Airflow & its updated features. Get to know how users can benefit from Taskflow API, Custom XCom Backends, Astro SDK, and the Astro Cloud IDE.

Introducing Astronomer Providers

Find out more about Astronomer Providers, a set of Airflow 2-licensed providers with async functionality, created and maintained by Astronomer experts.

Airflow and dbt, Hand in Hand

Learn how to use Airflow and dbt together to advance data orchestration and data transformation projects and facilitate collaboration across data teams.

Letter from the CEO: Our Story So Far

Hear how Joe Otto reflects on Astronomer’s history, and looks to a future powered by the combination of orchestration, lineage, and observability.

Airflow Best Practices

Master Apache Airflow® with these 10 best practices. Learn how to optimize your data pipelines, improve efficiency, and avoid common pitfalls.

Machine Learning Pipeline Orchestration

Learn about machine learning orchestration, machine learning pipelines, and their components. See why Apache Airflow® is the top ML data orchestration tool.

How to Build a Modern Data Stack

Breaking down what a modern data stack means in practice. We discuss four core components, five reasons to set it up, and how to orchestrate it.

Apache Airflow® vs. Apache Beam: A Comparative Guide

Explore the differences and similarities between Apache Beam and Airflow. Understand their capabilities, programming models, and ideal use cases to make the right choice for your data management needs.

Apache NiFi vs. Apache Airflow®

Get to know the major benefits and limitations of Apache NiFi and Apache Airflow, and see which of the two popular ETL tools is better for data management.

How to Build an ETL Process?

Learn what an ETL process is and how to build it. Find out how Apache Airflow® can help you create, scale, and manage ETL pipelines more effectively.

Airflow Summit 2021 Highlights

Learn what the Airflow community got up to in 2021, in this recap of the biggest international Airflow event. Get ready for the next Airflow Summit!

The New KubernetesExecutor

Learn more about the KubernetesExecutor and its upgrade to version 2.0. See new features redesigned with Airflow admins and data engineers in mind.

Announcing the Astronomer Registry

Learn about the discovery-and-distribution hub for Airflow integrations. See how to bridge the gap between the Airflow community and the data ecosystem.

Airflow 2.0 TaskFlow API and Its Features

Learn more about the TaskFlow API and read about its features. Get to know how TaskFlow API in Airflow 2.0 enables a better DAG authoring experience.

Near-Real-Time CDC with Airflow and GCP

Learn how to implement near-real-time Change Data Capture (CDC) in Airflow using a scheduled GCP CloudSQL export approach for data pipelines.

The Airflow 2.0 Scheduler

Explore the features of the updated Apache Airflow® 2.0 Scheduler. Learn how the Airflow Scheduler enables quick and seamless initiation of tasks.

Introducing Airflow 2.0

Get to know the highlights of Apache Airflow® 2.0 and see hundreds of new features it includes. Have a look at how Airflow 2.0 compares to Airflow 1.10.

Introducing KEDA for Airflow

Explore the possibilities of the Kubernetes Event-Driven Autoscaler. See how KEDA helps users improve the efficiency of their Apache Airflow® deployments.

Why Airflow?

Discover Apache Airflow® and explore its workflow-management capabilities. See which global companies use Airflow to solve data engineering challenges.

The Next Generation of Astronomer Cloud

Learn how Astronomer Cloud supports the latest version of Apache Airflow®. See the features included in the newly released, next-generation data platform.

Astronomer v0.10

Learn more about Astronomer v0.10 and its key updated functionalities. See how the new Astronomer Platform supports the latest version of Apache Airflow®.

Astronomer v0.8.0 Release Notes

Discover the newly launched Astronomer v0.8.0 and the features it includes. Find out what’s been fixed, improved, and added to the Astronomer Platform.

Astronomer v0.4.1 Release

Check out the highlights of the Astronomer v0.4.1 release. See the full summary of upgrades and learn more about the Astronomer Platform's new features

Astronomer v0.3.2 Release

Learn more about Astronomer v0.3.2 and its new, updated functionalities. Find out what’s been changed, fixed, and added to the Astronomer Platform.

Announcing Astronomer v0.3

Find out more about Astronomer v0.3 and its great benefits. Get to know what features are included in the newly released next-generation data platform.

Announcing Astronomer SpaceCamp

Discover Astronomer SpaceCamp and see how it gets data teams up and running with Airflow in no time. See the benefits of different SpaceCamp versions.

Announcing The Airflow Podcast

Learn about Astronomer's podcast focused on the future potential of Apache Airflow®, as seen by top players in the data engineering space.

What Exactly Is a DAG?

Learn what a DAG is and how it's used in data pipelines. Explore benefits, real-world examples, and FAQs in this comprehensive guide.

Our Open Source Philosophy

Get to know how the open-source approach helps drive growth and innovation. Learn why it’s worth investing in open-source components like Apache Airflow®.

Why Is My Data Playing Hard to Get?

Learn more about the different types and properties of hard-to-reach data with great potential. Find out how to access, organize, and store it effectively.

Airflow at Astronomer

Learn why Astronomer needed a unified scheduling system to extract and monitor all types of data pipelines. Find out why Apache Airflow® was our answer.

Lessons Learned Writing Data Pipelines

See how to simplify the data pipeline writing process with the right tools. Learn what Astronomer experts do to make data pipelines less challenging.

An Almost Acquisition Story

Coming out of AngelPad’s 2015 Demo Day, we found ourselves vacillating between an acquisition and Series A, though we were arguably too early for either.

A Logo Story

Astronomer's Head of Design, Chris Hendrixson, explains how he created the design aesthetic to encompass data, futurism, and a little bit of fun.

Setting Up Your Redshift Cluster

Redshift is popular but you still need to know what you''re doing when spinning up your first cluster. In this tutorial, we walk you through the process.


Build, run, & observe your data workflows.
All in one place.