An opinionated Python toolkit for air quality data analysis.
Documentation · GitHub · PyPI
Aeolus provides a simple, unified, opinionated workflow for downloading and working with air quality data from multiple sources. Access over 28 billion station-hours of monitoring data from 145,000+ locations across 100+ countries through a single consistent API.
Aeolus distinguishes between two types of data source:
- Networks are discrete monitoring networks with a known set of sites (e.g. the UK's AURN/SAQN or Breathe London). You can list all sites and download data directly.
- Portals are global data aggregators (e.g. OpenAQ). With hundreds of thousands of sites worldwide, you search first, then download.
| Type | Source | Coverage | API Key |
|---|---|---|---|
| Network | AURN, SAQN, WAQN, NI, AQE | UK regulatory networks | No |
| Network | LAQN | London Air Quality Network (~250 sites) | No |
| Network | EEA | European Environment Agency (40+ countries, 7,000+ stations) | No |
| Network | Breathe London | London low-cost sensors | Yes |
| Network | AirQo | African cities (200+ sensors) | Yes |
| Network | Sensor.Community | Global citizen science (35,000+) | No |
| Network | EPA AirNow | USA, Canada, Mexico | Yes |
| Network | Sonitus | Smart Dublin, Ireland | No |
| Portal | OpenAQ | Global (100+ countries) | Yes |
| Portal | PurpleAir | Global low-cost sensors (30,000+) | Yes |
Installation
Requires Python 3.11 or later.
Quick Start
import aeolus from datetime import datetime # Download data from the UK's national network data = aeolus.download( "AURN", sites=["MY1", "KC1"], # Marylebone Road, North Kensington start_date=datetime(2024, 1, 1), end_date=datetime(2024, 3, 31) ) print(data.head())
site_code date_time measurand value units source_network
0 MY1 2024-01-01 00:00:00 NO2 42.3 ug/m3 AURN
1 MY1 2024-01-01 00:00:00 PM2.5 18.7 ug/m3 AURN
2 MY1 2024-01-01 00:00:00 PM10 24.1 ug/m3 AURN
...
Data Sources
Aeolus connects to 12 monitoring networks and 2 global data portals, providing access to an estimated 28 billion station-hours of air quality data. This spans reference-grade government networks (AURN, EEA, AirNow), low-cost sensor networks (Sensor.Community, PurpleAir, Breathe London), and global aggregation portals (OpenAQ). All sources are normalised to a common 8-column schema, so data from a DEFRA reference monitor in London and a citizen science sensor in Kampala arrive in the same format.
UK Regulatory Networks
These networks provide quality-assured data from reference-grade monitors operated by UK government bodies:
| Network | Description | Coverage |
|---|---|---|
| AURN | Automatic Urban and Rural Network | England, Wales, Scotland, N. Ireland |
| SAQN | Scottish Air Quality Network | Scotland |
| WAQN | Welsh Air Quality Network | Wales |
| NI | Northern Ireland Network | Northern Ireland |
| AQE | Air Quality England | England (local authorities) |
| LAQN | London Air Quality Network | Greater London (~250 sites) |
# Get metadata for all AURN sites sites = aeolus.networks.get_metadata("AURN") # Download from multiple UK networks data = aeolus.download( { "AURN": ["MY1", "KC1"], "SAQN": ["GLA4", "ED3"] }, start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 31) )
Breathe London
High-density sensor network across London, operated by Imperial College London's Environmental Research Group.
# Get Breathe London site metadata sites = aeolus.networks.get_metadata("BREATHE_LONDON") # Download data data = aeolus.download( "BREATHE_LONDON", sites=["BL0001", "BL0002"], start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 31) )
Requires API key: Set BL_API_KEY in your environment. Get a free key at breathelondon.org/developers.
AirQo (Africa)
Air quality monitoring network focused on African cities, operated by Makerere University. Provides PM2.5 and PM10 data from 200+ low-cost sensors across 16+ cities.
# Get AirQo site metadata sites = aeolus.networks.get_metadata("AIRQO") # Filter to a specific country uganda_sites = sites[sites["country"] == "Uganda"] # Download data data = aeolus.download( "AIRQO", sites=uganda_sites["site_code"].head(5).tolist(), start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 31) )
Requires API key: Set AIRQO_API_KEY in your environment. Get a free key at analytics.airqo.net.
Sensor.Community (Global)
Global citizen science network (formerly luftdaten.info) with 35,000+ low-cost sensors worldwide. Provides PM2.5, PM10, temperature, humidity, and pressure data. No API key required.
# Find sensors in a geographic area from aeolus.sources.sensor_community import fetch_sensor_community_metadata sites = fetch_sensor_community_metadata( area=(51.5074, -0.1278, 50) # lat, lon, radius_km ) # Download data using the standard interface data = aeolus.download( "SENSOR_COMMUNITY", sites=sites["site_code"].head(5).tolist(), start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 7) )
Rate limiting: Aeolus includes built-in rate limiting (10 requests/minute by default) to be respectful of the community-run infrastructure. You can configure this:
from aeolus.sources.sensor_community import set_rate_limiting # Adjust rate limits set_rate_limiting(max_requests=5, period=60, min_delay=2.0) # Disable (not recommended) set_rate_limiting(enabled=False)
Note: Data is marked as Unvalidated since this is citizen science data without formal QA/QC processes.
EPA AirNow (USA)
Real-time air quality data from the US EPA's AirNow system, covering the United States, Canada, and parts of Mexico. Provides O3, PM2.5, PM10, NO2, SO2, and CO data from thousands of monitoring stations.
# Get current air quality at a location from aeolus.sources.airnow import fetch_airnow_current current = fetch_airnow_current( latitude=34.05, longitude=-118.24, distance=25 # miles ) # Find monitoring sites in a bounding box # bbox format: (min_lon, min_lat, max_lon, max_lat) - same as GeoJSON/shapely sites = aeolus.networks.get_metadata( "AIRNOW", bbox=(-118.5, 33.7, -117.5, 34.3) # LA area ) # Download historical data (up to ~45 days) data = aeolus.download( "AIRNOW", sites=sites["site_code"].head(3).tolist(), start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 7) )
Requires API key: Set AIRNOW_API_KEY in your environment. Get a free key at docs.airnowapi.org.
Note: AirNow provides provisional (real-time) data with approximately 45 days of history. For verified historical data going back years, use EPA AQS (via pyaqsapi or OpenAQ).
OpenAQ
Global air quality portal aggregating measurements from 100+ countries.
# Search for monitoring locations locations = aeolus.portals.find_sites("OPENAQ", country="GB") # Download data using site codes site_codes = locations["site_code"].head(5).tolist() data = aeolus.portals.download( "OPENAQ", sites=site_codes, start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 31) )
Requires API key: Set OPENAQ_API_KEY in your environment. Get a free key at openaq.org.
PurpleAir (Global)
Global network of 30,000+ low-cost air quality sensors, popular with researchers and citizen scientists. PurpleAir sensors use dual laser counters for improved accuracy and measure PM1, PM2.5, PM10, temperature, humidity, and pressure.
# Search for PurpleAir sensors in a bounding box (e.g., London) # bbox format: (min_lon, min_lat, max_lon, max_lat) - same as GeoJSON/shapely sites = aeolus.portals.find_sites( "PURPLEAIR", bbox=(-0.5, 51.3, 0.3, 51.7), location_type=0 # 0 = outdoor only ) # Download data from specific sensors data = aeolus.portals.download( "PURPLEAIR", sites=["131075", "131076"], # Sensor indices from map.purpleair.com start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 31) )
Requires API key: Set PURPLEAIR_API_KEY in your environment. Get a free key (includes 1M API points) at develop.purpleair.com.
Note: PurpleAir sensors have dual laser counters (A and B channels). Aeolus automatically applies literature-based QA/QC and flags data quality:
Validated: Both channels agree (±10 µg/m³ for low concentrations, ±10% for high)Channel Disagreement: Both channels valid but disagree beyond thresholdsSingle Channel (A/B): Only one channel had valid dataBelow Detection Limit: Value below 0.3 µg/m³ (sensor noise floor)Sensor Saturation: Value above 1000 µg/m³
Working with the Data
Standardised Format
All data sources return pandas DataFrames with a consistent schema:
| Column | Description |
|---|---|
site_code |
Unique site identifier |
date_time |
Measurement timestamp |
measurand |
Pollutant (NO2, PM2.5, PM10, O3, etc.) |
value |
Measured concentration |
units |
Units (typically µg/m³) |
source_network |
Data source |
Data Transformations
Aeolus includes composable transformation functions for data processing:
from aeolus.transforms import pipe, filter_rows, select_columns # Filter to NO2 measurements above 40 µg/m³ exceedances = pipe( data, filter_rows(lambda df: df["measurand"] == "NO2"), filter_rows(lambda df: df["value"] > 40), select_columns("site_code", "date_time", "value") )
Combining Sources
Download from multiple sources in a single call:
data = aeolus.download( { "AURN": ["MY1"], "BREATHE_LONDON": ["BL0001"], "OPENAQ": ["2178"] }, start_date=datetime(2024, 1, 1), end_date=datetime(2024, 1, 31) ) # All data in one DataFrame with source_network column data.groupby("source_network").size()
Configuration
Environment Variables
Create a .env file or set these in your environment:
# Required for OpenAQ OPENAQ_API_KEY=your_key_here # Required for Breathe London BL_API_KEY=your_key_here # Required for AirQo AIRQO_API_KEY=your_key_here # Required for PurpleAir PURPLEAIR_API_KEY=your_key_here # Required for EPA AirNow AIRNOW_API_KEY=your_key_here
Using with dotenv
from dotenv import load_dotenv load_dotenv() import aeolus # API keys are now available
API Reference
Top-Level Functions
# Download data (smart routing to appropriate source) aeolus.download(sources, sites, start_date, end_date) # Date range shorthand (alternative to explicit dates) aeolus.download("AURN", ["MY1"], last="30d") # also: "2w", "6m", "1y" # List all available sources aeolus.list_sources() # Find monitoring sites across sources aeolus.find_sites("AURN", near=(51.5074, -0.1278), radius_km=20) # Get near-real-time readings (UK regulatory networks) aeolus.get_current("AURN", sites=["MY1", "KC1"]) # Quick data overview aeolus.summarise(data) # Get information about a source aeolus.get_source_info("AURN")
Networks (UK regulatory, Breathe London)
# List available networks aeolus.networks.list_networks() # Get site metadata aeolus.networks.get_metadata("AURN") # Download data aeolus.networks.download("AURN", ["MY1"], start_date, end_date)
Portals (OpenAQ, PurpleAir)
# List available portals aeolus.portals.list_portals() # Search for monitoring locations (filters required) aeolus.portals.find_sites("OPENAQ", country="GB") aeolus.portals.find_sites("OPENAQ", city="London") # bbox format: (min_lon, min_lat, max_lon, max_lat) aeolus.portals.find_sites("PURPLEAIR", bbox=(-0.5, 51.3, 0.3, 51.7)) # Download data aeolus.portals.download("OPENAQ", sites, start_date, end_date) aeolus.portals.download("PURPLEAIR", sites, start_date, end_date)
Examples
Annual Regulatory Statistics
import aeolus from aeolus import metrics from datetime import datetime # Download a full year of data data = aeolus.download( "AURN", sites=["MY1"], start_date=datetime(2023, 1, 1), end_date=datetime(2023, 12, 31) ) # Regulatory statistics (annual mean, exceedances, data capture) stats = metrics.aq_stats(data) print(stats) # Time averaging with data capture thresholds daily = metrics.time_average(data, freq="D", data_thresh=0.75) # Trend analysis requires multi-year data (≥6 months) multi_year = aeolus.download( "AURN", sites=["MY1"], start_date=datetime(2020, 1, 1), end_date=datetime(2023, 12, 31) ) result = metrics.trend(multi_year, pollutant="NO2") print(f"NO2 trend: {result.slope:.2f} µg/m³/year (p={result.p_value:.4f})")
Compare Sites Across Networks
# Download from multiple networks data = aeolus.download( { "AURN": ["MY1", "KC1"], "SAQN": ["GLA4"] }, start_date=datetime(2024, 1, 1), end_date=datetime(2024, 6, 30) ) # Monthly NO2 by site monthly = ( data[data["measurand"] == "NO2"] .set_index("date_time") .groupby(["site_code", pd.Grouper(freq="M")])["value"] .mean() .unstack(level=0) )
Export to CSV
data = aeolus.download("AURN", ["MY1"], start_date, end_date) data.to_csv("marylebone_road_2024.csv", index=False)
Find Sites Near a Location
# Find all free-source sites within 10km of central London sites = aeolus.find_sites(near=(51.5074, -0.1278), radius_km=10) print(sites[["site_code", "site_name", "source_network", "distance_km"]]) # Download from the nearest site nearest = sites.iloc[0]["site_code"] data = aeolus.download("AURN", [nearest], last="7d")
Near-Real-Time Data
# Get the latest readings from UK regulatory monitors latest = aeolus.get_current("AURN", sites=["MY1", "KC1"]) print(latest[["site_code", "date_time", "measurand", "value"]])
Air Quality Indices
Aeolus includes a comprehensive metrics module for calculating air quality indices from downloaded data.
Supported Indices
| Index | Country/Region | Scale | Description |
|---|---|---|---|
| UK_DAQI | UK | 1-10 | Daily Air Quality Index |
| US_EPA | USA | 0-500 | EPA AQI with NowCast |
| CHINA | China | 0-500 | China AQI |
| WHO | Global | Pass/Fail | WHO 2021 Guidelines |
| EU_CAQI_ROADSIDE | EU | 1-6 | European AQI (traffic) |
| EU_CAQI_BACKGROUND | EU | 1-6 | European AQI (background) |
| INDIA_NAQI | India | 0-500 | National AQI |
Quick Example
import aeolus from aeolus import metrics from datetime import datetime # Download data data = aeolus.download( "AURN", sites=["MY1"], start_date=datetime(2024, 1, 1), end_date=datetime(2024, 12, 31) ) # Calculate UK DAQI summary summary = metrics.aqi_summary(data, index="UK_DAQI") print(summary) # Monthly breakdown monthly = metrics.aqi_summary(data, index="UK_DAQI", freq="M") # Check WHO guideline compliance compliance = metrics.aqi_check_who(data) print(compliance[["pollutant", "meets_guideline", "exceedance_ratio"]])
Summary Options
# Get overall AQI only (no per-pollutant breakdown) simple = metrics.aqi_summary(data, index="UK_DAQI", overall_only=True) # Wide format output (one row per period) wide = metrics.aqi_summary(data, index="UK_DAQI", freq="M", format="wide") # Different aggregation frequencies daily = metrics.aqi_summary(data, index="UK_DAQI", freq="D") weekly = metrics.aqi_summary(data, index="UK_DAQI", freq="W") monthly = metrics.aqi_summary(data, index="UK_DAQI", freq="M") yearly = metrics.aqi_summary(data, index="UK_DAQI", freq="Y")
WHO Guidelines
The WHO module checks compliance against the 2021 Air Quality Guidelines and interim targets:
from aeolus import metrics # Check against the AQG (strictest target) compliance = metrics.aqi_check_who(data, target="AQG") # Check against interim targets for progressive improvement it1 = metrics.aqi_check_who(data, target="IT-1") # Least strict it4 = metrics.aqi_check_who(data, target="IT-4") # More strict
Unit Conversion
The metrics module automatically converts units where needed (e.g., ppb to µg/m³) and warns you when conversions are applied.
Acknowledgements
Aeolus wouldn't be possible without the work of many organisations and individuals. See REFERENCES.md for full citations and methodology sources.
Code Contributors
- Dr Ruaraidh Dobson — Project creator, architecture, documentation
- Claude (Anthropic) — Code implementation, including data source integrations, AQI calculations, QA/QC methodology, and test suites
Data Providers
- OpenAQ — Open, global air quality data portal and API
- Breathe London — Imperial College London's Environmental Research Group (Open Government Licence v3.0)
- AirQo — Makerere University's air quality monitoring network for African cities
- PurpleAir — Global network of low-cost sensors
- Sensor.Community — Global citizen science sensor network (formerly luftdaten.info)
- EPA AirNow — US Environmental Protection Agency real-time air quality data
- UK regulatory bodies (DEFRA, SEPA, Natural Resources Wales, DAERA) — Reference-grade monitoring networks
Standards and Methodologies
- US EPA — Air Quality Index and NowCast algorithm
- DEFRA/COMEAP — UK Daily Air Quality Index
- WHO — 2021 Air Quality Guidelines
- CITEAIR Project — EU Common Air Quality Index
- CPCB India — National Air Quality Index
- China MEE — HJ 633-2012 AQI Standard
- PurpleAir Community — QA/QC methodology for dual-channel sensors
Software
- openair — David Carslaw and Karl Ropkins' R package, which provides the data files for UK regulatory networks. If you use Aeolus with UK data, please cite: Carslaw, D.C. and K. Ropkins (2012) openair — an R package for air quality data analysis. Environmental Modelling & Software 27-28, 52-61.
- purpleair-api — Carlos Santos' Python wrapper for the PurpleAir API
Contributing
Contributions are welcome. The codebase is designed to be extensible — see src/aeolus/sources/ for examples of how data sources are implemented.
Licence
GNU General Public License v3.0 or later. See LICENCE for details.
Contact
Ruaraidh Dobson — ruaraidh.dobson@gmail.com
Issues and feature requests: GitHub Issues