GitHub - spacegauch0/flights-scraper-effect

7 min read Original article ↗

✈️ Google Flights Scraper (Effect + Protocol Buffers)

A high-performance Google Flights scraper built with TypeScript Effect and Protocol Buffers, inspired by fast-flights.

🚀 Features

Core Capabilities

  • One-way, Round-trip, Multi-city flights
  • All cabin classes (Economy, Premium Economy, Business, First)
  • Multiple passengers (Adults, Children, Infants in seat/on lap)
  • Advanced filtering (Price, Duration, Airlines, Stops)
  • Flexible sorting (Price, Duration, Airline)
  • Price indicator (Low/Typical/High)
  • Detailed flight info (Departure, Arrival, Duration, Stops, Delays)

Production Features 🎯

  • Response caching with TTL (15 min default, 300x faster)
  • Rate limiting (10 req/min, protects against blocking)
  • Retry logic with exponential backoff (3 attempts default)
  • Enhanced error messages with troubleshooting guides

Technical Advantages

  • 5x faster than Puppeteer (HTTP requests vs browser automation)
  • 💰 4x less memory usage
  • 🌐 Edge-compatible (No browser required!)
  • 🔒 Type-safe with Effect error handling
  • 📦 Lightweight dependencies
  • 🔄 Production-ready with built-in reliability features

📦 Installation

bun install
# or
npm install

🎯 Quick Start

Using the CLI

The easiest way to get started is using the command-line interface:

# Launch interactive TUI
bun run start

# Or use CLI directly
bun run cli --from JFK --to LHR --depart-date 2026-01-19

Using as a Library

You can also use the scraper as a library in your own code:

import { Effect } from "effect"
import { ScraperService, ScraperProtobufLive } from "./src"

const program = Effect.gen(function* () {
  const scraper = yield* ScraperService
  
  const result = yield* scraper.scrape(
    "JFK",              // From
    "LHR",              // To
    "2026-01-19",       // Depart date
    "one-way",          // Trip type
    undefined,          // Return date (for round-trip)
    "price-asc",        // Sort by price ascending
    { limit: 10 },      // Filters
    "economy",          // Seat class
    { adults: 1, children: 0, infants_in_seat: 0, infants_on_lap: 0 }, // Passengers
    ""                  // Currency (empty = default)
  )
  
  console.log(`Found ${result.flights.length} flights`)
  console.log(`Price level: ${result.current_price}`)
})

Effect.runPromise(program.pipe(Effect.provide(ScraperProtobufLive)))

📁 Project Structure

flights-scraper-effect/
├── src/
│   ├── cli/              # Command-line interface
│   │   └── index.ts      # CLI implementation with argument parsing
│   ├── tui/              # Terminal User Interface
│   │   └── index.ts      # Interactive TUI implementation
│   ├── domain/           # Types, schemas, and errors
│   │   ├── types.ts      # FlightOption, Result, filters, etc.
│   │   ├── errors.ts     # ScraperError and error helpers
│   │   ├── validation.ts # Validation utilities
│   │   └── index.ts
│   ├── services/         # Service interface and implementations
│   │   ├── scraper.ts    # Service interface definition
│   │   ├── scraper-protobuf.ts    # HTTP-based implementation
│   │   ├── scraper-production.ts  # Production with cache/retry/rate-limit
│   │   └── index.ts
│   ├── utils/            # Utility modules
│   │   ├── cache.ts      # Response caching
│   │   ├── rate-limiter.ts  # Rate limiting
│   │   ├── retry.ts      # Retry with exponential backoff
│   │   ├── protobuf.ts   # Protocol buffer encoding
│   │   └── index.ts
│   ├── cli.ts            # Main entry point (routes to CLI or TUI)
│   └── index.ts          # Library exports
├── docs/                 # Documentation
│   ├── MIGRATION.md      # Puppeteer → Protobuf migration
│   ├── PRODUCTION.md     # Production features guide
│   ├── IMPLEMENTATION_STATUS.md  # Feature comparison
│   ├── EFFECT_BEST_PRACTICES.md  # Effect best practices guide
│   └── SUMMARY.md        # Implementation summary
├── package.json
├── tsconfig.json
└── README.md

🔧 Configuration

Trip Types

  • "one-way" - One-way flight
  • "round-trip" - Round-trip flight (requires returnDate)
  • "multi-city" - Multi-city (not fully implemented)

Seat Classes

  • "economy" - Economy class
  • "premium-economy" - Premium Economy
  • "business" - Business class
  • "first" - First class

Passengers

{
  adults: number,           // Ages 18+
  children: number,         // Ages 2-11
  infants_in_seat: number,  // Under 2, with seat
  infants_on_lap: number    // Under 2, on lap
}

Filters

{
  maxPrice?: number,             // Maximum price
  minPrice?: number,             // Minimum price
  maxDurationMinutes?: number,   // Maximum duration
  airlines?: string[],           // Filter by airlines
  nonstopOnly?: boolean,         // Only nonstop flights
  max_stops?: number,            // Max stops (0, 1, 2)
  limit?: number | "all"         // Result limit
}

Sort Options

  • "price-asc" - Price: low to high
  • "price-desc" - Price: high to low
  • "duration-asc" - Duration: shortest to longest
  • "duration-desc" - Duration: longest to shortest
  • "airline" - Airline: alphabetical
  • "none" - No sorting

📊 Output Format

{
  current_price?: "low" | "typical" | "high",
  flights: [
    {
      is_best?: boolean,
      name: string,              // Airline name
      departure: string,         // Departure time
      arrival: string,           // Arrival time
      arrival_time_ahead?: string, // "+1 day" if next day
      duration: string,          // "12 hr 30 min"
      stops: number,             // Number of stops
      delay?: string,            // Delay info
      price: string              // "ARS 299,733"
    }
  ]
}

🧪 Running

Terminal User Interface (TUI)

Launch the interactive TUI (default when no arguments provided):

bun run start
# or
bun run tui
# or explicitly
bun run src/cli.ts --tui

Command-Line Interface (CLI)

Use the CLI for programmatic access and automation:

# Basic CLI usage
bun run cli --from JFK --to LHR --depart-date 2026-01-19

# Production mode (with caching, rate limiting, retry)
bun run production --from LAX --to NRT --depart-date 2026-01-19

# JSON output
bun run cli --from AEP --to SCL --depart-date 2026-01-19 --json

# Full example with filters
bun run cli \
  --from JFK \
  --to LHR \
  --depart-date 2026-01-19 \
  --return-date 2026-01-26 \
  --trip-type round-trip \
  --seat business \
  --adults 2 \
  --max-stops 1 \
  --limit 20 \
  --currency USD

CLI Options

Required:

  • --from, -f <code> - Origin airport code (e.g., JFK)
  • --to, -t <code> - Destination airport code (e.g., LHR)
  • --depart-date, -d <date> - Departure date (YYYY-MM-DD)

Optional:

  • --return-date, -r <date> - Return date for round-trip (YYYY-MM-DD)
  • --trip-type <type> - Trip type: one-way, round-trip, multi-city (default: one-way)
  • --sort, -s <option> - Sort: price-asc, price-desc, duration-asc, duration-desc, airline, none (default: price-asc)
  • --seat <class> - Seat class: economy, premium-economy, business, first (default: economy)
  • --adults, -a <number> - Number of adults (default: 1)
  • --children, -c <number> - Number of children (default: 0)
  • --infants-in-seat <number> - Infants in seat (default: 0)
  • --infants-on-lap <number> - Infants on lap (default: 0)
  • --max-price <number> - Maximum price filter
  • --min-price <number> - Minimum price filter
  • --max-duration <minutes> - Maximum duration in minutes
  • --max-stops <0|1|2> - Maximum number of stops (default: 2)
  • --nonstop-only - Only show nonstop flights
  • --airlines <list> - Comma-separated list of airlines
  • --limit, -l <number|all> - Limit number of results (default: 10)
  • --currency <code> - Currency code (e.g., USD, EUR)
  • --production, -p - Use production mode (caching, rate limiting, retry)
  • --json, -j - Output results as JSON
  • --help, -h - Show help message

🖥️ Terminal User Interface (TUI)

The project includes an interactive terminal UI built with OpenTUI:

TUI Screenshot

Features:

  • Interactive form with airport inputs
  • Trip type selection (One-way / Round-trip)
  • Seat class selection (Economy, Premium Economy, Business, First)
  • Max stops filter
  • Real-time flight search results
  • Sortable table view with keyboard navigation
  • Color-coded price level indicators
  • Mouse support
  • Direct links to Google Flights booking pages

Controls:

  • Enter - Search for flights (form mode) / Open selected flight (table mode)
  • Tab / Shift+Tab - Navigate between form fields
  • Ctrl+R - Focus results table
  • ↑/↓ - Navigate table rows (table mode)
  • ←/→ - Navigate table columns (table mode)
  • Space - Sort by selected column (table mode)
  • Esc - Exit table mode, return to form
  • Ctrl+C - Exit application

📝 Example Configurations

Business Class Round-trip for Family

const result = yield* scraper.scrape(
  "LAX", "NRT", "2026-01-19", "round-trip", "2026-01-26",
  "price-asc",
  { max_stops: 1, limit: 20 },
  "business",
  { adults: 2, children: 1, infants_in_seat: 0, infants_on_lap: 1 },
  "USD"
)

Budget Economy with Filters

const result = yield* scraper.scrape(
  "ORD", "CDG", "2026-01-19", "one-way", undefined,
  "price-asc",
  { 
    maxPrice: 500, 
    maxDurationMinutes: 12 * 60,
    airlines: ["United", "American"],
    limit: 15
  },
  "economy",
  { adults: 1, children: 0, infants_in_seat: 0, infants_on_lap: 0 },
  "USD"
)

📚 Tech Stack

🤝 Credits

  • Inspired by fast-flights by @AWeirdDev
  • Reverse engineering insights from the Python implementation

📄 License

MIT

🐛 Known Limitations

  1. JavaScript Data Extraction: Some flight details may be incomplete depending on Google's response format
  2. Rate Limiting: Google may rate-limit excessive requests
  3. Price Currency: Prices are returned in the currency Google serves (may vary by region)
  4. Multi-city: Not fully tested yet