GitHub - dream-horizon-org/datagen: Generate coherent, synthetic data at scale

1 min read Original article ↗

Go Version License Go Report Card Join our Discord

datagen is a tool to generate coherent, synthetic data generation from models expressed in a simple, declarative DSL.

Watch the Demo

Watch the video

Salient features:

  • A declarative DSL for defining data models with Go-like syntax
  • High performance through transpilation to native Go code
  • Multiple output formats (CSV, JSON, XML, stdout)
  • Database integration with direct loading to MySQL
  • Model relationships via cross-references using self.datagen
  • Tag-based filtering for selective data generation
  • Built-in functions for common data items

Install

See the Installation Guide for detailed installation instructions.

Usage

You can launch datagen for trying it out with:

# Create a simple model file
cat > user.dg << 'EOF'
model user {
  metadata { count: 100 }
  fields {
    id() int
    name() string
  }
  gens {
    func id() { return iter + 1 }
    func name() { return Name() }
  }
}
EOF

# Generate data
datagenc gen user.dg -f csv -o ./output

this will generate a user.csv file in output directory with 100 user records.

More information

Contributing

Refer to CONTRIBUTING.md

License

MIT License, see LICENSE.