Settings

Theme

Show HN: A real world streaming data generator in Python

github.com

1 points by ashishbagri 7 months ago · 0 comments · 1 min read

Reader

I've built GlassGen to solve the common problem of generating real time synthetic data for testing, demos, and ML datasets. While Faker is great for individual data points, GlassGen adds:

- Configurable data publishing (CSV, Kafka, Webhooks)

- Precise rate control (records/second)

- Controlled data duplication

- Extensible architecture for custom generators and sinks

Key features:

- Built on top of Faker for reliable data generation

- Simple JSON/YAML configuration

- Support for complex data relationships

- Real-time data streaming to Kafka

- Custom sink implementations

GitHub: https://github.com/glassflow/glassgen

Docs: https://glassgen.glassflow.dev/

Would love feedback from the community, especially on:

1. Additional sink types that would be useful

2. Performance optimization opportunities

3. Ideas for handling more complex data relationships

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection