GitHub - nihaljn/datahawk: Viewer for text datasets in formats like HuggingFace, JSONL, etc.

1 min read Original article ↗


A lightweight app that makes browsing and analyzing text data a breeze.

Key Features

🔍 Intuitive Navigation: Effortlessly browse local (or remote) data in HuggingFace, JSONL, etc., formats.
⚡ Efficient Browsing: Stream large local (or remote) datasets without loading (or downloading) in memory.
🚀 Powerful Analysis: Easily filter and sort data for better insights.
đŸ’ģ Pretty-Print Code: Human-friendly visualization of code embedded in your data.

Experience seamless data browsing and analysis with Datahawk đŸĻ…!

Alternatives include: Lilac, HuggingFace Dataset Viewer.

Instructions

Install

Installation requires python>=3.8.

Run

Launch the app from anywhere as:

This will start the application at localhost:5009.

Specify a custom port number as:

This will start the application at localhost:PORT.

Usage

Usage is quite intuitive! You can find on-screen instructions by hovering over the information icons â„šī¸.

License

Datahawk has an MIT license, as found in the LICENSE file.

Acknowledgements