Settings

Theme

Show HN: 518K Vietnamese legal documents (1924–2026)

huggingface.co

3 points by th1nhng0 3 days ago · 0 comments · 1 min read

Reader

I scraped and open-sourced a corpus of 518,255 Vietnamese legal documents — laws, decrees, circulars, decisions — spanning a century of legislation. Metadata + full Markdown text, ~3.6 GB parquet, CC BY 4.0. Vietnamese legal text is nearly absent from existing NLP datasets despite Vietnam having one of the more prolific legislative systems in Southeast Asia.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection