GitHub - michaelteter/docgen: capture your entire project as a single text file, then drop that file in an AI chat and go

4 min read Original article ↗

docgen

docgen - your entire project in one text file

A fast, intelligent project documentation generator that creates a single, comprehensive text file containing your entire project structure and source code. Perfect for sharing context with LLMs - just drop one file in an AI chat and discuss your entire project.

Recommended: add docgen to your git pre-commit process.

Features

  • 🌳 Visual directory tree of your project structure
  • 📄 Concatenated file contents with clear separators
  • 🎯 Smart filtering respects .gitignore and custom exclusions
  • 🚫 Automatic binary detection skips images, executables, PDFs, etc.
  • ⚙️ Configurable via .docgen_ignore file
  • 🔄 Git-aware uses git check-ignore for accurate filtering
  • 📊 Size limits skips files over 2MB by default

Installation

From Source

# Clone the repository
git clone https://github.com/michaelteter/docgen.git
cd docgen

# Initialize Go module
go mod init docgen

# Build and install
make install

This installs docgen to /usr/local/bin/ so you can use it from anywhere.

Manual Build

make build      # Creates ./docgen executable
./docgen        # Run from current directory

Usage

Run docgen from your project's root directory:

# Generate documentation (creates project_doc.txt)
docgen

# Preview which files will be included
docgen --files-list

# See the directory tree structure
docgen --graphic-tree    # With symbols
docgen --plain-tree      # Plain text

# Custom output location
docgen --output docs/my_project.txt

Configuration

If a .docgen_ignore file is present, that takes precedence over the defaults. (Be sure to include .git/ in your exclusions.)

Create a .docgen_ignore file in your project root to customize which files are included or excluded. It follows .gitignore syntax:

# Lines starting with ! are ALWAYS included (even if gitignored)
!.gitignore
!some_important_non_binary_file.foo

# Everything else is EXCLUDED (in addition to .gitignore rules)
.git/
log/
scratch.txt
junk/

Note: .gitignore rules are always respected. .docgen_ignore adds additional exclusions/inclusions on top of git's ignore rules.

Default Behavior (without .docgen_ignore)

If no .docgen_ignore exists, docgen uses these defaults:

Always included:

  • .gitignore

Always excluded:

  • .git/ directory
  • .DS_Store

Automatic Exclusions

Regardless of configuration, docgen always automatically excludes:

  • Binary files: detected via file extensions and content analysis (images, PDFs, executables, archives, etc.)
  • Previous docgen output: any file starting with # DOCGEN-OUTPUT: pragma

Pattern Matching Rules

Patterns follow gitignore conventions:

  • *.ext - matches files with that extension anywhere
  • dirname/ - matches directory and all its contents
  • !pattern - always include (negation)
  • # - comments
  • Blank lines are ignored

How It Works

  1. Directory Walk: Recursively scans all files in the project
  2. Extension Filter: Quickly skips known binary extensions (.pdf, .jpg, .exe, etc.)
  3. Content Detection: Reads first 8KB of remaining files to detect:
    • Binary content (null bytes, magic bytes like %PDF, non-printable characters)
    • Previously generated docgen files (via # DOCGEN-OUTPUT: pragma)
  4. Git Integration: Uses git check-ignore to respect .gitignore rules
  5. Custom Filtering: Applies .docgen_ignore patterns on top of git rules
  6. Always Include: Forces inclusion of specified files even if gitignored
  7. Output Generation: Creates a file with:
    • Directory tree visualization
    • Each file's complete contents with clear headers

Output Format

The generated project_doc.txt contains:

# DOCGEN-OUTPUT: This file is generated by docgen. Do not include in next generation.

>>>> PROJECT FILE TREE: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
.
├── main.go
├── README.md
└── go.mod

<<<< END OF FILE TREE: <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

>>>> FILE CONTENTS: main.go >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
package main
...

<<<< END OF FILE: main.go <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

>>>> FILE CONTENTS: README.md >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# My Project
...

<<<< END OF FILE: README.md <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Development

# Build optimized binary
make build

# Cross-compile for multiple platforms
make cross-compile

# Clean build artifacts
make clean

# Run tests
make test

Use Cases

  • 📤 Share with LLMs: Give AI assistants complete project context
  • 👀 Code Reviews: Easy-to-read project snapshot
  • 📚 Documentation: Auto-generated project overview
  • 🔍 Audits: Quick scan of entire codebase
  • 📊 Diffs: Track project evolution via git diffs of the output

Why Commit project_doc.txt?

We recommend committing the generated file to git because:

  • Visibility: See how your project evolves over time through diffs
  • Convenience: Anyone cloning the repo has immediate access
  • History: Track documentation changes alongside code changes

The pragma comment prevents recursive inclusion if you regenerate while an old version exists.

Future/TODO

See FUTURE.md

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

MIT License - see LICENSE file for details.

Author

Michael Teter