MaskWise - Enterprise PII Anonymization Platform
Maskwise by VerifyWise is a data privacy platform built to detect, redact, mask, and anonymize sensitive information across unstructured text, images, and structured data within LLM training datasets. It automatically identifies and classifies PII, payment data, health records or other regulated content.
The system supports 50+ document and file formats, applies anonymization while preserving original structure and formatting, and generates full compliance audit trails for traceability and verification.
Join the Discord server for discussions: Click here
Overview
- Microsoft Presidio integration with 15+ compliance entity types (SSN, Credit Cards, HIPAA, GDPR etc)
- RBAC with comprehensive audit trails
- Full Office Suite Support (Word, Excel, PowerPoint, PDF) with format preservation
- Batch Processing for enterprise-scale volumes
- OCR Integration for scanned documents
- Policy-driven Processing with customizable business rules
- Format-preserving Anonymization maintaining document usability
- Multiple Strategies (
redact,mask,replace,encrypt) - Original + Anonymized Downloads for audit workflows
- On-premise & Docker Installation
- RESTful API for existing system integration
Roadmap:
- Vault
- Single Sign-On Ready (Active Directory, SAML, OIDC) (in the works)
You can deploy Maskwise in 24 hours an reduce PII exposure risk by 95%. Maskwise can process thousands of documents per hour.
π Quick Deploy with Docker (Recommended)
Deploy Maskwise in under 5 minutes using pre-built images:
# 1. Clone repository git clone https://github.com/bluewave-labs/maskwise.git cd maskwise # 2. Configure environment (required) cp .env.production.example .env # Edit .env: Set POSTGRES_PASSWORD and JWT_SECRET # 3. Deploy all services docker-compose -f docker-compose.production.yml up -d # 4. Initialize database (one-time setup) docker-compose -f docker-compose.production.yml exec api npx prisma migrate deploy docker-compose -f docker-compose.production.yml exec api npx prisma db seed # 5. Access Maskwise # Frontend: http://localhost:3000 # Login: admin@maskwise.com / admin123
β Ready-to-use Docker images available:
ghcr.io/bluewave-labs/maskwise-api:latestghcr.io/bluewave-labs/maskwise-worker:latestghcr.io/bluewave-labs/maskwise-web:latest
Maskwise use cases for AI and LLMs
1. Safe training data curation
LLM training datasets often contain sensitive information like PII or confidential business data. Maskwise detects and anonymizes this content before ingestion, preventing models from memorizing or leaking private details.
2. Fine-tuning on proprietary data
When fine-tuning LLMs with internal corpora such as customer conversations or documents, regulated data may slip through. Maskwise redacts or masks sensitive fields while preserving structure, enabling safe and compliant fine-tuning.
3. Prompt and response anonymization
Prompts and outputs collected for evaluation or reinforcement learning can include sensitive content. Maskwise anonymizes these logs before theyβre stored or shared, reducing exposure and ensuring privacy.
4. Synthetic dataset generation
To expand training data safely, Maskwise anonymizes real records and replaces them with synthetic placeholders. This preserves realism for model training while protecting user privacy.
Architecture
This is a monorepo containing:
- apps/web - Next.js frontend with shadcn/ui
- apps/api - NestJS backend API
- apps/worker - Background job processor
- packages/shared - Shared utilities and helpers
- packages/types - TypeScript type definitions
- packages/database - Database schemas and migrations
Tech Stack
- Frontend: Next.js 14 (App Router) + TypeScript + shadcn/ui + TailwindCSS
- Backend: NestJS + TypeScript + PostgreSQL + Redis
- Processing: Microsoft Presidio + Apache Tika + Tesseract OCR
- Deployment: Docker Compose
Screenshots
Main dashboard
Project view
Datasets view
Jobs overview
Anonymization workflow
Policies
Settings
Quick Start
Option 1: Docker Images (Recommended)
π Zero-build deployment with pre-built images from GitHub Container Registry:
Prerequisites:
- Docker and Docker Compose installed
- 4GB+ RAM available
Quick Deploy:
# Clone and configure git clone https://github.com/bluewave-labs/maskwise.git cd maskwise cp .env.production.example .env # Edit .env: Set POSTGRES_PASSWORD, JWT_SECRET # Deploy all services instantly docker-compose -f docker-compose.production.yml up -d # One-time database setup docker-compose -f docker-compose.production.yml exec api npx prisma migrate deploy docker-compose -f docker-compose.production.yml exec api npx prisma db seed
Access Application:
- π Web UI: http://localhost:3000
- π API: http://localhost:3001
- π€ Admin Login: admin@maskwise.com / admin123
Service Status Check:
# Verify all services are healthy
docker-compose -f docker-compose.production.yml psβ Pre-built Docker Images (No Build Required):
ghcr.io/bluewave-labs/maskwise-api:latest- Backend API serviceghcr.io/bluewave-labs/maskwise-worker:latest- Background job processorghcr.io/bluewave-labs/maskwise-web:latest- Frontend web application
Features:
- Multi-platform support (linux/amd64, linux/arm64)
- Security-optimized Alpine Linux base
- Automated health checks and restart policies
- Production-ready with resource limits
See DOCKER.md for complete Docker deployment guide.
Option 2: Development Setup
For development with live code changes:
Prerequisites:
- Docker and Docker Compose installed and running
- Node.js 18+ and npm installed
- PostgreSQL client (optional, for direct database access)
Quick Setup Script (Alternative)
# Automated setup of infrastructure and database
./start-dev.shThen follow the terminal instructions to start the three application services.
Installation Steps (Manual)
-
Clone and Install Dependencies
git clone https://github.com/your-org/maskwise.git cd maskwise npm install -
Start Infrastructure Services
# Start PostgreSQL, Redis, Presidio, Tika, and Tesseract docker-compose up -d postgres redis presidio-analyzer presidio-anonymizer tika tesseract # Wait for services to be healthy (about 30-60 seconds) docker-compose ps
-
Set Up Database
# Navigate to database package cd packages/database # Generate Prisma client npx prisma generate # Run migrations npx prisma migrate deploy # Seed database with admin user and policies npx prisma db seed # Return to project root cd ../..
-
Start Application Services
Open 3 separate terminals and run:
Terminal 1 - API Server:
cd apps/api JWT_SECRET=maskwise_jwt_secret_dev_only \ DATABASE_URL=postgresql://maskwise:maskwise_dev_password@localhost:5436/maskwise \ REDIS_URL=redis://localhost:6379 \ npm run devTerminal 2 - Worker Service:
cd apps/worker npm run devTerminal 3 - Web Frontend:
cd apps/web npx next dev -p 3005 -
Access the Application
- Frontend: http://localhost:3005
- API: http://localhost:3001
- Default Admin: admin@maskwise.com / admin123
Verification
# Check Docker services are healthy docker-compose ps # Test API is responding curl http://localhost:3001/health # All services should show as running/healthy
Troubleshooting
- Port conflicts: Change ports in the commands above if needed
- Docker issues: Run
docker-compose downand restart - Database connection: Ensure PostgreSQL container is healthy before starting API
- Missing dependencies: Run
npm installin individual app directories if needed
Production Deployment
π Production Docker Deployment (Recommended)
Deploy Maskwise to production using battle-tested Docker images:
# 1. Setup production environment git clone https://github.com/bluewave-labs/maskwise.git cd maskwise cp .env.production.example .env # 2. Configure secure production values # Edit .env with: # - Strong POSTGRES_PASSWORD (use a password manager) # - Secure JWT_SECRET (32+ random characters) # - External database URLs if using managed services # - Custom ports if needed # 3. Deploy instantly with pre-built images docker-compose -f docker-compose.production.yml up -d # 4. Initialize database (first-time only) docker-compose -f docker-compose.production.yml exec api npx prisma migrate deploy docker-compose -f docker-compose.production.yml exec api npx prisma db seed # 5. Verify deployment docker-compose -f docker-compose.production.yml ps curl -f http://localhost:3001/health
π‘οΈ Production Features:
- β Zero build time - pre-built images ready to deploy
- β Multi-platform - works on amd64/arm64 (Apple Silicon, AWS Graviton)
- β Security hardened - Alpine Linux with non-root users
- β Auto-healing - health checks with automatic restart
- β Resource optimized - memory limits and CPU controls
- β High availability - separate API, Worker, and Web services
Option 2: Build from Source
-
Copy environment template
cp .env.example .env # Edit .env with your production values -
Deploy
Option 3: Kubernetes with Helm (Enterprise)
-
Prerequisites
- Kubernetes cluster (1.19+)
- Helm 3.0+
- kubectl configured
-
Configure values
# Edit production values cp k8s/values-production.yaml k8s/values-production-custom.yaml # Update image registry, domains, secrets, etc.
-
Deploy
# One-command deployment make k8s-deploy # Or manual ./k8s/deploy.sh
-
Access
# Port forward for local access make k8s-port # Or configure ingress for external access # Update ingress.hosts in values file
Kubernetes Features
- Auto-scaling: HPA based on CPU/memory
- High Availability: Multi-replica deployments
- Rolling Updates: Zero-downtime deployments
- Monitoring: Prometheus integration ready
- Security: Pod security contexts, network policies
- Storage: Persistent volumes for data
Development
Starting Development Environment
Follow the installation steps above to run in development mode.
Building and Testing
# Build individual packages cd apps/api && npm run build cd apps/web && npm run build cd apps/worker && npm run build # Run linting cd apps/api && npm run lint cd apps/web && npm run lint # Type checking cd apps/api && npm run type-check cd apps/web && npm run type-check # Run tests cd apps/api && npm test
Database Operations
cd packages/database # Reset database (careful!) npx prisma migrate reset # Apply new migrations npx prisma migrate dev # View data in browser npx prisma studio