One of the biggest challenges in AI-native infrastructure is the tension between model quality and operational cost. A large, powerful model produces better search results — but it also costs more to run at scale. When you’re processing billions of log lines per day, that cost adds up fast.
Today we’re announcing TinyThedex — a compact version of our log search model purpose-built for high-throughput production ingest.
The Numbers
| Spec | Full Model | TinyThedex |
|---|---|---|
| Model size | 570 MB | 87 MB |
| Parameters | 149 million | 22 million |
| Throughput (per CPU core) | ~200 logs/sec | ~1,000-2,500 logs/sec |
| Quality retention | Baseline | 96.4% of full model |
| Vector storage per log | 3,072 bytes | 1,536 bytes |
7x smaller. 5-7x faster. 50% less vector storage. 96.4% of the quality.
Why This Matters for Customers
Enterprise log volumes are growing at 35-50% year over year. A mid-market company with 50 microservices generates around 500 million to 1 billion log lines per day. The AI model that processes those logs at ingest time is the single largest compute cost in the system.
With TinyThedex, we can process 1.7 billion logs per day on a single 8-core ARM server — no GPU required. That translates directly to infrastructure cost:
| Configuration | Logs/day capacity | Monthly infra cost |
|---|---|---|
| Full model on GPU | 864M | ~$500 |
| Full model on CPU (3 nodes) | 430M | ~$400 |
| TinyThedex on CPU (1 node) | 1.7B | ~$200 |
A single TinyThedex node handles 3x more throughput than three full-model nodes — at half the cost. No GPU required.
What This Means for Pricing
Lower infrastructure costs mean we can offer more competitive pricing without sacrificing margins. Here’s what the unit economics look like:
At 100 GB/day (typical mid-market customer):
| Metric | Legacy APM | Thedex (Full Model) | Thedex (TinyThedex) |
|---|---|---|---|
| Monthly cost | $5,000-15,000 | $1,200 | $1,200 |
| Search capability | Standard search only | AI-native search | AI-native search |
| Infrastructure required | Large clusters | 3 CPU nodes | 1 CPU node |
TinyThedex doesn’t change the customer’s price — it dramatically reduces the infrastructure required to deliver the service. That means we can scale to more customers without proportional infrastructure growth, invest more in product development, and pass savings on to customers over time.
At 500 GB/day (large enterprise):
| Metric | Full Model | TinyThedex |
|---|---|---|
| Infrastructure needed | 3 CPU nodes | 1 CPU node |
| Infrastructure cost | $1,200/month | $200/month |
6x less infrastructure for the same workload. That’s the efficiency gain that lets us offer enterprise-grade AI search at a fraction of legacy pricing.
Two Models, Best of Both Worlds
TinyThedex is not a replacement for our full model — it’s a complement. We use a two-model architecture:
Ingest path (TinyThedex — speed priority): Every log line that enters the system is processed by TinyThedex. It’s fast enough to keep up with any customer’s log volume on standard CPU hardware. The compact representation captures 96.4% of the full model’s understanding of log semantics.
Query path (Full model — quality priority): When a user runs a search query, we process their query with the full model for maximum precision. Since there’s only one query at a time (not millions per second), the full model’s speed is more than sufficient. The user gets full-quality results searched against TinyThedex-encoded data.
The result: customers get the speed of a compact model for ingest and the quality of a full model for search. No compromise on either dimension.
How We Built It
TinyThedex was created through knowledge transfer — our full model (the “teacher”) teaches a smaller model (the “student”) to produce similar representations for log data.
The process:
- Encode 500,000 log messages with the full model, capturing its understanding of each message as a high-dimensional vector
- Train the compact model to reproduce those same representations, learning to compress the teacher’s knowledge into fewer parameters
- Validate quality by measuring how closely the compact model’s output correlates with the teacher’s output across thousands of test cases
The key metrics:
| Metric | Score | What It Means |
|---|---|---|
| Pearson correlation | 0.964 | Output is 96.4% correlated with the full model |
| Spearman rank correlation | 0.954 | Ranks logs 95.4% identically to the full model |
| Variance retained | 96.9% | 96.9% of the information is preserved |
This is above our 90% quality threshold. The compact model captures essentially all of the full model’s log-specific knowledge — operational equivalence, causal chain awareness, severity understanding — in a package that runs 7x faster.
The Infrastructure Advantage
Most AI-powered enterprise tools require GPU infrastructure to run their models. GPUs are expensive ($500-5,000/month per GPU) and add operational complexity (driver management, CUDA versions, GPU scheduling).
TinyThedex runs on standard ARM CPUs. The same commodity servers that run web applications and databases. No special hardware, no GPU drivers, no CUDA toolkit.
This is a structural cost advantage:
- Legacy APM tools: No AI models in the ingest path. Fast ingest but basic search.
- AI-powered competitors: Require GPU infrastructure. Better search but expensive.
- Thedex with TinyThedex: AI-native search on commodity CPU hardware. Better search AND lower cost.
We believe this is the right architecture for enterprise log intelligence: AI quality at CPU cost.
What’s Next
TinyThedex is deployed in our production environment and available to all design partners. As we onboard customers and learn from their real-world log patterns, both the full model and TinyThedex will improve through our data flywheel — each customer’s data makes the models better for everyone.
If you’re processing 100+ GB/day of logs and interested in AI-native search at a fraction of legacy costs, we’d love to talk.
Interested in what we’re building?