Introducing Julius: Open Source LLM Service Fingerprinting

Julius open-source security tool for LLM service fingerprinting and adversarial AI red teaming to identify hidden model versions.

The Growing Shadow AI Problem

Over 14,000 Ollama server instances are publicly accessible on the internet right now. A recent Cisco analysis found that 20% of these actively host models susceptible to unauthorized access. Separately, BankInfoSecurity reported discovering more than 10,000 Ollama servers with no authentication layer—the result of hurried AI deployments by developers under pressure.

This is the new shadow IT: developers spinning up local LLM servers for productivity, unaware they’ve exposed sensitive infrastructure to the internet. And Ollama is just one of dozens of AI serving platforms proliferating across enterprise networks.

The security question is no longer “are we running AI?” but “where is AI running that we don’t know about?”

What is LLM Service Fingerprinting?

LLM service fingerprinting identifies what **server software** is running on a network endpoint—not which AI model generated text, but which infrastructure is serving it.

The LLM security space spans multiple tool categories, each answering a different question:

Julius answers the third question: during a penetration test or attack surface assessment, you’ve found an open port. Is it Ollama? vLLM? A Hugging Face deployment? Some enterprise AI gateway? Julius tells you in seconds.

Julius follows the Unix philosophy: do one thing and do it well. It doesn’t port scan. It doesn’t vulnerability scan. It identifies LLM services—nothing more, nothing less.

This design enables Julius to slot into existing security toolchains rather than replace them. The Praetorian Guard Security Pipeline. In Praetorian’s continuous offensive security platform, Julius occupies a critical position in the multi-stage scanning pipeline:

Why Existing Detection Methods Fall Short

Manual Detection is Slow and Error-Prone

Each LLM platform has different API signatures, default ports, and response patterns:

Ollama: port 11434, /api/tags returns {“models”: […]}
vLLM: port 8000, OpenAI-compatible /v1/models
LiteLLM: port 4000, proxies to multiple backends
LocalAI: port 8080, /models endpoint

Manually checking each possibility during an assessment wastes time and risks missing services.

Shodan Queries Have Limitations

A Cisco’s Study found ~1,100 Ollama instances were indexed on Shodan. While interesting, replicating the research requires a Shodan license.

Introducing Julius

Julius is an open-source LLM service fingerprinting tool that detects 17+ AI platforms through active HTTP probing. Built in Go, it compiles to a single binary with no external dependencies.

Julius vs Alternatives

How Julius Works

Julius uses a probe-and-match architecture optimized for speed and accuracy:

Architectural Decisions

Julius is designed for performance in large-scale assessments:

Detection Process

Target Normalization: Validates and normalizes input URLs
Probe Selection: Prioritizes probes matching the target’s port (if :11434, Ollama probes run first)
HTTP Probing: Sends requests to service-specific endpoints
Rule Matching: Compares responses against signature patterns
Specificity Scoring: Ranks results 1-100 by most specific match
Model Extraction: Optionally retrieves deployed models via JQ expressions

Specificity Scoring: Eliminating False Positives

Many LLM platforms implement OpenAI-compatible APIs. If Julius detects both “OpenAI-compatible” (specificity: 30) and “LiteLLM” (specificity: 85) on the same endpoint, it reports LiteLLM first.

This prevents the generic “OpenAI-compatible” match from obscuring the actual service identity.

Match Rule Engine

Julius uses six rule types for fingerprinting:

All rules support negation with not: true—crucial for distinguishing similar services. For example: “has /api/tags endpoint” AND “does NOT contain LiteLLM” ensures Ollama detection doesn’t match LiteLLM proxies.

Julius also caches HTTP responses during a scan, so multiple probes targeting the same endpoint don’t result in duplicate requests. You can write 100 probes that check / for different signatures without overloading the target. Julius fetches the page once and evaluates all matching rules against the cached response.

Julius prioritizes precision over breadth. Each probe includes specificity scoring to avoid false positives. An Ollama instance should be identified as Ollama, not just “something OpenAI-compatible.” The generic OpenAI-compatible probe exists as a fallback, but specific service detection always takes precedence.

Probes Included in Initial Release

Self-Hosted LLM Servers

Proxy & Gateway Services

Enterprise Cloud Platforms

ML Demo Platforms

RAG Platforms

Chat Frontends

Generic Detection

Extending Julius with Custom Probes

Adding support for a new LLM service requires ~20 lines of YAML— no code changes:

Validate your probe:

Real World Usage

Single Target Assessment

Scan Multiple Targets From a File

JSON output for automation:

What's Next

Julius is the first tool release of our “The 12 Caesars” open source tool campaign where we will be releasing one open source tool per week for the next 12 weeks. Julius focuses on HTTP-based fingerprinting of known LLM services. We’re already working on expanding its capabilities while maintaining the lightweight, fast execution that makes it practical for large-scale reconnaissance.

On our roadmap: additional probes for cloud-hosted LLM services, smarter detection of custom integrations, and the ability to analyze HTTP traffic patterns to identify LLM usage that doesn’t follow standard API conventions. We’re also exploring how Julius can work alongside AI agents to autonomously discover LLM infrastructure across complex environments.

Contributing & Community

Julius is available now under the Apache 2.0 license at https://github.com/praetorian-inc/julius

We welcome contributions from the community. Whether you’re adding probes for services we haven’t covered, reporting bugs, or suggesting new features, check the repository’s CONTRIBUTING.md for guidance on probe definitions and development workflow.

Ready to start? Clone the repository, experiment with Julius in your environment, and join the discussion on GitHub. We’re excited to see how the security community uses this tool in real-world reconnaissance workflows. Star the project if you find it useful, and let us know what LLM services you’d like to see supported next.