GitHub - tensorlakeai/indexify: A realtime serving engine for Data-Intensive Generative AI Applications

Indexify

Compute Engine for Building Data Platforms

Indexify is a compute engine for building data platforms in Python. Create large-scale data processing workflows and agentic applications with durable execution—functions automatically retry on failure, and workflows seamlessly scale across machines. Upon deployment, each application gets a unique URL that can be called from any system.

Note: Indexify is the open-source core that powers Tensorlake Cloud—a serverless platform for document processing, media pipelines, and agentic applications.

✨ Features

Feature	Description
🐍 Python Native	Define workflows as Python functions with type hints—no DSLs, YAML, or config files
🔄 Durable Execution	Functions automatically retry on failure with persistent state across restarts
📊 Distributed Map/Reduce	Parallelize functions over sequences across machines with automatic data shuffling
⚡ Request Queuing	Automatically queue and batch invocations to maximize GPU utilization
🌐 Multi-Cloud	Run across multiple clouds, datacenters, or regions with minimal configuration
📈 Autoscaling	Server automatically redistributes work when machines come and go

🚀 What Can You Build?

Large-Scale Data Processing Workflows

Build production-grade data pipelines entirely in Python with automatic parallelization, fault tolerance, and distributed execution:

Document Processing — Extract tables, images, and text from PDFs at scale; build knowledge graphs; implement RAG pipelines
Media Pipelines — Transcribe and summarize video/audio content; detect and describe objects in images
ETL & Data Transformation — Process millions of records with distributed map/reduce operations

Agentic Applications

Build durable AI agents that reliably execute multi-step workflows:

Tool-Calling Agents — Orchestrate LLM tool calls with automatic state management and retry logic
Multi-Agent Systems — Coordinate multiple agents with durable message passing

📖 Explore the Cookbooks → for complete examples and tutorials.

📦 Installation

Using pip:

pip install indexify tensorlake

🎯 Quick Start

Define Your Application

Create applications using @application() and @function() decorators. Each function runs in its own isolated sandbox with durable execution—if a function crashes, it automatically restarts from where it left off.

from typing import List
from pydantic import BaseModel
from tensorlake.applications import application, function, Image, run_local_application

# Define container image with dependencies
embedding_image = Image(base_image="python:3.11-slim", name="embedding_image").run(
    "pip install sentence-transformers langchain-text-splitters chromadb"
)


class TextChunk(BaseModel):
    chunk: str
    page_number: int


class ChunkEmbedding(BaseModel):
    text: str
    embedding: List[float]


@function(image=embedding_image)
def chunk_text(text: str) -> List[TextChunk]:
    """Split text into chunks for embedding."""
    from langchain_text_splitters import RecursiveCharacterTextSplitter

    splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20)
    texts = splitter.create_documents([text])
    return [
        TextChunk(chunk=chunk.page_content, page_number=i)
        for i, chunk in enumerate(texts)
    ]


@function(image=embedding_image)
def embed_chunks(chunks: List[TextChunk]) -> List[ChunkEmbedding]:
    """Embed text chunks using sentence transformers."""
    from sentence_transformers import SentenceTransformer
    
    model = SentenceTransformer("all-MiniLM-L6-v2")
    return [
        ChunkEmbedding(text=chunk.chunk, embedding=model.encode(chunk.chunk).tolist())
        for chunk in chunks
    ]


@function(image=embedding_image)
def write_to_vectordb(embeddings: List[ChunkEmbedding]) -> str:
    """Write embeddings to ChromaDB."""
    import chromadb
    import uuid
    
    client = chromadb.PersistentClient("./chromadb_data")
    collection = client.get_or_create_collection("documents")
    
    for emb in embeddings:
        collection.upsert(
            ids=[str(uuid.uuid4())],
            embeddings=[emb.embedding],
            documents=[emb.text],
        )
    return f"Indexed {len(embeddings)} chunks"


@application()
@function(description="Text embedding pipeline")
def text_embedder(text: str) -> str:
    """Main application: chunks text, embeds it, and stores in vector DB."""
    chunks = chunk_text(text)
    embeddings = embed_chunks(chunks)
    result = write_to_vectordb(embeddings)
    return result

Deploy to Tensorlake Cloud (Fastest Way to Get Started)

Tensorlake Cloud is the fastest way to test and deploy your applications—no infrastructure setup required. Get an API key and deploy in seconds:

# Set your API key
export TENSORLAKE_API_KEY="your-api-key"

# Deploy the application
tensorlake deploy workflow.py
# => Deployed! URL: https://api.tensorlake.ai/namespaces/default/applications/text_embedder

Invoke your application using the SDK or call the URL directly:

from tensorlake.applications import run_remote_application

request = run_remote_application(text_embedder, "Your document text here...")
result = request.output()
print(result)

Self-Host with Indexify

If you prefer to self-host or need on-premise deployment, you can run the Indexify server locally:

# Terminal 1: Start the server
docker run -p 8900:8900 tensorlake/indexify-server

# Terminal 2: Start an executor (repeat for more parallelism)
indexify-cli executor

Set the API URL and deploy:

export TENSORLAKE_API_URL=http://localhost:8900
tensorlake deploy workflow.py
# => Deployed! URL: http://localhost:8900/namespaces/default/applications/text_embedder

Run your application:

from tensorlake.applications import run_remote_application

request = run_remote_application(text_embedder, "Your document text here...")
result = request.output()
print(result)

Test Locally (No Server Required)

For quick iteration during development, run applications locally without any infrastructure:

if __name__ == "__main__":
    request = run_local_application(text_embedder, "Your document text here...")
    result = request.output()
    print(result)

🏗️ Production Self-Hosted Deployment

For production self-hosted deployments, see operations/k8s for Kubernetes deployment manifests and Helm charts.

☁️ Tensorlake Cloud vs Self-Hosted Indexify

Start with Tensorlake Cloud to build and test your applications without infrastructure overhead. When you're ready for self-hosting or need on-premise deployment, Indexify provides the same runtime you can run anywhere.

Feature	Tensorlake Cloud	Indexify (Self-Hosted)
Setup Time	Instant—just get an API key	Deploy server + executors
Image Building	Automatic image builds when you deploy	Build and manage container images yourself
Auto-Scaling	Dynamic container scaling with scale-to-zero	Manual executor management
Security	Secure sandboxes (gVisor, Linux containers, virtualization)	Standard container isolation
Secrets	Built-in secret management for applications	Manage secrets externally
Observability	Logging, tracing, and observability built-in	Bring your own logging/tracing
Testing	Interactive playground to invoke applications	Local development only

Get started with Tensorlake Cloud →

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

📄 License

Indexify is licensed under the Apache 2.0 License.

Website • Docs • Slack • Twitter