Deep Research is an open source library for conducting deep, multi-hop research with reasoning capabilities. It performs focused web searches with recursive exploration to provide comprehensive, evidence-backed answers to complex questions.
Perplexity and OpenAI's Deep Research is gatekeeped and close sourced, So we decided to built the oppositeβan open, fully customizable deep research framework.
β¨ Key Features
Deep Research is designed to be your comprehensive solution for AI-powered research:
- π§ Advanced multi-hop reasoning for complex questions
- π Real-time web search with recursive exploration
- π Automatic subquery generation for comprehensive coverage
- π Intelligent depth and breadth control for research thoroughness
- π Evidence-based report generation with proper citations
- π Automatic bibliography generation with source tracking
- π Iterative research cycles for deeper understanding
- π€ Multi-model support with specialized reasoning capabilities
- β‘ Flexible configuration for customizing research parameters
- π Scalable from simple inquiries to complex research problems
π§± Core Concepts
| Concept | Description |
|---|---|
| Deep Thinking | The system breaks down a question into logical parts, reasons through them independently, and synthesizes an answer. |
| Deep Research | The system performs multi-hop, focused web searches, compares the findings, and composes an evidence-backed answer. |
π Installation
npm i deep-research # or yarn add deep-research # or bun i deep-research
π Quick Start
Basic Usage
import { createDeepResearch } from "deep-research"; // Create instance using the factory function with default settings const deepResearch = createDeepResearch({ OPENAI_API_KEY: process.env.OPENAI_API_KEY, GEMINI_API_KEY: process.env.GEMINI_API_KEY, OPENROUTER_API_KEY: process.env.OPENROUTER_API_KEY, JIGSAW_API_KEY: process.env.JIGSAW_API_KEY, }); // Research prompt const prompt = "What are the recent developments in quantum computing?"; // Generate research report const result = await deepResearch.generate(prompt); console.log(result.data.text); console.log(result.data.bibliography);
Advanced Usage
import { createDeepResearch } from "deep-research"; import { createGoogleGenerativeAI } from "@ai-sdk/google"; import { createOpenRouter } from "@openrouter/ai-sdk-provider"; import { createOpenAI } from "@ai-sdk/openai"; // Initialize AI providers const gemini = createGoogleGenerativeAI({ apiKey: process.env.GEMINI_API_KEY, }); const openaiProvider = createOpenAI({ apiKey: process.env.OPENAI_API_KEY, }); const openRouterProvider = createOpenRouter({ apiKey: OPENROUTER_API_KEY! }) // Get model instances const geminiModel = gemini("gemini-2.0-flash"); const deepseekModel = openRouterProvider("deepseek-ai/DeepSeek-R1"); const openaiModel = openaiProvider("gpt-4o"); // Create instance with custom configuration const deepResearch = createDeepResearch({ max_output_tokens: 30000, // Hard upper limit of tokens target_output_tokens: 10000, // Target report length max_depth: 4, // specify how many iterations of research to perform max_breadth: 3, // specify how many subqueries to generate models: { default: openaiModel, // Custom models from AI SDK reasoning: deepseekModel, output: geminiModel, }, logging: { enabled: true, // Enable console logging }, }); // Research prompt const prompt = "What are the recent developments in quantum computing?"; // Generate research report const result = await deepResearch.generate(prompt); console.log(result.data.text); console.log(result.data.bibliography);
Configuration Options for Deep Research
| Category | Option | Type | Default | Description |
|---|---|---|---|---|
| max_depth | - | Number | 3 | Controls how many iterations of research the system will perform. Higher values allow for more thorough, multi-hop research. The system will continue researching until it has a complete answer or reaches this limit. |
| max_breadth | - | Number | 3 | Controls how many subqueries are generated for each research iteration. Higher values enable wider exploration of the topic. Determines how many parallel search paths are pursued. |
| max_output_tokens | - | Number | 32000 | Hard upper limit on the length of the final report. Must be greater than target_output_tokens. |
| target_output_tokens | - | Number | optional | The ideal length for the generated report. The system will try to produce a report of approximately this length. |
| models | default | LanguageModelV1 | GPT-4o | The primary model used for most operations. |
| reasoning | LanguageModelV1 | DeepSeek-R1 | Model used for reasoning about search results. | |
| output | LanguageModelV1 | GPT-4o | Model used for generating the final report. | |
| logging | enabled | Boolean | false | When set to true, enables detailed console logging. Helpful for debugging and understanding the research process. |
| API Keys | JIGSAW_API_KEY | String | required | For accessing the JigsawStack API for web searches. |
| OPENAI_API_KEY | String | optional if custom models provided | For OpenAI model access. | |
| DEEPINFRA_API_KEY | String | optional if custom models provided | For DeepInfra model access. |
π§© How It Works
1οΈβ£ Research Planning & Analysis
- Creates a DeepResearch instance with user-provided configuration
- Analyzes the input prompt to understand requirements
- Generates a comprehensive research plan
- Breaks down into focused sub-queries using LLMs
2οΈβ£ Data Collection & Processing
- Executes AI-powered web searches for each sub-query via JigsawStack API
- Gathers and validates relevant sources
- Generates context from search results
- Deduplicates URLs to ensure unique sources
3οΈβ£ Analysis & Synthesis
- Processes gathered information through reasoning models
- Analyzes and synthesizes the findings
- Evaluates information sufficiency
- Determines if additional research is needed
- Performs iterative research within configured depth limits if needed
4οΈβ£ Report Generation & Citations
- Creates comprehensive final report
- Iteratively generates content until complete
- Maps sources to reference numbers
- Generates bibliography with citations
- Formats output according to target length requirements
JigsawStack
This project is part of JigsawStack - A suite of powerful and developer friendly APIs for various use cases while keeping costs low. Sign up here for free!
π οΈ Contributing
Contributions are welcome! Please feel free to submit a PR :)
