Project Name
Cohere Weviate Wikipedia Retrieval using LangChain
Description
A backend API to perform search over Wikipedia using LangChain, Cohere and Weaviate
Getting Started
Prerequisites
To use this project, you will need to have the following installed on your machine:
- Python 3.8 or above
- pip
- virtualenv
Installing
To install and run this project on your local machine, follow these steps:
- Clone the repository onto your machine using the following command:
git clone https://github.com/menloparklab/cohere-weviate-wikipedia-retrieval
- Create a virtual environment for the project using the following command:
- Activate the virtual environment using the following command:
- Install the project dependencies using the following command:
pip install -r requirements.txt
-
Create a
.envfile in the root directory of the project and add your API keys. You can use the.env.copyfile as a template.Weaviate api keys and url are left intentionally. These are read only api provided by Weaviate for demo purposes.
-
To test your output and results, use the provided jupyter notebook. You can easily run this in Colab as well.
-
To start the API routes using Flask, run the following command:
Below are the endpoints and examples to call them
-
/retrieveThis endpoint generates an answer to a query using retrieval-based QA. To use this endpoint, send a POST request to
http://<host>/retrievewith the following JSON payload:{ "query": "<your query>", "language": "<language>" }The
queryfield should contain the query for which you want to generate an answer. Thelanguagefield is optional and should be set to the language of the query. If thelanguagefield is not set, the default language is English.Example JSON:
{ "query": "What is the capital of France?", "language": "english" } -
/retrieve-listThis endpoint returns a list of most similar embeddings to the query using the vectorstore. To use this endpoint, send a POST request to
http://<host>/retrieve-listwith the following JSON payload:{ "query": "<your query>", "k": <k> }The
queryfield should contain the query for which you want to generate an answer. Thekfield is optional and should be set to the number of most similar embeddings you want to retrieve. If thekfield is not set, the default value is 4.Example JSON:
{ "query": "What is the capital of France?", "k": 4 } -
/retrieve-comprThis endpoint generates an answer to a query using Contextual Compression. To use this endpoint, send a POST request to
http://<host>/retrieve-comprwith the following JSON payload:{ "query": "<your query>", "k": <k>, "top_n": <top_n>, "language": "<language>" }The
queryfield should contain the query for which you want to generate an answer. Thekandtop_nfields are optional and should be set to the number of most similar embeddings you want to retrieve and the number of compressed documents you want to consider, respectively. If thekandtop_nfields are not set, the default values are 9 and 3, respectively. Thelanguagefield is optional and should be set to the language of the query. If thelanguagefield is not set, the default language is English.Example JSON:
{ "query": "What is the capital of France?", "k": 9, "top_n": 3, "language": "english" } -
/retrieve-compr-listThis endpoint returns a list of most similar embeddings to the query using Contextual Compression. To use this endpoint, send a POST request to
http://<host>/retrieve-compr-listwith the following JSON payload:{ "query": "<your query>", "k": <k>, "top_n": <top_n> }The
queryfield should contain the query for which you want to generate an answer. Thekandtop_nfields are optional and should be set to the number of most similar embeddings you want to retrieve and the number of compressed documents you want to consider, respectively. If thekandtop_nfields are not set, the default values are 9 and 3, respectively. -
/chat-no-historyThis route allows the user to chat with the application without any historical chat context. It accepts the following parameters in a JSON request body:
query: The user's query. Required.k: An integer value for the number of results to retrieve from the model. Optional, defaults to 9.top_n: An integer value for the number of top search results to consider for generating an answer. Optional, defaults to 3.
The route then uses the
compressionfunction to retrieve the topkresults from the model, and constructs a prompt using the user's query. The prompt is passed to the machine learning model, and the output is parsed using aparserobject. If a language is detected in the output, it is used for subsequent queries, otherwise the default is English. TheRetrievalQAclass is used to generate a response using theqaobject, and the search result is returned as a JSON response.Example JSON
{ "query": "What is the capital of France?", "k": 5, "top_n": 2 }Example Response
{ "search_result": "Paris is the capital of France." } -
/chat-with-historyThis route allows the user to chat with the application using historical chat context. It accepts the same parameters as the previous route:
query: The user's query. Required.k: An integer value for the number of results to retrieve from the model. Optional, defaults to 9.top_n: An integer value for the number of top search results to consider for generating an answer. Optional, defaults to 3.
In addition, this route maintains a memory of past conversations using the
ConversationBufferMemoryclass, and generates responses using theConversationalRetrievalChainclass. The memory key for this route is set to"chat_history". The search result is returned as a JSON response.Example Json
{ "query": "What is the capital of Spain?", "k": 3, "top_n": 1 }Example Response
{ "search_result": "The capital of Spain is Madrid." }