Implementation of Graph of Agents based on huggingface pipeline.
Usage
Setup
Refer environment.yml for the setup.
Create a new env
conda env create -f environment.yml
Update the existing env
conda env update -f environment.yml
Required data will be downloaded from HF datasets.
Export HF_TOKEN in your environment.
Inference
Inference code will generate predictions on samples, which will be used for evaluation later.
Examples of GoA inference on LongBench, respectively. Model name convention is {model_family}-{context window}-{temperature}.
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python3 eval_longbench.py --model_name llama3_8b-2000-0.1 --seed 42 --goa
GoA and baselines can be tested by passing the following arguments:
- goa: test graph of agents (with k=4)
- coa: test chain-of-agents
- rag: test RAG
- None of the above will test vanilla model
The number of clusters of GoA can be changed by passing "--goa_cluster_size 2".
Refer script in "inference_scripts" for more details.
Evaluation
Run result_longbench.py by passing a proper model name $MODEL_NAME (qwen_8b, llama3_8b). Note that the context window and temperature is omitted as opposed to the inference code. This will evaluate all models in $DIRECTORY.
python3 result_longbench.py --model qwen_8b --directory DIRECTORY
Qualitative analysis
Pass "--debug" will save worker outputs and manager input/output in "save_dir"
Reference
The large part of the code is adopted from an implementation of Chain-of-Agents https://github.com/rudrankriyam/Chain-of-Agents.