Show HN: LLMWare – Small Specialized Function Calling 1B LLMs for Multi-Step RAG
github.comHi, I was a corporate lawyer for many years working with a lot of financial services and insurance companies. In practicing law, I noticed there was a lot of repetition in the tasks I was working on even as a highly paid attorney that could be automated.
I wanted to solve the problem of dealing with a lot information and data in a practical way, using AI. This motivated me to start AI Bloks/LLMWare with my husband, who had a deep background in software and is a very early adopter of AI.
We have been on this journey with our open source project LLMWare for the past 4 months, producing a RAG framework in GH and about 50 models in Hugging Face. https://huggingface.co/llmware
Our latest models are designed to re-imagine the way we use small specialized models in multi-step RAG workflow (SLIMs). I would love for you to check it out and give us some feedback. Thank you! I've been building upon the LLMWare project - https://github.com/llmware-ai/llmware - for the past 3 months. The ability to run these models locally on standard consumer CPUs, along with the abstraction provided to chop and change between models and different processes is really cool. I think these SLIM models are the start of something powerful for automating internal business processes and enhancing the use case of LLMs. Still kinda blows my mind that this is all running on my 3900X and also runs on a bog standard Hetzner server with no GPU. Don't forget to give some credit to llama.cpp which actually runs the models here and does all the things you're praising it for. This project is more about building a platform on top of it with RAG and function calling. Oh yea 100%! llama.cpp and the opensource community in general is truely awesome in getting AI models into the hands of as many people as possible. I think these platforms are the key things to inspire people and get them to see the power of local LLMs in just a few minutes. Can't wait to see what other opensource platforms crop up in 2024 as well. Absolutely! We credit Georgi Gerganov and llama.cpp for the amazing advancement in quite a few of our YT videos. He is truly a hero. Thank you so much for the awesome feedback! This looks very interesting as majority of the world is moving towards bigger "do everything for you" LLMs. Just few preliminary questions after I glanced over the repo and blogs: - there are also many small models that try to "do everything" like phi, mistral, etc., do you find llmware have better quality and performance?
- how does the rag relate features compare with other tools like llamaindex?