ChatGPT-J: The Privacy-First, Self-Hosted Chatbot Built on GPT-J's Powerful AI
colab.research.google.com"Privacy-First", but also working in a colab notebook - meaning running on someone else's machine? That doesn't seem very private.
Download the notebook and run locally?
Yes, the GitHub has the Jupyter .ipynb notebook that can be run locally: https://github.com/jarrellmark/chatgpt-j
And even in Colab, it's privacy first in the sense that user input or model output isn't being sent anywhere. The data is local to your Colab session.
Github repo : https://github.com/jarrellmark/chatgpt-j
Can this be run locally without beefy GPUs by any chance?
ggml (https://github.com/ggerganov/ggml) has a GPT-J example, the 6B parameter model runs happily on the CPU 16gb of ram and 8 cores at a couple of words per second, no GPUs necessary.
gptj_model_load: ggml ctx size = 13334.86 MB gptj_model_load: memory_size = 1792.00 MB, n_mem = 57344 gptj_model_load: model size = 11542.79 MB / num tensors = 285 main: number of tokens in prompt = 12 An example of GPT-J running on the CPU is shown in Fig. [4](#Fig4 main: mem per token = 16179460 bytes main: load time = 7463.20 ms main: sample time = 3.24 ms main: predict time = 4887.26 ms / 232.73 ms per token main: total time = 13203.91 msThere have been CPU implementations of LLAMA (7b parameters, comparable in size) with very impressive performance
I haven't used this yet, but I am currently running GPT-J on my Mac Studio, so I suspect so.
It should work with about 12gb GPU RAM.
I got it to load on a GTX 1070 with 8GB GPU RAM, but then it crashed before it could generate a response.
It needs less RAM than regular GPT-J because the weights are converted to 8-bit