Settings

Theme

PasLLM: An Object Pascal inference engine for LLM models

github.com

4 points by nor-and-or-not 2 months ago · 1 comment

Reader

nor-and-or-notOP 2 months ago

PasLLM runs models such as Llama 3.x, Qwen 2.5 and 3, Phi-3, Mixtral, Gemma 1, DeepSeek R1 and others locally, without any external dependencies for inference. Currently it only runs on the CPU, but GPU acceleration is planned.

The inference engine uses its own custom 4-bit quantization formats that balance good precision with reasonable model sizes, and also supports larger bit sizes. Models need to be converted into these formats using the provided tools, but pre-quantized models are available for download. Details about the optimized formats can be found here: https://github.com/BeRo1985/pasllm/blob/master/docs/quant_4b...

It supports both Delphi and Free Pascal on all major operating systems. A CLI version as well as programs and examples for various Pascal GUI frameworks (FMX, VCL, LCL) are included.

PasLLM is licensed under the AGPL 3.0 and may be integrated as a Pascal unit directly into third-party Object Pascal projects.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection