Settings

Theme

Instant AI Response

chatjimmy.ai

38 points by hochmartinez a month ago · 14 comments

Reader

personalcompute a month ago

This is a demo of Taalas inference ASIC hardware. Prior discussion @ https://news.ycombinator.com/item?id=47086181

pella a month ago

- https://news.ycombinator.com/item?id=47086181

- https://taalas.com/the-path-to-ubiquitous-ai/

- https://www.nextplatform.com/2026/02/19/taalas-etches-ai-mod...

nacs a month ago

What model and hardware powers this?

Is this a Google T5 based model?

OutOfHere a month ago

Impressive, but this particular underlying LLM is objectively weak. I'd like to see it done with a larger and newer better model.

alansaber a month ago

I love seeing optimised SLM inference. Is there a current use-case for this? Edge CNNs make sense to me but not edge SLMs (yet).

Kuyawa a month ago

If this is possible, why not all online AI engines work like this?

  • yomismoaqui a month ago

    This is an specific model (Llama 3.1 8B) baked in hardware form. You can only use this model but get "low" power consumption and crazy speed.

    If you want to run a different model you need new hardware for that new model.

    • sbrother a month ago

      Do we understand how to scale up the hardware to the point it can run a frontier model? Because this is insane. It will be a game changer for agent systems making 10-100+ calls.

    • sixtyj a month ago

      It is really a crazy speed. 15k tokens/second.

      • sixtyj a month ago

        I have tried it again. This is the future of chat UI, imho.

        Generated in 0,074s • 15 754 tok/s

notronic a month ago

imagine a model like opus 4.6 at that speed, that would be insane

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection