Heap-overflowing Llama.cpp to RCE

248 points by retr0reg 9 months ago · 60 comments

Reader

This is really incredible work. And the fact that you are 15 is blowing my mind. You have a really bright future ahead of you, and your parents must be really proud (at least I would be if you were my kid.) Hit me up if you want a summer internship finding security vulnerabilities at a hotel software startup (access control, property management, etc)

rzk 9 months ago

This is amazing—made even more impressive by the fact that the author is just 15 years old!

Also, it's nice to see this mentioned:

> For this 10k-word write-up, I spent around a month finishing up the main parts, and refining/editing it took an extra while. Writing this is indeed a painful process. I spent the entire day on the weekend and 4-5 hours during the rest of the week working on it for around two weeks.

It's the kind of behind-the-scenes effort that often goes unspoken.

Sheeny96 9 months ago

10k words was the word count for my 3rd year undergraduate dissertation in the UK. Typically, this is tirelessly worked on over months. The quality of this far exceeds anything I produced during that time and anything I saw from my peers.
asveikau 9 months ago

I'm also detecting hints of non-native English in his writing which may make it even more effort. Though his Twitter account says he's based in Connecticut.
Edit: Wow, shocked that I'm being received so negatively about this, non native English is not "bad". It isn't meant as judgement but praise for what he's accomplished.
- johnisgood 9 months ago
  
  He's 15. Many adults whose native language is English can't write it correctly, or what are you referring to if not grammar errors? Can you give me examples?
  - asveikau 9 months ago
    
    The way grammar works is that it is hard to describe the rules of your native language. I am parsing this as non native. My older kid is a few years younger than this guy, and her text messages to me sound more native than this corpus. No I can't describe it concretely. It's just how I parse word choice and sentence structure.
    
    johnisgood 9 months ago
    
    I read it again. I can see where you are coming from though. I didn't downvote you.
    
    asveikau 9 months ago
    
    Yeah, I removed the first part and made it into an edit of the original comment, since it wasn't directed at you.
    
    johnisgood 9 months ago
    
    I know, thank you.

yamrzou 9 months ago

I tried to execute the PoC by running the following:

  git clone https://github.com/ggml-org/llama.cpp.git && cd llama.cpp
  git checkout c0d4843225eed38903ea71ef302a02fa0b27f048 # Checkout a revision prior to the exploit fix in 1d20e53c40c3cc848ba2b95f5bf7c075eeec8b19
  mkdir build-rpc && cd build-rpc
  cmake .. -DGGML_RPC=ON
  cmake --build . --config Release
  cd bin/
  ./rpc-server -p 50052

In a second terminal:

  nc -lvp 1337

Then running the exploit code in a third terminal (from llama.cpp/build-rpc/bin directory):

  pip install pwntools
  python exp.py # From https://gist.github.com/retr0reg/d13de3fde8f9d138fe1af48e59e630a9

It failed at Stage Three: Bypass boundary check via libggml and raised an EOFError. The RPC server exited with Segmentation fault. Any idea why?

retr0regOP 9 months ago

it can be both because of the unsuccessful leak / wrong `libggml-base` offset. We're building a fake `ggml_backend_buffer` table from the leaked base + offset (the hard-coded offset of `libggml-base` should be adjusted with the compiled release) However this exploitation is not actually `libggml-base` version dependent, the partial-writing space is always one byte, and you can leak the `libggml-base` version with after a successful leak if you build every release's `libggml-base`, and map the last-two-bytes with each version.
I am happy you read it and liked it; more glad you tried it yourself :D

rboyd 9 months ago

sheesh. the visual aesthetics and script behavior on your blog are so tastefully executed. great job!

krackers 9 months ago

The smudges on the screenshots got me.
andrewSC 9 months ago

I'd honestly love to know what framework, theme, or stack is being used here! Looks incredible--great job!
- evannotfound 9 months ago
  
  Hi! I am the developer of Retr0's portfolio. I used nextjs for the framework, with framer motion + gsap for animation. The blog is powered by hashnode headless api with serverside rendering.
  - andrewSC 9 months ago
    
    Awesome! Thank you for the follow up and great work!
evannotfound 9 months ago

thank you for your support!

zaphod420 9 months ago

Can anyone tl/dr this? Does this mean that its possible for a maliciously crafted LLM to execute arbitrary code via an exploit in llama.cpp?

krackers 9 months ago

Summary at https://github.com/ggml-org/ggml/pull/1103
cadamsdotcom 9 months ago

Thanks for adding value today.

behnamoh 9 months ago

prodigies are amazing, but I often wonder what they end up doing later in life when the intelligence gap between them and their peers converges to zero.

miki123211 9 months ago

If they're lucky, the gap converges to zero because they surround themselves with more and more intelligent people as time passes. That's a recipe for success.
If they're unlucky, the gap converges to zero because they get used to not having to do much work, "fail upwards" because of the raw intelligence, and then can't keep up when surrounded by similarly intelligent people who actually do the work.
Failing at something you were told you were extremely good at, and hence based your entire identity around, is extremely difficult and demoralizing. Some people can never really recover from that, and AFAIK depression / suicide isn't unheard of.
Definitely not a problem for this particular kid though, "lack of hard work" and "coasting" is evidently not what this person is about.
THe middle scenario is kids that do the work, but stay in their community for economic / political / class / "born in the wrong place" reasons. Their talents are mostly squandered, but they might end up doing something very significant for the communities they're part of.
This used to be extremely common, a medieval peasant or ancient slave would most likely stay in their village, regardless of how much of a genius they were. The modern world made it much less so, and that's something worth celebrating.
- msp26 9 months ago
  
  Agreed, you put it well. It's really hard to develop a work ethic or learn how to study properly so late.
  You lack all the foundational habits and are just used things working out naturally. There are zero consequences, only positive outcomes despite doing the bare minimum. And then it all goes to shit.
pram 9 months ago

In my teens I was in an IRC community full of various hacker/script kiddie miscreants. Some of these people I would call actual geniuses.
The trajectory of everyone ranged from early Facebook employees, a CMU CS PhD, to one literally going to prison for an exploit lol. You can never tell where life will take you.
pragmatic8 9 months ago

Why do you presume that the intelligence gap would converge to zero?
- ziddoap 9 months ago
  
  Eventually everyone dies, thus becoming equally intelligent!
- behnamoh 9 months ago
  
  easy: how many genius people do you know who were also prodigies? early intelligence only gets you so far, the rest depends on hard work, passion, etc.
  - FeepingCreature 9 months ago
    
    If your later work overshadows your earlier, you're not generally remembered as a prodigy.
    
    behnamoh 9 months ago
    
    It's just that I don't any example of an adult genius who used to be a prodigy. The most obvious counter example was Einstein.
    
    lolinder 9 months ago
    
    "Prodigy" and "genius" both lack rigorous definitions, so your problem might simply be one of cherry picking. Going with my own gut on each definition, here are a few examples to add to those already provided by sibling comments:
    * Magnus Carlsen (chess grandmaster by 14, world champion for 10 years as an adult)
    * Frédéric Chopin (concert pianist at 7, one of the top composers for piano)
    * Blaise Pascal (rediscovered Euclid on his own at 12 with no training, went on to become one of the most famous mathematicians of all time).
    * John van Neumann (could divide two 8-digit numbers in his head at age 6, learned calculus by 8, went on to be a founding figure in computer science).
    Shall I go on, or is this enough?
    
    bee_rider 9 months ago
    
    Gauss and Von Neumann are the two that immediately come to mind.
    
    sph 9 months ago
    
    Mozart?
subscribed 9 months ago

Some do wonderfully well, for example lcamtuf or Joanna Rutkowski.
If they're self-driven like the original author, they'll be good, they not necessarily need the gradient.

om8 9 months ago

Not surprising, llama.cpp code is a mess.

It's sad that hacked things that emerge first are way more popular than properly done projects that come later.

retr0regOP 9 months ago

In fact the llama.cpp codebase is well-developed and actively maintained. It has undergone iterative security hardening, intensive low-level security checks have been implemented in both the core inference engine and RPC components.
This standard of security is what made the exploitation such challenging and rewarding.
- vlovich123 9 months ago
  
  It’s actively maintained but I wouldn’t classify it as a clean codebase. Neither the abstractions it has within ggml, the structure of llama.cpp, effective use of modern c++ etc. it can’t even really make up its mind as to whether it should be c++ or c and there’s a lot of dirt because of that. Heck instead of using a submodule they’re copying ggml between projects making it very difficult to keep track of what’s actually happening where and what the ground truth is. It’s sloppy engineering. Parts are better designed for sure.
  None of that is meant to take away from your effort or the success of llama.cpp, but I have spent quite a bit of time reading and working with the internals across layers and have a good eye for quality c++ patterns.
- PartiallyTyped 9 months ago
  
  Thanks for the writeup! Was a very interesting read! I've subscribed and I am looking forward to your next exploits! ^_^
qskousen 9 months ago

Is there a comparable open source thing "done properly"?
- tuveson 9 months ago
  
  llama.rs, of course /s

Settings

Heap-overflowing Llama.cpp to RCE

Keyboard Shortcuts