LLama.cpp now has a web interface

328 points by xal 2 years ago · 53 comments

Reader

I'm always wondering about the, I don't even know what to call this, etiquette? of proposing PR's to projects like these that add a feature or a demo or whatnot to the main branch of a very focused project by adding something that is very different in interface, language, set and setting etc.

So in this case, Tobi made this awesome little web interface that uses minimal HTML and JS as to stay in line with llama.cpp's stripped-down-ness. But it is still a completely different mode of operation, it's a 'new venue' essentially.

What if GG didn't want such a thing? When is something like this better for a separately maintained repo and not a main merge? How do you know when it is OK to submit a PR to add something like this without overstepping (or is it always?)

I see this with a few projects on github that really 'blow up' and everyone starts working on. They get a million PR's from people hacking things on it in their domain of knowledge, expanding the complexity (and potentially difficulty to maintain quality). Sometimes it gets weird feeling watching from the outside at least (I'm not a maintainer on any public FOSS).

Just curious what others think because those are my thoughts that came to mind when I saw this.

ggerganov 2 years ago

My POV is that llama.cpp is primarily a playground for adding new features to the core ggml library and in the long run an interface for efficient LLM inference. The purpose of the examples in the repo is to demonstrate ways of how to use the ggml library and the LLM interface. The examples are decoupled from the primary code - i.e. you can delete all of them and the project will continue to function and build properly. So we can afford to expand them more freely as long as people find them useful and there is enough help for maintaining them. Still, we try to keep the 3rd party dependencies to a minimum so that the build process is simple and accessible
There was a similar "dilemma" about the GPU support - initially I didn't envision adding GPU support to the core library as I thought that things will become very entangled and hard to maintain. But eventually, we found a way to extend the library with different GPU backends in a relatively well decoupled way. So now, we have various developers maintaining and contributing to the backends in a nice independent way. Each backend can be deleted and you will still be able to build the project and use it.
So I guess we are optimizing for how easy it is to delete things :)
Note that the project is still pretty much a "big hack" - it supports just LLaMA models and derivatives, therefore it is easy atm. The more "general purpose" it becomes, the more difficult things become to design and maintain. This is the main challenge I'm thinking how to solve, but for sure keeping stuff minimalistic and small is a great help so far
> What if GG didn't want such a thing? When is something like this better for a separately maintained repo and not a main merge? How do you know when it is OK to submit a PR to add something like this without overstepping (or is it always?)
I try to explain my vision for the project in the issues and the discussion. I think most of the developers are very well aligned with it and can already tell what is a good addition or not
- aidenn0 2 years ago
  
  Thank you for the ggml library, by the way. It let me play around with whisper in a sane manner. To run the CUDA torch versions, I needed to shut down X to free enough GPU memory for the medium model, and the small model might require me to quit firefox. With ggml, I can use cublas and run even the large model with a huge speedup compared to CPU only torch.
- mk_stjames 2 years ago
  
  Thanks for replying to me directly! I'm finding it fascinating to follow this project. Good luck with your company Georgi.
- vitaminka 2 years ago
  
  i’m curious, what’s is the approach for maintainable and decoupled various gpu backends?
  - ggerganov 2 years ago
    
    It was designed in #915 (read just the OP and the linked PRs at the end) and the implementation pretty much follows it closely, at least for the Metal backend. The CUDA and OpenCL backends are currently slightly coupled in ggml as they started developing before #915, but I think we'll resolve this eventually.
    #915 - https://github.com/ggerganov/llama.cpp/discussions/915
    
    vitaminka 2 years ago
    
    interesting decoupling method, ty :)
LawnGnome 2 years ago

I generally think it's fine to do this sort of thing for your own benefit and open a PR as long as you're really 100% fine with "no, I'm not interested in merging this" being the answer.
Where the problems tend to arise (in my experience, at least) is when people hack on something expecting that it will be merged, get invested in it, and then get upset when the maintainer(s) aren't interested.
Checking in before starting to work on something is important if your goal is to have it merged, not just to do the work. The problem is that a lot of people start in the first category, but then move into the second category as they get invested in their project.
grepLeigh 2 years ago

Some tips/tricks/tidbits:
* Open a draft PR early in the process with a Request for Comment [RFC] tag. Explain your goal/approach in words, then follow up with code.
* Be succinct.
* Provide minimal viable examples and build more complex concepts from these.
* Accept feedback with grace, and execute promptly.
* Don't take personal offense if your work isn't merged, or even responded to.
* Single-maintainer open-source looks very different than consortium & working group FOSS.
version_five 2 years ago

A nice thing about llama.cpp is that it's well organized to accept a feature like this without really disturbing any other part or potentially stepping on someone's toes. There is the core repo and then this is on examples/server (as are various other "example" features). This organization feel like it would make it much easier to accept a pr like this than if doing the same thing required wider changes.
xalOP 2 years ago

GG would just say no and that's that. No hard feelings, that's what makes open source so great.
renewiltord 2 years ago

You just saw it happen in OP. Just do as others do and don't sweat this stuff. The principle is code sharing and an offer of a thing you've done.
Don't overthink it.
This is fantastic. I love the way he handles his project. Just great for adoption and contribution.
bfuller 2 years ago

My view is you get so many people blatantly ripping off your code to try and pass it off as their own that you actually appreciate when you come across someone doing something novel or interesting that users like. That was my view anyway.
kaliqt 2 years ago

Well that's why you open an issue first, an RFC. Get some discussion going, if everyone's happy, then you proceed.

vdfs 2 years ago

Good to see a CEO is still hacking around!

Shopify did change my life in a way i could never imaging, it will always hold a special place in my heart

samstave 2 years ago

Can you give a detailed account on specifics?
Xenoamorphous 2 years ago

A billionare co-founder at that!
- swyx 2 years ago
  
  actually did tobi even have any other cofounders? i just realized i only ever hear about him and harvey finkelstein
  edit: google says scott lake was actually the founding CEO. TIL! https://www.linkedin.com/mwlite/profile/in/scottlake?origina...
  - jonny_eh 2 years ago
    
    > harvey finkelstein
    Harley Finkelstein, he's a nice guy. Not a co-founder but very important to the history and running of Shopify.
  - Xenoamorphous 2 years ago
    
    Just going by the Google snippet I got when searching for “shopify ceo”. Not sure if the downvotes I got are because somone thinks that the co-founding aspect is wrong or because they think that him being a billionaire is irrelevant, but IMO it makes the fact he’s opening PRs even cooler.
  - bhouston 2 years ago
    
    There was 3 cofounders. Scott Lake left relatively early but Daniel Weinand (designer) stayed on for a while.

steren 2 years ago

And the PR is from Shopify's CEO!

vdfs 2 years ago

OP is Tobi

mteam88 2 years ago

Are these comments about the CEO bots?

mliker 2 years ago

I think it's just impressive to folks that the CEO of a public company is still coding and putting out useful PRs.
generalizations 2 years ago

I think this has been seen before, where it turned out to be a bunch of employees. But that time it was a product launch, not a CEO's pr.
isanjay 2 years ago

Dedo something fishy. But accounts are created ages ago
- Kiro 2 years ago
  
  Not really. I also think it's remarkable and would have posted something similar if it hadn't already been pointed out.
Cyph0n 2 years ago

People are just trying to point out who the author is for those not familiar with him.
mrtranscendence 2 years ago

Were you aware that the author is Shopify's CEO? How deliciously absurd!

isoprophlex 2 years ago

> I tried to match the spirit of llama.cpp and used minimalistic js dependencies and went with the ozempic css style of ggml.ai.

Ozempic? The anti-diabetes drug? That's either a glorious typo or an interesting new adjective...

jahewson 2 years ago

I did a double-take when I saw that too. It’s now prescribed as a weight-loss drug and is very much in the zeitgeist so yeah I think it’s a new adjective. Personally I’m going to stick with “light weight”.
- isoprophlex 2 years ago
  
  The more I think about it, the more I'm loving it
xalOP 2 years ago

Nat coined this usage https://twitter.com/natfriedman/status/1668656170645749761
- slaymaker1907 2 years ago
  
  Lol, I’m totally using this to describe memory optimization from now on. There’s a lot of software out there that needs to be more ozempic.

ynniv 2 years ago

  i'm importing from js cdns instead of adding them here

FWIW this seems counter to llama.cpp's philosophy.

xalOP 2 years ago

I agree, I ended up getting rid of it. Only one dependency is downloaded (via bash script) and everything is baked into the binary now.
mhh__ 2 years ago

The philosophy in practice seems to be dirty deeds done dirt cheap rather than genuine simplicity (it only looks simple compared to python ecosystem)

eclectic29 2 years ago

Can someone shed light on how does a CEO with 3 kids get time to hack on something like this? Some might even argue that all the time spent doing this would’ve been better spent on CEO activities, but thankfully this is HN and people have hobbies, so that’s that, but this can be a very time consuming hobby.

xalOP 2 years ago

I've tried to eliminate language around being "too busy" from my vocabulary and attempt to replace it with "can't prioritize". That's sometimes a bit awkward, but really trains better habits. Sometimes I prioritize hobbies if I feel I need it, and doing this seemed fun and useful besides!
- eclectic29 2 years ago
  
  Thanks for your helpful response.
  - vdfs 2 years ago
    
    He have this in his Twitter bio: "CEO by day, Dad in evening, hacker at night"

sebastiennight 2 years ago

I'm wondering whether the rendering of markdown through a regex that only deals with H3, strong, em and code is enough, since this implementation specifically seems to ask the bot to reply with markdown?[0]

Other than that, it's cool to see this web interface happen. It's little things like this that make new tools easier to grasp for less technical users.

I believe that people are more likely to go through the hurdles of learning difficult setup procedures (knowing the UX/UI will be easy to use) versus making the setup easier but having the daily use be too hard.

[0]: https://github.com/ggerganov/llama.cpp/pull/1998/commits/c19...

xalOP 2 years ago

I had an entire implementation that used a proper state machine but I couldn’t get it simple and small enough for this single file thing. The regexp chain is dumb but clear.

zoklet-enjoyer 2 years ago

Wow. Every comment pointing out the author but not the content

creekcreek 2 years ago

If only all tech CEOs were as code-savvy as the magnificient CEO of Shopify! What a amazing accomplishment from a CEO who cares about development.

wiihack 2 years ago

Nice simple interface! And great to see a CEO keeping up with recent technological developments :)

mliker 2 years ago

very cool to see the CEO of Shopify putting together this interface.

seydor 2 years ago

didnt people run llamacpp with oobabooga et al?

Settings

LLama.cpp now has a web interface

Keyboard Shortcuts