Ollama-Swift
nshipster.comBeautiful article.
Off-topic:
The Nominate app is exactly what’s missing from today’s UIs. Most of today’s user interfaces can benefit from “sparkles of AI helpfulness” instead of requiring a separate “AI App.” For example, macOS file renaming should provide AI-powered predictive suggestions when renaming a file that doesn’t have a useful name. Another example: when creating a GitHub issue, the UI should use AI to predict which labels are most likely relevant and bring them to the top for selection.
It seems many “AI products” attempt to replace entire workflows instead of enhancing the existing ones. GitHub puts a lot of effort into building Copilot features like Copilot Code Reviews, but it doesn’t appear interested in using AI to make existing code reviews more powerful and useful.
>AI-powered predictive suggestions when renaming a file
This is genius. Extend it to photos as well, no need for DSC_0001 when you can quickly analyze the contents and metadata and group things into folders with intelligent names. Maybe combine it with spotlight tags. There's a lot of similar untapped potential: a feature to look through your downloads folder and group things into folders, bundling all the stuff that's "junk" (e.g. zip files, dmgs, etc.) from stuff that could be valuable. And importantly in all cases it's ok if there are some errors since nothing is actually destroyed, it's just misclassified but still searchable.
In fact, do one better: spotlight is great, but sometimes you don't remember the exact wording. You don't even need a full LLM to do semantic search. And if you're going there, might as well go all the way and add a DevonThink like feature to show related documents. There's so much untapped potential to truly make the computer a "bicycle for the mind", that I'm surprised Apple hasn't bought DevonThink, integrated with an LLM, and shipped it (DevonThink could be integrated into finder, DevonSphere is a souped up spotlight, and DevonAgent would be perfect if they ever launched a search engine).
My Photos collection is a tens-of-gigs hideous heap of horrors and I am faithfully(?) waiting for Apple to give me a button to push to sort it all out.
I wonder if they’re interested in doing that since all of those images mean you have to buy their iCloud subscription forever.
But the AI would be a fantastically effective lock-in long-term methinks
That’s OK - until the big end of town work this out, you can use LLMs to code this yourself and have an advantage.
For instance - have your LLM create a tool for your framework of choice that pulls commits that are new in your local branch vs. the remote, to gather them up for a code review.
Then by adding that tool and supplying the right prompt, your LLM can give you a code review before you push it. No SaaS in sight, and no git-push in your submit/code-review cycle, let alone a human reviewer.
If you want to get more heavyweight you could deploy a lambda somewhere that responds to a webhook in github and posts that same review as a comment on each PR.
Sky’s the limit & you can build this stuff as easy as anyone.
Trust me! I'm having a ton of fun building things with LLMs. I'm just pointing out the fact that most products are making the LLM input/output the center point of the user experience but in reality it's more useful to enhance existing UIs with LLMs.
This.
AI is a feature, not a product.
NSHipster played such a pivotal role in me becoming the engineer I am today. It’s heartwarming and hugely nostalgic to see a mattt post
Curious about RAGs. The article made it look like just a few additional parameters (context) you pass to the LLM. Somehow I was under the impression RAGs required training.
All I want is an LLM front-end to a local Wikipedia drop.
The article starts off by creating in-memory vector embeddings for a list of "documents" (aka chunks of text) using the nomic model:
https://www.nomic.ai/blog/posts/nomic-embed-text-v1
They then use cosine similarity to compare the user's query to retrieve a list of "top N" embeddings which point to the doc chunks, then shove those into the entire query which is sent off to the LLM. RAG is just a means of injecting "relevant" docs into your input context - no training or special LLMs are required.
To do what you're asking, you'd need to fetch a recent wikipedia dump, spend some time 'normalizing'/'sanitizing', and perhaps most importantly, figure out how to divide each article (some of which would well exceed the embedding size) into chunks that you could generate embeddings from. Then you'd need to store them into a vector database (unlike the article which does not persist them). I personally use Qdrant, there is also Postgresql(has a pgvector extension), lancedb, etc.
No training needed, but you need to generate embeddings for all the content, store it in a vector DB, and wire it up.
“Can we take a moment to appreciate the name? Apple Intelligence. AI. That’s some S-tier semantic appropriation. On the level of jumping on “podcast” before anyone knew what else to call that.”
Apple didn’t invent the term Podcast though!
This looks great! I’ve been using OllamaKit, for prototyping but it’s not Linux compatible. While not explicitly documented as Linux compatible it looks like this package is.
It’s an unfortunate aspect of the Swift on the server ecosystem that most Swift developers don’t really know what it takes to be Linux compatible (very little in most cases) and so many packages are somewhat accidentally not Linux compatible.
The simplest way to use a language model is to generate text from a prompt (followed by python code)
for the initiated this is not the simplest way.
You can just `ollama run` in your Terminal and get a chat repl> port 11431 (leetspeak for llama ).
No. It's 11434...