Settings

Theme

The emergence of AI engineering

ignorance.ai

21 points by charlierguo 3 years ago · 6 comments

Reader

swyx 3 years ago

(author of https://www.latent.space/p/ai-engineer here)

this is a fantastic followup post spelling out the toolkit of the AI Engineer. I have a version of this in my notes but didnt want to publish it for fear of being too biased by ommission, plus it's generally good to leave others to fill in the blanks on some things. the Projects list is particularly great - one could build a course around these few things and be reasonably confident that you have the base skills of a competent "AI Engineer" (that, for example, any PM or founder could ask to do an AI project and they'd be well equipped to advise/make technical decisions).

don't miss Andrej Karpathy + Jim Fan's take on the AI Engineer role: https://twitter.com/karpathy/status/1674873002314563584

for those who dont like Twitter, i cleaned up the audio of our AI Engineer conversation with Jared Palmer, Alex Graveley, Joseph Nelson, and other self identifying AI Engineers here: https://mixtape.swyx.io/episodes/the-rise-of-the-ai-engineer...

it's a fun time for those who want to carve out a space for themselves specializing in this stuff.

ReadEvalPost 3 years ago

"Build, don't train" is poor advice for a prospective "AI Engineer" title. It should be "Know when to reach for a new model architecture, know when to reach for fine-tuning / LoRA training, know when to use an API." Only relying on an API will drastically reduce your product differentiation, to say nothing of the fact that any AI Engineer worth that title should know how to build and train models.

  • charlierguoOP 3 years ago

    Fair point! I think my main idea was "prefer building with an API over training your own model" but that isn't as pithy.

    The jury's still out on how much training and fine tuning are going to matter in the long run - my belief is that there are many great products that can exist without needing a new model architecture, or owning the model at all.

    • ReadEvalPost 3 years ago

      That advice makes sense if we're talking about 800B+ parameter models that require a gigantic investment of capital and time. For models that fit on a consumer GPU you're leaving chips on the table to not take advantage of training / fine-tuning. It's just too easy and powerful not to.

fswd 3 years ago

We're calling it LLMOps

  • swyx 3 years ago

    1. its unpronounceable

    2. ops is boring. get in losers, we're generating all the things

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection