GitHub - zckly/ai-engineer-roadmap: The most comprehensive free guide for becoming an AI Engineer in 2024

The fastest, most comprehensive way to become an AI Engineer in 2024

Welcome to the AI Engineer Roadmap! This guide offers a project-based approach to mastering AI engineering, whether you're a beginner or looking to expand your skills. Each section includes practical projects to apply your knowledge, build real-world AI applications, and develop crucial problem-solving skills ᕙ( •̀ ᗜ •́ )ᕗ

Web/App Development

It helps to have the ability to code your own interfaces, but it's also 100% possible to build AI products without knowing how to program. It's up to you if you wanna go down the coding (full-stack) route or no-code (Webflow, Zapier, etc) route.

Full-stack Route (recommended)

Front-end: Learn React for building interactive user interfaces
Back-end: Master NodeJS/NextJS for server-side development
Database: Understand and implement Postgres for data storage

There are tons of roadmaps out there for learning web development. One of my favorites is Scrimba. I also have a bootcamp on Youtube that covers full-stack web dev + building AI apps

No-code Route

Website Builder: Explore Webflow for creating professional websites without coding
Workflow Builder: Use Zapier to automate processes and integrate applications
Database: Leverage Firebase or Airtable for easy-to-use, scalable data storage solutions

Beginner Text Generation

Understanding Large Language Models (LLMs)
- Watch 3Blue1Brown's Youtube series on LLMs/Transformers as an entry point
- (Bonus) Watch Karpathy's video on building GPT from scratch
Proprietary LLMs
- OpenAI's GPT models
- Anthropic's Claude 3 family
- Google's Gemini
Open-source LLMs
- Meta's LLaMA 3
- Cohere's Command-R
Prompt Engineering
- Study Anthropic's Prompting Guide
Basic Chatbots
- Explore Vercel's AI Library documentation
- Project: Create a poem generator
Handling Structured Output
- Learn techniques for generating and parsing structured data from LLMs
- Check out Instructor or use string parsing

Advanced Text Generation

Function Calling and Tool Usage
- Implement LLM-powered tools and integrate external functions
- Project: Build a personal assistant that can interact with your calendar, email, and task list
Web-browsing Capabilities
- Learn about techniques for scraping and summarizing web content
- Project: Build an open-source version of Perplexity (like morph.so)
Fine-tuning LLMs
- Techniques for adapting pre-trained models to specific tasks
- Project: Fine-tune a model on a specific domain (e.g., medical terminology, legal jargon)
Embeddings and Vector Databases
- Understand and implement vector representations of text
- Explore vector database solutions for efficient similarity search (e.g. Chroma, Supabase, Weaviate)
- Project: Build a semantic search engine for a large corpus of documents
Retrieval Augmented Generation (RAG)
- Learn about different RAG architectures and when to use them
- Project: Develop a "Chat with PDF" application
AI Agents
- Study projects like OpenDevin to understand autonomous AI systems
- Project: Autonomous research agent

Speech

Text-to-Speech (TTS)
- Implement TTS using services like ElevenLabs and OpenAI
- Project: Create an audiobook generator from text input
Speech-to-Text (STT)
- Utilize models like OpenAI's Whisper for transcription
- Project: Create a job interview coach application
Speech Analysis
- Explore emotion and intent analysis using tools like Hume AI or Google Gemini 1.5 Pro
- Project: Create an AI Therapist with emotion detection
- Learn about prosody analysis and its applications in understanding speaker intent

Image Generation

Prompt Engineering for Image Generation
- Read up on art history and photography terminology to craft effective prompts
- Join the Midjourney Discord to study how experts prompt image models
- Project: Create a series of images that tell a story, using consistent style and characters
Proprietary Image Generation Models
- Explore capabilities of models like GPT-4o, Claude, and Gemini
- Project: Children's coloring/story book generator
- Learn about image-to-image transformations (style transfer, inpainting, outpainting)
Open-source Image Generation Models
- Experiment with Stable Diffusion and other accessible models
- Project: Build a custom image generation UI with fine-grained controls

Computer Vision

Image Analysis
- Leverage models like Claude or GPT-4o for comprehensive image understanding
- Project: Develop an app that can analyze and describe the contents of photos
- Learn about object detection, segmentation, and classification techniques
Video Analysis
- Explore advanced capabilities with models like Google Gemini 1.5 Pro
- Project: Video narration
- Study techniques for tracking objects and analyzing motion in videos
- Project: Create a sports analysis tool that can break down player movements and tactics

Happy learning and building!

Zack

my twitter

Table of Contents