
AI RESEARCH FROM META
Introducing Meta Segment Anything Model 3 (SAM 3)
With SAM 3 you can use text and visual prompts to precisely identify, segment, and follow any object in images or videos—coming soon to Instagram Edits and Vibes on the Meta AI app.
With SAM 3 you can use text and visual prompts to precisely identify, segment, and follow any object in images or videos—coming soon to Instagram Edits and Vibes on the Meta AI app.
SAM 3 CAPABILITIES
Advanced features, simple prompts
Using open vocabulary text or visual prompts, SAM 3 can detect, segment and track all matching objects in images and videos.

Text prompts
You can prompt SAM 3 with words and short phrases, to mask all objects matching the text description.

Exemplar Prompts
With exemplar prompts, you can simply draw a box around an example of the object you want to segment, and SAM 3 will mask all objects matching the outlined example.

Visual prompts
With all the capabilities of SAM 2, SAM 3 allows you to segment objects using positive and negative clicks.

Interactivity
If SAM 3 ever misses an object or makes a mistake, you can easily add follow-up prompts to help further guide the model.
BENCHMARKS
State-of-the-art performance
SAM 3 is state-of-the-art across all text and visual segmentation tasks in both images and videos. The model additionally maintains all the performance and functionality of SAM 2.
Designed for real-world applications
Edits is the new video creation app by Instagram that helps creators make great videos on their phones. Creators will soon be able to use SAM 3 in Edits to quickly apply effects to people or objects in their videos, helping their creations stand out.

ENHANCED CAPABILITIES
Evolution of SAM
The Segment Anything models build on each other, offering increasingly advanced capabilities for developers and researchers to create, experiment and uplevel media workflows.

SAM 3
Detect, segment and track every example of any object category in an image or video, using text or examples
Segment an object from a click
Track segmented objects in videos
Refine prediction with follow up clicks
Detect and segment matching instances from text
Refine detection with visual examples

SAM 2
Segment and track any object in any image or video using click, box or mask prompts
Segment an object from a click
Track segmented objects in videos
Refine prediction with follow up clicks

SAM 1
Segment any object in any image with as little as a single click
Segment an object from a click
Refine prediction with follow up clicks
Try SAM 3 today
Experiment with SAM 3 in the Segment Anything Playground.

OUR APPROACH
New unified architecture
SAM 3 is built as a unified, promptable model that enables segmentation with language, exemplars and visual prompts across images and videos. It leverages a large-scale, diverse training dataset and a powerful perception encoder backbone to achieve state-of-the-art performance in segmentation and tracking using open-vocabulary short text phrases and visual prompts.

