INTRODUCING DINOV3
Self-supervised learning for vision at unprecedented scale
DINOv3 scales self-supervised learning (SSL) for images to produce our strongest universal vision backbones, enabling breakthrough performance across diverse domains.
DINOV3 OVERVIEW
Cutting-edge image representations, trained without human supervision
We scaled unsupervised training to 7B-parameter models and 1.7B image datasets, using a fraction of compute compared to weakly-supervised methods. Despite keeping backbones frozen during evaluation, they achieve absolute state-of-the-art performance across diverse domains.
Exceptional performance across visual domains
SSL unlocks domains where annotations are scarce or costly. Backbones enable state-of-the-art results for tasks including object detection in web imagery, but also canopy height mapping in satellite and aerial imagery.
Versatile backbone with powerful dense image features
High-resolution dense features from a single DINOv3 backbone enable leading performance across vision tasks, including object detection, depth estimation, and segmentation, without any finetuning.
Efficient model sizes and architectures
We release a comprehensive model suite addressing a wide range of use cases, including broad coverage of ViT sizes and efficient ConvNeXt models for on-device deployment.
PERFORMANCE
Evaluating DINOv3's Performance
DINOv3 sets a new standard in vision foundation models. For the first time, a model trained with SSL outperforms weakly-supervised models on a broad range of probing tasks, from fine-grained image classification, to semantic segmentation, to object tracking in video.

APPROACH
Self-supervised pre-training unlocks simple task adaptation
Pre-training data is curated from a large unlabeled dataset. During pre-training, the model learns general-purpose visual representations, matching features between different augmented views of the same image. In post-training, the model is distilled into more efficient models.
A pre-trained DINOv3 model can be easily tailored by training a lightweight adapter on a small amount of annotated data.
DINO Evolution
DINOv3 marks a new milestone in self-supervised training at scale. It builds upon the scaling progress of DINOv2, further increasing the model size x6, and training data x12.
Explore additional resources