MLXDINOv3
A native Swift implementation of Meta’s DINOv3 using MLX Swift.
DINOv3 is a family of self-supervised vision foundation models from Meta AI, producing high-quality dense visual features that outperform specialized models without fine-tuning. This package provides a numerically validated, on-device compatible version for Apple silicon.
Installation
Add MLXDINOv3 to your Swift Package Manager dependencies:
dependencies: [ .package(url: "https://github.com/vincentamato/MLXDINOv3.git", from: "1.0.0") ]
Then import it:
Converting Hugging Face weights to MLX format
Convert Hugging Face weights to MLX format using the conversion CLI in Xcode:
- Open the package in Xcode:
xed . - Select the
Convertscheme from the scheme selector - Edit the scheme (Product → Scheme → Edit Scheme)
- Under "Run" → "Arguments", add:
facebook/dinov3-vits16-pretrain-lvd1689m./Models/dinov3-vits16-mlx
- Run the scheme (Cmd+R)
Note: Currently, only the ViT models are supported.
Example Usage
import AppKit import MLX import MLXDINOv3 // Load a pretrained model let model = try loadPretrained(modelPath: "Models/dinov3-vits16-mlx") // Preprocess an image let image = NSImage(contentsOfFile: "image.jpg")! let processor = ImageProcessor() let inputs = try processor(image) // Run inference let outputs = model(inputs) print("Pooler output shape:", outputs.poolerOutput.shape) print("Last hidden state shape:", outputs.lastHiddenState.shape)
Testing
All testing must be done from Xcode due to MLX metallib requirements.
Step 1: Convert the test model
- Open the package in Xcode:
xed . - Select the
Convertscheme - Edit the scheme (Product → Scheme → Edit Scheme)
- Under "Run" → "Arguments", add:
facebook/dinov3-vits16-pretrain-lvd1689mTests/MLXDINOv3Tests/Resources/Model
- Run (Cmd+R)
Step 2: Run tests
Run tests with Cmd+U or Product → Test
Tests automatically download PyTorch reference outputs from HuggingFace Hub for validation.
References
- DINOv3 Paper: DINOv3
- DINOv3 Repository: facebookresearch/dinov3
License
This package is released under the MIT License.
Note: The pretrained DINOv3 weights and original model architecture are released under Meta’s DINOv3 License.