GitHub - DebarghaG/forte: Official Implementation for Forte: Finding Outliers using Representation Typicality Estimation (ICLR 2025)

2 min read Original article ↗

Forte: Finding Outliers Using Representation Typicality Estimation

License: MIT

Why OOD?

Out-of-Distribution (OOD) detection is possibly the most important problem for safe and deployable ML:

  1. Provides the first line of defense by preventing silent failures in critical ML systems
  2. Bounds AI capabilities by recognition of model knowledge
  3. Allows safe fallback and enables human oversight when needed

Why Forte?

Forte takes a novel approach to OOD detection with several key advantages:

  1. Utilizes self-supervised representations to capture semantic features
  2. Incorporates manifold estimation to account for local topology
  3. Minimizes deployment overhead; eliminates additional model training requirements
  4. Requires no class labels, no exposure to OOD data during training, and no restrictions to architecture of predictive or generative models
  5. Strong domain generalization – tested on detecting synthetic data, MRI images etc.

Key Innovation

Forte treats OOD Detection as middleware in deployments. The approach is designed to be plug-and-play, requiring minimal setup and configuration.

Quick Start

# Clone the repository
git clone https://github.com/DebarghaG/forte.git
cd forte

python3 -m venv env
source env/bin/activate

# Install dependencies
pip install scikit-learn numpy scipy transformers torch torchvision PIL tqdm

Basic Usage

Simply provide your data folders:

python main.py --id_images_directories '../data/imagenet_1k' \
    --id_images_names imagenet1k \
    --ood_images_directories '../data/inaturalist_images' \
    --ood_images_names inaturalist_images \
    --batch_size 512 \
    --device cuda:0 \
    --embedding_dir ../embeddings/ \
    --num_seeds 5 \
    --run_baselines False

Technical Approach

Forte combines representation learning with statistical estimation:

  1. Uses self-supervised models to extract semantic features from images
  2. Estimates typical sets using nearest neighbor statistics
  3. Applies density estimation (KDE, OCSVM, or GMM) on the distribution of in-distribution data
  4. Evaluates samples using precision, recall, density, and coverage metrics

The method achieves strong state-of-the-art (SoTA) performance across various benchmarks and real-world applications.

Unsupervised Methods Comparison Supervised Methods Comparison

Citation

@inproceedings{
ganguly2025forte,
title={Forte : Finding Outliers with Representation Typicality Estimation},
author={Debargha Ganguly and Warren Richard Morningstar and Andrew Seohwan Yu and Vipin Chaudhary},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=7XNgVPxCiA}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Research supported by ICICLE AI Institute.