Overview
Django Neural Feed (DNF) is a production-ready Django application designed to build intelligent, personalized content feeds powered by semantic vector embeddings. It leverages PostgreSQL's pgvector extension to compute vector similarity at the database level, combined with customizable content freshness and popularity metrics—all evaluated in a single optimized SQL query.
With its object-oriented architecture, DNF decouples your configuration logic into dedicated Feed classes. It tracks user interactions non-intrusively via Django signals and supports flexible deployment execution blocks, easily falling back from Celery asynchronous queues to synchronous background threads if the broker is offline.
Core Features
- 🧠 Object-Oriented Feed Configuration: Define isolated, multi-tenant recommendation feeds by subclassing a unified
BaseNeuralFeedclass. - ⚡ Bulletproof Asynchronous Pipeline: Offload embedding generation and vector aggregation to Celery. Features an automated synchronous thread fallback system.
- 📊 Dedicated Multi-Feed User Profiles: Stores vector profiles in an isolated
UserFeedProfilemodel partitioned byfeed_id, keeping your core Auth User table clean. - 🎯 Hybrid Multi-Criteria Scoring: Merges semantic similarity (pgvector cosine distance), content recency, and custom popularity expressions into a single database-level annotation.
- 🚀 Non-Invasive Integration: Attach recommendation behavior to existing content models with minimal migrations, leaving your interaction tables (Likes/Dislikes) completely untouched.
Requirements
- Python: 3.10+
- Django: 4.2, 5.0, 6.0+
- PostgreSQL: 12+ (with
pgvectorextension installed) - NumPy: 2.0.0+
- pgvector: 0.4.0+
Installation
1. Install the Package
pip install django-neural-feed
The built-in DefaultVectorEncoder uses sentence-transformers (which pulls in
torch). It is an optional extra, so install it only if you rely on the default
local encoder:
pip install django-neural-feed[local]
If you configure a custom ENCODER_CLASS (OpenAI, Cohere, a hosted inference API,
etc.) you can skip it and avoid the torch download entirely.
2. Add to Django Settings
To ensure Django's migration engine can cleanly generate and compile PostgreSQL-specific vector indices, make sure to add both django.contrib.postgres and django_neural_feed to your settings:
INSTALLED_APPS = [ 'django.contrib.postgres', # Required for robust PG index migration compilation 'django_neural_feed', ]
3. Initialize PostgreSQL Extension
Ensure pgvector is enabled in your database instance:
CREATE EXTENSION IF NOT EXISTS vector;
Quick Start
Step 1: Configure Your Content Model
Inherit from NeuralRecommendMixin to inject a vector embedding column into your target content table.
from django.conf import settings from django.db import models from django_neural_feed.mixins import NeuralRecommendMixin class Post(NeuralRecommendMixin, models.Model): # NOTE: NeuralRecommendMixin must be BEFORE models.Model! title = models.CharField(max_length=255) content = models.TextField() created_at = models.DateTimeField(auto_now_add=True) likes = models.ManyToManyField(settings.AUTH_USER_MODEL, related_name="liked_posts") def get_ready_text(self) -> str: return f"{self.title} {self.content}"
Prepare and apply your migrations:
python manage.py makemigrations python manage.py migrate
Step 2: Define a Custom Feed Class
Create a dedicated feeds.py configuration to encapsulate tracking thresholds, model fields, math scoring expressions, and hybrid weights.
from django.db.models import Count, F, FloatField, ExpressionWrapper, Value from django.db.models.functions import Cast, Ln, Extract, Now from django_neural_feed.feeds import BaseNeuralFeed from your_app.models import Post class PostFeed(BaseNeuralFeed): # 1. Core Feed Identity feed_id = "posts_main" parent_feed = None # Optional: Reference to a parent feed class for inheritance hierarchy # 2. Target Django Models Configuration content_django_model = Post interaction_django_model = Post.likes.through # 3. Interaction Tracking Pipelines mode = "m2m" # Use "m2m" for ManyToMany fields, or "model" for explicit through models user_field_name = "user" # Field pointing to User model (not needed if mode is "m2m") content_field_name = "post" # Field pointing to Content model (not needed if mode is "m2m") # 4. Model & Pipeline Thresholds embedding_model_name = "paraphrase-multilingual-MiniLM-L12-v2" # Overrides global setting user_likes_limit = 20 # Max target sample size slice for vector profile aggregation # 5. Hybrid Scoring Global Weights (Should ideally sum up to 1.0) weight_similarity = 0.6 weight_freshness = 0.2 weight_popularity = 0.2 # 6. Popularity: Logarithmic scaling using natural logarithm to keep viral jumps balanced # Ln(Value(1000.0)) scales the metric dynamically, hitting a 1.0 score modifier at 1000 likes. popularity_expression = ExpressionWrapper( Ln(Cast(Count("likes"), FloatField()) + Value(1.0)) / Ln(Value(1000.0)), output_field=FloatField() ) # 7. Freshness: Time-decay function based on post age in hours # Safely subtracts timestamps inside the database, converting the interval to hours. freshness_expression = ExpressionWrapper( Value(1.0) / ( Value(1.0) + ( Extract(Now() - F("created_at"), "epoch") / 3600.0 ) ), output_field=FloatField() )
Step 3: Register Feed in Settings
Register the string path to your custom feed configuration within the DJANGO_NEURAL_FEED["FEEDS"] list inside your settings.py.
DJANGO_NEURAL_FEED = { "FEEDS": [ "your_app.feeds.PostFeed", # DNF hooks up all model and M2M signals automatically ], }
Step 4: Fetch Personalized Feed Results
Use your feed's .get_feed() function to obtain optimized querysets sorted by hybrid weights.
from your_app.feeds import PostFeed from your_app.models import Post def user_feed_view(request): # Gather IDs of posts the user has already liked to exclude them from the feed excluded_ids = Post.objects.filter( likes=request.user ).values_list('id', flat=True) # Generate personalized recommendations directly via your Feed class feed_queryset = PostFeed.get_feed( user=request.user, queryset=Post.objects.all(), excluded_ids=excluded_ids, limit=20 ) return feed_queryset
Enabling High-Performance HNSW Indexing
To scale beyond ~10k+ items, you must build an Approximate Nearest Neighbor (ANN) index in PostgreSQL. Enabling HNSW reduces query times from ~100ms+ to <1ms by bypassing expensive sequential scans.
Step 1: Set Up your Django Model with HNSW Mixin
Inherit from NeuralHnswMixin in your content model. To resolve meta-class conflicts cleanly in static type checkers (like Pylance/Mypy), explicitly merge the internal Meta classes:
from django.conf import settings from django.db import models from django_neural_feed.mixins import NeuralRecommendMixin, NeuralHnswMixin class Post(NeuralRecommendMixin, models.Model): title = models.CharField(max_length=255) content = models.TextField() created_at = models.DateTimeField(auto_now_add=True) likes = models.ManyToManyField(settings.AUTH_USER_MODEL, related_name="liked_posts") def get_ready_text(self) -> str: return f"{self.title} {self.content}" # Explicitly inherit Meta options to merge the dynamic HNSW index class Meta(NeuralRecommendMixin.Meta, NeuralHnswMixin.Meta): pass
Step 2: Turn on HNSW in Settings
Configure the HNSW runtime parameters inside your settings.py. DNF will automatically run SET LOCAL hnsw.ef_search on incoming queries to tune search precision dynamically.
DJANGO_NEURAL_FEED = { "FEEDS": [ "your_app.feeds.PostFeed", ], "HNSW": { "ENABLED": True, # Toggles HNSW-optimized query isolation "EF_SEARCH": 40, # Size of the dynamic candidate list during query phase "SEARCH_POOL": 500, # Pre-retrieval pool size divided between similarity/freshness/popularity } }
Step 3: Compile the Database Graph
Run the migrations to physically construct the HNSW index on your database server.
python manage.py makemigrations python manage.py migrate
Note: Generating a vector index on large datasets consumes substantial system resources. Expect your CPU usage to spike up to 100%+ during migrate execution while Postgres builds the graph layers.
Step 4: Verify HNSW is Running
Ensure your queries are actually utilizing the newly built index. Run a database analysis via Django Shell:
# python manage.py shell from your_app.feeds import PostFeed from your_app.models import Post from django.contrib.auth import get_user_model user = get_user_model().objects.first() # Analyze query execution structure print(PostFeed.get_feed(user=user, limit=20).explain(analyze=True))
If configured correctly, the execution log will output an Index Scan using ..._hnsw_idx instead of a Seq Scan (Sequential scan).
Configuration Reference
You can pass default global limits and model engine backends via standard DJANGO_NEURAL_FEED dictionary keys in your settings.py:
DJANGO_NEURAL_FEED = { "MODEL_NAME": "paraphrase-multilingual-MiniLM-L12-v2", "VECTOR_DIMENSION": 384, "CELERY_ENABLED": True, "WEIGHT_SIMILARITY": 0.6, "WEIGHT_FRESHNESS": 0.2, "WEIGHT_POPULARITY": 0.2, "HNSW": { "ENABLED": True, "EF_SEARCH": 40, "SEARCH_POOL": 500, }, }
| Global Config Key | Type | Default | Purpose |
|---|---|---|---|
| MODEL_NAME | str | paraphrase-multilingual-MiniLM-L12-v2 | Target HuggingFace SentenceTransformer engine. |
| VECTOR_DIMENSION | int | 384 | Embedding dense matrix array dimension sizes. |
| ENCODER_CLASS | str/type | django_neural_feed.encoders.DefaultVectorEncoder | Path to the vectorization engine class interface. |
| WEIGHT_SIMILARITY | float | 0.6 | Default proportional weight of cosine similarity scoring. |
| WEIGHT_FRESHNESS | float | 0.2 | Default proportional weight of item creation recency. |
| WEIGHT_POPULARITY | float | 0.2 | Default proportional weight of user interaction counts. |
| USER_LIKES_LIMIT | int | 20 | Max target sample size slice for vector aggregation. |
| CELERY_ENABLED | bool | False | Toggles routing tasks to background Celery workers. |
Advanced Settings Overriding
Every specific attribute can be declared dynamically within your custom BaseNeuralFeed class implementation to build separate configurations for multiple models (e.g., separate metrics weights for ArticlesFeed vs VideoFeed).
| Feed Class Attribute | Type | Default Value / Fallback | Purpose |
|---|---|---|---|
| feed_id | str | "default_feed" | Unique identifier for partitioning user vector profiles. |
| mode | str | Required ("m2m" / "model") | Toggles the internal signal tracking pipeline architecture. |
| embedding_model_name | str | settings.MODEL_NAME | Overrides the text-embedding engine for this specific feed. |
| user_likes_limit | int | settings.USER_LIKES_LIMIT | Overrides the history interaction slice size for this feed. |
| weight_similarity | float | settings.WEIGHT_SIMILARITY | Fine-tunes semantic similarity importance for this feed. |
| weight_freshness | float | settings.WEIGHT_FRESHNESS | Fine-tunes time-decay metric importance for this feed. |
| weight_popularity | float | settings.WEIGHT_POPULARITY | Fine-tunes interaction count importance for this feed. |
| popularity_expression | Expression | Value(1.0) | Custom Django ORM expression for parsing popularity scoring. |
| freshness_expression | Expression | Value(1.0) | Custom Django ORM expression for parsing time-decay scoring. |
Architecture Mechanics
- Content Structuring: When an entity subclassing NeuralRecommendMixin fires a post_save execution block, DNF reads get_ready_text() to calculate a dense float vector.
- Preference Profiling: On target connection updates, an isolated worker fetches the latest interaction history rows, calculates an averaged, L2-normalized mean representation vector, and updates UserFeedProfile.
- Query Engine Generation: Invoking Feed.get_feed() applies pgvector operations combined with standard math normalization, avoiding redundant lookups.
Custom Vector Encoders (Advanced)
By default, DNF uses local sentence-transformers via DefaultVectorEncoder. If you prefer to use external cloud APIs (like OpenAI, Cohere, or custom microservices) for generating embeddings, you can implement a custom encoder.
1. Subclass BaseVectorEncoder
Create a custom encoder class anywhere in your Django project and implement the text_to_vector method. You can optionally override average_vectors if you want to replace NumPy-based processing.
import requests from django_neural_feed.encoders import BaseVectorEncoder class OpenAIAppEncoder(BaseVectorEncoder): @classmethod def text_to_vector(cls, text: str, model_name: str) -> list[float]: """Fetch embeddings via custom third-party cloud API endpoint.""" if not text.strip(): return [] # Example API execution blueprint response = requests.post( "https://api.openai.com/v1/embeddings", headers={"Authorization": "Bearer YOUR_API_KEY"}, json={"input": text, "model": model_name} ) return response.json()["data"][0]["embedding"]
2. Register Custom Encoder in Settings
Pass the absolute string path pointing to your class implementation inside the global dictionary configuration:
DJANGO_NEURAL_FEED = { "ENCODER_CLASS": "your_app.encoders.OpenAIAppEncoder", "MODEL_NAME": "text-embedding-3-small", # Passed down as model_name argument "VECTOR_DIMENSION": 1536, # Adjust to match your custom API provider }
Testing
DNF maintains full code coverage execution metrics. Run the suite natively using:
pytest --cov=src/django_neural_feed
License
Distributed under the terms of the MIT License.