News Rationalizer
Measuring Emotional Valence in News Coverage
A demonstration web application that analyzes news articles to measure authors' emotional valence patterns and identify balanced vs. biased reporting across topics.
Core Concept: The Conjugate Principle
Just as multiplying a complex number by its conjugate yields a real number:
We can rationalize news coverage by blending stories from authors with complementary emotional valences—pairing positive-leaning and negative-leaning perspectives on the same topic to approach more neutral, comprehensive coverage.
Features
- Article Collection: Automatically scrapes articles from 8+ diverse news sources via RSS feeds
- Topic Categorization: Classifies articles into 5 key categories (Nuclear Energy, Data Centers, Healthcare, Immigration, Technology)
- Sentiment Analysis: Measures emotional valence using pre-trained RoBERTa model
- Author Profiling: Calculates balance scores and ranks authors by consistency across topics
- Interactive Dashboard: Tufte-inspired visualizations showing emotional spectrum and complementary pairs
Quick Start
Prerequisites
- Python 3.11+
- uv package manager
Installation
# Clone the repository git clone <your-repo-url> cd rationalizer # Install dependencies with uv uv sync # Run database migrations uv run python manage.py migrate # Create admin user (optional) uv run python manage.py createsuperuser
Running the Analysis
# Collect and analyze articles (takes 5-10 minutes) uv run python scripts/run_analysis.py # With options uv run python scripts/run_analysis.py --months 3 --max-per-source 50 # Use ML-based categorization (slower but more accurate) uv run python scripts/run_analysis.py --use-ml
Running the Dashboard
# Start development server uv run python manage.py runserver # Visit http://localhost:8000/
Project Structure
news_rationalizer/
├── analysis/ # Analysis pipeline modules
│ ├── collector.py # Article fetching from RSS feeds
│ ├── categorizer.py # Topic classification
│ ├── sentiment.py # Emotional valence scoring
│ └── profiler.py # Author analysis & balance metrics
├── dashboard/ # Django web application
│ ├── models.py # Database models
│ ├── views.py # Dashboard views
│ ├── templates/ # HTML templates (Tufte-inspired)
│ └── admin.py # Django admin configuration
├── config/ # Django settings
│ ├── settings.py # Project settings
│ ├── urls.py # URL routing
│ └── wsgi.py # WSGI configuration
├── data/ # Database and data files
│ └── analysis_results.db
├── scripts/ # Utility scripts
│ └── run_analysis.py # Main analysis pipeline script
└── manage.py # Django management script
How It Works
1. Data Collection
Articles are collected from diverse news sources:
- BBC News, Reuters, The Guardian, NPR
- Al Jazeera, The Hill, Axios, TechCrunch
The collector extracts:
- Title, author, publication date
- Full article content (or summary)
- Source publication and domain
2. Topic Categorization
Each article is classified into one or more categories:
- Nuclear Energy: Nuclear power, reactors, atomic technology
- Data Centers: Cloud infrastructure, server farms, edge computing
- Healthcare: Medical systems, treatment, public health
- Immigration: Border policy, refugee matters, asylum
- Technology Industry: AI, software, startups, big tech
Methods:
- Keyword-based (fast): Pattern matching with curated keyword lists
- ML-based (accurate): Zero-shot classification using BART-large-MNLI
3. Sentiment Analysis
Emotional valence is measured using cardiffnlp/twitter-roberta-base-sentiment-latest:
- Title valence: Sentiment of headline (-1 to +1)
- Content valence: Sentiment of article body (-1 to +1)
- Overall valence: Weighted combination (30% title, 70% content)
Negative values indicate critical/pessimistic tone; positive values indicate optimistic/favorable tone.
4. Author Profiling
For each author with 3+ articles per category:
- Average valence per category: Mean emotional tone in each topic
- Valence variance: Consistency within each category
- Cross-category variance: How much tone varies between topics
- Balance score:
1 / (1 + cross_category_variance × 10)
Authors are ranked by:
- Balance: Highest balance scores (most consistent across topics)
- Polarity: Emotional spectrum within each category
5. Complementary Pairing
The system identifies author pairs with opposing valences in the same category:
- Complementarity score: Measures how well two authors "cancel out"
- Higher when valences are opposite and similar in magnitude
- Ideal for "rationalized" reading—consume both perspectives
Dashboard Views
Landing Page
- Summary statistics and date range
- Overview of all topic categories
- Sample complementary author pairs
- Explanation of the conjugate principle
Category Views
- Author ranking by emotional valence
- Top complementary pairs for balanced reading
- Sample articles from positive and negative extremes
Author Profiles
- Valence scores across all categories
- Historical trend over time
- Most positive and negative articles
- Balance score and rank
Balance Rankings
- All authors ranked by consistency across topics
- Most balanced vs. most polarized
- Comparative analysis
Methodology & Limitations
Known Limitations
Model Bias: The sentiment model was trained on Twitter data, which may not generalize perfectly to formal news writing.
Sample Size: Authors need multiple articles per category for meaningful metrics. Results with <3 articles should be viewed skeptically.
Topic Misclassification: Keyword-based categorization can incorrectly classify articles that merely mention a topic in passing.
Sentiment ≠ Bias: Negative valence doesn't mean "wrong" or "biased." Critical coverage can be accurate; positive coverage can be misleading.
Missing Context: This measures tone, not truthfulness, depth, sourcing quality, or argumentation strength.
Falsifiability
To test if the system is working:
- Click through to individual articles and verify sentiment classifications make sense
- Check sample sizes—ignore authors with <3 articles in a category
- Look for high variance scores (low confidence in classification)
- Compare keyword vs. ML categorization methods
- Manually review complementary pairs—do they actually cover the same events?
What This Is NOT
- ❌ A truth detector or fact-checker
- ❌ A measure of journalistic quality
- ❌ A political bias detector
- ❌ A replacement for critical thinking
What This IS
- ✅ An experimental heuristic for exploring emotional framing
- ✅ A tool to surface opposing perspectives on topics
- ✅ A demonstration of sentiment analysis applied to news
- ✅ A starting point for thinking about media consumption patterns
Deployment
See DEPLOYMENT.md for detailed instructions on deploying to:
- Render (recommended)
- Railway
- PythonAnywhere
Quick deploy to Render:
# Push to GitHub git init git add . git commit -m "Initial commit" git push -u origin main # Deploy via Render dashboard (auto-detects render.yaml)
Development
Running Tests
uv run python manage.py testDatabase Migrations
# Create migrations after model changes uv run python manage.py makemigrations # Apply migrations uv run python manage.py migrate
Admin Interface
Access the Django admin at http://localhost:8000/admin/ to:
- Browse all articles, authors, and categories
- Manually verify sentiment classifications
- Edit analysis run records
Re-running Analysis
# Re-analyze existing data without fetching new articles uv run python scripts/run_analysis.py --skip-collection # Fetch fresh data uv run python scripts/run_analysis.py --months 1
Dependencies
Core libraries:
- Django 5.2+: Web framework
- pandas: Data processing
- transformers: Hugging Face ML models
- torch: PyTorch for model inference
- feedparser: RSS feed parsing
- beautifulsoup4: HTML parsing
- gunicorn: Production server
- whitenoise: Static file serving
Contributing
This is a demonstration project. Potential improvements:
- Better categorization: Fine-tune a classifier on news article data
- Entity-level sentiment: Analyze sentiment toward specific entities/topics within articles
- Temporal analysis: Track how author sentiment changes over time or in response to events
- Source diversity: Add more news sources, especially international and niche publications
- User feedback: Allow users to flag incorrect classifications
- Alternative metrics: Explore other measures of balance beyond variance
License
MIT
Acknowledgments
- Sentiment model: cardiffnlp/twitter-roberta-base-sentiment-latest
- Zero-shot classification: facebook/bart-large-mnli
- News sources: BBC, Reuters, The Guardian, NPR, Al Jazeera, The Hill, Axios, TechCrunch
Citation
If you use this project in academic work, please cite:
@software{news_rationalizer, title = {News Rationalizer: Emotional Valence Analysis in Journalism}, author = {Michael Frank Martin}, year = {2025}, url = {https://github.com/riemannzeta/news-rationalizer} }
Contact
For questions, issues, or feedback, please open an issue on GitHub.
Disclaimer: This is an experimental tool for educational and research purposes. Results should be interpreted with caution and skepticism. Always verify sentiment classifications and consider multiple sources when evaluating news coverage.