Show HN: System that rediscovers physics laws from raw data autonomously

1 points by strujillo 3 months ago · 2 comments · 1 min read

Reader

ProtoScience is a deterministic pipeline that takes raw numerical data and autonomously discovers governing equations.

It does not use LLMs for discovery — only sparse regression, power-law fitting, and statistical validation.

Results so far:

- Kepler's Third Law (P² = a³ / M) from 3,519 NASA exoplanets — R² = 0.998 - Sun’s ~27-day rotation period from solar wind plasma data — 93% accuracy - Power law T ~ v^3.40 in solar wind (NOAA/NASA spacecraft data) - 5/5 General Relativity predictions from simulated black hole observables — all R² = 1.000 - Chirp mass relationship from 219 LIGO gravitational wave events — R² = 0.998

It also detects when no meaningful law exists — Bitcoin daily prices returned R² = 0.00.

Pipeline:

raw data → feature extraction → candidate law generation → fitting → verification

An LLM (Claude) is only used at the end to interpret results in natural language — it is never involved in the discovery step.

All experiments are fully reproducible.

Code: https://github.com/SaulVanCode/protoscience-nasa-experiments

unhappychoice 3 months ago

The BH series is wild. Rediscovering GR relations from observables alone with R²=1.000. No physics priors, just raw data. That's the part that got me.

strujilloOP 3 months ago

Exactly — that was the surprising part for me too.
The system has no notion of “physics” at all — it’s just searching for compressible structure in the data.
The fact that GR relations emerge from observables suggests that a lot of what we call “laws” might just be the simplest compressions of measurement space.
Still early, but curious how far this goes.

Settings

Show HN: System that rediscovers physics laws from raw data autonomously

Keyboard Shortcuts