Towards Data Science

2 min read Original article ↗
  • How does decision-gravity dictate this gap?

  • We have the document clusters, and it’s time to unlock their true potential! Let’s explore…

Latest

  • Learn about function approximation and the different choices for approximation functions

  • A local, zero-cost project that cleans, structures, and summarizes your reading automatically

  • Learn how to get the most out of Claude Code

  • More variables don’t make a better scoring model. Stable variables do. Here’s how to find them.

  • A hand-drawn style diagram showing how an LLM zero-shot classifier works. Messy, unstructured "Input Text" on the left flows into a central funnel, which then sorts the data into neat "Bucketed classifications" on the right, represented by categorized folders, a keyword bucket, and an action item diagram.

    A practical pipeline for classifying messy free-text data into meaningful categories using a locally hosted…

  • Mario asked me why 18% of his shipments were late when every team hit their…

  • The silent gaps in synthetic data that only show up when your model is already…

  • It’s simpler than you think.

  • Turning free-to-use data into a hypothesis-ready dataset

Editor’s Picks

  • A short intro to scientific methodology to combat “prompt in, slop out”

  • And what does it tell us?

  • Photo by Ays Be on Unsplash

    Why it tickles your brain to use an LLM, and what that means for the…

  • Git worktrees, parallel agentic coding sessions, and the setup tax you should be aware of

  • How I turned my eight-year weekly visualization habit into a reusable AI workflow

  • Architectures, pitfalls, and patterns that work

  • Overview of the main chapel room for Marenostrum 4.

    Inside MareNostrum V: SLURM schedulers, fat-tree topologies, and scaling pipelines across 8,000 nodes in a…

  • The upstream decision no model, or LLM can fix once you get it wrong

  • Machine learning models can be confident even when they shouldn’t be. This article introduces Deep…

The Variable Newsletter

  • Authors can now benefit from updated earning tiers and a higher article cap

  • Sorting through the good, bad, and ambiguous aspects of vibe coding

Deep Dives

  • Learn how Propensity Score Matching uncovers true causality in observational data. By finding “statistical twins,”…

  • How you can build your own Thompson Sampling Algorithm object in Python and apply it…

  • For any data scientist who works in a team, being able to undo Git actions…

  • The hidden cost of probabilistic outputs in systems that demand reliability

  • Conceptual overview and practical guidance

  • Generating Minecraft Worlds with Vector Quantized Variational Autoencoders (VQ-VAE) and Transformers