
Bio
I’m an entrepreneur and open source software developer focusing on analytical computing. I’m currently a Principal Architect at Posit PBC.
I co-founded Voltron Data and now serve as its Chief Scientist. I created or co-created the pandas, Apache Arrow, and Ibis projects. I am a Member of The ASF and I have authored three editions of Python for Data Analysis.
In the past, I was with Ursa Computing, Ursa Labs (with Posit help), Two Sigma Investments, Cloudera, DataPad, and AQR Capital Management. I received my S.B. in mathematics from MIT in 2007. I grew up in Knoxville, TN and Akron, OH, and currently live in Nashville, TN.
Projects
- Python for Data Analysis: Book authored for O’Reilly Media. Three editions (2012, 2017, 2022) and translations in many languages.
- Apache Arrow: I am a co-creator and PMC member, focusing on the C++ and Python implementations. I helped design the core Arrow format and helped start the Flight, Nanoarrow, and ADBC subprojects.
- pandas: Python data analytics toolkit. I created the project in 2008 and turned it over to its open source community in 2013. I remain the “Benevolent Dictator for Life” but this is mostly a ceremonial role.
- Ibis: A Python DSL toolkit bringing together the best ideas of data frames and SQL. Created in 2015 at Cloudera.
- Apache Parquet: I am a PMC member and a principal author of the C++ implementation and Python bindings.
Recent Talks & Media
Podcast (Co-host) The Test Set (Posit) (Remote, Dec 2025)
Marco Gorelli: Narwhals, ecosystem glue, and the value of boring work
Episode
Podcast (Co-host) The Test Set (Posit) (Remote, Dec 2025)
Kelly Bodwin: Quarto hacks, AI in the classroom, and why R should stay weird
Episode
Podcast Tech on the Rocks (Remote, Dec 2025)
From pandas to Arrow: The Future of Data Infrastructure
Transcript · Episode
Podcast (Co-host) The Test Set (Posit) (Remote, Nov 2025)
James Blair Part 2: Solutions engineering, critical thinking, and staying human
Episode
Podcast (Co-host) The Test Set (Posit) (Remote, Nov 2025)
James Blair Part 1: Portfolios, practice, and staying curious
Episode
Podcast (Co-host) The Test Set (Posit) (Remote, Oct 2025)
Julia Silge Part 2: Glue work, licensing, and open source in the age of LLMs
Episode
Talk Earl UK 2025 (Brighton, UK, Oct 2025)
Building Data Science Tools in an AI-Native World
Transcript · Slides
Podcast (Co-host) The Test Set (Posit) (Remote, Oct 2025)
Julia Silge Part 1: Positron, pineapple pizza, and the art of iteration
Episode
Podcast (Co-host) The Test Set (Posit) (Remote, Sep 2025)
Michael Chow: From psychology and Python to constrained creativity
Episode
Talk Small Data SF (San Francisco, CA, Sep 2025)
Retooling for a Smaller Data Era