Laurent Luce's Blog

4 min read Original article ↗
Towns unemployment, sunshine and housing prices relationship

Regression analysis can be used to search for relationships among observation variables. I wanted to find out if unemployment rate and housing prices are related. I also wanted to find out if the number of yearly sunshine hours and housing prices is related. I found two data files on the French government OpenData platform to... » read more

Python deque implementation

Python deque is a double-ended queue. You can append to both ends and pop from both ends. The complexity of those operations amortizes to constant time. We are going to look at the Python 3 internal implementation of deques. It uses a linked list of blocks of 64 pointers to objects. This reduces memory overhead... » read more

Least frequently used cache eviction scheme with complexity O(1) in Python

This post describes the implementation in Python of a “Least Frequently Used” (LFU) algorithm cache eviction scheme with complexity O(1). The algorithm is described in this paper written by Prof. Ketan Shah, Anirban Mitra and Dhruv Matani. The naming in the implementation follows the naming in the paper. LFU cache eviction scheme is useful for... » read more

Cambridge city geospatial statistics

Using the Cambridge (Massachusetts) GIS data, we can compute some interesting geospatial statistics. Subway stations Using the address points (20844 points) and the subway stations data files, we can find out how many Cambridge addresses are in a certain radius of a Cambridge subway station. More particularly, we want to find the percentage of Cambridge... » read more

Massachusetts Census 2010 Towns maps and statistics using Python

Using the Massachusetts Census 2010 Towns data source, we generated the following maps and statistics. The software used relies on technologies such as Python and PostGIS. Note: The data source has been generated and published by MassGIS. Population change We can see some towns in Cape Cod and Western Massachusetts not growing as fast (few... » read more

Python, Twitter statistics and the 2012 French presidential election

This post describes how Pytolab was designed to process Tweets related to the 2012 French presidential election in real-time. This post also goes over some of the statistics computed over a period of nine months. Note: I presented this project at EuroSciPy 2012: abstract. ArchitectureStatistics Architecture The posts are received from the Twitter streaming API... » read more

Twitter sentiment analysis using Python and NLTK

This post describes the implementation of sentiment analysis of tweets using Python and the natural language toolkit NLTK. The post also describes the internals of NLTK related to this implementation. Background The purpose of the implementation is to be able to automatically classify a tweet as a positive or negative tweet sentiment wise. The classifier... » read more

Python dictionary implementation

This post describes how dictionaries are implemented in the Python language. Dictionaries are indexed by keys and they can be seen as associative arrays. Let’s add 3 key/value pairs to a dictionary: The values can be accessed this way: The key ‘d’ does not exist so a KeyError exception is raised. Hash tables Python dictionaries... » read more

Python string objects implementation

This article describes how string objects are managed by Python internally and how string search is done. PyStringObject structure New string object Sharing string objects String search PyStringObject structure A string object in Python is represented internally by the structure PyStringObject. “ob_shash” is the hash of the string if calculated. “ob_sval” contains the string of... » read more

Python integer objects implementation

This article describes how integer objects are managed by Python internally. An integer object in Python is represented internally by the structure PyIntObject. Its value is an attribute of type long. To avoid allocating a new integer object each time a new integer object is needed, Python allocates a block of free unused integer objects... » read more