It is now 2025, and you rarely see people online discussing whether Python dictionaries are ordered. Since Python 3.7, released in 2018, formally made insertion order part of the language spec, people have gradually gotten used to ordered dictionaries. The old unruly, unordered dict is as much a thing of the past as Python 2.7, mentioned mostly when old-timers get nostalgic.
Back when dictionaries did not preserve order, what did we use if we needed an ordered dictionary? The answer was collections.OrderedDict.
Now that the built-in dictionary is ordered, OrderedDict seems far less necessary. Even so, as of Python 3.14, it is still in the standard library module collections, mainly for these reasons:
- Backward compatibility: existing code that depends on it can keep working unchanged.
- Different behavior:
OrderedDictconsiders key order when testing equality; the built-indictdoes not. - Extra features:
OrderedDictprovides methods such asmove_to_end.
>>> d
OrderedDict([('a', 1), ('b', 2), ('c', 3)])
>>> d.move_to_end('a')
>>> d
OrderedDict([('b', 2), ('c', 3), ('a', 1)]) # 1
move_to_end()moves a key to the end of the dictionary.
This article looks inside OrderedDict to see what it takes to implement an ordered dictionary in Python.
Note: the standard library
OrderedDicthas both C and Python implementations for different runtime environments. Their designs are similar. This article focuses on the Python implementation.
A Doubly Linked List and Another Dictionary
OrderedDict is an ordered dictionary. Like a regular dictionary, it supports key-value operations, but it also preserves key order. Two ideas make this work:
- It inherits from
dict, so it automatically gets all built-in dictionary operations. All key-value pairs are stored on theOrderedDictobject itself. In other words,selfis already a{}. - It adds an extra ordered data structure that acts as external bookkeeping for key order.
Many data structures could be used to preserve order, but which one fits best? A dictionary is a high-performance hash table. Its strength is reading and writing key-value pairs in O(1) time. So whatever extra structure OrderedDict uses to store key order must meet the same performance bar: maintaining order must not slow down the dictionary's core operations.
To achieve that, OrderedDict uses two data structures at the same time: a doubly linked list and another dictionary.
- Doubly linked list: the ordered structure. Given a node, members can be inserted into or removed from the list in
O(1)time. Each node stores anOrderedDictkey. - Another dictionary: looking up a node in a linked list normally requires scanning the list in order, which takes
O(n)time on average. That is too slow, soOrderedDictadds another dictionary as an index. Given a key, it can retrieve the corresponding linked-list node inO(1)time.
The full structure looks like this:
Figure: Internal structure of `OrderedDict`, with three core data structures: `self` (the dictionary that stores key-value pairs), `self.root...` (the ordered doubly linked list), and `self._map` (the dictionary that indexes linked-list nodes)
Using __setitem__ as an example, here is how OrderedDict writes a key-value pair:
def __setitem__(self, key, value,
dict_setitem=dict.__setitem__, proxy=_proxy, Link=_Link):
'od.__setitem__(i, y) <==> od[i]=y'
if key not in self:
self.__map[key] = link = Link() # 1
root = self.__root
last = root.prev
link.prev, link.next, link.key = last, root, key # 2
last.next = link
root.prev = proxy(link) # 3
dict_setitem(self, key, value) # 4
- Create a new linked-list node and store it in
self.__map, so the node can later be retrieved quickly by key. - Update the new node
linkso it points to its previous and next neighbors, inserting it right beforeroot, which means appending it to the tail of the list. - Update the other two affected nodes,
last(the old tail) androot(the sentinel root). That completes the linked-list update. - Update the key-value pair in the dictionary itself.
Suppose we run d["aa"] = 4 to insert a new member. The full data structure changes like this:
Figure: Internal changes in `OrderedDict` after inserting the key-value pair `"aa": 4`
The doubly linked list, the dictionary that indexes list nodes, and the OrderedDict object itself all need to process the new member "aa": 4.
Like __setitem__(), methods such as __delitem__() (delete a member) and pop() (remove and return a member) must update not only the dictionary itself, but also the linked list and the index dictionary for the affected key. I will not repeat those details here.
To make OrderedDict return keys in order during iteration, __iter__ also needs custom logic:
def __iter__(self):
'od.__iter__() <==> iter(od)'
root = self.__root
curr = root.next
while curr is not root:
yield curr.key
curr = curr.next
As you can see, iterating over an OrderedDict is really just iterating over its internal doubly linked list. A while loop walks through the nodes and yields each key through a generator, preserving order.
Summary
By adding extra data structures, OrderedDict achieves ordering. The combination of a doubly linked list and an index dictionary minimizes the cost of maintaining order during reads and writes. It does use extra memory, but it still preserves good access performance.
Interesting Details
While reading the OrderedDict implementation, I noticed a few interesting details.
1. Using weakref
Python's garbage collection relies mainly on reference counting. The algorithm is simple and efficient, but it does not handle reference cycles well. Consider this example. When inserting a new node at the tail of a doubly linked list, you need to:
- Set the new node's next pointer to the root node (
link.next = root) - Set the root node's previous pointer to the new node (
root.prev = link)
That creates a reference cycle between link and root. Each object increments the other's reference count, which prevents timely garbage collection.
To avoid that, OrderedDict uses the weakref module in situations like this:
link.prev, link.next, link.key = last, root, key # 1
last.next = link
root.prev = proxy(link) # 2
linkandrootfirst create one directional reference throughlink.next.rootthen creates the reverse reference throughroot.prev, but this timelinkis wrapped byproxy(...), whereproxycomes from theweakrefmodule.
Once an object is wrapped by weakref, referencing it does not increase its reference count. That prevents a strong reference cycle and lets the GC reclaim memory sooner.
2. Passing object() as the Default Value
Like the built-in dictionary, OrderedDict also supports pop. The pop method removes a value by key and returns it. If the key does not exist, it returns the default value passed by the caller.
>>> d = {"a": 1}
>>> d.pop("a", 42)
1
>>> d.pop("c", 42)
42 # "c" does not exist, so return the default value 42
For OrderedDict, pop needs to do two things: pop from the dictionary itself and update the doubly linked list. The core code looks like this:
class OrderedDict(dict):
__marker = object()
def pop(self, key, default=__marker):
marker = self.__marker
result = dict.pop(self, key, marker)
if result is not marker:
# The same as in __delitem__().
# Linked-list update code omitted ...
Notice that dict.pop(self, key, marker) passes marker as the default value when the key does not exist. marker is not special. It is simply a fresh object() created when the class is defined.
Why use object() as the default? Because here the code must distinguish strictly between two cases based on the return value of pop(...): "the key exists" and "the key does not exist." A brand-new object() instance that can never appear in a user's dictionary is the ideal default value for that job.
Update: Change the title to "How Does Python’s OrderedDict Maintain Order?" from "Why is Python's OrderedDict ordered?"