Tell HN: Python's print function is thread-unsafe
Some days ago I found that Python's print function is by default thread-unsafe, as the underlying sys.stdout is by default a TextIOWrapper, which is not thread safe. That means, for the code below
from threading import Thread
def target(x):
for _ in range(100000): print(x)
Thread(target=target, args=['a' * 10]).start()
Thread(target=target, args=['b' * 10]).start()
Python could print out not only interleaved bytes (aaabbbabab), but also null bytes and uninitialized memory, thanks to non-synchronized buffering. A survey of some other languages:
C printf: MT-Safe locale.
C++ std::cout: safe, unless you call sync_with_stdio(false).
JVM System.out.println: safe in common JVMs.
C# Console.WriteLine: safe.
Go fmt.Printf: safe.
Rust println!: safe.
Ruby puts: safe.
So it seems that Python is the outlier here.
std::cout is thread-safe but because of the concatenation-based API parts of the message may interleave. C++23 std::print is safe (and the output doesn't interleave).
I was able to reproduce your failure (using Python 3.10) after a couple of runs:
% grep -v '^aaaaaaaaaa$' kerneloops.out | grep -v '^bbbbbbbbbb$' | od -x
0000000 8130 0db3 0001 0000 bd30 0dad 0001 0000
*
0006420 000a
0006421
I was surprised as I expected possible interleaved bytes but not output other than "a"s, "b"s and new-lines, as I thought the GIL would come to the rescue.See this example:
https://superfastpython.com/thread-safe-print-in-python/
Python logging is thread safe if you want to use it for output
Why Threads Are a Bad Idea: https://news.ycombinator.com/item?id=17297325
The problem with threads: https://ieeexplore.ieee.org/document/1631937
Given threads are here to stay, thread-unsafe print is a really bad idea. Couldn’t even imagine it can crash instead of interleave, in Python of all things.