Python Serialization Performance

27 points by kurtbuilds 10 years ago · 19 comments

Reader

> Disable Schematics Validation ... There's a cool Python trick to create an object while skipping the __init__ function.

Yes, but now every single time you update the library which provided the base class, you need to re-verify that __init__ doesn't do anything new. May be worth the tradeoff, but it really should be noted.

Igglyboo 10 years ago

Seems like a good use case for protocol buffers [1].

[1]https://developers.google.com/protocol-buffers/?hl=en

kyzyl 10 years ago

It might be that they consider it important to be able to interrogate the in-memory or on-disk representations without the help of a decoding step. Protobufs are great for getting objects into a nice compact format to throw on the wire, but god help you if you end up with multiple actors filling up a queue with inconsistently encoded objects.
- mahmoudimus 10 years ago
  
  doesn't Apache Avro solve this problem?

melted 10 years ago

TL;DR: Things get faster if you disable validation and parse dates using more specialized code. Duh.

It's humorous when someone who presumably cares about performance tells you they use Python. Python is a wonderful language, but performance is not what it is designed for. Basically _anything_ that does not require an interpreter to run will be 10-30x faster on the same hardware, and most will also consume less RAM and be able to use more than one core on the system efficiently. It used to be that Python's lack of performance didn't matter because disks and networks were so slow things were IO bound. In more and more cases that's just not true anymore. You could be easily reading at 1GB+/sec and pushing 10-20Gbps to NICs, depending on the hardware.

viraptor 10 years ago

It depends what "using python" means though. Cython is pretty good at optimizing basic code. Numpy will process your matrices and vectors using specialised libraries faster than most manual C approaches. Shedskin will give you a nice code framework which you can optimise in parts that matter. (insert other specialised examples)
CPython is slow as an interpreter, true. "Programming in Python" may or may not be many times slower than compiling the comparable code in other language. Depends what you're doing and how you're doing it.
Also, I care about performance in any language to some extent. If I can write a backup bash script that takes 2h, or write one that takes 20min, I do care about performance and will choose the second one. Why shouldn't I?
- melted 10 years ago
  
  Even Cython will be several times slower than a carefully tuned C/C++, Java, C# or Go program for most practical problems. And at least in the case of C/C++ it'll also likely use several times more RAM. Now for a company like Uber it may not matter if something is 3x slower and uses 3x the RAM, just throw more hardware at the problem, but if you're going to introduce typing into Python, you might as well go with a language where typing is not optional, and which, more importantly, has been used in production by thousands of teams over the past decade or more. Of the languages I listed, only C++ is not really viable for most programmers.
  As to caring about perf, you shouldn't care about it until you have to. Take that 2h vs 20min example, for instance. If you only need to run it a few times and there's plenty of time available, who cares how long it takes. If the 2h one is easier to write that's by all means what you should do. OTOH if you're under severe time constraints and need to run it every hour, then obviously 2h script won't do the job. Or alternatively if 20 min script takes the same time to write as 2h one, then of course you should go with it. All too often I see people optimizing things that don't matter one iota, simply because they like things to be fast. Something gets executed once a day and runs for 5 minutes? Let's spend two weeks making it complete in 30 seconds. As long as the employer is paying, why not.
  - williamstein 10 years ago
    
    Cython is the same speed as a carefully tuned C/C++ program; carefully tuned, Cython maps directly to C/C++. I've done many benchmarks of low level numerical code when implementing SageMath (which uses Cython for several 100K lines of code).
    
    melted 10 years ago
    
    As someone who has spent almost 20 years (on and off) writing C and C++, I don't believe it. C/C++ lets you go as close to hardware as you would possibly want. Want SIMD? Easy. Want custom memory allocation (a big deal if you allocate/deallocate a ton)? Sure, why not. Want to profile and optimize cache locality and memory layout? Knock yourself out. Memory alignment? Yup. Branch hints? Of course. I could continue with this, but as someone who has written performance-sensitive code you already know most if not all of this. It's not a coincidence that e.g. high performance linear algebra libs are written in C.
    And it baffles me that anyone would even consider writing 100KLOC+ project in something as lax as Python. That's just asking for trouble.
    
    BuckRogers 10 years ago
    
    >It used to be that Python's lack of performance didn't matter because disks and networks were so slow things were IO bound. In more and more cases that's just not true anymore.
    That's generally a valid statement. But it's why PyPy and Nuitka exist. Pyston and Pyjion are up and coming in this area.
    >And it baffles me that anyone would even consider writing 100KLOC+ project in something as lax as Python. That's just asking for trouble.
    That's why Python 3.5 has type annotations. It increases the amount of sane usecases for Python going forward.
    Let me add I'm with you though in your general thought. The DB/IO bottleneck myth needs to die. We do need more performance, on the order of Elixir. CPU performance at that level or above removes a whole class of application issues. Not sure C's performance is needed though unless you're doing systems programming.
    I just happen to like Python as a language, so happy to fly the flag on the various solutions to make it fit that truly needed aspect.
    
    cyphar 10 years ago
    
    > As someone who has spent almost 20 years (on and off) writing C and C++, I don't believe it. C/C++ lets you go as close to hardware as you would possibly want. Want SIMD? Easy. Want custom memory allocation (a big deal if you allocate/deallocate a ton)? Sure, why not. Want to profile and optimize cache locality and memory layout? Knock yourself out. Memory alignment? Yup. Branch hints? Of course. I could continue with this, but as someone who has written performance-sensitive code you already know most if not all of this. It's not a coincidence that e.g. high performance linear algebra libs are written in C.
    Which you can use in Python. I'm a C guy myself, but this bullshit "do it all in C" attitude needs to be calmed down a bit. Python is a great language for many reasons, and you can make up for its downsides by using Python libraries that are implemented in C.
    
    Lofkin 10 years ago
    
    Have you seen numba? Compiles numerical python approaching fortran speeds
    
    srean 10 years ago
    
    ...unless you have to call into Numpy or Python C-API frequently in a hot loop. That's bit of a bummer. I would rather write array indexing in Numpy notation than do that error-prone indexing by hand (in Cythonic C). I think there was something in the works to deal with this problem, not up do date on it.
- kyzyl 10 years ago
  
  Is Shedskin still actively maintained? I haven't looked in a long time but last I saw it hadn't gotten any updates in a couple years (maybe I have corrupted memory though ;)
  Anyhow, the relatively new Nuitka project seems to be aiming to tackle the python-to-c++ compiler problem, and seems to have a lot of promise. Really good compatibility, apparently decent speedups, and cross platform support. Works into Python3 too. I have a lot of hope!
  - viraptor 10 years ago
    
    I don't think it is. But that doesn't mean it doesn't still work :) But yeah, Nuitka is probably a more interesting target for new code.
dr_zoidberg 10 years ago

On my first job we had to calculate a lot of MD5's for integrity checks after some downloads (it was completely useless, but part of the protocol). They used the so called "fastest C implementad MD5 program for windows". I outperformed it by 20% with a Python script that used hashlib.md5() and read the file in 1 MiB chunks. Turns out the C program was reading in small (can't remember if it was 1, 2 or 4) KiB chunks, apparently an artifact from older times, while Python handled the disk read a lot better. I was hailed as a hero by my coworkers for speeding up the checks with the "slow" programming language.
TL;DR: things get faster if you know the machine you're working on and how to use it best, regardless of the programming language.
cyphar 10 years ago

When I'm doing data analysis for my research project, I use Python (even though I mostly program in C, Go and handful of other languages) because all of the important performance stuff (fast Fourier transforms, literally all of numpy) are done in C with on-the-metal performance. Why should I have to worry about my memory allocation when I'm trying to do a bunch of statistical tests on data? Not to mention that I can do endless monkeypatching in Python safely, while in C you'd be trying to modify code (which modern kernels don't like).
Most performance problems come from bad algorithms, not your programming language. I had a piece of code that had to do some complicated "image" masking with a 2 array, with feathering and some quite complicated statistical modelling to compute the offset. It took 20 minutes to run on a medium-sized data set. After sitting down with it for a few hours, I got it down to 30 seconds. If I switched it to C, it would've taken far too long to improve the performance. Python runs in a fairly well optimised VM anyway, so it's definitely "good enough".
The source is on my GitHub, but it probably won't be useful to anyone: https://github.com/cyphar/keplerk2-halo.
collyw 10 years ago

One of the most basic principles of my data structures and algorithms course years ago was to choose the correct algorithm. Language may run 10 - 30 times faster but if you have a crap algorithm its still going to go slow as you scale.
ju-st 10 years ago

> You could be easily reading at 1GB+/sec and pushing 10-20Gbps to NICs, depending on the hardware.
Python is easily able to push this much data but I have the impression that the problem are the performance-hogging libraries. In my case I had to write my own HTTP client implementation for Python to get such speeds. Python is not the problem you just need to avoid unnecessary LoCs.

Settings

Python Serialization Performance

Keyboard Shortcuts