Show HN: ZFS Implementation in Python

411 points by alcari 7 years ago · 96 comments

Reader

Note: this was implemented without referencing any ZFS source code and should not be subject to the CDDL.

josteink 7 years ago

So we port this python to something not slow, and all the kernel-people can shut up about ZFS being terrible ;)
- aidenn0 7 years ago
  
  I can Port from Python -> Common Lisp at a rate of ~100 LoC per hour, and that's a pretty friendly port (the yield expression is the only thing that can't be just done line-by-line; truthyness is the only real "gotcha" as there are a lot of "false" values in Python, but only 1.5 in Lisp[1]).
  1: I say 1.5 because there is only one false value, but it has 2 idiomatic meanings: nil (equivalent to python's None) and the empty list.
- j88439h84 7 years ago
  
  Dont forget, pypy is fast.
  - cure 7 years ago
    
    PyPy is faster than Python, yes. But Go, C and many other (compiled) languages are way faster than PyPy. Plus, if you use a language like Go or Rust then you avoid Python's GIL and you'll have much more reasonable memory usage. Best of all, deploying is a matter of copying a binary, rather than having to deal with the absolute disaster that is Python packaging.
    
    deaddodo 7 years ago
    
    > Plus, if you use a language like Go or Rust then you avoid Python's GIL
    No, but then you run into Go's GC and green threads. File systems fit squarely in the realms of "systems programming" (old definition [1], not new). Languages like Ada, Pascal, C/C++, Rust and D (without GC).
    [1] - https://en.wikipedia.org/wiki/System_programming_language
    
    klyrs 7 years ago
    
    Filesystem with a GIL, what could possibly go wrong? /s
    
    emmelaich 7 years ago
    
    A lot less will go wrong than a filesystem without a GIL.
    GIL is for safety and correctness, not speed.
    
    y4mi 7 years ago
    
    Uh, no?
    Python's global interpreter lock was added for single threaded speed and c library integrations, which often can't be used multithreaded
    There was some talk about removing it recentlish to improve pythons multithreaded performance and Guido said something along the lines of
    > "I'll remove it as long as single threaded performance doesn't suffer"
    Which nobody succeeded in
    
    nine_k 7 years ago
    
    Go? A GC'd language in kernel? (Well, yes, this has been done, from Lua to Haskell, but only experimentally.)
    
    hu3 7 years ago
    
    I wouldn't advise writing low level stuff in Go but people do enjoy a challenge from time to time: https://news.ycombinator.com/item?id=18399389
    
    pjmlp 7 years ago
    
    I wouldn't consider the workstations sold by Xerox, TI, Connecting Machines, the OS research department at ETHZ or the Microsoft’s natural language search service for the West Coast and Asia, just experiments.
    
    weberc2 7 years ago
    
    Python is also a GC’d language...
    
    twa927 7 years ago
    
    CPython is mostly reference-counted.
    
    int_19h 7 years ago
    
    With a synchronous garbage collector for cycles. Which is like the worst of both worlds, since you get the constant overhead of refcounting, plus unpredictable interruptions of unspecified duration that can happen every time a new object that might contain references to other objects is created.
    To be fair, the GC can be disabled. But it's only safe to do so when you know there are no cycles, and even when such guarantee can be had for your own code, I've never seen a library guarantee that to API clients.
    
    BlackFingolfin 7 years ago
    
    And reference counting is a form of GC
    
    fnord123 7 years ago
    
    >PyPy is faster than Python, yes.
    Python is only slow if you use it wrong:
    https://apenwarr.ca/diary/2011-10-pycodeconf-apenwarr.pdf
  - GuB-42 7 years ago
    
    Maybe plenty fast for most applications but a filesystem is not one of these IMHO, especially for something as naturally resource hungry as ZFS.
    A good filesystem implementation requires tight memory management and good control of what happens at the OS level. I am not saying it can't be be done in python, but it clearly isn't the right tool for the job.
    I meant that for a production implementation. Python is perfectly fine for a proof of concept, in fact, it may be better than jumping straight down to C. But keeping it for production is foolish IMHO.
  - spullara 7 years ago
    
    Faster that CPython doesn't mean it is fast.
    
    twa927 7 years ago
    
    I was trying to speed up a log processing service running on PyPy by rewriting it in Java. I was surprised that the result was about twice slower (I know Java quite well and I didn't see obvious optimizations; most of the time was spent in GC). So it can be quite fast even in more absolute terms (VM languages), at least for some types of code.
    
    spullara 7 years ago
    
    If more than 1% of your time is in GC you are doing something very wrong.
    
    michaelmrose 7 years ago
    
    The fact that a singular implementation was better than java says less about the languages and more about the particular software.
    
    tigershark 7 years ago
    
    I know that it’s asking a lot, but any chance that you can post a minimum reproducible sample? From what I know it is quite smelly...
    
    twa927 7 years ago
    
    I don't have access to this codebase now but I'll try to write some benchmark.
    
    tigershark 7 years ago
    
    If you manage to do it would be awesome, otherwise thanks a lot anyway for the effort :) I’m just curious to understand why it happens because it’s exactly the opposite of what I would expect. The only explanation that comes to my mind is excessive gc as someone else already mentioned, but it would be interesting to see the original code.
    
    twa927 7 years ago
    
    I started doing it but yes, it's too much effort to get two full benchmarks, I'm sorry :). But I think it went down to the inefficiency of String.split(): https://stackoverflow.com/questions/37007189/string-split-te... and generally the Java's String built-in methods not being GC-friendly: https://stackoverflow.com/questions/20336459/garbage-friendl.... I'm guessing that when such parts can be coded in a non-VM environment (CPython/PyPy runtime) they can be made much faster, and Java to these days has the standard library coded in pure Java?
    
    tigershark 7 years ago
    
    Yes, Java has nothing like C# Span to avoid these kind of problems, but I thought that also python would be affected in a similar way... Anyway, thanks for sharing.
    
    auscompgeek 7 years ago
    
    CPython and PyPy are both VMs, similarly to the JVM.
    
    twa927 7 years ago
    
    I meant the inside of a VM (its implementation), which is coded in C/RPython. Java's VM is coded in C++, but I don't think C++ is used for any regular library functions, while C is used heavily for Python's stdlib.
    
    pjmlp 7 years ago
    
    Not all JVMs are implemented in C++.
    It is a specification with multiple implementations, some of them are even bootstrapped in Java.
    
    d0mine 7 years ago
    
    I would have expected it: https://stackoverflow.com/questions/9371238/why-is-reading-l...
    
    int_19h 7 years ago
    
    I had a binary parser written in Python that took around 30 seconds on typical input on CPython. PyPy took that down to about 10 seconds. Rewriting it in C# took it down to 200 ms.
    
    twa927 7 years ago
    
    If this was using a loop processing a single byte in an iteration I would expect a greater speedup on PyPy. I've seen 100x speedup in such cases.
    
    int_19h 7 years ago
    
    Not single byte, but individual fields (float32/int32/string etc). Yes, I expected a much more significant speed-up as well. It's probably because a lot of that code was driven by reflection-type techniques.
    Curiously, IronPython did better than anything (but still slow). Haven't tried Jython.
    Compiling the whole thing with Cython was less effective than PyPy.
    
    twa927 7 years ago
    
    Yes, reflection voids many JIT paths in PyPy, AFAIK. Maybe it was worth rewriting the Python code to get rid of the reflection?
  - mehrdadn 7 years ago
    
    I routinely fail to get speedup on PyPy. In fact I frequently get slowdowns. I imagine it's only fast if your code is slower than it needs to be to begin with.
    
    twa927 7 years ago
    
    It works well for tight loops processing much data, or heavy object-orientation (multiple levels of class hierarchies). It probably won't work well for regular Django webapps or scripts. Also, real-world Python numerical/AI code uses numpy/ML libs so there's not much to optimize in Python...
  - tanilama 7 years ago
    
    Only when comparing to CPython
  - fragmede 7 years ago
    
    pypy is fast, but even were this written in C, there's still the kernel-userland boundary to contend with.
    
    moonbug 7 years ago
    
    Fuse
    
    yjftsjthsd-h 7 years ago
    
    If you're going to run the file system in user space, there's no reason not to just use normal ZFS. The problem with ZFS licensing is only in combining CDDL+GPL in one unit. If you're working across the kernel/userspace boundary, there's already no problem. ZoL even already ships a fuse version that works fine.
    
    Conan_Kudo 7 years ago
    
    > ZoL even already ships a fuse version that works fine.
    This is not true. A FUSE implementation is wanted though: https://github.com/zfsonlinux/zfs/issues/8
    
    yjftsjthsd-h 7 years ago
    
    Oops; I didn't realize the existing fuse version wasn't built from ZoL. Thanks for pointing that out.
newnewpdro 7 years ago

Would the CDDL matter for a python implementation that will never become part of the kernel?
- CaliforniaKarl 7 years ago
  
  You'd want a reverse-engineering lawyer, so be certain. But my (IANAL) guess is: If this is a proper reverse-engineered implementation, you could then convert _this_ implementation to C, and contribute _that_ into the kernel.
  Except, it seems this is BSD-licensed, so I'm not sure how that would work in the kernel (which is GPLv2).
  - loeg 7 years ago
    
    BSDL code is fine in the GPLv2 kernel. E.g., most of the DRM drivers are dual BSD-GPL licensed.
  - lunixbochs 7 years ago
    
    BSD is a subset of GPL's restrictions, so you can include BSD-licensed code in a GPL work.
    
    loeg 7 years ago
    
    Subset isn't quite accurate, but "GPL compatible" might be a good way to describe it.
- dr0verride 7 years ago
  
  It's clear to me that the solution is to integrate python into the kernel.
  - yjftsjthsd-h 7 years ago
    
    I assume you mean this as a joke, but I would point out that at least one of the BSD family has gone and baked lua into their kernel. Granted, lua is rather meant for that kind of thing and python isn't, but it is entertaining to point out an interpreted language that has been stuck into a unix kernel:)
    
    riffraff 7 years ago
    
    there was lua in linux too, with lunatik https://github.com/lunatik-ng/lunatik-ng
lelf 7 years ago

Bad title. It’s not a ZFS implementation, so hold your horses.
- AHTERIX5000 7 years ago
  
  What do you mean? Can't you mount & read ZFS filesystems with this one?

Pierre Menard, author of The Filesystem.

But I'm surprised this is possible without a specification - how can you test a filesystem through hexdumps? The effects of some operations are going to pretty far-reaching, surely?

randrus 7 years ago

Might be interesting/useful to aim for zfs send/receive compatibility :)
And thanks for the Borges callout.
cryptonector 7 years ago

There are lots of blog posts, lots of docs. There's ZFS code in GRUB that is GPL, etc.
aerovistae 7 years ago

One of my favorite short stories, and nobody else has ever read it. So glad to see someone else reference it.
- hueving 7 years ago
  
  Link?
  - iiv 7 years ago
    
    The original title is "Pierre Menard, Author of the Quixote": http://www.coldbacon.com/writing/borges-quixote.html

mfsch 7 years ago

Does someone know whether it would be legal for someone to go through the ZFS code and write a specification of the features this author hasn’t figured out yet? I.e. could someone write a detailed description of the missing functionality that doesn’t include any details about the implementation so other people can implement it in non-CDDL code?

ummonk 7 years ago

Edit: yeah that is how you avoid copyright infringement https://en.wikipedia.org/wiki/Clean_room_design
Original comment: I could swear this was actually the standard practice for writing an implementation of an unknown file format or interface without infringing on copyright. But I don't remember the term for it.
atomicwrites 7 years ago

That's called a clean room implementation and was the standard way to make x-compatible products (like for example, the bios on an IBM PC clone). Not sure what the current legal standing of that method is.
EDIT: Ninjad because I left the reply in a tab without posting.
- zymhan 7 years ago
  
  Reverse engineering is legal in the US, but you had better have detailed records proving no one who knew the insides of the original product ever influenced the clone. And be prepared to explain that in court.
Someone 7 years ago

The on-disk format is available (http://www.giis.co.in/Zfs_ondiskformat.pdf)
That PDF says ”Unless otherwise licensed, use of this software is authorized pursuant to the terms of the license found at: http://developers.sun.com/berkeley_license.html”*. That link is broken, but it seems that’s Berkeley license (whatever that means for a specification, and for which variant?)
According to http://open-zfs.org/wiki/Developer_resources, its outdated, but still useful.
I think I would use that, rather than spend months diffing disk images.

AlexanderDhoore 7 years ago

What?! How?! Why?!
This is the greatest thing ever. I wish I could just write code for the fun of it. Every time I wonder whether people will use it and give up before I even get started.

js2 7 years ago

I wrote this to scratch my own itch but mostly just for the fun of it and some people ended up using it.
https://github.com/jaysoffian/eap_proxy
This was a really simple project but tickled all my fancies: Python, low-level, networking, reverse engineering, system administration.
Just do it! Who cares if people use it?
Alternatively, contribute something to some open source project you use. I’ve done that too. Just small stuff here and there but that’ll guarantee someone uses your code if that’s what’s important to you. It only takes 39 commits to get on this page:
https://github.com/git/git/graphs/contributors
:-)

max0563 7 years ago

I had this problem too. I’ve been able to get over it by from coming up with a scenario, even if it’s completely fabricated, where what I am doing can be useful. I also make sure that I incorporate something new that I want to learn in the project. Whether it’s a language, library, whatever. Then I give myself a date I can quit. Normally it’s about two months. This makes me really consider whether I want to take something on because if I do I force myself to dedicate two months of time to it. If I enjoy it still at the end of two months then I continue otherwise I move on to another idea. At least in that time I because a little better at whatever I was trying to do. That’s the real goal anyway.

mirceal 7 years ago

why not? writing code can fall into one of a few bucket. one of them is play.

anon4242 7 years ago

Exactly! Write code that you find interesting and/or need for something and then share it. If someone uses your stuff, then great, if not, at least you've become a slightly better programmer! It's a win-win!

ehsankia 7 years ago

Absolutely, some of the most fun I've had coding was reverse engineering/implementing known protocols. Although something this big may be a little overboard :)

craftyguy 7 years ago

> What?! How?! Why?!
The readme literally answers this..

lelf 7 years ago

No ARC/L2ARC?
Edit: of course not. This is actually just it’s just a ZFS user-facing ”front-end”, not a ZFS implementation.

lunixbochs 7 years ago

It's capable of doing IO against a real ZFS array without any other code. ARC is an implementation detail and not necessary for correctness. If you removed ARC from ZoL it would still work, just slower. ARC is far from the most interesting milestone for a reimplementation effort because an ARC implementation doesn't need to be anything like the Sun version internally, as long as it offers similar performance.
This project is cool not because you're going to run the Python in your kernel today, but because someone can use it as a documented reference implementation of all of the data structures and transactions that is not covered by the CDDL, so another implementation based on this can live in the Linux kernel without problem.

hnlmorg 7 years ago

If the GPLv2 GRUB ZFS code[1] wasn't enough to get someone started then I doubt this will make any different in porting ZFS to GPL given there would be more work involved in turning this into a usable kernel driver.
Not taking anything away from the work that the author has done though. It's a nice project. I just think a little pragmatism is needed before we get carried away with the ZFS GPL comments.
[1] https://blogs.oracle.com/solaris/zfs-under-gplv2-already-exi...

chasil 7 years ago

It would certainly be wonderful if this led to a Linux kernel module that was free of Oracle.

lelf 7 years ago

You don’t need CDDL to use ZFS. Which is all the library does. It does not implement ZFS.

4oo4 7 years ago

This is awesome! Do you plan on blogging anything about how you went about reverse-engineering?

PaulHoule 7 years ago

i love userspace implementations of filesystems.
note that the issues are entirely different from those with a kernel implementation since you aren't having to think about page cache et all.

burmecia 7 years ago

There is an userspace implementations of filesystems ZboxFS: https://github.com/zboxfs/zbox.

gaze 7 years ago

Hell yeah, dude. This is awesome.

foxhop 7 years ago

Hey alcari!
I know you from ICV - we used to hang out online on the forums and IRC.
Nice work on this project, I'm looking forward to diving into the codebase!

rashkov 7 years ago

Just curious, what is / was ICV?

foxhop 7 years ago

It was / is an community of people who formed around a book called 1337 h4x0r h4ndb00k by tapeworm.
ICV was the name of the forum for the book, now defunct (icodeviruses.com)
https://www.amazon.com/1337-h4x0r-h4ndb00k-tapeworm/dp/06723...

RantyDave 7 years ago

This is all kinds of funny. I'm awash with awe and admiration.

adamnemecek 7 years ago

There's also TFS a ZFS inspired FS in Rust https://github.com/redox-os/tfs

garmaine 7 years ago

Simultaneously fucking awesome (that you pulled it off at all), and fucking useless (performance...).
Thanks for sharing though. Maybe could be useful in making a suite of zfs inspection tools?
Is the OP here? How difficult would it be creat a zfs reshaping tool, allowing for the offline expansion of a vdev?

Settings

Show HN: ZFS Implementation in Python

Keyboard Shortcuts