Settings

Theme

Mapping Python to LLVM

blog.exaloop.io

121 points by arshajii 3 years ago · 33 comments

Reader

fernly 3 years ago

A major type-incompatibility not mentioned in the linked blog post is this, from[1],

* Strings: Codon currently uses ASCII strings unlike Python's unicode strings.

Judas priest, after all the effin' grief we went through to learn how to handle Unicode strings in Python 3, and to finally begin to realize their value, you take this step backward? Forget the i64 limits, the lack of native Unicode strings is a flat deal-breaker. (For example, will Codon warn if it sees an "open(encoding='UTF8')" call? Or a normal open(mode='rt') if the default local encoding is UTF8?)

It doesn't help that the same doc also mentions that

* Dictionaries: Codon's dictionary type is not sorted internally, unlike Python's

Current Python dicts are not "sorted"; rather they "preserve insertion order, meaning that keys will be produced in the same order they were added sequentially over the dictionary."[2]

This is new functionality added only recently (3.7) so its lack would not inconvenience a lot of existing code. OTOH, why did they not plan to reproduce this useful feature from the start?

Possibly they were thinking of the pypi package SortedContainers[3]?

[1] https://docs.exaloop.io/codon/general/differences

[2] https://docs.python.org/3/reference/datamodel.html#index-30

[3] https://pypi.org/project/sortedcontainers/

  • HybridCurve 3 years ago

    Breaking compatibility from the current spec/functionality of python should be a definite no-no for any implementation. That being said, I can still appreciate that they didn't try to write their own busted unicode implementation since many other ones have contributed to security issues.

    • maxerickson 3 years ago

      I don't understand why you say that. Like, it's gonna cost them in adoption to diverge, they don't need a lecture to understand that, they are doing what meets their needs and sharing it.

    • Ultimatt 3 years ago

      The implementation is optimised for genomics, thats why.

debatem1 3 years ago

The type conversion assumptions here are real problematic. "64 bits ought to be enough for anybody"-style statements ignore integers as bitfields, large constants (eg Avogadro's number), any kind of math with large intermediate terms, all kinds of stuff.

Makes me very suspect about the rest of this project when they try to glide past all of these issues with nary a mention.

  • zamadatix 3 years ago

    > There are many things we took for granted here, like how we determine the data types to begin with, or how we put the source code in a format that’s suitable for code generation. These, among other things, will be topics of future posts in this series. Stay tuned!

    I don't feel they are trying to glide past anything. It's the first post in the series about a product in 0.x state, it's gotta start somewhere other than perfection and they seem to know that.

  • scaredginger 3 years ago

    https://docs.exaloop.io/codon/general/differences

    Looks like you can use bigger integers and they're very explicit about it not being a drop-in replacement for Python

maximilianburke 3 years ago

I recall that Google had a project to compile Python to LLVM (Unladen Swallow @ https://code.google.com/archive/p/unladen-swallow/), but work stopped on it a long time ago.

If I recall it really wasn't that much faster than CPython given the overhead, but it's been a long time; if it was faster I assume it wouldn't have been abandoned.

  • killingtime74 3 years ago

    I think the main difference is that this doesn’t purport to be a drop in replacement. also, they seem to be doing some multithreading.

  • orra 3 years ago

    Quite. Unladen Swallow was unfortunately a failure, in part because LLVM at the time was quite buggy, and in part because LLVM wasn't (isn't?) magic enough to speed up a dynamic language.

    The blog post here mentions they do their own optimization passes, before handing over to LLVM. I imagine that's pretty important.

    • maximilianburke 3 years ago

      LLVM really wasn't that buggy at the time (circa 2009); the project I was using it for at the time, a .NET compiler that targeted video game consoles, was quite stable from a code generation point of view, and we were shipping games with it.

toxik 3 years ago

Codon is very impressive, it feels a lot like Python without being slow like Python.

Don’t think of it as a Python compiler, it is its own language. (Esp. re choice of int == i64, this saves SO MUCH computation for the CPU.)

I will say though that I’m not sure where to use it yet, since it’s too immature for important projects and also aims at the “we need a nuclear bomb” level performance.

eyegor 3 years ago

> How can I use Codon for production or commercial use?

> Please reach out to... to inquire about about a production-use license.

Having "contact us" pricing with several incompatibilities makes this pretty hard to consider in a commercial environment. I wish they had a public pricing structure.

From their faq: https://docs.exaloop.io/codon/general/faq.

  • DangitBobby 3 years ago

    I agree, I'm curious enough to try it but not gonna bother if I have to email someone just to get an idea of how much it costs to use in real code.

dragonwriter 3 years ago

> int can become an LLVM i64

You can be efficient without sacrificing correctness; using LLVM shouldn’t mean “throw out arbitrary-precision semantics”.

  • wyldfire 3 years ago

    I don't know how codon does this. But I always supposed that existing optimized pythons like pypy map integer operations to native types and promote them to arbitrary precision when they encounter overflow. It's IMO a similar problem to "but what if someone decided to overwrite Int.__add__ with some other function?" - arguably these are weird/bad things to do but AFAIK permitted by the language semantics. So to fix problems like these you just make it work for the paranoid case and implement optimizations that rely on that not being common. When the weird behavior is detected you fall back to the slower path.

TazeTSchnitzel 3 years ago

How much does this differ from PyPy's RPython in terms of how the language is restricted?

  • mattip 3 years ago

    While RPython is a restricted version of python used to build the PyPy python interpreters, the interpreters themselves are not restricted. Any deviation from CPython behavior, intended or not, is considered a bug. So Codon should be compared to the PyPy python interpreter, not to RPython. The advantage to writing the interpreter in RPython rather than C (CPython) or pre-compiling python code to LLVM IR and from there to creating and executable (Codon), is that RPython comes with a metaJIT (which can generate a JIT) and a mark-and-sweep garbage collector for any interpreter built on top of it.

julienfr112 3 years ago

kind of like numba, isn't it ?

maxloh 3 years ago

How good is Coden's performance compared to Cyphon and mypyc?

mypalmike 3 years ago

Very well written article. Delves into some details of things like exception handling semantics without going too far in the weeds. Thanks for sharing.

lowbloodsugar 3 years ago

Have to assign variables to a bit of memory on the stack because SSA??

edelsohn 3 years ago

How does Codon leverage the lessons learned from Unladen Swallow?

  • baq 3 years ago

    Looks like it does it by not being Python, but rather Python-ish.

amir734jj 3 years ago

The LinkedIn href is wrong in exaloop.io website

sakesun 3 years ago

A very clean and beautiful blog design.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection