Node.js and Python Interoperability
fridgerator.github.ioI was once trying to do something even crazier: exposing full python accessibility to node.js, using proxies. So that you can use python objects and functions as native javascript objects and functions. It is feasible because the languages and runtimes share a number of common designs.
There were several tough things however. The first was error handling. Yes you can catch exceptions in native code and convert it to exceptions in the other language, but it's not straightforward to keep stacktraces. The second was circular references between runtimes. Since references across boundaries are global, garbage collector on either side could not reclaim circularly referenced objects. Although this could be resolved by manually breaking up the circle, it could be better to have weak references. (Or maybe other utilities, idk what would be more elegant.) The third was that js has no operator overloading, so I had to use .__add__() for example to call the python add operator.
One line example: https://github.com/swordfeng/pyjs/blob/master/test/jsobject.... It's a toy project I did years ago and not even compiling now. Also I was wondering if anyone really need to do things in this way, given there are bunch of popular and stable RPC libraries. But I was happy to learn something about underlying cpython and v8 from it.
> The third was that js has no operator overloading, so I had to use .__add__() for example to call the python add operator.
I expect this wouldn't have worked in the long run as these methods are often just part of the protocol e.g. even `a == b` will try `type(a).__eq__(a, b)` then fall back to `type(b).__eq__(a)` (~~and then it'll do some weird stuff with type names IIRC~~[0]).
And most operators are not considered symmetric so the fallback is not the same as the initial (even `+` has `__add__` and `__radd__`, also `__radd__` might be called first depending on the relationship between type(a) and type(b)).
And then there's the "operations" which fallback to entirely different protocols e.g. `in` will first try to use `__contains__`, if that doesn't exist it uses `iter()` which tries to use `__iter__` but if that doesn't exist it falls back to calling `__getitem__` with non-negative sequential integer.
Which is why sometimes you define `__getitem__` for a pseudo-mapping convenience and then you get weird blowups that it's been called with `0` (you only ever expected string-keys). Because someone somewhere used `in` on your object and you hadn't defined `__iter__` let alone `__contains__`.
Good times.
[0] I misremembered: it's for ordering (not equality) in Python 2[1] `a < b` will first invoke `type(a).__lt__(a, b)`, then if that's not implemented fall back to `type(b).__ge__(a)`, and if that's not implemented either it'll fall back to a few hard-coded cases (e.g. None is smaller than everything) and finally to `(type(a).__name__, id(a)) < (type(b).__name__, id(b))`. That is the order of independent types with no ordering defined is the lexicographic order of their type names, and if they're of the same type it's their position in memory
[1] where there's always an ordering relationship between two objects, one of the things I'm most graceful Python 3 removed even if it's sometimes inconvenient
This is great. I wanted to do something similar so that I could render Django templates in native JSX.
Any idea which of those RPC libraries could help achieve this? I'd love a Django model instance that could be used seamlessly in a JSX template.
I recently did something similar with Tcl (instead of Python) and Duktape (instead of V8): https://rkeene.dev/js-repl/
All the Tcl runtime is in the "runtime" object, so like "runtime.puts('Hello World')" or "runtime.expr('2128')", etc
It was a lot of fun
Tcl is one of those languages that feel special once you get to know it. In a good way! It’s like a command prompt for actual code. I don’t know how to explain it any better.
I'm not sure I got to know Tcl properly. I needed to use it to interact with an electronics design tool in order to extract information about a design, and so most of what I learned was from the tool's documentation about how to use its API, and some random Stack Overflow for specific questions.
Anyway, Tcl felt very much like an 80's interpreted language to me. It seemed like every data structure was built on top of strings, and every statement was being run through eval().
I haven't used Tcl in years, but you are right about strings, actually the goal is that every data type is implicitly marshalled into string and vice versa.
As for the eval you need to keep discipline there is difference whether you use [] {} or "".
The magic about the language is that it has no statements, for example you are missing try/catch? You can implement it your own: http://code.activestate.com/recipes/68396-try-catch-finally/
>It seemed like every data structure was built on top of strings, and every statement was being run through eval().
That's the case, and that's the whole magic...
We needed to solve this as part of using pydata/GPU service calls from our node app. While we do have a couple of native modules, they're a PITA. Our solution for more generic code is async HTTP service calls passing typed Apache Arrow tables ( https://arrow.apache.org/docs/js/ ), which gives a cleaner path to maintenance, observability, packaging, distribution, low overhead etc.
The current trick we're looking at is making this zero-copy when same-node, esp for GPU code, so happy to chat with folks about that!
Very cool idea! For all of you thinking of writing native modules for node, please remember to use napi, not nan. napi is fully abi stable, while nan is not!
https://nodejs.org/api/n-api.html
We also have a c++ wrapper: https://github.com/nodejs/node-addon-api
I'm having the requirement to work with Node and Python now. I write my webapps in node but have data analytics scripts written in python to be invoked. I'm planning to use [python-shell - npm](https://www.npmjs.com/package/python-shell)
Have you considered GraalVM? You can run both languages inside the same managed runtime (JVM), sharing the same data structures, inlining across language boundaries, and debug both with Chrome Dev Tools.
"Analytics" generally means "cython only". The numpy dep ends up being a problem.
Generally if you want python/java interaction, we maintain a tool called javacpp that handles this, we even bundle cpython: https://github.com/bytedeco/javacpp-presets
GraalVM itself also depends on javacpp for a portion of its features (LLVM wrapper): https://github.com/oracle/graal/blob/315f5dcf69c2e73fd13a5f8...
I'd be happy to answer questions about the overlap of the 2. I can say I happily execute python scripts from our embedded python and even point the python execution at an anaconda distribution.
How much have you used GraalVM? I'm curious to know if it really works like that in practice. I've read what it says on the tin a few times and been impressed, but haven't actually sat down and played with it yet. The skeptic in me thinks that in a real-world production scenario, such a thing is surely a house of cards that falls down all the time in unexpected and difficult to diagnose ways. It'd be great to hear if there's people who can vouch and say it actually delivers on its promises of a polyglot utopia :)
Graal itself still has limitations with many libraries yet, the biggest one being having any library that heavily uses reflection. I'd look at some of the work red hat is doing with some of its java libraries and quarkus: https://developers.redhat.com/blog/2019/03/07/quarkus-next-g...
IIRC you also end up relying on a graal-provided version of python that may or may not be kept up-to-date by a project for which it's just a secondary target. Been there with IronPython and Jython, it's no fun.
Thanks for the input.
I remember coming across GraalVM. However, as I solo developer, I have found learning another new thing to be detrimental to my ability to getting things done. I have wasted way too many hours learning new frameworks, languages, platforms, etc.
I'm doing my best to avoid learning something new except when a strong compatibility with my current stack exists and is quickly implementable.
There's also python-bridge: https://github.com/Submersible/node-python-bridge
I don't recall now why I chose it over the much more popular python-shell. Sigh - the exploratory-spike log is sparse, and then "that all works, so why revisit low-commitment choices". I used it last year for a python helper, to offload some opencv and tensorflow optical tracking from electron.
(Both exist, and work, in javascript. But opencv had painfully subtle issues even with py3 vs py2, so I payed complexity to be closer to then mainstream center py2. And then moved tf over to work around chromium's excellent video latency and load (for passthrough AR) becoming much less so when also touched by the cpu.)
Also: https://github.com/bobpepin/pyduktape (interop between the Duktape JS interpreter - https://duktape.org/ - and Python).
I've worked the opposite way: calling Javascript from Python, using PyV8. It worked okay enough, but it felt super fragile. We only used it to run a tiny bit of client-side code on the server. I wouldn't want to have a large project where Javascript and Python interact heavily.
I have just taken to running server and talking over http whenever I want two languages to interface. It’s slower but it’s pretty much bulletproof.