Settings

Theme

Show HN: V8serialize – Read/write V8-serialized JavaScript values from Python

github.com

3 points by h4l a year ago · 2 comments · 2 min read

Reader

v8serialize is a Python library that reads and writes V8's value serialization format[1][2], which is how V8 moves JS values between contexts when using postMessage(value), storing data in IndexedDB and Deno uses it to store values in its Deno KV database. Node.js and Deno have a v8.serialize() API to read/write the format. The format supports most of the conventional JavaScript types, like Map, Set, BigInt, Date, ArrayBuffer, Error, undefined etc; but not object prototypes or functions. It's a bit like JSON with more types and reference cycles.

The strength of the format is that it lets JavaScript programs serialize values without thinking about reducing types down to the subset that JSON supports. A weakness is that all the quirks of JavaScript's types need to be exposed by another language deserializing it, so it's certainly not ideal as a general-purpose format, just when ease of use from JavaScript is a priority.

This library lets Python programs work with JavaScript values by reading and writing this serialization format. As well as encoding/decoding the format itself, the library implements Python types that replicate the behaviour of JavaScript's Object, Array, Map, Set, etc, so that values can be round-tripped faithfully and quirks like JavaScript's sparse arrays don't cause Python to allocate massive contiguous lists.

I made this in an unplanned yak shave when setting out to write a client for Deno's KV database in Python. I'd assumed it was just storing JSON-encoded values, but I quickly realised it could round-trip all the regular JavaScript types, like Date & undefined, and realised it was using V8's value serialization format to store values at rest.

I'm now back to implementing a Deno KV client using this library. It can imagine it could also be used for other things, such as having a Python program act like a JavaScript Web Worker, receiving JavaScript values from postMessage()-like calls. Or perhaps could be used to preserve full JavaScript types in something using Python in the browser (in conjunction with a pure-js v8.serialize() implementation, like https://github.com/worker-tools/v8-value-serializer).

1: https://h4l.github.io/v8serialize/en/latest/explanation/v8_s... 2: https://github.com/v8/v8/blob/main/src/objects/value-seriali...

ekzhang a year ago

This is really neat! It looks like you implemented the parser and encoder entirely in Python, how does that compare to an approach where you load an extension module in C or Rust? (Possibly with the actual V8 source)

  • h4lOP a year ago

    Thanks! That's right, it's pure Python. It's certainly a lot slower than C or Rust, but I've not done any benchmarking to get a good sense. I believe Deno have a Rust implementation which they use, so in principle a Python binding could make use of that. I think V8's own C++ implementation is too integrated with V8 itself to be re-usable, at least without refactoring.

    My immediate use case is very IO-bound, and won't use huge message sizes, so decoding/encoding performance probably isn't a huge problem. My hunch is It should be fast enough for event handling small messages, or it'd also be fine for passing binary buffers between Python and JS. (E.g using an array library like numpy and shipping an array over as a buffer, with some other JS objects for extra metadata.) (More so if I implemented reading/writing an mmap.)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection