1.4 Extending JS at once with a WASM language

More Than A Lisp in The Browser

A coherent companion for JavaScript, in 3.5 MB of WebAssembly

(Check the: playground)

Modern web applications routinely load tens of megabytes of dependencies, frameworks, utility libraries, polyfills, parsers, validators, internationalization tables, assembled from hundreds of npm packages whose collective behavior nobody fully understands. When something breaks, when a security advisory hits, when a maintainer abandons a project, the cascade of consequences can paralyze a team for days.

LispE WebAssembly proposes a different approach. A single binary of approximately 3.5 MB (3.3 MB for the WASM core plus the supporting JavaScript files lispe.js and lispe_functions.js) provides more than 450 native functions covering text processing, statistical computation, pattern matching, regular expressions, linear algebra, date manipulation, and much more. One file. One author. One coherent design. No transitive dependencies.

To put 3.5 MB in perspective: it's smaller than the hero image on most marketing websites, a fraction of what a typical React single-page application weighs once bundled, and roughly half the size of Pyodide's core distribution, and Pyodide doesn't yet include the scientific libraries that make Python actually useful. For an entire programming environment with industrial-grade text processing inherited from thirty years of natural language processing work at Xerox and Naver Labs Europe, the size proposition is, frankly, modest.

Why a Companion Rather Than an Alternative

JavaScript is the unavoidable language of the browser. Its integration with the DOM, its event loop, its asynchronous primitives, its vast ecosystem of UI frameworks, none of these can or should be replaced. JavaScript is excellent at what JavaScript is excellent at.

But there are things JavaScript does poorly, or doesn't do natively at all:

Numerical computation. JavaScript's Math object provides barely a dozen functions. There is no native gamma, no erf, no statistical distributions, no linear algebra. Anything beyond the trivial requires assembling several libraries (mathjs, jStat, simple-statistics, ml-matrix), each with its own conventions and dependency tree.
Multilingual text processing. Despite recent improvements (Intl.Segmenter, Unicode property classes), proper tokenization of mixed-language text remains painful. Handling French apostrophes, em-dashes, Greek and CJK character classes, and configurable domain-specific rules typically requires libraries that don't approach the maturity of industrial NLP toolchains.
Pattern matching. JavaScript developers have been waiting for native pattern matching since the early TC39 proposals. The current Stage 1 proposal will eventually deliver something, but it will be a syntactic addition bolted onto an existing language.
Coherent dependency management. Even when libraries exist, assembling them creates a dependency graph that nobody audits. The left-pad incident, the event-stream compromise, the ua-parser-js malware injection are not isolated accidents but structural consequences of the small-modules philosophy.

LispE WASM addresses these gaps as a companion to JavaScript rather than a replacement. The intent is not to abandon JavaScript but to delegate to LispE the tasks where it excels, while keeping JavaScript for what JavaScript does best.

The Bridge: Typed Boundary, Async Discipline

A companion language is only as useful as its boundary with the host. The integration with JavaScript is therefore the central design question, not an implementation detail. LispE WASM's bridge has two distinguishing properties: it is typed at the boundary, and it treats asynchrony as explicit and confined.

A typed boundary, not a string firehose

The naive design for a WASM language bridge funnels everything through strings: send LispE source code as a string, receive results as a stringified value, parse it back into JavaScript types. This works, but it negates most of the performance benefit of running native code: a million-element vector serialized as "(0.123 0.456 ...)" and then re-parsed in JavaScript is no faster than computing it in JavaScript to begin with.

LispE's bridge avoids this. The exported entry points are typed:

// Returns a JavaScript Float64Array, not a string to be parsed
const samples = callEvalLispEToFloats(0, "(normal_distribution 1000000 0 1)");

// Returns a JavaScript Int32Array
const indices = callEvalLispEToInts(0, "(range 0 100 1)");

// Returns an array of JavaScript strings
const tokens = callEvalLispEToStrings(0, '(segment "Le café Αθήνα serves 中華 cuisine")');

// Scalar variants for single values
const pi2 = callEvalLispEAsFloat(0, "(* 2 _pi)");
const count = callEvalLispEAsInt(0, "(size mylist)");

Symmetrically, sending data into LispE is done by passing typed arrays directly:

// Pass a Float64Array into a LispE 'numbers' variable
const data = new Float64Array(1000000);
// ... fill data ...
callSetqFloats(0, "input", data, data.length);

// Then operate on it in LispE, zero-copy from there onward
callEvalLispE(0, "(setq normalized (/ input (max input)))");

The data crosses the WASM/JS boundary once, copied between the WASM heap and the JavaScript-side TypedArray. After that, all operations on the LispE side run zero-copy on contiguous memory: LispE's specialized numbers and integers types map directly to the same native representation that Float64Array and Int32Array use in the WASM heap. This is the performance discipline that makes a million-element vector operation actually fast in the browser, rather than nominally fast and practically constrained by serialization.

A subtle but real engineering choice underlies this: WASM memory can be detached when _malloc grows the heap, so the bridge re-derives typed array views from the buffer on each call rather than caching them. This is documented in the source. For developers auditing the integration, the discipline is visible.

Furthermore, the bridge handles encoding asymmetry between JavaScript (UTF-16) and LispE (UTF-32) explicitly. At the input, JavaScript characters arrive as Int32 slots carrying UTF-16 code units, and surrogate pairs are combined into single UTF-32 code points for LispE. At the output, astral-plane characters and emoji are split back into UTF-16 surrogate pairs for JavaScript. Many WASM language bridges skip this conversion and silently lose information on emoji, CJK extended characters, or other non-BMP code points. The team's WMT19 Robustness Task win in machine translation — partly through correct emoji and Unicode handling that competing systems botched — is the reason this discipline exists at the boundary. For applications that need to ship complicated strings across the boundary without worrying about any of this, LispE also provides btoa and atob for base64 round-tripping.

The typed entry points coerce LispE's internal types into a uniform output format. A function returning a numbers, integers, floats, shorts, generic list, or even a linked llist is normalized into the same Float64Array for the JavaScript caller. The developer does not need to know which internal LispE type a given function produces: the boundary handles the dispatch.

Errors travel through the same channel via a sign convention: a negative size on the result indicates that the buffer contains an error message rather than data. The convention is uniform across all typed entry points, which makes both the JavaScript wrappers and any future binding language straightforward to write.

evaljs: synchronous evaluation back into JavaScript

In the other direction, LispE can call into JavaScript. The evaljs function takes a JavaScript expression as a string, evaluates it synchronously in the browser context, and returns the result:

(setq result (evaljs "Math.PI * 2"))
(setq selected (evaljs "document.getElementById('user-input').value"))
(setq today (evaljs "new Date().toISOString()"))

evaljs is synchronous because nothing asynchronous happens, the JavaScript code executes in the current event loop tick. It is the right tool for reading or modifying DOM elements, querying browser APIs, accessing JavaScript objects, or injecting LispE results into JavaScript visualization libraries.

asyncjs: explicit, callback-based asynchrony

The asyncjs function handles JavaScript code that returns a Promise or contains await. It does not block the LispE interpreter waiting for the result. Instead, it takes a callback function name and arguments, and arranges for that callback to be invoked when the asynchronous JavaScript work completes:

(defun handle_response (response context)
   (println "LLM said:" response)
   (process_for context response))

(setq url "http://localhost:1234")
(setq prompt "Explain transducer fusion")
(setq query (f_ `call_lm_studio("{url}", "llama-3", "{system}", "{prompt}");`))
(asyncjs query 'handle_response context)

The architecture is deliberately callback-based rather than blocking. The C++ side of asyncjs launches an immediately-invoked async JavaScript function and returns immediately to LispE. When the JavaScript Promise resolves, JavaScript calls back into LispE through an exported C function, which invokes the registered callback with the result.

This avoids the need for ASYNCIFY (Emscripten's mechanism for suspending the WASM stack), which would roughly double the binary size and slow down execution. More importantly, it embodies a precise architectural philosophy: LispE itself remains synchronous and deterministic; asynchrony is explicit and confined to the boundary. The JavaScript side is treated as an asynchronous bus, and the LispE program orchestrates events rather than blocking on them.

The combination of evaljs and asyncjs means that any JavaScript capability, DOM manipulation, fetch, IndexedDB, Web Audio, WebGL, is reachable from LispE. The runtime does not duplicate browser APIs; it composes with them.

What's Inside

The 3.5 MB binary includes a coherent collection of capabilities organized around domains that JavaScript handles poorly:

Text processing with full Unicode support (UTF-32 internally), customizable rule-based tokenization, and a segment function that handles multilingual text correctly out of the box. Two regular expression engines: POSIX-compatible (prgx_*) for portability, and a LispE-native engine (rgx_*) with concise syntax and multilingual character classes (%a for accented letters, %h for Greek, %H for CJK).
Mathematics and statistics including special functions (erf, gamma, lgamma), numerically stable variants (expm1, log1p), 18 statistical distributions, and linear algebra (LU decomposition, matrix inversion, linear system solving, determinants, N-dimensional tensor manipulation à la APL).
Pattern matching with three flavors: defpat for pure dispatch, defpred for predicate functions with backtracking, defprol for collecting all matching solutions à la Prolog. Algebraic data types (data@) with structural validation. A macro system (defmacro) that reuses the same pattern matching engine for compile-time code transformation.
Functional programming with automatic transducer fusion: chained map, filter, zipwith operations are compiled into single loops without intermediate allocations.
Date and time manipulation with proper formatting and high-precision timing (chrono).
Object-oriented programming (class@) implemented as private namespaces with cascading method lookup.
JavaScript interoperability via evaljs and asyncjs, as described above.

A developer who needs all of these capabilities through npm would assemble something like Luxon for dates, mathjs and jStat for math and distributions, compromise or natural for tokenization, ml-matrix for linear algebra, ts-pattern for pattern matching, plus their transitive dependencies. Few applications need every domain at once, of course. The point is not that LispE replaces all of these in every project, but that whichever subset an application needs, LispE provides them as a single coherent runtime rather than an assembly with version skews and duplicated subdependencies.

Maintenance Surface and When It Pays Off

For organizations subject to security or compliance regulations, the difference between auditing 800 transitively trusted npm packages and auditing one C++ source tree is concrete. Software Bill of Materials (SBOM) requirements become trivial: one component, one source, one audit. The C++ source compiles in less than a minute. The author has a thirty-year industrial track record at Xerox Research Centre Europe and Naver Labs Europe; the institutional sponsor (Naver) provides continuity beyond any individual contributor.

This is not a claim that LispE is intrinsically more secure than mainstream alternatives. It is a claim that the surface area to verify is dramatically smaller, and that small surfaces are easier to keep secure than large ones.

The cost-benefit profile of adopting LispE WASM follows from this. It pays off when an application has substantial logic that benefits from a more coherent, more expressive runtime, typically:

Numerical or statistical computation client-side (financial dashboards, scientific visualizations, statistical analysis tools)
Structured text processing (parsing, tokenization, validation, transformation)
Local-first AI agents that orchestrate LLM calls without backend infrastructure, asyncjs plus a local LM Studio or Ollama server is enough
Embedded domain-specific languages where users define rules at runtime
Replacements for collections of npm packages that have become unmaintained or risky

It is overkill for simple form validation, briefly-visited pages, applications already well-served by their existing JavaScript stack, or teams unfamiliar with Lisp who would face a learning curve disproportionate to the benefit. The honest framing is not "LispE instead of JavaScript" but "LispE for the parts of your application where it's the better tool."

A Concrete Example: The LispE Playground

The LispE WASM playground is a complete worked example of integration. The HTML page is roughly 350 lines and demonstrates the architectural patterns any LispE WASM application would follow.

See: playground code

The page maintains two interpreters:

An interpreter at index 0, created automatically when the WASM module loads, dedicated to the user's code
A second interpreter, created lazily when needed, dedicated to running the Python-to-LispE transpiler

This separation matters. The Python transpiler involves loading several thousand lines of LispE code (the BNF parser and the transpilation rules). Running it in a separate interpreter ensures that the user's code can be reset (via the RESET/RUN button) without losing the transpiler state.

The function that runs the user's code is straightforward:

function run_code() {
    var code_text = document.getElementById('get_code');
    var res;
    try {
        res = callEvalLispE(0, code_text.value);
    } catch (e) {
        res = e;
    }
    appendOutput(res);
}

The Python transpilation flow uses lazy initialization:

function run_python() {
    if (pythonInterpreterIdx < 0) {
        pythonInterpreterIdx = callCreateLispE();
        callEvalLispE(pythonInterpreterIdx, basic_code);
        callEvalLispE(pythonInterpreterIdx, transpiler_code);
    }
    var code_text = document.getElementById('get_code').value;
    var transpiled = callEvalLispE(
        pythonInterpreterIdx,
        '(compilepython «' + code_text + '»)'
    );
    if (transpiled.indexOf('ERROR>>>') === -1) {
        callResetLispE(0);
        var res = callEvalLispE(0, transpiled);
        appendOutput(res);
    } else {
        appendOutput("Transpilation error:\n" + transpiled);
    }
}

The pattern is illustrative: JavaScript handles UI events and DOM manipulation, LispE handles the substantive computation (here, parsing and transpiling Python), and the result is integrated back into the DOM through standard JavaScript APIs. For a Python-to-LispE transpiler running entirely in the browser, including the BNF grammar parser, the AST construction, and the pattern-matching transpilation rules, the responsiveness on modern hardware is essentially instantaneous.

Wrapping LispE for JavaScript Developers

Developers who want LispE's capabilities without writing Lisp code can encapsulate calls behind a JavaScript façade. The crucial design choice is to use the typed entry points for data-heavy operations rather than the generic string-based one:

const lispe = {
    // Numerical operations: return Float64Array, not stringified vectors
    normalDistribution: (n, mean, std) =>
        callEvalLispEToFloats(0, `(normal_distribution ${n} ${mean} ${std})`),

    matrixSolve: (A_flat, n, b) => {
        callSetqFloats(0, "A", A_flat, A_flat.length);
        callSetqFloats(0, "b", b, b.length);
        return callEvalLispEToFloats(0, `(solve (reshape A '(${n} ${n})) b)`);
    },

    // Text processing: return arrays of strings
    tokenize: (text) => {
        const escaped = text.replace(/"/g, '\\"');
        return callEvalLispEToStrings(0, `(segment "${escaped}")`);
    },

    // Generic evaluation for power users
    eval: (code) => callEvalLispE(0, code)
};

const samples = lispe.normalDistribution(1000, 0, 1);    // Float64Array
const tokens = lispe.tokenize("Le café Αθήνα serves 中華 cuisine"); // string[]

This pattern allows progressive adoption. A team can introduce LispE for one specific need, proper multilingual tokenization, say, without committing to writing Lisp code throughout their application. If the experience proves valuable, more capabilities can be exposed through additional wrapper functions. If not, the dependency can be removed without disturbing the rest of the application. This is the "companion" model in its most pragmatic form.

A Concrete Pipeline: From LispE Sampling to JavaScript Consumers

The typed boundary becomes most visible in pipelines where LispE produces a large numerical result and JavaScript needs to consume it efficiently. Statistical sampling is a representative case: generating a million samples from a distribution is exactly where string-based serialization would negate the benefit of running native code, and where the typed bridge preserves it.

// One million samples from a standard normal distribution
const samples = callEvalLispEToFloats(0,
    "(normal_distribution 1000000 0 1)");
// samples is a Float64Array, ready to feed directly into:

// ... a Chart.js histogram
new Chart(ctx, {
    type: 'bar',
    data: histogram(samples, 50)
});

// ... a WebGL buffer for GPU-side rendering
gl.bufferData(gl.ARRAY_BUFFER, samples, gl.STATIC_DRAW);

// ... a Web Worker via transferable, zero-copy
worker.postMessage({ data: samples }, [samples.buffer]);

The last case is particularly worth noting. The Float64Array.buffer property exposes the underlying ArrayBuffer, which is directly transferable to a Web Worker via the second argument of postMessage. The transfer is zero-copy: ownership of the buffer moves to the Worker, with no duplication. The complete pipeline is therefore:

LispE allocates and fills the vector in the WASM heap (zero-copy on the LispE side)
The bridge copies once into a JavaScript Float64Array
The Float64Array is transferred to a Web Worker zero-copy

For an application that wants to run statistical computation on the main thread and visualization or further processing on a Worker, this is a clean architecture: one copy across the WASM/JS boundary, none after.

The same pattern applies to any LispE function that produces a numerical vector: uniform or Poisson sampling, Monte Carlo simulations, random permutations, matrix operations, time-series transformations. The callEvalLispEToFloats and callEvalLispEToInts entry points return values that are directly consumable by Chart.js, D3, WebGL, the Web Audio API, IndexedDB, Web Workers, and any other JavaScript API that accepts typed array buffers. There is no intermediate parsing step, and no second copy.

This is the property that distinguishes a serious WASM language bridge from one that merely works: data flows through the boundary in the form the consumer actually wants, not as a string the consumer must reparse.

Performance

LispE was designed with performance discipline. Specialized typed list classes (numbers, floats, integers) bypass generic dispatch for arithmetic operations. Object pools eliminate fragmentation for frequently allocated types. Symbol lookup uses a custom 16-bit hash table optimized for an interpreter's access patterns. Reference counting is deterministic, with no garbage collection pauses.

A documented benchmark on stochastic gradient descent, implemented identically in both languages, shows LispE running in 9 ms native versus 45 ms for naive Python (no NumPy). With 32-bit floats, LispE drops to 3 ms. The factor of 5 to 15 advantage holds because LispE's specialized types operate directly on contiguous memory, while Python's interpreter must dereference a PyObject for every floating-point value.

In WebAssembly, LispE pays a 1.5x to 2x penalty against the native build, but this penalty is shared by all WASM-compiled languages. In practice LispE WASM remains substantially faster than Pyodide on equivalent numerical workloads, while occupying half the binary size.

Internationalization Is Not An Afterthought

LispE's text processing traces directly from the XIP parser developed at Xerox Research Centre Europe over two decades, used in industrial NLP pipelines. This experience is encoded in concrete design choices:

Internal representation in UTF-32, guaranteeing uniform handling of all Unicode code points across operating systems
The segment function recognizes accented characters, em-dashes, typographic apostrophes, and mixed-language text without configuration
Built-in regex character classes for Greek (%h), CJK (%H), and accented Latin (%a)
Tokenization rules are runtime-modifiable, so domain-specific conventions (currency notation, scientific identifiers) can be added without recompilation
Three decimal notation conventions (anglo-saxon, French, permissive) are available as parameters

For applications processing text in multiple languages, common in 2026, these capabilities are immediately useful. There is no equivalent in the JavaScript standard library, and assembling them from third-party packages produces an inconsistent patchwork.

Conclusion

LispE WASM is not a revolution. It is a mature interpreter built with thirty years of accumulated discipline, brought into the browser through WebAssembly, occupying 3.5 MB of binary that includes capabilities normally distributed across hundreds of npm packages.

The proposition is straightforward: where JavaScript is strong (UI, DOM, async events, ecosystem integration), continue to use JavaScript. Where JavaScript is weak (numerical computation, multilingual text processing, pattern matching, algebraic data structures, coherent integration), consider delegating to a companion runtime. The bridge between them is typed where it needs to be (numerical data in Float64Array and Int32Array, with a single copy at the boundary and zero-copy on each side) and asynchronous where the host demands it (asyncjs with explicit callbacks, no ASYNCIFY tax).

The cost is a 3.5 MB download, paid once and cached thereafter. For applications that exercise the runtime substantively, the cost is amortized in the first few seconds and forgotten thereafter.

The playground at naver.github.io/lispe is open. The source code is on GitHub. The wiki documents the language in depth. For some applications, LispE WASM will be the right answer; for others it will not. The honest invitation is to evaluate it on its merits, in concrete use cases, against the actual alternatives.