Debugging compilers in Clojure

jpmonettas.github.io

58 points by jpmonettas 3 years ago · 13 comments

Reader

This is a nitpick, but I think this is a bad mental model of most lisps:

> Since the compilation unit of most Lisps is a form instead of a file like on most other languages, the core of the ClojureScript compiler can be seen as a program that will take a string representing a Clojure form as input, read it, recursively parse it into a tree of expressions also known as an AST (abstract syntax tree), and then walks down the tree emitting strings containing JavaScript code.

There might be a string representation of the code, but one thing that’s unique about a lot of lisps is that the input to compile/eval is not text: there is usually a function called something like “read” that turns the textual format into normal lisp objects and then eval can take any lisp object and evaluate it. Typical lisp objects are “self-quoting”, but what characterizes a lisp is that the domain and codomain of eval are the same type.

Zambyte 3 years ago

Your core point is absolutely true about how Lisp is special in that it usually provides a read procedure to turn a textual type into a native object that can be evaluated (this is a side effect of homoiconicity, so any homoiconic language will have this property too), but I have one additional nitpick to make ontop of yours:
> [...] eval can take any lisp object and evaluate it.
eval cannot be generalized to accepting any Lisp object, only specifically symbolic expressions (symbols, or lists (potentially nested) of symbols). I discovered this because I thought Chibi Scheme was throwing a warning for valid code[0] to inject a value into an expression for eval, but Marc helped me understand that the warning was correct, because Scheme only specifies what eval does for symbolic values.
[0] https://github.com/ashinn/chibi-scheme/issues/902
- fiddlerwoaroof 3 years ago
  Scheme might be different here, I'd have to read the standard, but it's true of Common Lisp at least. You have to distinguish here, I think, between what's valid input to eval and the evaluation rules for given inputs.
  Anyways, here's CL's spec for eval[1]:
  eval form => result*
  Form is defined in the glossary as "form n. 1. any object meant to be evaluated. 2. a symbol, a compound form, or a self-evaluating object."[2] And you can see this is an exhaustive categorization of lisp objects from the definition of self-evaluating object: "self-evaluating object n. an object that is neither a symbol nor a cons. If a self-evaluating object is evaluated, it yields itself as its only value."[3]
  IMO, the useful definition of homoiconicity is "the domain and codomain of eval are the same type". One of the "results" of eval might be that the evaluation routine signals error, if the object has a special evaluation rule.
  EDIT: I just read that github thread more carefully, and I guess Scheme's eval is stricter and so, IMO, Scheme is less "homoiconic" :) FWIW, my experience is that Scheme pioneered a lot of things that are more similar to non-lisps than to the more lispy lisps: e.g. schemes typically seem to prefer batch-style compilation to REPLy environments. Source as files vs. image-based development and the standard defines the language in terms of text rather than datastructures.
  [1] http://www.lispworks.com/documentation/HyperSpec/Body/f_eval...
  [2] http://www.lispworks.com/documentation/HyperSpec/Body/26_glo...
  [3] http://www.lispworks.com/documentation/HyperSpec/Body/26_glo...
  - Zambyte 3 years ago
    
    > Scheme might be different here [...]
    Yep, I meant to specify that a subset of your original claim is true for all Lisps, not that your claim was not true for certain ones like CL.
    > IMO, the useful definition of homoiconicity is "the domain and codomain of eval are the same type".
    I do not think that is a useful definition. Under that definition, any language can be made homoiconic by providing a library with an eval procedure that accepts an AST where nodes in the tree could be non syntax objects, and their value is just accepted as-is for evaluation.
    The most useful definition of homoiconic IMO is that the internal representation of the language (the AST) is the same as the external representation of the language (the literal text you write). In that sense, CL and Scheme are equally homoiconic, but Python, C, etc. which you could provide an evaluator for with the same domain and codomain types, are not homoiconic.
    
    fiddlerwoaroof 3 years ago
    
    “The most useful definition of homoiconic IMO is that the internal representation of the language (the AST) is the same as the external representation of the language”
    Except this is never true, unless your AST is a string. And Racket’s internal representation (syntax objects) is much more complicated than what is apparent from the text. So, in my opinion, defining homoiconicity in terms of the textual syntax is basically impossible. (And I’m sympathetic to Shriram Krishnamurthi’s claim that homoiconicity doesn’t mean anything).
    The interesting property, in my opinion, is that languages like Common Lisp are not specified in terms of the textual syntax and so the textual representation is irrelevant to the semantics of the program: a visual tool that produces lisp forms is as valid a “syntax” of Common Lisp as the standardized textual representation produced by READ.
    
    lispm 3 years ago
    
    > textual representation is irrelevant to the semantics of the program
    It's not completely irrelevant, since Common Lisp has the idea of a text-based file with source code and a compilation mode for it (-> COMPILE-FILE).
    But generally what you say is true. We see that especially in two places: 1) the Lisp interpreter, which runs non-text s-expressions and 2) special development environments, like Interlisp-D/Medley where a prominent way to edit Lisp code is a structure editor, which manipulates s-expressions in memory.
    
    kazinator 3 years ago
    
    "Homoiconic" means that code definitions are stored in the original textual form, or possibly in a tokenized form from which the text is easily recovered. Thus the program's definitions may be edited, without referring to external source code.
    Bash is homoiconic because you can type "set" with no arguments, and see all the function definitions in their original syntax (albeit reformatted and without comments). You can copy and paste these definitions, editing them in between.
    In Common Lisp, the perhaps little-known ed function supports homoiconicity. Function definitions whose original definitions is available may be edited. (Like Bash's set, ed will serve you up a reformatted function without comments.)
    Homoiconicity is mostly stupid and irrelevant, and an earmark of immature, naively implemented languages. A Lisp which compiles every definition that is fed into it, throwing away the original nested lists, cannot support a function like ed, and fails to be homoiconic, yet has all the oft discussed advantages of Lisp.
    A feature that is close to homoiconicity that is useful is a REPL with history recall. In a REPL with history recall, we can write definitions. Even if those definitions are compiled, so the language image has no record of the code, the REPL has the original text in memory. So by history recall, we satisfy the use case of the programmer who wants to recall a definition, edit it and replace it. Or some of those use cases. Obviously, a program doesn't come with a REPL history of its definitions; only one that we made in the REPL.
kazinator 3 years ago

Mainly, what is wrong with the statement is that such a program/function which reads forms from a file and compiles them is window dressing around a compiler, not its core.

iLemming 3 years ago

I recently joined a team where I need to sift through copious amounts of very confusing code. The company is a startup, and many elements were designed hastily. The Flowstorm debugger has proven to be an invaluable tool; my life would have been significantly more difficult without it. I highly recommend it. Also, Juan is an incredibly friendly person, he is clearly very knowledgeable and always eager to help. There's a #flow-storm channel in Clojurians Slack.

KRAKRISMOTT 3 years ago

Awesome work, super pumped for this. I wish the Clojure devs would release a first party debugger, almost every other lisp has one built in.

koito17 3 years ago

I doubt that would happen due to the focus on core libraries and tooling being platform agnostic, and deferring users to host tooling (and libraries) whenever possible.
The difference between debugging and profiling Clojure and ClojureScript is night and day. e.g. async-profiler and VisualVM are much better than whatever I've seen in the Node.js and browser ecosystems. But tools like FlowStorm seem to be helping close this gap, if you decide to use the custom compilers designed for it.
As someone whose day job is developing with ClojureScript, I've been keeping a close eye on developments in FlowStorm, and I always send this video to get people interested: https://www.youtube.com/watch?v=4VXT-RHHuvI

upghost 3 years ago

if this works you can have all my money. stellar work in advance ohhhh do I dare even hope?

edit: sorry realize my comment is confusing, I got very emotional when I saw about flow storm http://www.flow-storm.org/

clojurescript debugging.. oh my god I can't even imagine.

Settings

Debugging compilers in Clojure

Keyboard Shortcuts