Settings

Theme

Show HN: Minimal Lisp in ~70 lines of Haskell

gist.github.com

76 points by die_sekte 15 years ago · 23 comments

Reader

tel 15 years ago

I think this might be some of the most beautiful code I've seen in a long while

    value :: Parser Value
    value =
          List <$> (char8 '(' *> sepBy value (takeWhile1 isSpace_w8) <* char8 ')')
      <|> Number . fst . fromJust . B.readInteger <$> takeWhile1 isDigit_w8
      <|> Symbol <$> takeWhile1 (inClass "A-Za-z\\-")
The power of parser combinators representing the elegant lisp syntax.
  • snprbob86 15 years ago

    I'm familiar with parser combinators, but not the Attoparsec or other libraries. Furthermore, I have limited Haskel experience, so I'm struggling to read this. In particular, the use of symbols and overall terseness are hard to get through. I'm also really uncomfortable with the order of operations of all these operators.

    Now, I know that Hoogle exists, so I'm able to look these things up, but it's pretty tough to untangle. Particularly when searching for

      *>
    
    I found something that reads "This module describes a structure intermediate between a functor and a monad: it provides pure expressions and sequencing, but no binding. (Technically, a strong lax monoidal functor.)" Which is so full of jargon, my head hurts. And I even know what most of that jargon means!

    So as far as I can tell, here's what's going on....

      Foo <$>
    
    Defines a grammar production using the left argument as a constructor on the matched value of the right argument.

      <|>
    
    Separates productions as alternatives; in a PEG Grammer style. Both the <$> and <|> operators combine functions to result in a big master function that takes input and has the result type that is the big `one of A, B, C` algebraic type that is defined earlier: "Value".

      char8
    
    Matches a literal character (passed as an argument).

      char8 '(' *>
    
    Reads a parenthesis and throws it out of the result set. Similar for the close parenthesis later on.

      takeWhile1 isSpace_w8
    
    Produces a parser that consumes whitespace.

      sepBy value (takeWhile1 isSpace_w8)
    
    Combines to produce a parser which reads a list of values (recursive here) and presumably eliminates the separators from the result.

      takeWhile1 isDigit_w8
    
    Similar story, reads an integer.

      Number . fst . fromJust . B.readInteger
    
    Is a composed constructor that may or may not parse an integer, forcibly assumes it will succeed to parse one (which we know because we're reading digits). That integer parse stops when it hits a non-digit, and the unconsumed characters are returned as the second element of a tuple, so we call fst to take the first element of the tuple, discarding the unread characters (the main parser will consume those).

      takeWhile1 (inClass "A-Za-z\\-")
    
    Similar story again to the whitespace and numbers.

    OK, yeah, so.... amazing. Incredibly brilliant little bit of code. But holy hell was that tough to read. And I have no idea if I'd be able to write it in only a few hours. I'm far from certain I'd ever learn to write it as fast or faster than something more verbose.

    Please let me know if I've misunderstood something....

    • palish 15 years ago

      "I'm familiar with parser combinators, but not the Attoparsec or other libraries. Furthermore, I have limited Haskel experience, so I'm struggling to read this. In particular, the use of symbols and overall terseness are hard to get through. I'm also really uncomfortable with the order of operations of all these operators."

      This is why I've never identified with "code should comment itself".

      I'm not afraid to admit it: I am a dumb programmer. As a dumb programmer, the only way for me to be a good programmer is to be effective.

      To be effective, I need to minimize the "time spent understanding code" phase of software development. Furthermore, I will likely be working within a team.

      The opposite of that is to be writing "prototype" code --- code which is written to explore the problem space and to gain a better understanding of specific patterns to best accomplish a goal.

      But once the prototype phase is over, all of that code is deleted, and is replaced with commented code. (If possible, I try to explain each step of the algorithm in English, first, and then write the code. This is slower, but results in far fewer bugs and a more solid foundation.)

      I look at it this way: when my life ends, either I will have gone as far as my own brain was able to take me --- or I will have gone as far as my team's brains were able to take us. I'm willing bet my life that teams of "less brilliant" people are more effective than individuals of excessive brilliance. I write my code accordingly.

      -------------------

      That said, there is no excuse for laziness. Even if I am merely "competent", it's important for me to strive to be brilliant, even if I'll never attain it.

      • ezyang 15 years ago

        In my opinion, this code comments itself, even more so than a manually, written-out parser would.

        Here's how I would think about it. What would a grammar for Lisp look like?

            value := list | number | symbol
            list := '(' value + ')'
            number := [0-9]+
            symbol := [A-Za-z-]+
        
        OK, great. Now how do I look at this code?

            value :: Parser Value
            value =
                  List <$> (char8 '(' *> sepBy value (takeWhile1 isSpace_w8) <* char8 ')')
              <|> Number . fst . fromJust . B.readInteger <$> takeWhile1 isDigit_w8
              <|> Symbol <$> takeWhile1 (inClass "A-Za-z\\-")
        
        Even if I don't understand the combinators, I can blank them out for now and focus on the recognizable semantic bits. I see List, Number and Symbol, ok, so my guess is that <|> does something like a | might in my grammar, and each line corresponds to a different way a value can be expressed. For list, I see some quoted '(' and ')', so I guess that 'char8' means "match this literal." In fact, I can just read all of that off, and it makes sense. Nevermind what > and < are doing, I'll ignore that for now. And so forth.

        Suppose I wanted to make a superficial change, like make $ a recognized symbol in the language. That's super easy. I don't even need to look at the docs. If I want to introduce a new syntactic construct? A little harder; I'll have to go check the attoparsec docs. But you'd have to look up the docs for a library in any language, anyway.

        All's not well; in particular, this code conflates the tokenizing and parsing steps (notice that I don't say anything about whitespace in my grammar, but there's some line-noise here dealing with it.) That decreases the readability a little, but you gain efficiency with it.

        Maybe there's a tradeoff: the use of a library and symbolic combinators makes it harder to tell precisely how the code manages to actually do any parsing. That's true of any abstraction. But what's really great about this is that I can easily tell what the big picture of the code is. A page or two of state machine would not do that for me!

    • aristidb 15 years ago

      The syntax elements that you struggled with are explained here: http://learnyouahaskell.com/functors-applicative-functors-an...

      They are so incredibly general that you can use them in a lot of places, so the investment to learn their meaning pays off quickly. Learn You A Haskell actually explains them _before_ explaining Monads.

    • sfvisser 15 years ago

      When you've worked with Haskell a bit this code is actually very readable. Admittedly some experience is required.

      With a two named helper functions and a bit of additional alignment everyone that has ever written a parser before in Haskell immediately recognizes the structure, thanks to use of applicative functor syntax (<|> and <$>):

          value :: Parser Value
          value =  List   <$> parenthesis (sepBy value (takeWhile1 isSpace_w8))
               <|> Number <$> parseNumber
               <|> Symbol <$> takeWhile1 (inClass "A-Za-z\\-")
          
            where parenthesis p = char8 '(' *> p <* char8 ')'
                  parseNumber   = fst . fromJust . B.readInteger <$> takeWhile1 isDigit_w8
Peaker 15 years ago

The content of Fun is actually the type: Context -> Value -> (Context, Value).

If you flip the arguments, you get: Value -> Context -> (Context, Value). The (Context -> (Context, Value)) is actually the State type:

    State s a = s -> (s, a)
I modified your code to use this type. During the process, I encountered some peculiar things (and factored out some things), see comments in code.

https://gist.github.com/950445

Funny that it turns out slightly longer when re-using more code due to syntactic artifacts. Monad comprehensions (recently restored to GHC via an extension) would resolve that.

  • die_sekteOP 15 years ago

    The eval throwing away context does not make any sense in retrospect. I think I made it that way because of some ghetto-scoping ideas.

    You broke car and cdr somehow (these don't evaluate their arguments anymore), but I have no idea how you did this.

    This was mostly a training exercise, this is why there is no significant use of monads in there: I simply haven't learned enough about them to feel comfortable using them.

    • Peaker 15 years ago

      Can you give the example that breaks car/cdr? They seem to be evaluating their args here:

          fst $ repl "(car (cons 1 (quote ())))"
          Number 1
      • die_sekteOP 15 years ago

        Sorry, that was my fault. I didn't make lambdas evaluate their arguments and I didn't notice this in my code. I.e. car/cdr work correctly, fun doesn't.

stiff 15 years ago

I do not know Haskell enough to fully judge this, so maybe someone can correct me, but that seems _very_ minimal. From what I understand, it doesn't even have arithmetic operations, it "executes" incorrect programs without any error (try repl "())") and also it doesn't have any kind of scoping, for example in the following code a function argument is bound in the global (and only) environment:

Main> repl "(begin ((fun (x y) y) 1 2) y)"

([("x",Number 1),("y",Number 2),("begin",Fun),("car",Fun),("cdr",Fun),("cons",Fun),("cond",Fun),("def",Fun),("eval",Fun),("fun",Fun),("t",Symbol "t"),("quote",Fun)],Number 2)

I don't think it's fair to call this a Lisp at all at this point. There is a nice writeup of implementing a Lisp in Python by Peter Norvig, where at least the most basic things are implemented correctly (and the code is documented): http://norvig.com/lispy.html

Also, I get errors even with some very simple statements that theoretically seem to be implemented like:

Main> repl "(cons 1 2)"

([("begin",Fun),("car",Fun),("cdr",Fun),("cons",Fun),("cond",Fun),("def",Fun),("eval",Fun),("fun",Fun),("t",Symbol "t"),("quote",Fun)],List

[Exception: lisp.hs:48:8-48: Irrefutable pattern failed for pattern (ctx', [v', (Main.List vs')])

Am I doing something wrong here?

kleiba 15 years ago

Cool, only 70 lines!!

(plus a ton of libraries)

j/k ;-) Actually, I like this - it shows how cool Haskell really is! No, wait - it shows how cool Lisp really is! No wait... I'm confused... :-)

  • GregBuchholz 15 years ago

    You might also like: Lisprolog http://stud1.tuwien.ac.at/~e0225855/lisprolog/lisprolog.html

    "Some online books show how to implement a simple "Prolog" engine in Lisp. They typically assume a representation of Prolog programs that is convenient from a Lisp perspective, and can't even parse a single proper Prolog term. Instead, they require you to manually translate Prolog programs to Lisp forms that are no longer valid Prolog syntax. With this approach, implementing a simple "Lisp" in Prolog is even easier ("Lisp in Prolog in zero lines"): Manually translate each Lisp function to a Prolog predicate with one additional argument to hold the original function's return value. Done. This is possible since a function is a special case of a relation, and functional programming is a restricted form of logic programming.

    Here is a bit beyond that: lisprolog.pl

    These 162 lines of Prolog code give you an interpreter for a simple Lisp, including a parser to let you write Lisp code in its natural form."

  • eru 15 years ago

    If you like this, there's a more involved version: http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_H...

  • magoghm 15 years ago

    "Actually, I like this - it shows how cool Haskell really is! No, wait - it shows how cool Lisp really is! No wait... I'm confused... :-)" <- +1

guard-of-terra 15 years ago

See also http://www.defmacro.org/ramblings/lisp-in-haskell.html

nkassis 15 years ago

Now if someone will just write a 70 line Haskell interpreter in lisp we can have lispkell all the way down.

meric 15 years ago

How long did it take to write?

  • die_sekteOP 15 years ago

    About 2-3 hours. Lots of breaks though; I started about 16 hours ago and finished about 4 hours ago.

    • yason 15 years ago

      So it's really 6-12 hours. You're working on the code in the background; you probably couldn't just have written it three hours straight. Or more likely, you were sorting out the details within the 12-hour period but you had been toying around with the idea for days if not weeks.

      Just guessing and certainly not trying to look down your work, it's beautiful. It's beautiful even if it had taken a week to write. The time to write it is just interesting because it is exactly this that makes programming hard to measure.

      How does any explain his productivity to a boss if you get seemingly nothing done for four days and then write a nearly complete solution on the fifth day? How much is one actually working after all, counting all the time that affects the work? Does it count as work if you go biking on a Saturday but you kind of subconsciously think about your work and then you write great stuff on Monday, thanks to that?

      One thing is for sure: you're much better off measuring a programmer's efforts by his accomplishments instead of the time to do them. This is backed up by the fact that an entrepreneur programmer gets rewarded much more fairly than a salaried programmer.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection