Gerbil – An opinionated dialect of Scheme designed for systems programming

178 points by cmpitg 8 years ago · 79 comments

Reader

xfer 8 years ago

Here is some context on Racket vs Gerbil by a well-known Common-lisp Programmer, François-René Rideau : https://fare.livejournal.com/188429.html

e12e 8 years ago

Some excellent comments there.
Looking around a bit, I came across a couple of implementations that might be of interest for those wanting to mash up "some kind of system programming" and "scheme": Bigloo (interpret / compile to executables, java byte code or experimentally dot.net) and larceny (interpret / direct compile to machine code / optionally via c):
http://www-sop.inria.fr/mimosa/fp/Bigloo/
http://www.larcenists.org/
- metaobject 8 years ago
  
  I used Bigloo years ago to build Scheme bindings to a few of our C applications and libraries. It's so much more fun to script your applications - similar to how your can/could script The Gimp with SIOD (Scheme in one Defun) Scheme.
- cat199 8 years ago
  
  Chicken Scheme
  https://www.call-cc.org/
  is another good one - compiles to binary, many libraries (eggs)
  - ClFromEmacs 8 years ago
    
    And can't compete in speed. But does have more libraries.
- WindowsFon4life 8 years ago
  
  Try building a modern project in either. Anything non-trivial will highlight where Gerbil stands out.
  - e12e 8 years ago
    
    I'd be happy to see some example non-trivial, new(ish) projects in any scheme, actually.
    Hop looks like one such project, I don't know if gerbil would be better as an underpinning than Bigloo?
    https://github.com/manuel-serrano/hop
DigitalJack 8 years ago

Thank you for posting this.

kovrik 8 years ago

Can anyone explain why there are so many implementations of Scheme written in Scheme? What is the point of doing that (apart from learning purposes)?

I know, for example that people want Racket VM to be implemented in Chez Scheme because Chez is super fast. But what about all other implementations?

Also, as I'm currently writing R5RS/Clojure hybrid in Kotlin, can anyone please share any _simple_ standard algorithm of implementing r5rs macro system and macro expander?

The only thing I could find is https://www.cs.indiana.edu/chezscheme/syntax-case/

jbclements 8 years ago

I'm afraid I literally laughed out loud at your request for a simple standard algorithm for r5rs macro systems. However, it's a sympathetic laugh. Macro Hygiene is still very hard, principally because (I claim) there's not yet a clean and widely accepted model for it. Matthew Flatt's "sets of scopes" model is (IMNSHO) the current leader. Time will tell whether he or anyone else comes up with a simpler and more widely accepted model. But yes: as an implementor, it "feels" very heavy, and you keep thinking that there must be a simpler solution to this problem (aside from just throwing the problem out and giving up on hygiene and language composability).
- kovrik 8 years ago
  
  Yeah, you are right. I've already implemented everything from r5rs, but macro system. And I don't know where to start, just can't wrap my head around it.
  And people say that Scheme has a very minimalist design and is easy to implement...
  - groovy2shoes 8 years ago
    
    You may find this paper helpful [1]. As noted there, syntax-rules consists really of two different parts: (1) the hygiene-preserving macro expander and (2) the pattern-matching facility. Once you wrap your head around the two pieces, it's actually pretty straightforward to implement them both. It's very common to write some "low-level" macro facility as a stepping-stone to syntax-rules (such as explicit-renaming macros or syntactic closures): the implementation of syntax-rules then becomes a composition of the pattern-matcher and the low-level facility. A tutorial implementation of an appropriate pattern-matcher can be found at [2],[3].
    There's lots of good reading to be found at the ReadScheme Library [4]. Most if not all of the references in [1] can be found there.
    Tangentially: yeah, syntax-rules kinda flies in the face of the oft-mentioned minimalism of Scheme, but there's a reason for it: at the time syntax-rules was standardized, there was no consensus on which (if any) low-level facility should be standardized (and, really, there still isn't any such consensus). The reason syntax-rules operates as a pattern-template rewriting system rather than as a procedural system is that such a system is able to guarantee that the macros it produces are hygienic in a (comparably) simple and intuitive way. The idea was to standardize on a high-level system so that Scheme could have a standard macro facility, whilst leaving the low-level systems open for further exploration and experimentation. The two main contenders are still, after all these years, syntax-case and syntactic closures: the latter being easier to implement and arguably easier to grok, the former being potentially more powerful (in that an implementation of syntactic closures within syntax-case is known, but not vice-versa (last I checked)).
    --
    [1]: http://mumble.net/~jar/pubs/scheme-of-things/easy-macros.pdf
    [2]: http://blog.theincredibleholk.org/blog/2013/02/11/matching-p...
    [3]: http://blog.theincredibleholk.org/blog/2013/02/12/patterns-w...
    [4]: http://library.readscheme.org/
    
    kovrik 8 years ago
    
    Hey, thank you so much for that! Haven't seen some of those papers. Looks promising. Cheers!
noelwelsh 8 years ago

Scheme has a strong tradition in programming pedagogy---it is the language used in SICP and other classic books---and writing Scheme in Scheme is what many of these books build up to. So writing your own Scheme implementation is a natural thing for many Scheme programmers to do.
As for Racket, IIUC the plan is to migrate the current C VM to a Scheme one building on Chez Scheme. It won't be that Racket will be running on Chez. The Racket fork of Chez and Racket itself will be one and the same thing.
huntie 8 years ago

I think it's usually to provide extra features. For example, Gerbil has a more advanced macro and module system than most Scheme implementations. It's also very easy to do because you're bascially given an AST for your new Scheme that you transform into the base Scheme.
Gerbil's docs have an example of this: https://github.com/vyzo/gerbil/blob/master/doc/tutorial/lang...
- kovrik 8 years ago
  
  Can these 'more advanced' features be implemented in, say, Racket/Guile/Chez?
  - huntie 8 years ago
    
    I don't know why they couldn't be. I'm pretty sure Gerbil's macro/module systems are heavily inspired by Racket.
xfer 8 years ago

The well-known algorithm for macro expansion is mark-and-rename. Although the racket implementation was later changed to scope-sets(http://www.cs.utah.edu/plt/scope-sets/index.html).
Johnny_Brahms 8 years ago

The simplest way would probably be to implement explicit and implicit renaming macro transformers. They are found in many schemes and are quite a bit simpler than the syntax case you linked to. Then you can just glue parts of ashinn's match.scm on top of it for the pattern matching.
There is also a srfi for an improved hygiene low level macro facility that is rather elegant. Can't remember the number though. 70-something.
cat199 8 years ago

A single unifying scheme standard?
You must be new here..
- WindowsFon4life 8 years ago
  
  r7rs is what you seek
nathancahill 8 years ago

HN uses * instead of Markdown's _ for emphasis.
ClFromEmacs 8 years ago

Can someone explain these sort of questions?

WindowsFon4life 8 years ago

https://ecraven.github.io/r7rs-benchmarks/ recent addition to the charts, and is doing quite well.

nextos 8 years ago

Yes, Chez & Stalin are the fastest in these benchmarks, which seems to match what the community usually answers when asked about quick implementations.
Sadly Stalin is unmaintained. Its whole program optimization techniques were really advanced. I remember it even got my ivory-tower professors, who were big in the static analysis field, excited.
Now, a tricky question. I'm mostly unfamiliar with Scheme for writing real-world code. Will the ongoing merger of Chez with Racket make the latter a clear winner in the Scheme camp? How are libraries and FFI?
A problem with Scheme is excessive fragmentation. Having a clear winner would be cool for library support. Racket is great due to multiple paradigms and DSLs [1]. I hope it eventually becomes a very practical Lisp with all Mozart/Oz semantic goodies.
I am trying to do all my projects in a Lisp. Lately this is either Clojure or SBCL. Both have good libraries and decent FFIs. Clasp (Common Lisp on LLVM) [2] has gotten me excited, as good interfacing with C++ will be great to access a lot of quick numerics code.
[1] https://beautifulracket.com/appendix/domain-specific-languag...
[2] https://github.com/drmeister/clasp
- baldfat 8 years ago
  
  > Will the ongoing merger of Chez with Racket make the latter a clear winner in the Scheme camp?
  Academically they have all the super giants of Scheme. It is now within the past few years with the renaming to Racket that the success story is just now unfolding after over 20 years of development. I think Racket will end up being the clear leader not just of scheme but of Lisp. Reminds me of when R just took off 5 or 6 years ago.
  - nextos 8 years ago
    
    That would be amazing. I think the issue with Scheme, and in general most Lisps has been excessive fragmentation.
- kovrik 8 years ago
  
  > A problem with Scheme is excessive fragmentation.
  Yes. And not only that, but also the fact that each Scheme implementation is slightly (at best) different from all other implementations. I guess that is because Scheme standards are not that strict (for example, compare with Java specs)
  - dleslie 8 years ago
    
    They're fairly strict; if you code to RnRS and use only SRFIs then you'll find your code to be fairly portable... And nigh-useless.
    The problem with the Scheme standards is that the committee refuses to be opinionated about implementation details, and so the standard is defined in terms of itself with little to no consideration for the environment in which Scheme will operate. In practical terms, this means that if you want to communicate with other libraries, or virtually any aspect of the system on which you're running, then you're venturing outside of the Scheme specification and into implementation-specific territory.
    Scheme is a toy language. The implementations are not, but then they aren't Scheme so much as they are Scheme with useful extensions.
    
    kovrik 8 years ago
    
    I don't know, every time I check different common implementations (Racket, Guile, Chicken, Chez, Kawa) - they all act very different. Very often even a very simple code snippet works fine in one implementation, but not in the other(s).
    And while writing my own Scheme, I literally had to choose "Ok, I will implement this Racket-way; This I will implement Guile-way" etc.
    
    Johnny_Brahms 8 years ago
    
    I have found that you fairly easily can write portable r6rs code, but whenever you step outside r6rs (for stuff that isn't standard, like networking) it becomes an exhibition of cond-expand abuse.
- ClFromEmacs 8 years ago
  
  R7RS ftw.
  - Johnny_Brahms 8 years ago
    
    I feel that not standardising a low level FFI for r7rs large is a mistake. They should at least recommend some reasonably low level stuff that can be used to build abstractions.
    
    cat199 8 years ago
    
    Agree with the sentiment, but at the same time understand the rationale - FFI implies alot of GC/memory internal interfacing stuff, and to some extent expects a c-based implementation (not e.g. JVM,CLR,etc).
    If r7 & library interface are widely adopted, some flavors of FFI could evolve within the library/srfi process and gradually become defacto standards..
    Probably hoping too much.. but in any event..
    
    dleslie 8 years ago
    
    That didn't happen with R5RS, why would it happen now?
    Face it, by refusing to be opinionated they ensured Scheme will remain a toy.
    
    Johnny_Brahms 8 years ago
    
    Because r7rs large is an effort that probably is bigger than common lisp. It is very much not a toy language.
    
    dleslie 8 years ago
    
    If it doesn't have an ffi it will remain a toy that relies on implementations to make it useful.
    
    Johnny_Brahms 8 years ago
    
    I think that the best way to go about it (at least if you want to be successful) would be to write a SRFI that could easily be impkemented using the FFIs that are already out there.
    Managing that, but still being flexible enough to be useful, is a huge amount of work, at least if you want to do more than just "call this c function".
  - ClFromEmacs 8 years ago
    
    http://snow-fort.org/link/ very portable packages that are quite complete. R7RS provides for portable code.

jbclements 8 years ago

(Mostly just hoisting Noel Welsh's comment to the top level) How will the expected upcoming move of Racket to the use of Chez affect the comparison between Gerbil and Racket? I'm trying not to set my expectations too high. BTW: RacketCon! This weekend!

baldfat 8 years ago

I am super excited for Chez. If we see the speed we are look at a possible doubling in speed.
There is certainly a reason why he has stuck with Gambit and not my beloved Racket. This seems perfect for using Racket's macros. I just don't know Gambit's Macros well enough to compare them to Racket.
Here is the Bench Marks for Chez, Racket and Gambit https://ecraven.github.io/r7rs-benchmarks/
- WindowsFon4life 8 years ago
  
  Gerbil is there as well.
616c 8 years ago

No way!!! Holy crap balls will this get interesting. To be clear, is this what mflatt's racket7 repo is all about!? This is what I found from quick Googling.
https://github.com/racket/racket7
Is this like research work like Pycket (Racket on PyPy) or is this a blessed project and Rackets official implementation will cut over
- capnrefsmmat 8 years ago
  
  It's the official goal of Racket to be based on Chez in the next couple years:
  > TL;DR: I expect the main Racket distribution to run on Chez Scheme instead of the current Racket VM sometime in the next couple of years.
  https://groups.google.com/d/msg/racket-dev/2BV3ElyfF8Y/4RSd3...

nemoniac 8 years ago

Already a handful of "how does this compare...?" postings but the standard for a systems-programming scheme is scsh, a find abstraction layer over POSIX dating from the nineties.

https://scsh.net/

Naac 8 years ago

I think maybe there is a different interpretation of the words "systems programming"?
I took it to mean a scheme used for developing close to the hardware programs, where as maybe others are taking it to mean a language which can be used to write common scripts and tools used by Systems Engineers.
- xfer 8 years ago
  
  Close to the hardware can also be done with scheme and indeed has been done in the past. Look at all the smartcards, running a JVM. For scheme, it just requires some engineering effort and there is not much demand for it. Btw, there is picobit(https://github.com/stamourv/picobit) which can be run on micro-controllers.
  - mschaef 8 years ago
    
    PreScheme serves the same sort of role within the Scheme48 ecosystem. It's been a while since I've looked at Scheme48, but if I remember right, the VM is written in PreScheme, which is usually compiled down to either C or machine code. However... you can also run PreScheme code within Scheme48, so that it's possible to test/update the VM semantics without actually compiling down to C for each dev cycle.
  - cat199 8 years ago
    
    That wasn't the point of parent -
    scsh is purely interpreted and interfaces well with user-level programs but was being put forth as a systems-programming tool - while this works for some definitions of systems programming, it doesn't for others, and it is this distinction that the parent was pointing out.
WindowsFon4life 8 years ago

I think you mean "shell scripting".

zitterbewegung 8 years ago

How does this compare to prescheme from scheme48?

http://www.s48.org

https://en.wikipedia.org/wiki/PreScheme

ClFromEmacs 8 years ago

The link should have been to http://cons.io See Freenode #gerbil-scheme for active development and help. So far it is very nice and about twice as fast as sbcl.

roma1n 8 years ago

For the uninitiated, how does this compare to guile?

ratboy666 8 years ago

Based on Gambit-C: A scheme to C (or javascript... and I think there may be other targets) compiler. Comes with a Scheme->target compiler, then compile that code, and deploy. So, you write in Scheme, and deliver in C...
Comes with an interpreter. Supports macros, etc. and Gambit supports easy C FFI, option for infix syntax, massive threading, full numeric tower
Easy to hack on (in my opinion), and performs very well. Gambit-C has been my "goto" Scheme for the past decade.
Could be used as an "extension language", but not its main purpose. Mostly, a way to deliver Scheme code that runs fast.
Racket has a better IDE -- but I like the simplicity of Gambit-C.
- WindowsFon4life 8 years ago
  
  Gambit is amazing. And gerbil brings a lot of batteries to it.
ClFromEmacs 8 years ago

Supports r7rs, networking, leveldb, lmdb, mysql, and is very fast.

jblow 8 years ago

Why do people keep thinking they can use a garbage-collected language for “systems programming”?

flavio81 8 years ago

>Why do people keep thinking they can use a garbage-collected language for “systems programming”?
Why do you think it can't be done?
It was already used for systems-programming in the early 80s; whole machines were programmed in Lisp in low level: TI Explorer, Xerox workstations and others.
- rfreytag 8 years ago
  
  I distinctly remember waiting quite a long while with my Symbolics locked up while it completed a GC.
  - cat199 8 years ago
    
    I distinctly remember waiting quite a long while with my PC locked up while it completed a Win32 load.
  - flavio81 8 years ago
    
    Can't help but recall this quote:
    "Bad response time doesn't bother the Real Programmer -- it gives him a chance to catch a little sleep between compiles. "
    -- from "Real Programmers Don't use Pascal"
    http://web.mit.edu/humor/Computers/real.programmers
  - TeMPOraL 8 years ago
    
    GCs are better these days. And you're much more likely to wait on IO anyway, at every granularity level in the system.
tom_mellior 8 years ago

Why do people keep thinking that "systems programming" is a term with a well-defined meaning?
I've seen it used for anything from operating system kernels to database systems to anything that talks to the network to command-line utilities like ls.
Garbage-collected languages are fine for all of these except maybe kernels. But even there, you might use a system that lets you do critical sections in a mode where the GC is disabled, or in a sublanguage that is guaranteed to be GC-free.
tokenrove 8 years ago

Well, it's been happening since the '80s when several OSes were written in ostensibly garbage-collected languages. So it might be a little late to complain.
e12e 8 years ago

Because of lisp machines, Smalltalk/the Alto - the popularity of java and the popularity of golang? As well as a disagreement about what "systems programming" means?
But when you can do real-time video processing in an (at the time) young alternative implementation of python[1] - I'm not sure what we're arguing about anymore..?
[1] https://morepypy.blogspot.no/2011/07/realtime-image-processi...
CoolGuySteve 8 years ago

I always wonder exactly what kind of "systems" these people are programming, clearly not any with real-time constraints. It almost seems like "systems programming" got co-opted to mean "not interpreted". At least that's what I noticed with Go's marketing.
There are a few HFT firms, most notably Virtu, that use Java. But my understanding from having interviewed people that worked there is that it's so convoluted to avoid GC pauses that you might as well be using C++.
- seertaak 8 years ago
  
  > I always wonder exactly what kind of "systems" these people are programming, clearly not any with real-time constraints.
  Well Niklaus Wirth for one was able to design an entire OS using a language (Oberon) that had garbage collection.
  > There are a few HFT firms, most notably Virtu, that use Java.
  The hedge fund where I worked used Java. We didn't have problems with latency, although admittedly we weren't doing the really low-latency stuff. There are patterns that you can use in Java that basically emulate manual memory management. In Java this is a little harder than in C# because there no value types, but java.nio basically lets you do whatever you want, so you can always allocate everything up-front. In fact, just for fun I did this myself in an audio setting. So basically I had an audio engine with a ring buffer to pass messages to the event loop, and with a little care I was able to ensure that there literally no garbage that made it past the first phase (i.e. locals). Those are essentially free, so basically the GC wasn't getting any pressure at all. It wasn't that hard to write, and with the excellent profiling tools available on the JVM it's easy to see which if you're making use of longer-term GC sweeps or not.
  Finally, I would add that there are scenarios where C++ moves into automatic memory management. std::shared_ptr is an example of (shitty) automatic memory management, but there have been efforts, notably by Herb Sutter, to provide precise GC-collection as a library. For some non-blocking multithreaded algorithms, GC schemes are actually necessary since allocation becomes one of the prime vectors through which blocking occurs.
  So conclusion, while it's certainly the case that most systems programming is done in languages with manual memory management, it's a little less binary than you suggest. That said: I'm writing a DAW (zenaud.io) and for that I am using C++, mainly for memory management :)
- kazinator 8 years ago
  
  Many operating systems that use manual memory allocation and/or reference counting techniques are nevertheless unsuitable for real time.
  Modern desktop and mobile device operating systems often exhibit embarassing lulls in responsiveness that resemble pauses in a rudimentary garbage collector.
  Real-time operation can be "bolted on" to a non-real time operating system as a small specialized kernel which has higher priority access to the CPU and its own, separate resource management.
- geofft 8 years ago
  
  I don't tend to think of hard realtime as "systems programming". It's related in that a lot of the good languages for one are good for the other, but they don't seem like the same fundamental problem space - note that almost no OS kernels and even fewer base userspace layers (libc or equivalent, coreutils or equivalent, etc.) are hard realtime.
- naasking 8 years ago
  
  Real-time GCs with microsecond latencies do exist.
- mschaef 8 years ago
  
  > But my understanding from having interviewed people that worked there is that it's so convoluted to avoid GC pauses that you might as well be using C++.
  I've done a very minimal amount of this... the gist is that you avoid GC pauses by avoiding allocation. This translates into reusing objects using pools, etc.... and the assorted complexities that come from having to explicitly manage object lifecycles. In critical applications you often want to avoid dynamic memory in C/C++, so it may not be all that different.
- astrobe_ 8 years ago
  
  It seems to md that systems (with a 's') programming was a term lntroduced by Go, and I always understood it has something with a wider scope than system programming - actually "intermediate level" applications for which a GC is acceptable both time-wise and space-wise. Yet for system (without 's') programmers (kernel, driver or embedded systems devs for instance) a GC is still a no-go.
  - cat199 8 years ago
    
    > lntroduced by Go
    no. further popularized perhaps, but no.
    perhaps a shorthand for 'distributed systems' programming, idk.
    Definately heard the phrase applied in this area 10+ years ago, by people who had been around for 10+ years (see also go authors)
yawaramin 8 years ago

Because they keep being able to do it: https://mirage.io/
inetsee 8 years ago

A search on "non-blocking garbage collection" turns up almost a million hits.
kazinator 8 years ago

My guess would be because it's been done since the dawn of electronic computing.
tentaTherapist 8 years ago

Because it's a term that means nothing but makes for good marketing.

dogruck 8 years ago

Can anyone briefly summarize the points of debate and optimization? I'm not familiar with the competing kit, and the primary goals.

Settings

Gerbil – An opinionated dialect of Scheme designed for systems programming

Keyboard Shortcuts