MIT Scheme on Apple Silicon
kennethfriedman.orgI've been engaged in a 1.5 year long (so far) project to port the entirety of GJS's "scmutils" package over to Clojure, and the erratic behavior of MIT Scheme over Rosetta has been a pain I've consigned myself to for months. I keep an old machine handy when I need to test functions that can't work on the M1.
I am SO HAPPY to see this work! Major timesaver for me and anyone looking to run the executable versions of Functional Differential Geometry[1] and Structure and Interpretation of Classical Mechanics[2] in the original language.
[0] https://github.com/sicmutils/sicmutils [1] https://github.com/sicmutils/fdg-book [2] https://github.com/sicmutils/sicm-book
I have to ask, how far along is this project? Is it good enough to run all of the examples in the books? I have run into the issue where the provided compiled version of scmutils from GJS' website doesn't run on recent versions of MIT Scheme (version 11 and up) and there's not much info on compiling it yourself.
It is good enough! Almost all code forms from the book live in the tests (see the FDG directory[0], for example), and there are a few nice environments like Nextjournal[1] where everything from the books works in the browser.
The Clojure port is quite fast, faster than the original for all benchmarks GJS has sent me, and more fleshed out. (That will change, as I've been pushing bugfixes and performance improvements back upstream as I go, as a meager gift to GJS for making this huge, amazing library in the first place.)
I actually wrote to GJS this morning asking for instructions on how to compile the original "scmutils", since I have the same problem. He responded saying he'll get back to me this afternoon, so I'll post here once I have details.
If you are still interested in getting the books going with MIT-Scheme, I put a decent amount of work into the exercises using the original codebase here[2], including a dockerized version of mit-scheme[3] and the scmutils package[4] that might be useful.
- [0] https://github.com/sicmutils/sicmutils/tree/main/test/sicmut...
- [1] https://nextjournal.com/try/samritchie/sicmutils/
- [2] https://github.com/sicmutils/sicm-exercises
Awesome resources, thank you!
Both of the compilation errors identified in this article were just fixed in the master branch of MIT/GNU Scheme five hours ago: https://git.savannah.gnu.org/cgit/mit-scheme.git/commit/?id=..., https://git.savannah.gnu.org/cgit/mit-scheme.git/commit/?id=.... So, if you grab the current master branch, it should just build for x86 without any fixes needed.
Ok, but how do you compile it? When you pull the repo you need to run autoconf to create the configure script but that tells me "This script needs an existing MIT/GNU Scheme installation to function".
What's stopping people from just compiling Scheme for ARM? The website has a separate aarch64 download it seems, so why not patch that instead of relying on Rosetta2?
The vfork/fork issue and the compiler upgrade issue don't seem to be too problematic to work around, so there must be some kind of ARM limitation that's preventing Scheme from working, but what?
MacOS on the M1 processor is the first to use, and require, the W^X bit in memory, meaning that pages of memory are either writable, or can be executed from, but not both. MIT Scheme's front page says this is fundamentally incompatible with their design, and therefore it won't build. When running in the emulator, this requirement would be relaxed for compatibility reasons.
There is an escape hatch for writing JIT compilers (essentially what MIT Scheme is in this case), described here https://developer.apple.com/documentation/apple-silicon/port... although it's fairly cumbersome and would almost certainly require a lot of extra, MacOS specific code. I assume that's why no-one has bothered so far to port it.
From perusing the source of the MIT/GNU Scheme compiler, I suspect that “only” two changes are needed to support W^X:
- Compiled code needs to be allocated separately from Scheme objects. It can still be garbage collected and such - they will probably need to make a separate set of allocation functions for code vs. data. The closure/function objects can be made to point to the code, or, if they don’t need to be written often, simply allocated wholly from the “code” pages. - Before modifying any of the code (e.g. to patch addresses after GC relocation), a system-specific hook function will need to be called to set the permissions to RW. They already call an I-cache flush function after each modification, so this shouldn’t be too bad.
Some of the necessary changes are already sketched out in cmpint.txt. And, sooner or later, they’re going to have to make these changes: OpenBSD already enforces W^X (but provides a workaround), and MIT/GNU Scheme already applies a paxctl workaround to gain W|X on NetBSD.
Wow really? A common intro to security exercise (think CTFs and university courses) is to write increasingly complicated C programs that leverage W&X. Classic buffer overflow into the stack kind of stuff. On M1 it’s now impossible to exploit even a self-compiled toy in this way?
Basically, yeah. In addition, the usual way to bypass W^X memory, using ROP chains, is also mitigated by the pointer authentication the M1 implements. It's not bullet proof, but it prevents most of the old exploit methods from working at all. You'd need to throw up a VM on an M1 Mac to learn much this way (although that'd be ideal anyway, to get an environment without other protections like ASLR)
I know at least OpenBSD also enforces W^X protection universally, anyone else? I know Linux can with the right SELinux policies, but not sure any distro ships with those by default.
Windows has had this enabled by default for a long time: https://docs.microsoft.com/en-us/windows/win32/memory/data-e...
There's a per program exception list to handle legacy programs though.
Windows DEP only applies W^X (more accurately, !X) to the default stack and heap; programs can still freely allocate new memory as PAGE_EXECUTE_READWRITE if they want RWX memory.
macOS W^X on Apple Silicon, however bans RWX memory outright, making it impossible to have a page in memory that is simultaneously writable and executable. Instead, if you want to be able to write instructions to a page and later execute them (e.g. for JIT compilation), you have to (1) have a special entitlement (or opt out of the Hardened Runtime), (2) map your memory with a special MAP_JIT flag, and (3) call special mprotect-like functions to toggle the protection between RW and RX every time you want to modify the code.
There does, however, seem to be a bit of a loophole: the JIT protection flags are applied per thread meaning that in principle one thread could have the page RW while another has it RX.
On M1 CPUs you cannot ever have simultaneously writable and executable memory. Windows just makes default allocations write only, you have to explicitly request RWX, which is what every other OS has been doing basically since x86 actually added support for non executable memory :)
Pointer authentication isn’t in 3rd party processes though, only system ones. (or maybe it’s available but optional, I forget)
> Pointer authentication isn’t in 3rd party processes though
Still isn’t, because the arm64e ABI isn’t stable. As such, any binaries not bundled with the OS, including Apple applications, use the arm64 ABI without pointer authentication.
You can use -arm64e_preview_abi as a boot argument to enable arm64e support for non-OS bundled processes.
Note that however the arm64e binaries that you compile might not work on future macOS releases.
System libraries are more than happy to use some parts of pointer authentication, such as return address signing.
System libraries use the full pointer authentication schema, because they’re updated as a whole with the entire system, so ABI changes don’t impact them (as much).
Offtopic, but can you recommend such a course?
Hey sorry, didn't check back on this comment for a while. I can't recommend any _courses_ in particular (unless you're a Georgia Tech student, in which case "CS 6265: Information Security Lab" is absolutely incredible).
One really fun way to hone your skills is https://microcorruption.com/, a ctf-style simulated hacking game originally made by Square and Matasano.
You can do it in a Linux VM container, it's just MacOS processes, not the HW.
Ohh that makes more sense, thanks.
It is possible using Rosetta 2
Ah, thank you. That explains the problem quite well.
I suppose the wait is on for someone to rewrite the JIT engine to be compatible with Apple's implementation of ARM.
I mean if they pass the correct flags they get memory that can be toggled rapidly between X and W mode - or is MIT Scheme mixing data and code in the heap and so actually requiring RWX?
I find it a bit disingenuous to say that it runs on apple silicon if you need to modify source code. Also it's not because it compiles and starts that it's fully functional.
The main MIT Scheme page say that it's not possible and need significant efforts, so I would be curious to get a description as to why one claim it's impossible while the other show that it compiles and starts.
Are the original authors too much against M1/Apple and justify themselves ? Or does compiling on M1 sort-of works until you hit more complex features that will crash or misbehave ?
It's also a bit disingenuous to say it's "on Apple Silicon" when you're running it through a translation layer that won't exist in a few year's time. I'd wager the reason why the GNU folks say it doesn't run on ARM is because... it doesn't. Running it as an x86 program is mandatory, apparently.
Well they do say that they need rosetta just to compile, that once compiled it "works", though I agree it's still too shallow of an article.
Apple Silicon has a (mostly) hardware translation layer, which this software is running on.
There's a special aarch64 build of the software available, so it clearly runs on ARM. Perhaps there's some kind of issue specifically on macOS that makes the existing ARM port incompatible with Apple's ARM implementation?
> Apple Silicon has a (mostly) hardware translation layer…
I can’t imagine what you mean by this, Rosetta 2 is a binary translation system implemented in software, based on QuickTransit. There are a few features implemented in Apple Silicon to make translation easier and more efficient, such as supporting Intel memory ordering, but thats about it.
I think it’s reasonable to worry about how long rosetta2 will be available. The first version, that allowed Intel Macs to run PowerPC binaries, was available for 5 years. Having said that, there’s no guarantee versions of MacOS beyond 5 years time will run on today’s M1 anyway (though M1 compatible versions will likely still get updates beyond then).
I can't say what Apple will do, but I'm really hoping they'll keep Rosetta 2 around for longer than Rosetta 1.
For starters, the Mac became a lot more popular in the Intel era than it ever was while on PPC, so there's a much larger quantity of legacy software that Apple would be cutting off. Secondly, the overall user experience of running apps via Rosetta 2 seems to be a lot better than Rosetta 1. And for Apple, Rosetta 2 was developed in-house and doesn't require continuous licensing fees to keep around (not that I'm particularly sympathetic to Apple's pocketbook.)
And for Apple, Rosetta 2 was developed in-house and doesn't require continuous licensing fees to keep around (not that I'm particularly sympathetic to Apple's pocketbook.)
I don't think any of those things matter; Apple will stop supporting Rosetta 2 as quickly as they can. They announced the transition to Apple Silicon will be two years and unless something unforeseen happens, that's what it's going to be.
I suspect that Rosetta 2 won't be available for new Apple Silicon Macs running macOS five years from now.
Of course, no matter how many years in advance Apple warns that a particular technology is going to be deprecated, that never stops people from complaining vociferously when it happens.
A great example is 32-bit apps, where Apple gave something like an 8-year heads-up that 32-bit apps were going away, which happened a few years ago but it's not hard to find threads on HN where people are still complaining about it.
> A great example is 32-bit apps, where Apple gave something like an 8-year heads-up that 32-bit apps were going away, which happened a few years ago but it's not hard to find threads on HN where people are still complaining about it.
But actually, I personally believe that the actual reason Apple killed 32bit support was because they didn't want to build it into Rosetta. (And they didn't want Intel computers to be able to run anything their new Apple Silicon computers could not.)
Before Apple Silicon was on the horizon, it was no problem for Apple to keep 32 bit and carbon libraries around for eight years because they might as well, it's not doing any harm.
(I'm also one of the people who was/is mad about 32 bit support, but I acknowledge that my opinion on the matter has no bearing on what Apple will decide to do.)
But actually, I personally believe that the actual reason Apple killed 32bit support was because they didn't want to build it into Rosetta. (And they didn't want Intel computers to be able to run anything their new Apple Silicon computers could not.)
I doubt it; that’s not how Apple rolls. They’re not like Microsoft which keeps legacy technologies around for backwards compatibility for several years after a technology is no longer mainstream.
Reasonable people can disagree but Apple is about the present and the future, not the past. Sure, they could have kept Carbon around or pick your favorite framework from the past but that’s generally not their thing.
Occasionally something from their past reappears, like the QuickDraw GX font format from the 90s that became the basis for today’s variable fonts on the web.
Apple has always been fine with some software not making the leap to the next operating system or processor architecture.
We’ve seen this going back to 68K to PowerPC then to Intel and now ARM.
Being able to run x86 operating systems (Windows) natively on Intel Macs was a huge selling point not that long ago and now it’s an afterthought that current buyers (mostly) don’t care about. Microsoft would bring Windows for ARM to Apple Silicon and so far, they haven’t.
And while this is all going on, Macs have never been more popular.
x86-32 was removed because it’s significantly less secure and performant than x86-64 and there were unfixable issues with the ABI like fragile ObjC superclasses. Don’t need any secret projects to explain that. The only people who seem to still have a problem are video game developers, who should maybe try writing clean code.
32-bit isn’t completely gone, it’s still on watchOS.
I would say it is more of the legacy codes that Apple don't want to keep providing support for backward compatibility. And it uses up spaces in your drive if they need to keep 32-bit library in case for those app that are running in that level.
Microsoft want to do that but it will be a huge risk since it will alienate their enterprise consumers.
> But actually, I personally believe that the actual reason Apple killed 32bit support was because they didn't want to build it into Rosetta.
Rosetta 2 contains functionality to correctly emulate 32-bit Intel code.
It's not the 32 bit code though, it's all the old libraries (carbon) which happen to also be 32 bit.
Right. The 32 bit emulation is only really useful to CrossOver.
For Rosetta (1), QuickTransit was bought up by IBM. Rosetta disappeared not very long after that.
Rosetta 2 has nothing to do with Rosetta 1 (other than the name), nor any other company’s software.
Looks like QuickTransit was a jit engine which was the base of Rosetta 1. Rosetta 2 is AOT translation.
They’re both based on QuickTransit, but Rosetta 2 has an AOT mode as well as JIT. I’m sure the R2 engine is more advanced than the original engine, Apple employed several engineers from the original team, but it still uses and is based on licensed tech.
My understanding is that Rosetta 2 is based on LLVM.
Even the company that made QuickTransit is gone now, having been bought out by IBM a decade ago.
My understanding is that Apple hired a lot of the staff of Transitive, and had a more or less do whatever you want license with source access. Ad that the AOT mode is based on LLVM, but the JIT piece is still pretty core to the design (hence why other JITs run well on top of it).
It’s not based on LLVM, that’s way too heavy for the use case.
The same backend is used for AoT and JIT compilation. It uses a custom lightweight IR that’s really close to x86 itself, with a big focus towards reducing translation times.
Rosetta 2/Cambria was fully written in-house.
So Java, Groovy, Scala, Kotlin and Clojure aren't running on x86, nor ARM, nor Apple Silicon?
In a lot of ways, yes. Their runtimes are so massive that saying they "run" on any of those architectures is a stretch of what is actually happening at a lower level.
In "a lot of ways", sure, but definitely not by the most common meaning of "program x runs on y architecture", and not the one being used by most people in this thread.
If you ask any random programmer if "Java runs on x86", 99% of them will say either "yes" or "I don't know what x86 is". Similarly, if you ask them "does Kotlin run on the SPARC architecture", they'll say "I don't know" and, if you give them some time to find [1], they'll amend to "no".
To be precise: the meaning being used by most programmers (and here) is "either the compiled binaries, the virtual machine, or the interpreter runs directly on the given architecture" - which clearly excludes MIT Scheme running on Rosetta, just as (to take a less controversial example) the fact that might be able to run the JVM on qemu on SPARC doesn't mean that the JVM runs on SPARC.
Thinking about the various levels of abstraction of VM's and interpreters is a fun exercise in general, but I don't think it's constructive in this particular situation.
No, it's not a stretch at all. This is just nerd contrarianism.
Well, let's see. Is there a way for me to run a Java program as native machine code? Or is the code that I'm executing still a runtime that interprets a program?
Yes there is, although it's not in widespread use yet. [1]
For Scala, there is, separately, Scala Native. [2]
Both have ahead-of-time (AOT) compilers that compile down to the target architecture.
[1] https://www.graalvm.org/reference-manual/native-image/ [2] https://github.com/scala-native/scala-native
The vast majority of the programs in the world are written in Javascript, and there's no way for you to "run" them if by that you mean "natively".
> The vast majority of the programs in the world are written in Javascript
Where is this coming from? What about the literal thousands and thousands of programs on your OS right now? Or the thousands of systems that power large corporations that predate JavaScript popularity?
Sure the language is popular right now, but software development has a much longer history than the last 10 years
They’re talking about the web.
I'm comfortable making that distinction. JavaScript isn't executed like normal, native code, so... I still agree.
When the majority of programs in the world can't be "run" according to your definition, you might want to reconsider your definition.
Since we're enjoying a nitpick picnic you can compile JS AOT to Java with rhino and then to native code with GCJ
You're redefining the terms, native and interpreted programs are differentiated by the execution layer, and it has nothing to do with where a majority of programs live.
> Is there a way for me to run a Java program as native machine code?
Yes [1].
Depending on code, a quite large fraction of your java code is run as x86 machine code at any times. It hardly gets more native than that.
Is it not running on Apple Silicon?
I got MIT Scheme running on my M1 MacBook Pro about 6 months ago when I bought the book "Software Design for Flexibility" and although I can't find my notes for that, I think I remember building from source natively, not via Rosetta - but I may remember incorrectly.
I also remember it taking a while to get Gerbil Scheme running on M1.
How did you like the book?
I like it, but I have only worked about 1/3 of the way through it.
What would someone who has worked through SICP learn from it (based on the first third that you've read)?
I've also gotten through about the first 1/3. Based on this, and the table of contents, it goes into much more depth (both in terms of implementation and non-trivial illustrative examples) into a number of topics that are either only touched upon in SICP, or not discussed at all. These include combinators, generic functions, pattern matching, etc. There's a chapter on propagators, which didn't even exist when the last edition of SICP was written, though I think SICP does discuss related (but simpler) ideas on constraint propagation.
Thank you.
The Racket fork of Chez Scheme runs natively on Apple ARM (AFAIK these changes have not yet been merged into the main branch of Chez Scheme)
Just fyi in dark mode on this site, the code snippets are almost unreadable
Thanks for the heads up. I blame our site's Chief CSS Officer (me). It has been fixed!
Rule of thumb: don't take advice from people who can't explain why they suggest you comment code out.
As the OP here, I could not agree more.
Very cool how much interest there is in mit-scheme and sicm. On the csail website I think the '.com' binary for scmutils only works with v10 and was released about a year ago. Does anyone know where there are instructions for finding/compiling a version that works with the latest version of mit scheme?
Nice. FYI, I've been using Fennel on Monterey (via brew), and it's also great for that extra LISPy feeling.
I wish I had known this before I recently installed Racket because I'm currently reading SICP.
That MIT Scheme includes an emacs clone that uses Scheme instead of elisp is a nice touch.
There’s a fork of Emacs itself that can run scheme, by replacing the internal lisp engine with Guile (which can run both scheme and elisp). It doesn’t seem to have gotten a lot of love in the last few years, but did mostly work at one point.
Just a UI comment, the white highlighting of white text on a black/grey background is pretty unreadable in my browser
Just turn off dark mode. So many sites have css that claims to support dark mode but doesn’t. The other direction seems less common.
We should really expect better from someone who's about page says they are an interface designer, though.
We should expect better, and you deserve better! Luckily, the issue has been fixed :)