Ask HN: Testing AST or assembly output for a compiler
Hi Hacker News,
I'm working on a c compiler from scratch and am in a bit of a deadzone figuring out how I can test the generated AST and assembly output. I'm specifically having a hard time finding something that is viable for a one person project and which is also useful.
I did some research on Clang and saw they use a custom Filecheck library. This looks incredible for a production grade compiler but for mine I'm not sure if I want to put in all of the effort (especially because my host language F# doesn't have a Filecheck lib and I would have to re-create it).
Same with the AST - the best I can think of is creating the nodes in my host code language. This is verbose.
What have you done to test and check your compiler output, any good recommendations for me? I'm happy to research or read anything. Please keep in mind I'm going for a good effort to reward ratio. I go straight to having examples/ with a ton of little test cases, and check the output stdout and stderr to see whether it went pear shaped. It's an end to end test harness of sorts. If you care about locking down your AST and IR output, I'd recommend having a printer of sort to stdout and check against that, like a sha-1 or expected output.txt to compare against, see example [1] You can try creating an interpreter for the AST and other IR forms you use. This can also free you from testing for specific generated ASTs and IR, so long as they're equivalent (when executed will produce the same results). This will be more helpful once you start with things like adding in optimization passes. Honestly, I think the biggest win is just having a solid test harness that can compare AST snapshots across versions. It’s not glamorous, but it catches regressions early and gives you confidence when you refactor the optimizer. Maybe throw in some fuzzing on the AST nodes and see what breaks – it’s surprisingly fun.