Pedagogical Downsides of Haskell
ciobaca.substack.comI find its syntax & idiomatic style incredibly difficult to follow, in a way nearly no other languages have been for me, including some functional languages (OCaml doesn't seem nearly as bad to me, for instance).
It's sometimes implied that those who trip over Haskell just aren't big-brained enough to understand various important concepts related to it, but I've found they're usually very easy to grasp, provided the explanation's not using Haskell examples. If all programming were Haskell, I probably never would have become a programmer in the first place. Would have taken me too long to figure any of it out, probably would have concluded I wasn't smart enough to be a programmer.
I do wonder if there are some shared experiences or common patterns to who tends to love Haskell, and those who don't. I also feel nigh-dyslexic trying to read math formulas. Human language and broadly C-family programming languages, on the other hand, seemed easy and natural to me, almost effortless to pick up. Wonder if there's a "mathy"-person versus "languagey"-person divide on finding Haskell legible.
I'm not sure it's the whole thing, but I think I've also figured out that I find algorithm-type reasoning far easier to follow and work with than equations or proofs. Like, the only way I can begin to get traction with an unfamiliar equation is to break down what each term and operation "does" to something "moving through" it—it's tedious as hell. Might be something there.
I find the terminology that Haskell uses quite misleading for software engineering. It borrows concepts from category theory with quaint names such as a "monad", "endofunctor", "catamorphism", etc. The problem is that, instead of a "monad", we can say "brrrdogcogfog" and nothing will change -- the name is absolutely irrelevant to the problem being solved. Given that a monad is an interface for sequential computation, a much better name would be something like "Seq", "SeqComp", or something like that.
> Given that a monad is an interface for sequential computation, a much better name would be something like "Seq", "SeqComp", or something like that.
Just because you can look at something as describing a computation doesn't mean you always should. For example:
You can choose to interpret a binary tree as describing nondeterministic computation where you have two choices at every step, but I rarely do. Most of the time trees are just trees.data BinaryTree x = | Leaf x | Node (BinaryTree x) (BinaryTree x) instance Monad BinaryTree where return :: a -> BinaryTree a return x = Leaf x bind :: (a -> BinaryTree b) -> BinaryTree a -> BinaryTree b bind f (Leaf x) = f x // replace a leaf with the result of calling f on its label bind f (Node l r) = Node (bind f l) (bind f r) // traverse down the tree, ultimately replacing all the leaf nodes with a new subtreeThe tree is a tree, but the Monad instance is sequencing modifications to the tree.
Sure, Kleisli arrows `a -> m b` are generally best interpreted in a computational sense. But with something like `IO`, the actual objects `m b` are computations as well, and this intuition is not as broadly applicable.
As I understand it, monads help solve the problem of sequential computation in Haskell but the concept is not limited to that. For example, how would you consider the monadic properties of data type like Maybe or Either to be (exclusively?) interpreted through the lens of sequential computation? What about commutative monads where the order doesn’t matter?
The `Monad` instance for `Maybe` and `Either` is precisely for doing sequential computation!
Consider the following:
Here we are sequencing the "effect" of optionality. `ma` must be evaluated before `mb` and if it returns a `Nothing` then we short circuit and do not evaluate `mb`.myBigSubroutine :: Maybe Int -> Maybe Int -> Maybe Bool myBigSubroutine ma mb = do a <- ma b <- mb return (a > b)I think the commenter who mentioned syntax is on to something. If I write this
it’s clearer to me what the intent is. I’m not sure why the other syntax is so hard for me but it feels hard to understand for some reason.fn my_big_subroutine(ma: Option<isize>, mb: Option<isize>) -> Option<bool> { match (ma, mb) { (Some(a), Some(b)) => Some(a > b), (_, _) => None, } }The `do` block I showed is more like this:
Which in this case is equivalent in this example, however I'm trying to stress the sequencing. Imagine `mb` had some very expensive computation in it, then it will remain an unevaluated thunk if we shortcircuit on `ma`.fn my_big_subroutine(ma: Option<isize>, mb: Option<isize>) -> Option<bool> { match (ma) { (Some(a)) => match (mb) { (Some(b)) => Some(a > b), _ => None } (_) => None, } }> it’s clearer to me what the intent is. I’m not sure why the other syntax is so hard for me but it feels hard to understand for some reason.
We can write `myBigSubroutine` with case matching:
In fact, the `do` notation version desugars to something equivalent to this snippet.case ma of Nothing -> Nothing Just a -> case mb of Nothing -> Nothing Just b -> Just (a > b)The motivation for using the `Monad` instance (and thus `do` notation), is that it allows us to be polymorphic over the effect described by `>>=` (and thus `do`).
This lets us have a customized version of sequencing computations specialized to whatever "effect" we need, not just casing on optional values.
> `ma` must be evaluated before `mb`
No - either one can be evaluated first, with the other being short-circuited. If you swap the order of those lines, the function is exactly the same (in terms of inputs and outputs, at least).
This `do` syntax desugars to binds like:
We can then inline the definition of `>>=` and `return` to get:ma >>= \a -> mb >>= \b -> return (a > b)
Imagine that `mb` is actually a really expensive computation that we don't want to perform unless `ma` returns a value. Sequencing our case statements in this way allows us to do that. `mb` will remain an unevaluated thunk until `ma` evaluates to a `Just a` value.case ma of Nothing -> Nothing Just a -> case mb of Nothing -> Nothing Just b -> Just (a > b)
monads help solve the problem of sequential computation in Haskell
It probably confuses people because this is a problem haskell created for itself.
F# calls them Computation Expressions which is far more approachable imo
That sounds almost as vague to me as “object” does in OOP. Don’t non-monadic functions also consist of expressions that describe computations?
Everything is vague. Hell, Haskell functions are not really functions in the mathematical sense.
To me, “computation expression” is significantly more vague than “function” is in Haskell. “Seq” as was suggested by someone else here seems clearer. But I would genuinely be happy to understand the F# point of view on this better.
> Given that a monad is an interface for sequential computation
And what the hell is an interface for sequential computation? I think I understand what these Maybe types are and what they accomplish but "interface for sequential computation" sounds a lot like those buzzwords people mix together that could mean anything.
It's an API for ensuring work gets done in a specified order.
Like opening the jar before sticking a knife into the peanut butter.
Yeah, well, so is every programming language ever created. When I write any code at all, it's to ensure the instructions are laid out in the correct order for the computer to execute.
Composition/SeqComp comes for free. A monad is how to wrap something up,apply a function to the wrapped up thing, and how to unwrap it.
Composition is not sequential in a pure, lazy language with a graph reduction runtime like Haskell.
A monad is just a monoid in the category of endofunctors. Would most people know what "brrrdogcogfog" means? Couldn't we make that argument about literally any word? I don't see why it applies more here than elsewhere. For people who have experienced it before, it's straightforward and easy to work with, for those who haven't, then there's a learning curve. No one would likely have encountered "brrrdogcogfog" before and everyone would have to go through the learning curve.
Like a Javascript ArrayBuffer? No, a Stream?
tl;dr -> I agree that the terminology is probably not something that enhances cohesion amongst devs using Haskell, and certainly can be distracting in a pedagogical setting.
I think the fact your experience leads you to believe that a monad is an interface for sequential computation. A monad is often used for ordering computations, but Haskell’s monads can also be commutative (like the Reader instance of monad) which do not order anything.
The real issue is that the naming convention where some typeclass is named after a concept in category theory means wildly different things to different developers. For instance, I would expect a type/typeclass named for some categorical construct to behave in the way the categorical construct behaves and that would be the extent of what I use it for. However, some developer may see a particular usage the same construct and extrapolate that said construct is intrinsically tied to that algorithmic pattern of usage.
So the problem is controlling expectation and managing consistency throughout the dev community. I doubt Haskell will ever get away from the category theory inspired libraries and the subsequent naming conventions. See the relatively lively development of the profunctor optics based work. But, I can certainly see how it may distract or confuse newcomers.
My original degree was in Math and I can definitely 'feel' a difference when reading Haskell vs other languages.
Writing/Reading Haskell gives me a similar feeling to doing proofs than programming.
Even other functional programming languages don't give me that 'in the math class' feeling that Haskell does.
I use generators, list comprehension and a lot of lambda stuff with python.
It's a bit fun because it's very short to write, it's concise and it helps a lot working only with dict and tuples etc.
Not sure if it's faster, but it's always a bit longer to write and think about, and I'm not sure it's easier to read and understand.
Sometimes it feels a bit like code golfing, because you can do a lot of things with very few lines.
It's immensely better to remove 99% of side effects, the code is shorter and more compartmentalized, so it's just easier to deal with.
Although I'm doing this alone, and I'm not confident that I could enforce this sort of software design in a team.
I'm in the exact same boat. Haskell code feels more like abstract maths and I feel more at home when I can just easily track the data flow. The language and community uses relatively abstract terminology due to its roots and it's just a bit too cryptic to me.
Though I'm glad newer languages are starting to adopt more features from the functional territory for the situations where it just makes more sense.
> I'm not sure it's the whole thing, but I think I've also figured out that I find algorithm-type reasoning far easier to follow and work with than equations or proofs.
For me it's the opposite. Once i figure out what an expression is, I do not want it to change on the next clock cycle.
I suspect you are right that there's a type of person Haskell feels very intuitive to. I think if your mind works that way you might have a hard time appreciating the degree of confusion "regular" programmers face when trying to decipher the mess of symbolic soup.
There is no reason for Haskell to be a "mess of symbolic soup".
You can make Haskell about as human-readable as Ruby if you choose to.
Syntactically Haskell and OCaml are extremely close. A lot of the surface difference between the two languages has nothing to do with syntax and has so much more to do with how things are named by default in the standard libraries. (There are massive differences in the type system and all sorts of subtle differences below the surface, of course.)
Regular programmers were pretty happy with Perl soup.
The pedagogical downside of Haskell is that it ignores the physical reality of the machine. Physically, a computer is imperative, has mutating state, and is filled with all kinds of possible race conditions. Even after you apply the operating system, allowing processes to live together (and giving you space to define new ones), very few constraints are placed on your program and process space.
Instead of building on this reality, Haskell asserts that the starting point is not physical reality, but rather a mathematical formalism called "The Lambda Calculus", the physical machine is looked at with disdain and pity, its limitations to be worked around to provide the one true abstraction. This is the original sin of Haskell, because it is an attitude that isn't driven by a need to make a thing, but aesthetics and a peculiar intellectual dogma around building that ultimately becomes a stumbling block.
In my view, you have to respect the machine. Abstractions can be beautiful, but they are ephemeral, changeable, unreal. The danger is that these illusions become a siren song to makers who are always looking for better tools, and to these makers the abstractions become realer than the machine. Haskell's power users famously don't actually make anything with it (modulo pandoc and jekyll), and my guess is because either they find that 90% of real-world things you want to do are "ugly" from Haskell's point of view, and so are left as distasteful "exercises for the reader", or they get so distracted by the beauty of their tools they never finish.
In any event, Haskell is a road less traveled for good reason.
I'm sorry to break it to you but every single programming language in existence ignores the physical reality of the machine. That's the point of abstractions such as programming languages.
There are abstractions which build on the inherently stateful nature of computers with their instruction pointers, registers, memory and peripheral devices, and there are abstractions which coerce you into framing any computational problem like a mathematical formalism.
Haskell's abstractions build on the inherently stateful nature of computers – how do you think Haskell compilers do their job?
(Not to mention that Haskell and its base libraries have plenty of abstractions useful also for the programmer to deal with the inherently stateful nature of computers, e.g. the IO type, STM, State (it's in the name!), Channels, etc.)
Yes, this is essentially it. There's a shape to the causal connections in the real machine that must be respected at higher levels of abstraction. In particular, the shape I mean is that basic mechanism of computation where you have a program counter, instructions and data in mutable memory (von Neumann), and a CPU with registers that "starts on the upper left" of memory, and leaves interesting shaped smears behind when it's done.
On top of this machine shape the OS adds a process abstraction, and a method to speak to devices. It is not coincidence that this process shape looks like the machine shape: lines of source correspond to instructions, declared structures correspond to main memory.... And from here we programmers pick a coordinate system and begin to build. But whatever coords we pick the space, the degrees-of-freedom, always the same: as vast as Turing could fathom. The interesting part of coordinate systems is the kinds of shapes you get for the constraints you picked. But Haskell seems to be a coordinate system with some valid constraint ideas (clear division between purity and side-effect, immutability), but an invalid sense of its identity as merely one coordinate system within this larger structure.
I've never had to worry about what's a register in Python, and barely about pointers and memory (those are highly abstracted away, exactly to the same extent as they are in Haskell).
These days even binary instructions ignore the reality of the physical hardware, as far as I know (I make a javascript lol). The output of e.g. assembler is an instruction set for a virtual machine that doesn't exist, that the CPU translates into actual execution. At least on the intel superscalar side, ARM may be a simpler setup.
Also true. Their point is that Haskell's system ignorance goes much deeper than its peers.
I mean yes, it's a higher level language. Python's system ignorance goes deeper than, say, Ada's, which in turn goes deeper than C++'s, which goes deeper than C's, which goes deeper than many of its predecessors, which go higher than x86, which goes deeper than PDP-11, which goes deeper than logic gates, which go deeper than transistors.
But what's the point? Which languages do we reject because they are sufficiently dissimilar to transistors? Should we all start writing code in VHDL?
(jokingly) yes, in Haskell: https://clash-lang.org
> Abstractions can be beautiful, but they are ephemeral, changeable, unreal
Where can we find 'no abstractions' these days? Even if you write in ASM, there will be tons of abstractions. Instructions will run out of order. Memory is abstracted. Even the ASM you write will be translated to microcode.
The closest you'll ever find to 'the physical reality of a machine' are microcontrollers (and even then, only some of them) and machines from the 80s. I have one sitting right next to me that I can tell you exactly how many cycles every CPU instruction takes.
Everything else is an abstraction. C abstracts a machine that doesn't exist(it was closer to machines that did exist at the time it was created). Even something as simple as a short circuit expression in your IF statement is an abstraction. Even in C you have to sometimes fight the abstractions when you are trying to, say, use caches effectively.
In a bunch of key ways Haskell is closer to the machine than modern languages like JavaScript: it doesn't depend on a complicated JIT system at runtime, it lets you explicitly control details like boxing and unboxing, it exposes various low-level C and machine types (fixed-size ints/etc), it has primops for SIMD...
You can write relatively low-level Haskell a lot more easily than you can write low-level JavaScript. You just don't have to.
I don't think that this is right. A programming language is useful to programmers if it's oriented around the structure of the _problem_ and not just the structure of the _physical machine_. For some tasks these coincide (especially if you care about performance) but I frequently find myself in situations where functional code is simple and the machine is irrelevant.
One of the more surprising aspects of GHC Haskell is that it is possible to write a very high level code with performance matching or exceeding code written in a low level language, thus honoring the machine. Stream fusion for an example. Not sure if there is any other language with higher abstraction/performance ratio.
JavaScript comes to mind. Its benchmarks are a wonderful testament to the immense engineering resources poured into V8.
V8 is very impressive. But in exchange for its speed, it needs more memory. JS code optimized for speed tends to use more memory with Node.js than optimized Haskell or OCaml code:
https://benchmarksgame-team.pages.debian.net/benchmarksgame/...
Unfortunately, one space leak (extremely east to accidentally create in Haskell, much harder to debug than in other languages) cancels out all those benefits.
I find so many things about this line of reasoning wrong that I don't know where to start. So let's just pick one thing: Haskell does not ignore the physical reality of the machine. It's one of few languages that explicitly recognise it.
There are more facilities in Haskell to deal with this reality than in almost any other language you can think of.
Do you think Haskell recognizes the physical reality of the machine more than C does? If so, how specifically does it do so?
Technically, within the language, yes. C itself simply delegates a lot of logic to the "physical machine"[1] by leaving it unspecified or implementation-defined. In contrast, Haskell actually tries to model these differences within the type system and standard libraries, with IORefs, STMs, and whathaveyou.
(In fact, I just checked the IORef documentation and it actually references the x86/64 architecture manual to explain some of the behaviour that can be expected. I would be surprised if any part of the C standard did that.)
[1]: I mean, if we're using an x86 derivative we're still talking about a very fancy PDP-11 emulator.
I see. The language spec explicitly talks about the machine. That's not nothing.
In practice, though, when I have some piece of memory-mapped hardware attached, and I want to talk to it, in C I can say:
or whatever I need to flip the bits. C lets me actually control the whole machine. Whereas Haskell... I don't know, but I suspect it lets me actually use the physical machine a lot less.*(uint32_t*)0xF00BA4 = 0x0102ABCD;There might be a cleaner way of doing it, but
should have you covered.do let ptr :: Ptr Word32 = nullPtr `plusPtr` 0xF00BA4 poke ptr 0x0102ABCD> C lets me actually control the whole machine
Really? Can you run micro-ops?
You know you can write C inside Haskell, right?
Like, there's literally nothing stopping you. You can use FFI, and you can also write C inline.
You can have the best of both, if you want.
Well, in my defense I did offer a "line of reasoning" and not just a flat contradiction with no support.
Also, I'm sorry for any discomfort. To use an analogy, if your friend starts dating a girl that you know is bad for him, you can't just tell him that. You'll get punched. Especially early on when he's totally in love. It doesn't matter if you're right or wrong about her, there's no argument that is going to win against love, and to say anything ill of her is only going to cause pain and harm your relationship with your friend. And love is love, this applies to a person or a software tool.
I'm sorry for the discomfort, but I'm telling the truth as I see it and am not trying to hurt you. But Haskell, I think she's bad for you.
Your language should not care about the physical reality of the machine. That's the compiler's job, and the CPU microcode's job. And thankfully, every programming language ignores physical reality, including Assembly.
The goal of a programming language is to allow a human to express a sufficiently rigorous solution to a problem. From there, every step along the chain of execution is allowed to make 'unobservable' (for various definitions of the word) changes to execution. Your compiler might unroll your loops, or eliminate some unneeded intermediate variable, or even replace your entire function with a lookup table. Your CPU's microcode might do some weird fuckery with predictive execution. You shouldn't care, as long as the solution is, as far as you can observe, identical to your given one.
Whether functional programming is a better expression of computation than imperative programming is its own problem, but it's both silly and wrong to assert that imperative is better because it matches the behavior of the machine.
You can apply your argument to almost _any_ language. Haskell's semantics not matching the underlying machine has little to do with any of the issues in the article.
To the contrary, the simplicity of Haskell allows you to understand through simply rewriting expressions according to the rules/definitions you define. You don't have to worry about memory/effects/so many other things that have nothing to do with the _logic_ of what you are trying to do.
Of course, programs in reality often need to be changed to improve performance, but this isn't relevant when teaching.
Abstracting over the physical reality of the machine is part of the point. The physical reality of the machine isn't the focus in programming language design or theory, and it certainly isn't the focus of making maintainable code with properties like referential transparency, type correctness, and parallelizability. Abstractions, in short, allow us to make anything worth making.
The machine has no types. The machine has no variables. The machine has no functions, procedures, scoping, or information hiding. The machine has no assembly language. The machine has no machine code. The machine, ignoring the physical reality and focusing on an abstraction which could still potentially be in the realm of software and not physics, has a certain number of bits in flip-flops perturbed by other bits coming in on pins.
> Haskell's power users famously don't actually make anything with it (modulo pandoc and jekyll)
Self-contradiction is self-negation. You've destroyed your own argument, such as it was.
> Haskell's power users famously don't actually make anything with it
This is a lie that you're perpetuating.
Myself and many of my friends, colleagues, and associates make a living writing Haskell.
What kind of things do you use it for?
I write software for the reinsurance industry at Supercede[0].
One of the benefits of the Curry-Howard isomorphism is that people like myself who never make anything useful can use computers too.
> Physically, a computer is imperative, has mutating state, and is filled with all kinds of possible race conditions.
Those are too difficult for compiler writers to reason about. While you're mutating the finite set of registers in your high-level C code - just like a real computer does - clang is swapping those out for operations on an infinite number of immutable registers.
>"The Lambda Calculus", the physical machine is looked at with disdain and pity, its limitations to be worked around to provide the one true abstraction.
That isn't true. There are graph reduction machines whose natural model of computation is lambda calculus and they are generally very efficient compared to sequential processors implementing Turing machines.
Screw the machine. As long as you can transform one formalism to another, why encumber the human mind with needlessly complicated ones?
Brilliant write up.
> There is also a school of thought that you should start Haskell by teaching the IO monad first, but I am not convinced: in my experience, if someone gets exposed to IO early on, they will contaminate all their functions with IO. They will essentially end up writing Java in Haskell.
I don't think this is such a bad starting place. Crawling before walking. Purifying an (unnecessarily-) IO function into an ordinary function is a good exercise.
Trying to enforce non-IO from the start would be like enforcing 'no new keyword & factories only' in another language.
One of the reasons I liked the Haskell Wikibook [1] when trying to learn Haskell was that it didn't concern the reader with the IO monad until much later. It just presented 2 forms of using the language, a) normal functional style, b) an "imperative" "do" style, and then showed how they could be used together and when.
That was enough to do most basic tasks and only later was it explained why they can't be mixed directly.
I feel like the hard part is that, if you dive in early on with imperative-style code, it's really easy to try and do everything else the imperative "style" too...until you can't, or you run into some weird behavior stemming from how IO works, at which point you just end up super confused.
Starting without IO makes sure that you actually start to "get" how the language functions, so that once you jump into IO, the weird parts and how to mix it in with the logic written elsewhere makes a lot more sense.
> Purifying an (unnecessarily-) IO function into an ordinary function is a good exercise.
Agree! And I would add that you can "purify" a monadic function without having to rewrite it in non-monadic style. You can make it polymorphic over all monads and relegate the "impurity" to monadic functions that you pass as arguments/dependencies. A trivial example:
This is not that different to having a Spring bean that doesn't perform any effect directly—say, a direct invocation to "Instant.now()"—but instead receives a "Clock" object through dependency injection.twice :: IO () twice = do putStrLn "foo" putStrLn "foo" twice' :: forall m. m () -> m () twice' action = do action actionHaskell lets you express the idea of "program logic that only has effects through its dependencies" by being polymorphic over all monads.
I agree. Haskell is a really good imperative language if that's what you want to use it for.
And allowing beginners to write actual meaningful programs is a huge pedagogical benefit.
This is great and a lot of it rings true to my experience writing a book to teach Rust. It's basically a giant topological sorting exercise to find the optimal order to introduce syntax so that you steer clear of rabbit holes. Or you just end up drawing the owl.
For example, to implement a simple "hello world" program in Rust you have to use a macro (println!), so you can't even look for a function signature in the standard library docs to help. So you can either just say "don't worry about this for now, just trust me" or spend a whole chapter diving into macro syntax. The number of concepts you need to implement a basic program is pretty large and you could easily spend a chapter going into any of them.
Personally I'm not a fan of the approach in this post to just "lie" to people but I do find myself showing a non-optimal implementation because that's all the syntax I've introduced up to that point. Then later I show how to do it better. I know some readers just want the final answer up front though.
I provide a dependency diagram so students can work out where to apply most effort and how to catch up if they miss something.
I also show likely dependencies from the course assessment to the various topics. For instance, there is a strong dependency on the IO monad, but a weaker/optional dependency on (general) monads.
In terms of presentation order, I tend to over-simplify early in the course and circle back and make things more precise later.
(I'm teaching a 2nd year university course on Functional Programming with Haskell for the first time, so I found the OP fascinating. Thanks!)
I wonder if there could be a (or already is) a "teaching" Prelude designed for this purpose.
One of the reasons the standard Prelude includes partial functions and specialize versions of `map` and `filter` is to support the pedagogical use-case (as far as I understand the situation). Most production applications will use a custom Prelude of some kind in order to prevent programmers from using foot-guns like `head` or make things more general in the case of `map` and `filter`.
Turns out using linked-lists for everything isn't the best idea but a lot of Haskell applications will use them because it's in Prelude.
Bit of a balancing act supporting both use cases.
I don't know if there is one already, because the Haskell community generally heads in the other direction with its alternate Preludes.
But the effort to fix up the fixable issues mentioned in the post is about the same as writing the post was. Getting it distributed to the students may be a bit harder, depending on the local setup.
But it's definitely fixable with Haskell as it is today.
Linked lists are particularly tricky in Haskell, because as a data structure manifested in memory, they really stink. But as a lazy data structure traversed exactly once and thus just serving as a mechanism for providing "the next thunk", they're fine. Haskell and its laziness completely conflates the two of these, so it ends up being easy to think you have one and end up with the other.
Definitely. Linked lists are great for pedagogy and useful in many applications. I think it’s a bit of a sign that the struggle between pedagogy and practice can lead to suboptimal outcomes for both parties.
I think Elm is second to none as a tool for learning FP.
It compiles quickly, the guidance offered in error messages are best in class, it's small, and the mental model is consistent.
In fact I think it's far easier to learn Elm (and also perhaps web UI development wouldn't be such a shitshow if programmers earlier in their career used Elm to build their mental model) than it is to learn:
- React
- Redux
- Immutable.js
- Lodash/Ramda
- ES${CURRENT_YEAR}
- Webpack/Parcel/Grunt/Groan/Whatever
- etc…
I've seen so many early programmers go through some React course thinking they've learned FP, and yet struggle to solve basic problems by applying functions to values.
Code World[1] is a great project that addresses a number of the problems from the article, with an eye towards using Haskell to teach children basic math and programming simultaneously. Code World directly addresses a number of the obstacles outlined in this article:
1. Using an online editor with a rich built-in library removes any toolchain problems.
2. A custom standard library simplifies pedagogically unnecessary details like Foldable
3. The custom standard library also avoids currying (f(a, b) for functions rather than f a b)
4. Custom error messages improve the feedback students get from the compiler
I would highly recommend Code World to anybody looking to teach programming with Haskell. If you want to teach Haskell in a way that fits the existing ecosystem, it's also possible to run Code World without the custom standard library[2].
[1]: https://code.world/#
I find the go pattern absurd. Which of these is easier to read:
foldr k z = go
where
go [] = z
go (y:ys) = y `k` go ys
or foldr k z = foldr_k_z
where
foldr_k_z [] = z
foldr_k_z (y:ys) = y `k` foldr_k_z ysThe first.
Do you have insight you can share into why you find it that way?
I do. The name foldr_k_z doesn't say what the function is doing. It's just syntactic punning on a function call with two additional arguments. That's actually negative for comprehensibility. Names should be semantic, not syntactic. And that name doesn't say a thing about its meaning. The most it tells you is that it's related to foldr and its k and z parameters. But the details? Well you have to look for those. When you look at the definition, you discover that the it's the foldr worker that closes over k and z. You could name it foldrWorkerThatClosesOverKandZ, I suppose. But does that name contain any information that isn't present in the context? Does it help you actually understand anything?
I'd argue "of course not". You already know that it's the foldr worker because it's a local recursive definition inside foldr. And you already know it closes over k and z because it uses them without defining them locally. Nothing in that name provides additional semantic value.
You could still use it anyway, on the argument that a little redundancy can help aid reading. But the more Haskell code you read and write, the less that redundancy helps you with anything. On the other hand, the proliferation of names that contain almost no semantic content starts to drag on you. And so an idiom was developed for naming recursive workers that do the core job of what the parent's name promises: just name it "go". Nothing to think about. It's reduced down to a level that communicates exactly that it's not clever. It's just doing the thing it has to do. And it's standardized. If you see it, you know exactly what it's doing. There's no need to waste time mapping a new name into your existing set of well-known patterns.
So... As to the original argument's point? I think it probably is awkward for pedagogy. But it's absolutely better for actively using the language.
How about 'folding'? I've settled on that kind of name for looping/recursing helper functions.
Scheme has a bit of syntactic sugar called "named let" which makes this internal-helper pattern more concise/direct.
I think I prefer this:
foldr _ z [] = z foldr k z (x:xs) = k x $ foldr k z xsI suspect we all prefer that, but the point of abstracting out a closure that captures k and z is for performance.
Eh, the performance isn't from abstracting out a closure. It's from making the definition non-recursive so that it can be inlined. Then the compiler can see and inline the k and z parameters into the "go" block to eliminate indirect references. It's really all about inlining.
If it was just about making it non recursive so it could be inlined then the following would be sufficient:
That's obviously not sufficient, so it must have something to do with the nature of the closure. In this case I presume that it's because the closure captures k and z, although if you have any evidence to the contrary that would be interesting to see.foldr k z = foldr' k z where foldr' k z [] = z foldr' k z (y:ys) = y `k` foldr' k z ysThat's a reasonable question. It comes down to being transparent with the compiler. Not redefining k and z at every step is what allows their values to be inlined. You could make an argument about a sufficiently advanced compiler and partial evaluation, but the fact is that partial evaluation is far too slow to rely on for things you could just make explicit in the code instead. When the definition closes over the names, they trivially refer back to the same thing every time. So when the definition of go is in the same scope as what k and z refer to (which is usually the case after inlining foldr), k and z can be inlined into go.
When this happens, note that it's actually no longer constructing a closure at runtime. It has essentially closed over the values at compile time, using some very trivial transformations. If you use a definition that is too complex for those trivial transformations, you're getting in the way of the compiler doing its job. I always prefer to write my code with sympathy for the compiler. The less magic it needs to do, the better it does its job.
Point 11 surprised me. Not the “go” thing but the “where” syntax – I wish more languages had it!
Yes. Where is often lovely -- I want to delegate details, and not think about them yet, but keep that delegation scoped to the function that needs the relevant details.
But calling auxiliary functions "go" is almost always bad naming.
"go" is a fantastic name for communicating that all you're doing is exactly what the containing named definition promises. It's a lot better than adding "Worker" or "Impl" as a suffix of the same name as the parent. It contains no additional information because there's no additional information to contain - the parent name already says it all. So you might as well make it short and a standard idiom.
You're not doing what the parent definition promises though -- if you were you'd leave out the parent definition and the where, and just write the go definition with the true name.
Go is instead doing something similar to the parent that is easily transformed to the right thing (i.e. accumulated in reverse or something), or more general that does the right thing when called with specific arguments. Communicating how and why the function does what it does and works in conjunction with the top level wrapper actually matters.
Yes, you're doing what the parent promises. You're setting up some initial values for internal accumulators and closing over values that don't change in preparation for the loop. Then maybe you do a bit of cleanup after the loop.
But it's no more interesting than a "for" or "while" loop that takes up most of the body of a function in C or Java. People don't demand descriptive names for those, because they realize such a name would contain no useful information. That's equally true in functional programming.
> But calling auxiliary functions "go" is almost always bad naming.
Calling auxiliary functions "go" is like calling loop variables "i".
No, it's really not. An index variable has very little details to it -- there's nothing to communicate.
Auxiliary functions are complex behavior that should be reflected in the name.
They get their name from the function they are auxiliary to.
PureScript might be worth considering, a few of the downsides listed here aren't in PS, for example: Int/Number primitives aren't overloaded, strict evaluation, the various tools like package management are easy, explicit Prelude means you are free to import foldl from Array for example.
Of course PureScript has it's own downsides not apparent in GHC
PureScript also has the huge advantage that it's trivial to build "something". When teaching Haskell, I'm never sure what to build as an example. CLI tools aren't attractive, making a webserver is complex, and so is making a native UI. Of course you can use GHCJS, but at that point, why not just teach PureScript in the first place?
Or use gloss and make "something" arguably more easily than in PureScript?
Had good experience at https://exercism.org/tracks/haskell
I don't think this article is helpful for beginners.
I think this article's audience is teachers of beginners, not beginners themselves. At least the author is writing about their experience as a teacher.
Don't know why you thought it would be an article for beginners, but good on you for linking a resource regardless.
The article is an introduction to the basic concepts of Haskell, thus beginners may be considered a target audience. However, the style and the content brings to my mind the dreaded monad tutorials. I'm not convinced the article is about pedagogical downsides of specifically Haskell. It mostly reads like a collection of random purported gotchas/differences from someone with experience with other languages.
I'm taking my cue from the title of the article and the intro - seems pretty certain
For someone not familiar with functional programming (but familiar with OOP/procedural), this was not easy or intuitive for me to follow.
I generally trying to avoid single paradigm languages that are trying to show me the one and only "true" way. I see no business benefits coming of of their use.