Diminishing returns of static typing
blog.merovius.deThere are 3 main areas of interest in the discussion of benefits of static vs dynamic typing.
- Quality (How many bugs)
- Dev time (How fast to develop)
- Maintainability (how easy to maintain and adapt for years, by others than the authors)
The argument is often that there is no formal evidence for static typing one way or the other. Proponents of dynamic typing often argue that Quality is not demonstrably worse, while dev time is shorter. Few of these formal studies however look at software in the longer perspective (10-20 years). They look at simple defect rates and development hours.
So too much focus is spent on the first two (which might not even be two separate items as the quality is certainly related to development speed and time to ship). But in my experience those two factors aren't even important compared to the third. For any code base that isn't a throwaway like a one-off script or similar, say 10 or 20 years maintenance, then the ability to maintain/change/refactor/adapt the code far outweigh the other factors. My own experience says it's much (much) easier to make quick and large scale refactorings in static code bases than dynamic ones. I doubt there will ever be any formal evidence of this, because you can't make good experiments with those time frames.
> For any code base that isn't a throwaway like a one-off script or similar, say 10 or 20 years maintenance
I think one of our problems is that people have downgraded the importance of this. Much code nowadays (rightly or wrongly) is considered "disposable" - people think that the likelihood of any given piece of code they are writing as surviving more than a few years is negligible. It is a natural assumption when you see the deluge of new technologies, hype cycles, etc. It is further reinforced by the fact that people's empirical experience is that a huge amount of their software work is abandoned, rewritten, outdated, obsoleted, etc.
I think these views are horribly mistaken, because at a deeper level even if 90% of code gets abandoned, the quality of the 10% that survives is still going to determine your maintenance cost. And half the reason we keep throwing code away is because it was created without consciousness of maintainability - it is so easy to say that the last person's code was garbage, so we are going to rewrite it because that is faster than understanding and then fixing the bugs in what they wrote.
I observe this in myself: my favorite language to code in is Groovy - a dynamic, scripting language with all kinds of fancy tricks. But my favorite language to decode is Java. Because it is so simple, boring, there is almost nothing clever it can do. Every type declared, exception thrown, etc. is completely visible in front of me.
> people think that the likelihood of any given piece of code they are writing as surviving more than a few years is negligible
Well, most code is written by relatively-inexperience developers, who have not had to retire a system or support a legacy one, and don't know what should be sought out & what should be avoided when designing a system. Thus, they make decisions with limited information to solve the problem at hand, and only later find out the implications of those decisions when someone wants to (say) deploy it as a dockerized service on k8s.
It's one thing to read The Mythical Man Month, and another to write a replacement system that stops providing business value after 30 months and needs to be rewritten to support the current needs.
> it is so easy to say that the last person's code was garbage, so we are going to rewrite it because that is faster than understanding and then fixing the bugs in what they wrote
There's no black and white answer here: sometimes the code is so convoluted (or in the wrong language) that it has to be rewritten; sometimes the design of the system strongly resists changes in behaviour & so much of it needs to be made more flexible that an incremental improvement would cost about the same as a full rewrite.
And half the reason we keep throwing code away is because it was created without consciousness of maintainability.
Well, this has nothing to do with static vs dynamic typing. You can write unmaintainable code in static languages very easily. In startups, developers often overlook maintainability, I completely agree but that's because everyone knows that the code you are writing today might not be needed 2 years down the line, you are mostly iterating to find PMF.
It's not totally divorced from the static vs dynamic argument. One of the main arguments people deploy for dynamic typing is that most type declarations are boilerplate and take time to code but don't add any value. But the reality is that they do add value because they enhance the readability and maintainability of the code (and this is one reason I'm not even a great fan of type inference in many situations). So it comes back to the value of maintainability vs getting your first iteration of the code to work.
It's also caused by the fact that nowadays a lot (most?) of the code that gets written is for the web or mobile, where technology changes incredibly fast.
In my experience with growing companies, even compay-critical code bases get rewritten within 3-4 years to account for flexibility that the previous strongly-typed system just can't handle. A well designed system uses strong types for the "knowns" but allows changes via dynamic types for the "unknowns". Those are the systems that last.
> I observe this in myself: my favorite language to code in is Groovy - a dynamic, scripting language with all kinds of fancy tricks. But my favorite language to decode is Java. Because it is so simple, boring, there is almost nothing clever it can do. Every type declared, exception thrown, etc. is completely visible in front of me.
one of my favorite things about groovy is that it's easy to start strongly typing things as your code shapes up, because it allows for totally dynamic types, but it also allows for strong static typing. haven't really had the chance to use groovy since 2012, though.
Static typing was grafted onto Apache Groovy in 2012, but no-one really uses it. I'm not sure about its reliability -- its use never took off on the Android platform, and none of the Groovy codebase itself has ever been rewritten in static Groovy.
Groovy's still great for scripting on the JVM though, for stuff like those 10-liner build scripts for Gradle, glue code, and mock testing. Just don't use Groovy for building systems -- use a language based on static typing from the ground up, like Java, Scala, or Kotlin.
> Static typing ... no-one really uses it. I'm not sure about its reliability -- its use never took off on the Android platform, and none of the Groovy codebase itself has ever been rewritten in static Groovy
You keep saying this repeatedly but it just isn't true:
https://github.com/grails/grails-core/blob/master/grails-cor...
https://github.com/groovy/groovy-core/blob/master/src/main/g...
Both your examples use very simple logic. The Apache Groovy codebase example is of some peripheral functionality, i.e. a builder. All the methods in your Grails codebase example are, at most, 1 line long. I can't be bothered re-investigating what proportion of the core Groovy codebase really uses static compilation -- it certainly wasn't much only 2 years ago. As for Grails, virtually no-one has upgraded from v.2 to Grails 3 since it was released 2.5 yrs ago, or started many new projects with it.
i was just saying i personally found type declarations useful as the couple of small groovy codebases i worked on progressed over the short period (maybe a year?) i worked on them. thinking about how i might declare types made me decompose things a bit differently, which made the logic simpler in some places, which allowed me to do things like get rid of tests where i checked the behavior in a case where a method was missing on a function parameter, because now i knew the parameter was over certain type (and thus would have that method).
>I think one of our problems is that people have downgraded the importance of this. Much code nowadays (rightly or wrongly) is considered "disposable" - people think that the likelihood of any given piece of code they are writing as surviving more than a few years is negligible.
I think younger devs think this. Once you get a decade or more experience, you grow wiser and realise that code never dies, and especially the code you wish would die is particularly tenacious. And this is pure speculation, but I would wager that the number of lines of legacy code that is kept alive with maintenance is much greater than the number of lines of code that gets abandoned/rewritten/obsoleted.
I agree that the third point is important, but it's not clear that it's static typing that is important, and not type annotations. One reason why I can still fairly easily read and understand Eiffel code that I wrote decades ago is Design by Contract. And there's normally nothing static about DbC, it's about assertions that are checked at runtime and that by convention are part of a class's interface.
What both type annotations and DbC are is self-enforcing documentation (of an interface) that doesn't go out of sync with the actual code. But for that, you don't necessarily need static type checks. Now, type checking of type annotations that happens exclusively at runtime is an option that hasn't been explored much (after all, if you already have type annotations, why not let the compiler make use of them?), but an option that has sometimes been used successfully is having a mixture of static and dynamic type checks. You can often greatly simplify a type system by delaying (some) type checking until runtime (examples: for covariance or to have simpler generics).
I think one disadvantage of runtime type checking and DbC is that the compiler can't aid you in refactoring.
For example, if you add a case to a variant or sum type, or change the parameter or return type of some function, in a static type system, the compiler can tell you all the locations you need to change. In a runtime system, you have to find them yourself, or wait till you see an error at runtime.
Now, this is still better than the alternative of having the error propagate until it crashes 10 functions down, but the compiler finding all the places that need to be changed is something I've found to be really useful, especially in early development when there's a lot of refactoring happening. Presumably, this is probably useful in later stages as well, when the system is large enough that you can't expect to find all the uses of a function or type manually.
IDEs can still substantially aid in refactoring with runtime type checking and return type annotations - see WebStorm and PHPStorm for two good examples of this. It isn't perfect - but certainly for things like changing the return type of functions, it will usually get you at least 90% of the way there.
Now, whether you consider that's actually helping solve the refactoring, or actually introducing new bugs, well - that's another issue :)
Completely agree. I've had a team member able to quickly contribute a change to a project he didn't work on due to static typing keeping his code within the guard rails. It's very valuable.
4. Performance. There is software that can't be slow.
Right - I was trying to avoid runtime considerations and keep it on language, but it's true there are some concerns that extend into language. The border is becoming fuzzier when you consider that many (most?) languages these days can run in a browser after some transformation. Some even have 3 or more runtimes including js, a managed runtime, or native.
Incredulous that people would downvote this.
Seriously with the downvotes? Dynamic languages are obviously slower than static languages in general. Some special cases aside.
Performance is a concern for some projects - you wouldn't write an OS, database kernel, or mainstream game engine in a dynamic language.
How is that not a valid concern in the dynamic vs static typing argument? The parent comment has a legitimate point.
Technically, this is a strong/weak type distinction. Dynamic but strongly typed languages like Julia and erlang can be quite performant when given strong fences around the types their functions are passed.
No. Not at all. It has more to do with implementations.
Python is strongly typed, but dynamic. But slow. JavaScript and PHP are weakly typed and dynamic as they will coerce types in strange ways during operations and comparisons.
Lua is dynamically and strongly typed like Python, but LuaJIT can sometimes produce code on par with or even slightly faster than native code - because it's really JIT compiling the hot path to native code with some guards and offramps to interpreted code for special/unexpected cases.
But there are limits to those techniques and it's doubtful that dynamic languages will ever perform at the same level as static languages because the compiler simply has more information and doesn't have to be as pessimistic or insert as many runtime guards.
> it's doubtful that dynamic languages will ever perform at the same level as static language
I think these timing benchmarks come after the jit warmup procedure, so the presumption is that the compilation cost is amortized over lots of runs in an HPC-type setting:
I’m in the static camp myself, but there are dynamic language JIT runtimes that meet or exceed the performance of many static languages.
In special cases. It's not nearly as reliable. There are a lot of ways to get poor performance out of such JIT compilers, and writing performant code for them is a bit of a black art that varies from version to version. Just read a little about the "fun" people have had with V8 over the years.
Including the Java runtime - the JVM is basically dynamic.
Java is a static language. While there are dynamic languages on the JVM like Jython, JRuby, Groovy, they don't get anywhere near the performance of the static languages.
Oh, sure it's statically typed language with all things being constantly cast to and from Object. (Have you ever used collections? Have you ever heard of "type erasure" term?)
There is code that doesn't `constantly cast ti and from Object`, most code won't need to, anyway. Generic information is kept during type-checking and then is discarded, this can be considered an optimization. There are some warts that appear because of erasure, but I believe you're wrong in implying that most developers care (most java developers, maybe scala peole have more issues because of erasure).
That it doesn't have reified generics like C# does not make it a dynamic language. Go doesn't have generics at all, but it's still a static language.
These are good points, but what about considering replaceability as an alternative to maintainability?
I personally find dynamic languages allow for easy replacability, as there's less explicit references of types. However this is highly dependent on the system being somewhat modular I suppose.
This is basically why erlang's hot code reloading would be impossible as a general solution in a statically typed language
15 commits 4 years ago, 121 commits 8 months ago
If these things are so good why does no one use them? EVERYONE using Erlang is using the same hotswapping facility. This sort of dynamism is just fighting against the language in an environment like Haskell.
Because Haskell isn't used in the same domains as Erlang, and hot patching full Haskell semantics either isn't needed, or the program is likely already using a DSL for the parts needing well defined dynamism, like the Yi editor, and so hot patching the runtime isn't needed.
Why would you need to? GHCi supports dynamic code reloading which most people do during development. During runtime there’s not so much of a use case though for most people.
Replying to myself. There are some use cases though, and in fact my comment points to one of them. Xmonad configurations are themselves Haskell code, and Xmonad does dynamic code reloading to make configuration changes without having to logout. Xmonad rolled their own solution, but "dyre" is a reusable generalization inspired by what Xmonad does. Using Haskell DSL's for configuration of Haskell programs has some significant advantages and dynamic code reloading is essential if the program itself is long-lived or can tolerate no downtime. I don't meant to imply there are no uses ... but outside of configuration or development environments (or real-time routing algorithms in Erlang's case, something I wouldn't advise doing in Haskell) I really don't know what general use cases there are for it. If someone does though, I'd love to learn something new.
Hot code reloading isn't used in practice to much extent in Erlang codebases either. It exists, in my experience mostly for Dev or emergency patch but not for normal release upgrades. Dynamics with tagged values and pattern matching def make Erlang the easiest to hot reload.
As long as we are here, with experienced programmers of both "sides", here's a question I've been wondering for some time: is it possible to create a haskell-like type system on an erlang-like language?
Erlang has dialyzer, which is great but it's based on optimistic typing. Hot-reloading aside, what would be the issue to creating such language? Maybe some issue with the process pids, which are quite dynamic?
There is no issue. Plenty of typed process calculi exist, and concurrent ML existed back in the late 90s and features channels and processes.
There is Cloud Haskell[1] which is reasonably close to a Haskell version of Erlang. It forces you to run the same code everywhere while in Erlang you could just hope that the code everywhere is compatible, though.
My first big love in languages was Turbo Pascal, because, it was the first one I learned. So many people do this, fall in love with the first language they understand.
My second big love in languages was Python. It's also the language in which I wrote my first major software product. It was this product that taught me to hate Python. Not because it was hard to create, or because quality was low. In fact, I was VERY FAST to produce 1.0. Took about a week. But after that, I had to work with other developers. That's where everything went to hell.
Then I got a new job a few months later where almost 100% of my time was spent doing maintenance on aging codebases written in Java, a language I never worked with before then. I won't say I fell in love with Java, but, I did fall in love with the ease of inspecting "the world" in each project. As soon as I had it setup in my IDE properly, it was so ridiculously easy to explore how everything related, and then to make refactoring changes? So much easier than it ever was in Python.
Now, at that point I didn't directly make the connection with the type system, but, in retrospect, I know that all of the value I derived from working in Java vs. Python came from having a descriptive, static type system. And frankly, I never once felt slowed down by the need to specify my types up front. In fact, the opposite is true. It taught me to put more thought into my data structures and vastly improved the quality of my software design before I even started writing logic.
Sadly now I'm moving into the data science/data engineering field and everything is Python and I don't know. I don't want to go back to this nightmare. It's like I spent the last decade in first class establishments with the best tools and now I'm going to have to work in the mud with sticks and shovels. I am interested in the field in terms of the capabilities it enables, and I have no problems working in Scala or whatever decently typed language is around, but, the reality is the lion's share of people in this field are doing everything in Python or R and I hate them both.
I figure I have two choices: help advance the capabilities of "better" platforms, or pursue some other direction in my career. It's too hard to know how much better life can be, then go back.
> Sadly now I'm moving into the data science/data engineering field and everything is Python and I don't know. I don't want to go back to this nightmare.
Now there are (optional) type annotations and mypy [0]. I've been using them in my latest projects and I found them useful/helpful.
I started out with that in mind when I was initially working on some PySpark code. It fell completely apart the moment I had to include boto3 libraries. In fact, I'd hold boto3 as the pinnacle example of searingly awful garbage that languages like Python promote. Completely impossible to do even the slightest static analysis of a library that's 100% dynamically generated at runtime.
The worst part is that versions of the API for other languages are fine. It's just the Python one they decided to go all "clever junior developer" on.
> you can't make good experiments with those time frames
Multi-decadal longitudinal studies are not too uncommon in medicine, epidemiology and psychology. Why there is no will to conduct, or fund, this kind of research in computer science, I am not sure.
It's hard to find N projects that are comparable and live for that long. Not least because whatever effect you are seeking will be much less noticeable than e.g differences in developer skill and experience.
Maybe because of "lifespans"?
People live for ~80 years... doing a 2-4 decade study isn't out of the realm of possibility.
Computers on the other hand... While there are a few mainframes that live to be 10 years old - the vast majority of the internet, program languages, apps, etc... Hell, even the iPhone just hit 10 years old.
How can you have a 20 year study when the majority of "code" is less than 10 years old?
10-20 years?! Holy Cow! Other than huge software projects (like Word or Mac OS - and even then...) is there really software that still has that kind of maintenance window? I've worked for a Fortune 150 company for nearly 2 decades. There is not a single piece of software at the company that has not been rewritten from scratch (usually due to business changes) at least once every 10 years. I can't even imagine something that would still be useful after 10 years (honestly, even 5 years seems like a stretch). Just think - software written 20 years ago would have been written when the WWW was still soiling its diapers.
I'm working for a healthcare insurer and I don't believe anything has been rewritten since they went from mainframe to .Net.
The previous application I worked on is over a decade old (and it shows). The current application I'm working on is about 8 years old.
Neither applications has any sign of being replaced. Which would be insane, as they both have roughly a decade of laws & regulations and business lessons embedded in them. Despite the state of especially the older application, I don't see how rewriting the entire application would fix anything.
At best parts would be rewritten. And the parts I'm thinking about wouldn't be rewritten because of technical reasons, but because of the way they work. The prime example is a part that only 1 person, a business user, understands.
I work in medical robotics and I can tell you that a lot of our code is quite old, and we have a culture that code you write will stay around. Some things don't change for example optimal control algorithms, while others things are very difficult to change such as network routing. So, while the applications get re-written on the 4-10 year time frame parts of the OS are 15+ years old.
> I can't even imagine something that would still be useful after 10 years
Ah the HN perception bubble.
Good code last longer than that. Bad code gets replaced.
Good code is replaceable. Bad code is hard to get rid of.
Some code sticks around because it's great at what it does. Some code sticks around because it works if you don't touch it and is impossible to delete due to various kinds of dependencies.
All code is replaceable. It's bad APIs that are hard to get rid of.
Most POSIX APIs, for instance, are confusing, obtuse and unnecessarily imperative but still good enough in spite of being 40 odd years old. There's way too much code that implements or calls them to justify making significant changes as this point.
We are talking years and decades, where "replacing" is applied to entire applications, not chunks of code.
Especially in SOA, it can be cheaper to replace a poorly written service than trying to rewrite all of it over time.
If it ain't broken, don't fix it.
> If it's broken but in familiar, known ways, with runbooks, and the cost of rewriting it is way higher than supporting it for 5 years & hopefully we'll move away from the business model which requires it, don't fix it.
I have started companies 15+ years ago that I sold which still use (a lot of) the same code. You think (I thought) that would never happen but I think this idea that every company rewrites everything is not all that common. Banks don't, but the small startups I work with don't either. Frontends get redone, there is refactoring and library updates but most (unless trivial tiny systems) just stays the same. You need to think that they run a business and that business is not software development usually. So if there is not a pressing reason to replace things, why would they allocate money for that?
I also consult for Fortune 500 companies on a regular basis, and most of them still have core business processes running on mainframe code bases well older than 10 years. No one is doing major greenfield development on mainframes, but they still exist all over the place.
I think IBM and their Z division would disagree. IBM Z Users: The 10 top insurers, 44 of the top 50 banks, 18 of the top 25 retailers, 90% of the largest airlines
That seems to agree with me? Mainframes are still in use everywhere. However, that statistic doesn't imply that those customers build their brand new greenfield capabilities on those mainframes.
I'd love to read more about what it's like doing mainframe consulting. Do you own or know of any blogs in this area?
I don't know of any. I don't consult for the mainframe systems themselves. Usually I get pulled in when the client realizes that their last mainframe developers are years away from retirement, and they cannot find any new mainframe developers to hire. That starts a mad dash to migrate/replace the mainframe solution without disrupting the entire business. Despite the existing codebase, these projects are very difficult because noone knows how they work anymore.
At one of my clients they had one mainframe developer left that knew their systems. She had already tried to retire, but they got her to agree to stay on for 5 years in return for bags filled with money. That meant they had 5 years to rewrite on a platform they could actually hire people for. 5 years to replace a system with decades of history.
IME most software will last that long, if it's remotely successful then it will at least make it to the 10 year mark. Business rarely changes drastically enough for a rewrite to make financial sense.
About best you can hope for is a new "epoch" that forces a rewrite. In the MS world we went from classic VB and VC++ to .net, a lot of companies went through rewrites to keep up with that and some of that software is now nearing 20 years old. There has been a few other epoch like changes, terminal -> GUI, c++ -> java, desktop -> web, except for maybe the last one it's been quite a while since a new epoch has begun.
Parasolid [1] is 30 years old and is the dominant B-rep solid modeling kernel powering Solidworks, Siemens NX and Solid Edge. It's very difficult to see it being replaced as it is so entrenched.
Parasolid (written in a C dialect) was a rewrite of Romulus (written in Fortran) and that goes back to 1974. And that was a rewrite of Build that originated from Ian Braid's PhD thesis. [2]
I know people who are still working on the same Parasolid code after 30 years. Some of them
Disclaimer: Parasolid dev 1989-1995
[1] https://en.wikipedia.org/wiki/Parasolid
[2] http://solidmodeling.org/awards/bezier-award/i-braid-a-graye...
I recently finished making some mods to a PHP CMS to make sure it works fine with PHP7.1. The base of this code is 17 years old and is still used every day.
The CSS/JS on the frontend rarely lasts more than a few years, usually changed to due to design trends (flat, responsive, mobile-first etc).
Everything that controls hardware has tremendous maintenance windows. Trains, planes, industrial machines.
Most business critical software like SAP for example is also based on decade old codebases.
Wordpress is 14 years old.
It's probably a good reference project, when talking about maintenance (nightmares).
My business, https://www.filterforge.com/, is almost 12 years on the market since the release of v1.0 back in 2006. And if we also count the 6 years of initial development, that would be 18 years in total.
I also work at a similarly large company.
Some of our internal infrastructure systems are 10-20 years old - some could definitely do with a complete rewrite, but in the meantime, they're mission critical systems.
As for our products - some of them have even longer timeframes than 20 years.
I know of multiple finance companies that have Mainframe Assembler from the 70s.
I wonder how older you are than the average javascript crowd :-)
Another little thing is (from my own point of view) is that many application don't live in a vacuum : they use json schema, or WSDL; and database with types and constraints. So what the language does not "type", the context does.
I would add fun.
I know "fun" is highly subjective but still important none the less.
performance at runtime is usually factor in type systems as well.
Rather than conduct experiments I believe that existing data still holds an answer. There's one metric that hasn't been looked at. Many projects over a long period of time tend to get rewritten in a different pattern or a new language/framework. I would say dynamic languages tend to have this problem in greater proportion over say a typed language like java. This is a direct long term marker for the maintainability of a language.
Counterpoint: Java projects tend to be maintained rather than rewritten because the verbosity of the language makes it difficult to tell boilerplate from productive code. It's less dramatic to rewrite in a dynamic language because understanding the full system before and after is easier.
I think what's often missing from these arguments is that statically checking (or inferring) homogenous lists is probably one of the most superficial uses of the type system in Haskell (and indeed not the interesting feature most power-users of Haskell are interested in as far as I can tell).
What is interesting is using the type system to specify invariants about data structures and functions at the type level before they are implemented. This has two effects:
The developer is encouraged to think of the invariants before trying to prove that their implementation satisfies them. This approach to software development asks the programmer to consider side-effects, error cases, and data transformations before committing to writing an implementation. Writing the implementation proves the invariant if the program type checks.
(Of course Haskell's type system in its lowest-common denominator form is simply typed but with extensions it can be made to be dependently typed).
The second interesting property is that, given a sufficiently expressive type system (which means Haskell with a plethora of extensions... or just Idris/Lean/Agda), it is possible to encode invariants about complex data structures at the type level. I'm not talking about enforcing homogenous lists of record types. I'm talking about ensuring that Red-Black Trees are properly balanced. This gets much more interesting when embedding DSLs into such a programming language that compile down to more "unsafe" languages.
List typing isn't as superficial as it seems. The following has happened to me multiple times, perhaps in the last month:
I have a large code base. I want to replace a fundamental data structure to support more operations/invariants/performance guarantees. I change the type at the roots of the code base. My instance of ghcid notifies me of the first type error. I fix it. This repeats until the program compiles again. I run the tests. All the tests pass.
This is insane in Python/C/Ruby. I've had to do it in C and Python. In Haskell I do it with impunity.
The type system doesn't just check what my program does, it is the compass, map, and hiking gear that gets me through the wilderness.
Yup, love that about statically typed languages. This happens all the time for me in C#. I have several libraries I like that do a lot of code generation. When the project is young, directly handling the generated classes works well but as the project grows, I inevitably want to wrap the handling of the generated classes. It's awesome to be like, welp... it's time to handle this one type differently. Change the return type of an interface or the type of a container and just have the compiler tell you everything you broke.
I’d argue that code generation is an anti—pattern that is only necessary because of static typing. A dynamic language would let you change the implementation of all generated objects simultaneously.
Code generation is, in essence, the ultimate static construct since it allows you to compile any feature you would achieve through late binding and reflection, but with a static code path. Which in turn lets you leverage the compiler toolchain more and bring more errors towards compile time(where they're cheap to resolve).
Dynamic languages can always lean on runtime features, but that's also their peril. Late binding deprives you of leveraging the tools in favor of "trust me".
In both cases you can get a maintenance nightmare, of course. The point as I see it is to move things toward the runtime when the error case is not troublesome, and towards the compiler when automating in more safeguards would help.
Code generation by macros is an important feature of Lisps, which tend to be dynamically typed. Maybe it feels less like code generation when you don't also have to produce the correct type annotations, but a good inference engine can eliminate most of those from statically typed languages as well.
You can actually do a huge amount of metaprogramming in dependent type systems and still have the the power of static checking. We're still figuring out how to improve the ergonomics to the level that it matches macros and other code-gen methods, but it's super exciting stuff.
That may be entirely true and everything you can achieve with Roslyn and T4 templates may be achievable through some more elegant construct in another language.
But I've never had the luxury of choosing a tech stack for it's purity of design. So the feature is wonderful in my day to day regardless.
I'd love to build 3D experiences in a language like lisp or scheme. It'd be great fun to learn but I don't currently have the luxury of the time it would take to ramp.
And I certainly don't have the political capital to convince my entire dev team to change.
In principle, this should be possible with static typing with a robust enough type system, too.
A lot of generated code could be done with reflection or meta-object protocols.
There's a good reason people generate code instead (in statically and dynamically typed languages!): performance.
I find that people doing these claims have seldomly really worked in large Python codebases...
Personally, I find it pretty workable in Python with a big codebase (but you have to respect the rules like having a good test suite -- you change the time from compiling to running your test suite -- which you should have anyways)...
I find that the current Python codebase I'm working on (which has 15 years and around 15k modules -- started back in Python 2.4, currently in Python 2.7 and going to 3.5) pretty good to work in -- there's a whole class of refactoring that you do on typed languages to satisfy the compiler that you don't even need to care about in the Python world (I also work on a reasonably big Java source codebase and refactorings are needed much more because of that).
I must say I'm usually happier on the Python realm, although, we do use some patterns for types such as the adapter protocol quite a bit (so, you ask for a type and expect to get it regardless of needing a compiler to tell you it's correct and I remember very few cases of errors related to having a wrong type -- I definitely had much more null pointer exceptions on java than typing errors on Python) and we do have some utilities to specify an interface and check that another class implements all the needed methods (so, you can do light type checking on Python without a full static checking language and I don't really miss a compiler for doing that type checking...).
I do think about things such immutability, etc, but feel like the 'we're all responsible grown ups' stance of Python easier to work with... i.e.: if I prefix something with '_' you should not access it, the compiler doesn't have to scream that it's private and I don't need to clutter my code -- you do need good/responsible developers though (I see that as an advantage).
> I definitely had much more null pointer exceptions on java than typing errors on Python
Null pointers are a type error. The fact that several nominally "statically" typed languages don't differentiate between nullable and non-nullable types is a significant source of failure in their type systems. Using a modern language that properly identifies nullable values as a distinct type from non-nullable ones goes a long way towards eliminating a whole host of problems. It will be interesting to see what things look like in 10 years or so once Rust has had time to really displace a significant portion of the C and C++ code in the wild, and hopefully Kotlin has killed off Java (and if we're really lucky Typescript has done the same more or less with Javascript).
I'd worked in OpenStack for 4 years and with Python in general for about 10 years.
I can say from my experience it is definitely possible to maintain large codebases in Python. The type errors of the superficial variety that the OP refers to were usually caught before they made it to production (and were rare besides if you were experienced enough to avoid them). It requires discipline to maintain tests and write code in a way that avoids errors.
I've been learning OCaml and Haskell for a couple of years along with formal methods using TLA+ and Lean. I used to think type theory was the accounting of maths. I still think that's at least partly true but the power it brings you as a programmer is quite powerful.
I find working with Haskell or OCaml to be much more productive. Instead of stepping through a debugger or following tracebacks (a descriptive error) I get prescriptive errors as I make changes to a Haskell codebase. The propositions in the type system form a much better specification than unit tests alone.
I still like Python and C for many reasons and will continue using them where appropriate. However I think Haskell/OCaml offer quite enough power that everyone should at least consider what they bring to the table.
I do the exact same thing on large codebases of Ruby, but instead of the compiler type-checking errors, it's the test suite errors.
It's true that static type-checking proves the absence of an entire class of errors. But it doesn't prove that the code does the correct thing; it could be well-typed but completely wrong. On the other hand, tests prove that the code does the correct thing in certain cases. ...Of course, it's up to the developers to actually write a good test suite.
The faster we can all accept that there are pros and cons to both, the faster we can come up with a solution that takes advantage of the best of both worlds. That's the whole point of this OP.
I, personally, have always wondered about ways to dial in to the sweet-spot over time as a project matures. At the start of a project, shipping new features faster is often more important. But if the project survives, maintenance (by new developers) and backward compatibility become more and more of a priority.
> It's true that static type-checking proves the absence of an entire class of errors. But it doesn't prove that the code does the correct thing;
So prove it yourself. Proving things about programs that rely on dynamism is invariably much harder than proving things about programs that don't.
> This is insane in Python/C/Ruby.
It's only insane if you don't have test cases with good coverage, in which case you are very-very-very screwed, statically typed or not.
Testing is not the answer to the problem of knowing if I refactored everywhere necessary. In a statically typed language I don't even have to worry about what do I do if my function is passed a var of the wrong type, it won't compile.
I don't really the lumping of C in with Python and Ruby here. The C compiler picks up on that, too. All over this comment section people are calling C weakly typed, I don't get it. Is it because void* exists? Every language has something like that.
It's not just that void * exists, but that it's basically mandatory.
C's built-in arrays are super weak, so you need some library to do proper resizable arrays. Since C doesn't have generics, such a library will use void * as the type for putting values into the array and getting them back out again. You'll be casting at every point of use, and nothing will check to make sure you got the cast right, other than running the code and crashing.
> C's built-in arrays are super weak, so you need some library to do proper resizable arrays. Since C doesn't have generics, such a library will use void * as the type for putting values into the array and getting them back out again. You'll be casting at every point of use, and nothing will check to make sure you got the cast right, other than running the code and crashing.
There are other options though like macros and code generation. Code gen in particular can give you more options than generics without sacrificing any type safety.
I don't know why macro-based containers aren't more popular. You can have a quite usable interface, and even the implementation isn't that ugly. Example:
http://attractivechaos.github.io/klib/#Khash%3A%20generic%20...
This approach isn't just more strongly typed than using void * for everything, it's also typically more efficient. For an array, you can put larger structs directly in the array rather than being forced to use a pointer. For a hash table, the same applies, plus you can avoid expensive indirect calls to compute hashes. (There are alternative non-macro approaches that trade off that overhead for other types of overhead, but you can't do as well as with a specialized container.)
I guess you could ask, at that point why not just use C++? And a lot of people do, and the people left writing new C programs are often traditionalists who don't want to switch to new approaches. And to be fair, there are disadvantages to macro-based containers, like increasing build time. But I still think there's room for them to see more adoption.
I use a code generator. It has a great type system, it's composable, mature and it can even compile pretty fast with the right tooling. It's called C++!
Templates are good for some things but they only do a fraction of what code generators can do. With code generation you can generate types from database tables, web APIs, etc. You can do things like declaratively declaring database views and generate huge chunks of an application. It can handle all sorts of boiler plate code that you can't do with templates alone.
In this thread, a.lot of people are conflating strong/weak typing with static/dynamic typing.
Static typing versus dynamic typing is fairly binary: if your types are checked at compile time they're probably static, while if they're checked at run time they are probably dynamic. Haskell, C are statically typed, Python, JavaScript are dynamically typed.
Strong/weak typing is more of a spectrum. A strong type system can check many properties of programs and accommodate many patterns as types. A weak type system, on the other hand, can't check many properties of programs, and has to be bypassed to accommodate common patterns. JavaScript has probably the weakest type system, because it checks almost nothing ("hi" + 42 returns "hi42" even though this is nonsensical, {}.foo returns undefined rather than throwing a type error). C is fairly weakly typed because you can add disparate types (int* + int returns int* even if you intended to add two integers) and the type system has to be bypassed with void* to do anything sizeable. Python, ironically, is slightly stronger, in that applying operators to objects of types with no defined relationship throws exceptions ("hi" + 42 errors). A spectrum from weakest to strongest might look something like: JavaScript, C, Go, Ruby, Python, Java, C#, OCaml, Haskell.
My personal experience is that the difference between static and dynamic types isn't very important to my development process or code quality. I have to run and unit test my code to verify it, so the checks happen regardless of whether they happen at compile time or run time. But the difference between strong and weak typing is huge. Strong types catch more bugs, but perhaps more importantly, they catch those bugs where they occur. A type error when adding "hi" + 42 is far more useful for debugging than a mysterious unit test failure on a completely different function where it's returning "Hi42username" instead of "Hi Username" because you added the wrong variable. A segfault 30 lines later is harder to debug than an error when trying to add an int to the value at an int*.
> The C compiler picks up on that, too.
Except when it's not. Just five days ago I was debugging an error in my Erlang port driver that was caused by me passing receiver (ErlDrvTerm, an int in disguise) in the place where I wanted number of iterations. The funnier thing was that the declaration of the function had the arguments in correct order (and that's what guided me), but definition had them swapped. The compiler did not catch that bug, because, well, both are ints, so apparently the declaration and definition match, don't they?
That’s why people should use one-element structs instead of typedefs for that kind of use case. A struct is a distinct type that can’t be accidentally mixed up with random integers, but its memory representation and efficiency will generally be identical; and you can add one-liner conversion functions to minimize syntactic overhead when you do need to convert to/from raw integers. Same idea as ‘newtype’ in Haskell; it works pretty much just as well in C, at least for ‘ID number’ sorts of types where you’re usually just shuttling values from place to place rather than doing any arithmetic. (For types where you do need arithmetic, it gets pretty ugly in languages without operator overloading. Except Go, which doesn’t have operator overloading but does have builtin support for defining distinct versions of integer types.)
In fact, make the structs opaque so that you retain full control over the data and its invariants.
void * is part of it, but you can also implicitly cast from between integer types, and also between integers and enums. Think of passing an enum or an int into a function which takes a long as an argument.
I mean... I don't really think that's a strong case for calling C's type system weak.
I suspect most people which state that C has a weak type system are really talking about the fact that C has a weakly _enforced_ type system. You can break the type system's rules rather easily (or perhaps it's more accurate to say, the type system is too permissive), either way it doesn't provide you the same guarantees a stronger type system provides). At least that's my take on it.
This.
I would say it's mostly a matter of use: In C you deal with void* or typecasts all the time, whereas in higher level languages it's much less common, either because the type system is smarter, or the constraints that it does have are more strictly enforced. For example: you can happily compare a char* and an int in C, but other languages like python might error at the thought.
C is incredibly permissive with regard to its types which are themselves very anemic. With the exception of the numeric primitives, C really only has a single type, the pointer, everything else is just syntactic sugar for various forms of pointer arithmetic. For instance arrays in C are just a shortcut for some pointer plus an offset multiplied by a constant determined at compile time based on what you've claimed is the underlying struct or primitive of the array. Importantly C is perfectly happy to take any random pointer into arbitrary memory and allow you to map any set of offsets into it. It's worth looking at for instance Rust that at least in theory allows the same thing to be done, but only by explicitly opting out of static checks via unsafe declarations. In normal safe code Rust will statically verify that a given reference (pointer more or less) is in fact referring to the type you're code is expecting it to, rather than the C approach of simply assuming the program is correct. Looked at another way, as far as the C compiler is concerned nearly everything is a pointer, and one kind of pointer is entirely exchangeable with another kind of pointer (with at most a cast being required, but probably not even that if it's the entirely too common case of being a void pointer). This is in contrast to nearly every other statically typed language that will either at compile or runtime verify that any given reference is the appropriate type before dereferencing it. C++ nominally at least has a more powerful type system, but since it was designed (in theory at least) as a superset of C, C's permissivity blows a giant gaping hole in its type system.
/t/tmp.1q8r9dZAtX > cat test.c int main() { char *test = "test"; int i = 10; return test == i; } /t/tmp.1q8r9dZAtX > cc test.c test.c: In function ‘main’: test.c:4:14: warning: comparison between pointer and integer return test == i; ^~$ python -c '"test" == 10' $
> The developer is encouraged to think of the invariants before trying to prove that their implementation satisfies them. This approach to software development asks the programmer to consider side-effects, error cases, and data transformations before committing to writing an implementation. Writing the implementation proves the invariant if the program type checks.
I really wish more languages took this to the logical conclusion and implemented first-class contract support. It seems work on contracts stopped with Eiffel (although I've heard that clojure spec is _kinda_ getting there).
Racket took the baton from Eiffel: https://docs.racket-lang.org/guide/contract-boundaries.html
Check out Dafny, Whiley, and Liquid Haskell.
Ada also supports contracts, as does .NET with code contracts.
Contracts are useless. They merely describe what you want your code to be, not what your code actually is. Just grab a pen and a piece of paper, and start proving things about your programs.
> Contracts are useless.
No, they are not.
1. If the code doesn't conform to the contract, it will fail on the contract boundary, with a well-defined error. If this is useless, then `assert` is also useless, which it is, of course, not.
2. With a sufficiently well-designed language and sufficiently smart compiler, you can move some contract checks to compile-time. See Racket.
3. If your language supports both static and dynamic typing, the contracts are a dual of static types, which lets you interface the static and dynamic parts of the code seamlessly and automatically (in both directions). Again, see Racket.
Meta: I wonder, why it's mostly static-typing proponents who aggressively evangelize, insult the other side, are 110% sure they're right even though there is no scientific evidence and so on. Could it be the bondage&discipline approach of static typing just appeals to people with a certain mindset, who are statistically more probable to engage in such behaviors, no matter the subject?
0. Trapping the error doesn't make your program any less wrong. I agree that `assert` is equally useless.
1. With pencil and paper, you don't need to wait for a smart compiler - you can get started proving things about your programs today!
2. I can totally see what's coming next: “Being wrong is dual to being right, so being wrong is another possibility worth exploring”. Right?
> Meta: (slander)
No comment.
Static typing prevents bugs in code to the degree that the programmer can correctly encode the desired behavior of the program into the type system. Relatively little behavior can be encoded in inexpressive type systems, so there's a lot of room for bugs that have nothing to do with types. A lot more behavior (e.g. the sorts of invariants mentioned in agentultra's top level comment) can be encoded in a more expressive type system, but you then have the challenge of encoding it /correctly/. A lot of that kind of thinking is the same as the kind of thinking you'd have to do writing in a dynamic language, but you get more assurances when your type system gives you feedback about whether you're thinking about the problem right.
For my money, I work in a primarily dynamic language and I already have a set of practices that usually prevent relatively simple type mismatches so I very rarely see bugs slip into production that involve type mismatches that would be caught by a Go-level type system, and just that level of type information would add a lot of overhead to my code.
But if I were already using types, a more expressive system could probably catch a lot of invariance issues. So I feel like the sweet spot graph is more bimodal for me: the initial cost of switching to a basic static type system wouldn't buy me a lot in terms of effort-to-caught-bugs-ratio, but there's a kind of longer term payout that might make it worth it as the type system becomes more expressive.
> Static typing prevents bugs in code to the degree that the programmer can correctly encode the desired behavior of the program into the type system.
Exactly. The author of the article implicitly equates "statically verified code" with "bug-free code". But that's not correct. It's quite possible (and even, dare I say it, fairly common) to have code that expresses, in perfectly type-correct fashion, an algorithm that doesn't do what the user actually wants it to do. Static typing doesn't catch that.
It depends on your type system. In new languages like Idris or F* you can encode in the type the correctness of an algorithm and it will not compile if the compiler can not prove that correctness.
For example I can prove my my string reverse works in Idris (https://www.stackbuilders.com/news/reverse-reverse-theorem-p...). Or I could prove that my function squares all elements in a list. Etc.
Now a big part of the problem is expressing with sufficient accuracy what the properties of the algorithm you want to prove are. For example for string reverse I may want to show more than that `reverse (reverse s) = s`. Since after all if reverse does nothing that would still be true. I would probably want to express that the first and last chars swap when I just call reverse xs.
> For example I can prove my my string reverse works in Idris (https://www.stackbuilders.com/news/reverse-reverse-theorem-p...).
This article basically demonstrates GP point, though. It proves that `reverse` is self-inverse, but there are lots and lots of functions that are self-inverse (for example, `x -> x` is self-inverse. As would be the function that swaps any odd-index element with the one following it).
The claim was a difficulty of how to encode the actual correctness into your type-system. That this article doesn't actually encode correctness of reverse, seems like pretty good for that difficulty.
This is precisely what I said in the next paragraph and even mentioned away to correctly encode that "correctness". Where a _correct_ implementation of reverse has the property `strHead' s = strTail' (reverse s)` recursively.
It's important to understand that type systems can encode correctness to the level you can specify it. So the program is therefore bug-free to the accuracy of your requirements on it.
Most people do not work with type systems which can do this and are unfamiliar with formal verification. The author presents directly (and argues through out) that there is a correlation not that bug free and static analysis are the same thing.
Sorry, you are right, of course. I fell victim to one of the internet's classic blunders: Skimming a long comment thread and not carefully read what I reply to in the end :)
There's no such thing as "proving correctness". You can have bugs in the type definitions. You can have bugs in the english (or whatever your native language is) description of what you think the algorithm should be doing. You can prove a program does what the types say it should do but that is not what "correctness" means.
>Now a big part of the problem is expressing with sufficient accuracy what the properties of the algorithm you want to prove are. For example for string reverse I may want to show more than that `reverse (reverse s) = s`. Since after all if reverse does nothing that would still be true. I would probably want to express that the first and last chars swap when I just call reverse xs.
This is no different from writing tests in a dynamic language.
> This is no different from writing tests in a dynamic language.
Types and tests are not equivalent. This is a prevalent myth among dynamic typing enthusiasts. There is no unit test that can ensure, eg. race and deadlock freedom, but there are type systems that can do so. There are many such properties, and tests can't help you there.
Types verify stronger properties than tests will ever be able to, full stop. You don't always need the stronger properties of types, except when you do.
To add some color here on the difference between a type proof and a test: consider that you can never test all possible strings for reverse.
However, a type proof can show that reverse, reverses all possible strings.
It is possible to test that a function on 16 bit integers returns the correct value for all inputs. Doing so would be a proof by exhaustion.
Type based proofs let us prove things using other methods than exhaustion, which is the only possible way to prove things with tests. That is an important property.
Indeed, or to summarize as a soundbite: tests can only prove existential properties but types can prove universal properties.
> There's no such thing as "proving correctness". You can have bugs in the type definitions.
You can have bugs in type definitions and you can have bugs in tests as well and that'll be a problem until computers can read our minds. Type definitions are superior at checking what you specified though because the checking is exhaustive. Perfect is the enemy of good and all that.
> This is no different from writing tests in a dynamic language.
It's not. In that example, the type system will verify the stated property is true for all values of "s". Tests only check some specific examples whereas using types in this way tests all possible input and output pairs. It's like comparing a maths proof to checking an equation holds for a few examples you tried.
No, I’m pretty sure there’s a pretty large body of academic and industrial research on proving program correctness that you can’t just hand wave away with sophistic “but what if your type signature is wrong” nonsense. And there’s a huge difference between a test and a proof - a test can only tell you a program doesn’t do what you think it should for a particular case, a proof tells you that your program does exactly what it is supposed to.
Defining "correctness" in terms of types is the CS equivalent of defining "risk" in terms of volatility - it replaces a real and fundamentally unsolvable problem with a problem that, while it has the advantage of being tractable, isn't actually all that important to solve. Great for publishing papers, dangerous when people start confusing the fake problem and the real problem.
And you’re just doubling down on sophistry. Why should anyone take you seriously in this conversation?
Saying "you're doubling down on sophistry" instead of just "you're wrong" doesn't actually make you more convincing.
No, I’m saying sophistry contributes nothing to the conversation. I don’t care about convincing you about the subject at hand: you have the opinion of someone who’s invested too much into justifying their ignorance to actually pick up a textbook and learn the relevant material.
> There's no such thing as "proving correctness".
Given the caveats you mention, is there such thing as proving anything?
There are two meanings of prove. The one that type theorists are using is roughly "to derive your statement from axioms with pure logic." This sense of the term can never apply to things in the real world, like programs (of course, a program also exists both as an abstraction about which things can be proved, but when you're talking about programs which are actually doing things in the real world, you can't treat them as being pure logic).
The second sense is the scientific "gather enough supporting evidence that your are reasonably sure". In this sense, you can prove a lot of things.
> There are two meanings of prove. The one that type theorists are using is roughly "to derive your statement from axioms with pure logic." This sense of the term can never apply to things in the real world...
I am not sure it's a meaningful distinction. There are no triangles in the "real" world -- if you look close under a microscope, there will be more than three sides -- but geometry proves to be of great practical value just the same.
> You can have bugs in the english (or whatever your native language is) description of what you think the algorithm should be doing.
According to the principle of separation of concerns, this isn't the programmer's problem.
> You can prove a program does what the types say it should do but that is not what "correctness" means.
Of course, the ultimate arbiter of what “correctness” means is the program specification.
> The author of the article implicitly equates "statically verified code" with "bug-free code".
Not at all. First, the statements as put here are discrete (boolean even) while I present both "statically verified code" and "bug-freedom" as living on a continuum. Secondly, I don't equate them. If anything, I assume a monotonic, positive relationship between them (strictly speaking not even that. I make pretty clear that the curves could also have whatever shape. But I yield that I am very suggestive in this because I do strongly believe it to be the case). In fact, one of the main points of the argument is that the two are not equal - otherwise, the blue curves I drew would all be straight lines from (0,0) to (1,1). And lastly, none of this is done implicitly. I mention all of this pretty explicitly :)
> It's quite possible (and even, dare I say it, fairly common) to have code that expresses, in perfectly type-correct fashion, an algorithm that doesn't do what the user actually wants it to do. Static typing doesn't catch that.
It's also possible to grab a knife with your hand on the blade edge and cut yourself, but that doesn't diminish safety the value of knife handles.
Or as Knuth pithily put it, "Beware of bugs in the above code; I have only proved it correct, not tried it."
The article was not trying to discuss how to make programmers smarter. No language is going to help with that so there is no point in talking about it. As far as the scope of the article is concerned, it's fair to say that statically verified code equals bug-free code.
I had a related reaction, which is that the problems you're mentioning can become more complex when using libraries outside the standard library with static languages.
E.g., my experience is that poor library design can sometimes be exacerbated in statically typed languages if the type logic is poor and doesn't match the problem domain. Dynamic languages sometimes inadvertently "correct" for this by smoothing over these sorts of issues.
I prefer static languages (or at least optionally typed ones) but there can be big downsides of the sort you're mentioning, that are exacerbated by third-party libraries.
> I already have a set of practices that usually prevent relatively simple type mismatches
Care to share those practices? I also primarily work (this year, at least) in dynamically typed languages.
I think it depends on language and context a lot so I can only give some vauge advice. A lot of the idea is to write code that only does the one thing you know you need it to do, does it very simply and fails very obviously when it's not used correctly. In some cases your instinct is to make things as generic as possible at the first pass. Preemptive abstraction pays off more in statically typed languages, in my experience, because it saves you the time of refactoring a type system in place if you turn out to need flexibility later. But in dynamic languages it can introduce complexity that hides bugs.
Simple type-level errors come up the most when you have types that are easily conflated. That tends to happen when you have functions that accept more than one type of thing or output more than one type of thing - avoid that. Avoid polymorphism and OOP patterns that set up a complicated type hierarchy or override methods - you don't want any instance where you end up with something that looks a lot like one type of thing but isn't. Type hierarchies can often be factored out into behaviors provided by modules that supply functions that operate on plain data structures. For variables and parameters, stick to really simple types to whatever degree it's possible, e.g. language primitives, plain old data-container objects and "maybe" types (e.g. things that could be a primitive or null) when absolutely necessary (and check them whenever you might have one). Use union types extremely sparingly. Assignment/creation bottlenecks are useful: try to have only one source for objects of a certain type that always constructs them the same way (so you don't end up with missing fields).
A lot of programmers coming from a language with a stronger type system (especially when transitioning from OOP languages to functional or hybrid languages) tend to be nervous about writing functions without a guarantee about what kind of inputs they'll see, so they try to compensate for the lack of type safety by building functions that can cope with whatever is thrown at them. The idea is that this makes the function more robust but ironically, this tends to make bugs a lot harder to track down. In my experience it's better to write functions with specific expectations about their input that fast-fail if those aren't met, instead of trying to recover in some way - garbage-in-exceptions-out is better than garbage-in-garbage-out. If you send the wrong kind of thing to a function, you want it to throw an error then and there, and you'll likely catch it the first time you test that code.
A lot of the idea of this kind of advice is to shift the work that would be done by the compiler's type system to the very first pass of testing - if your program is basically a bunch of functions that only take one sort of thing in each argument slot, only emit one sort of thing as a result and fail fast when those expectations are violated, you'll typically see runtime type errors the first time those functions get executed, which is a lot like seeing them at compile time.
> but you get more assurances when your type system gives you feedback about whether you're thinking about the problem right.
It really isn't as much about languages as it is about the people who use them. The key ability is to prove things about programs. Powerful type systems, especially those that have type inference, merely relieve the programmer from some of the most boring parts of the job. Sometimes.
> ... and just that level of type information would add a lot of overhead to my code.
Can you give an example of this overhead?
The biggest issue with claims like "there are only diminishing results when using a type system better than the one provided in my blub language" is that it assumes people keep writing the same style of code, regardless of the assurances a better type system gives you.
"I don't see the benefit of typed languages if I keep writing code as if it was PHP/JavaScript/Go" ... OF COURSE YOU DON'T!
This is missing most of the benefits, because the main benefits of a better type system isn't realized by writing the same code, the benefits are realized by writing code that leverages the new possibilities.
Another benefit of static typing is that it applies to other peoples' code and libraries, not only your own.
Being able to look at the signatures and bring certain about what some function _can't_ do is a benefit that untyped languages lack.
I think the failure of "optional" typing in Clojure is a very educational example in this regard.
The failure of newer languages to retrofit nullabillity information onto Java is another one.
The article makes two main points: a) static typing has a cost and b) thus, any benefit it brings should be examined against that cost.
I am sorry, but I don't really see how you stating more benefits of static typing really counters either of them.
I recommend reading the article again. But this time, try not to read it as defending a specific language (I only mentioned my blub language so that it's a more specific and extensive reference in the cases where I use it - if you are not using my blub language, you should really just ignore everything I write about it specifically) and more as trying to talk on a meta-level about how we discuss these things. Because your comment is an excellent example of how not to do it and the kind of argument that prompted me to this writeup in the first place.
Those are not really 'points', though; they are far too trivial. Obviously, nothing counters them, because they are tautologies that could just as well apply to any subject.
The point is to explore a comparative difference in value, and that is realized through mastery of the tool, not merely living in a world where it exists.
> they are far too trivial.
You'd have thunk I didn't have to make them than. But I did, judging from literally every argument I had about this.
> This is missing most of the benefits, because the main benefits of a better type system isn't realized by writing the same code, the benefits are realized by writing code that leverages the new possibilities.
The inverse is also true; you don't really get the benefits of dynamic typing until you start doing things differently to take advantage of that difference. If you still code like you're in a static language, you'll miss the benefits of a dynamic one.
Optional typing has not failed in Clojure, it's growing with clojure spec
What amuses me in all "static typing versus..." discussions, is that it usually it is the comparison between two camps:
Camp A: Languages with mediocre static typing facilities, for example:
-- C (weakly typed)
-- C++ (weakly typed in parts, plus over-complicated
type features)
-- TypeScript (the runtime is weakly typed,
because it's Javascript all the way down)
Camp B: Languages with mediocre dynamic typing facilities, for example: -- Javascript (weakly typed)
-- PHP 4/5 (weakly typed)
-- Python and Ruby (no powerful macro system to
help you keep complexity well under control
or take fulll advantage of dynamicism)
Both camps are not the best examples of static or dynamic typing. A good comparison would be between:Camp C: Languages with very good static typing facilities, for example:
-- Haskell
-- ML
-- F#
Camp D: Languages with very good dynamic typing facilities, for example: -- Common Lisp
-- Clojure
-- Scheme/Racket
-- Julia
-- Smalltalk
I think that as long as you stay in camp (A) or (B), you'll not be entirely satisfied, and you will get criticism from the other camp.It's Camp D that I'm least familiar with here; outside of academic projects in lisp/scheme I've never used them for anything serious.
What exactly does it mean to have "good dynamic typing facilities"?
> What exactly does it mean to have "good dynamic typing facilities"?
To quote Peter Norvig on the difference between Python and Lisp, but you could apply it to most other mainstream dynamic languages vs Lisp :
> Python is more dynamic, does less error-checking. In Python you won't get any warnings for undefined functions or fields, or wrong number of arguments passed to a function, or most anything else at load time; you have to wait until run time. The commercial Lisp implementations will flag many of these as warnings; simpler implementations like clisp do not. The one place where Python is demonstrably more dangerous is when you do self.feild = 0 when you meant to type self.field = 0; the former will dynamically create a new field. The equivalent in Lisp, (setf (feild self) 0) will give you an error. On the other hand, accessing an undefined field will give you an error in both languages.
Common Lisp has a (somewhat) sound, standardized language definition, and competing compiler/JIT implementations that are much faster than anything that could ever possibly come from the Python camp because the latter is actually too dynamic and ill-defined ("Python is what CPython does") and making Python run fast while ensuring 100% compatibility with its existing ecosystem, without putting further restraints into the language, is akin to a mirage.
What does “somewhat sound” mean?
I think he refers to some of the usual criticisms of Common Lisp:
1. The language specification is very big. This is true, it is a very big specification. On the other hand, this is mostly caused because the language spec also includes the spec for its own "standard library", unlike what happens in C or Java, for example, where the Std. lib is specified elsewhere. CL's "standard library" is very big, because there are many, many features.
The other reason the spec is so big, is that this is a language with a lot of features - you can do high level programming, low level, complex OOP, design-your-own OOP, bitwise manipulation, arbitrary precision arithmetic, dissasemble functions to machine language, redefine classes at runtime, etc etc etc.
Probably the extreme of the features is that there is a sql-like mini-programming language built in just for doing loops (!), "the LOOP macro". On the other hand, you can choose not to use it. And if you use it, it can help you write highly readable and concise code. More info:
http://cl-cookbook.sourceforge.net/loop.html
2. The "cruft"; Common Lisp is basically the unification ("common") of at least two main Lisp dialects that were in use during the 70s. So there are some parts (mind you, just some) in which some naming or function parameter orders could have been more consistent; for example here everything is consistent:
... but here the consistency is broken:;; access a property list by property (getf plist property) ;; access an array by index (aref array index) ;; access an object's slot (slot-value object slot-name)
There is also sometimes some things that seem to be redundant, like for example "setq", where "setf" can do everything you can do with "setq" (and more); or for example "defparameter", and "defvar" where in theory "setf" might be enough. But there are differences, and knowing such differences help to write more readable and better code. And it's really nitpicking, for these are easy to overcome.;; gethash: obtain the element from a hash table, by key (gethash key hash-table)3. Because of the above, CL is often criticized because of being a language "designed by committee". But, unlike other "committee-designed languages", this one was designed by selecting, from older Lisps, features that were already proven to be a Good Thing, and incorporating them into CL without too many changes. So you can also consider it to be "a curated collection of good features from older Lisps..."
4. Scheme, the other main "Lisp dialect", has a much, much smaller and simpler spec, so it's easier to learn. But on the other hand this also means that many features are just absent, and will need to be implemented by the programmer (or by external libs), without any standarization. On the other hand, due to the extensive standarization, usually Common Lisp code is highly portable between implementations, and often code will run in various CL implementations, straight away, with zero change.
Historically, Scheme was more popular inside the academic community while Common Lisp was more popular with production systems (i.e. science, space, simulation, CAD/CAM, etc.) Thus, there used to be an animosity between Schemers and Lispers, although jumping from one language to other is rather easy...
ANSI CL is what some 1100-something pages?
JavaScript is up to 885 pages. https://www.ecma-international.org/publications/files/ECMA-S...
The C++ 17 draft: 1623 pages. https://www.ecma-international.org/publications/files/ECMA-S...
C, which "is not a big language, and is not well served by a big book", according to Thompson and Ritchie's 1988 introduction in the K&R2, is up to 683 pages in C11. Almost triple the size of C90's 230 pages.
How about something non-language? USB 3.2 spec (just released Sep 22): 100+ megabyte .zip file download. Up from 2.0's 73.
0. There is nothing wrong with big standard libraries, so long as they are not redundant and the core language is small.
1. This is a serious criticism, but it has nothing to do with soundness.
2. There is absolutely nothing wrong with a language being designed by a committee, so long as the committee's members are all competent.
3. Back to 0.
>2. There is absolutely nothing wrong with a language being designed by a committee
It sort of has a bad stigma, because two well-known, unloved languages were designed by committee: COBOL and PL/I.
Today, those roles are played by C++17 and C11.
If you Google, you can find tons of good stuff written in Lisps. Julia is an up-and coming language - but it is highly regarded by the data science community.
> What exactly does it mean to have "good dynamic typing facilities"?
The ability to change the structure of your program at runtime will be at the top of the list for me. You can't do that with Ruby/Python.
>What exactly does it mean to have "good dynamic typing facilities"?
Picking Common Lisp as an example:
(NOTE: Some of the features are also present in good statically typed languages as well, so what I advocate is to use good, well-featured languages, not really static vs dynamic.)
(NOTE 2: I'm sorry for being such a fanboy, but that thing is addictive like a hard drug...)
0. Code is a first class citizen, and it can be handled just as well as any other type of data. See "macros" below.
1. The system is truly dynamic: Functions can be redefined while the code is running. Objects can change class to a newer version (if you want to), while the code is running.
2. The runtime is very strong with regards to types. It will not allow any type mismatch at all.
3. The error handling system is exemplary: Not only designed to "catch" errors, but also to apply a potential correction and try running the function again. This is known as "condition and restarts", and sadly is not present in many programming languages.
4. The object oriented system (CLOS) allows multiple dispatch. This sometimes allows producing very short, easy to understand code, without having to resort to workarounds. The circle-ellipse problem is solved really easily here. (Note: You can argue that CLOS is in truth a statically typed system, and this is partly true -- the method's class(es) need to be specified statically, but the rest of arguments can be dynamic.)
5. The macro system reduces boilerplate code to exactly zero. And also allows you to reduce the length of your code, or have very explicit (clear to read) code at the high-level. This brings down the level of complexity of your code, and thus makes it easier to manage. It also reduces the need for conventional refactoring, since macros can do much more powerful, automatic, transformations to the existing code.
6. The type system is extensive -- i am not forced to convert 10/3 to 3.333333, because 10/3 can stay as 10/3 (fractional data type). A function that in some cases should return a complex number, will then return the complex number, if that should be the answer, rather than causing an error or (worse) truncating the result to a real. Arbitrarly length numbers are supported, so numbers usually do not overflow or get truncated. (Factorial of 100) / (factorial of 99) gives me the correct result (100), instead of overflowing or losing precision (and thus giving a wrong result).
So you feel safe, because the system will assign your data the data type that suits it the best, and afterwards will not try to attempt any further conversion.
7. The type system is very flexible. For example, i can (optionally) specify that a function's input type shall be an integer between 1 and 10, and the runtime will enforce this restriction.
8. There is an extensive, solid namespace system, so functions and classes are located precisely and names don't conflict with other packages. Symbols can be exported explicitely. This makes managing big codebases much easier, because the frontier between local (private) code versus "code that gets used outside", can be made explicit and enforced.
9. Namespaces for functions and variables (and keywords, and classes) are separate, so they don't clash. Naming things is thus easier; this makes programming a bit more comfortable and code easier to read.
10. Documentation is built into the system - function and class documentation is part of the language standard.
11. Development is interactive. The runtime and compiler is a "living" thing in constant interaction with the user. This allows, for example, for the compiler to immediately tell you where the definition of function "x" is, or which code is using such function. Errors are very explicit and descriptive. Functions are compiled immediately after definition, and it can also be dissasembled (to machine language) with just a commmand.
12. Closures are available. And functions are first-class citizens.
13. Recursion can be used without too many worries -- most implementations allow tail call optimizations.
14. The language can be extended easily; the reader can also be extended if necessary, so new syntaxes can be introduced if you like. Then, they need to be explicitely enabled, of course.
15. There is a clear distinction between "read time", "compile time" and "run time", and you can write code that executes on any of those three times, as you need.
16. Function signatures can be expressed in many ways, including named parameters (which Python also has and IMO is a great way to avoid bugs regarding to wrong parameters / wrong passing order.)
> 3. The error handling system is exemplary: Not only designed to "catch" errors, but also to apply a potential correction and try running the function again. This is known as "condition and restarts", and sadly is not present in many programming languages.
I found this extremely weird and (or hence) interesting at the same time. Where can I read more about this?
You are welcome, sir! How about this chapter of the famous book "Practical Common Lisp", available online for free?
http://www.gigamonkeys.com/book/beyond-exception-handling-co...
googling for "common lisp condition system" turns stuff up. i looked at one point, and i don't remember finding an academic treatment of them.
they're (hand wave hand wave) basically a matter of stuffing the current continuation inside of the exception, whenever you throw an exception, and then making use of that to provide more options whenever the exception is caught.
Languages like TypeScript and Racket are also interesting because they let you shift camps partway through the project - you can start of untyped and then add typing annotations.
Perhaps it's because the languages you mentioned barely register in statistics(aside from Clojure maybe).
Maybe it's simply hard to have both a good type system and a friendly learning curve?
The other possibility is that the industry suffers from anti-intellectualism, so we keep reinventing the same 2 languages.
Do you believe that this is plausible?
I think this has happened for a long time. First (70s), features from Algol-68, like structured programming or better flexibility for data types, were ported to other languages. Then (80s-present), the features from Smalltalk and Lisp started slowly to be ported to other languages, many times in an incomplete or unelegant way.
We're still doing this, for example the last production spec of the Java language finally incorporated a mechanism to pass functions as input parameter on a method. And the next version of Java (9) will attempt to have some interactivity, with a kind of REPL. This, coupled with the powerful facilities of good Java IDEs, will give Java developers of 2017 the level of Interactivity and easiness of development that Smalltalk and Lisp users have enjoyed since the late 70s. Sad but true.
Julia -an interesting language, by the way- borrows multiple dispatch from the Common Lisp Object System (CLOS), among other features. CLOS itself was a further evolution of the OOP brought to the table by Smalltalk, invented by a true genius: Alan Kay.
Rust is basically a "fixed C++", that is, a more usable, less annoying C++.
So it's difficult to say there are truly new things in programming language. But it's not everything limited to Smalltalk and Lisp -- Prolog, ML (and OCaml, F# and Haskell) do bring new concepts to the table, and are worth checking out.
You managed to pick the two things where Rust is actually not improving on C++, because it's both more annoying and less usable compared to C++.
There's a reason the expression "fighting the borrow checker" was coined.
You're confusing your own biases and preferences for facts.
I find this whole "fighting the borrow checker" thing a tad inflated. I personally don't "fight" it anymore, because it's a simple rule to anyone who's familiar with pointer arithmetic.
Also the compiler usually tells you what is it exactly that you screwed up this time and how to get out of this mess, which cannot be said about C++.
In absence of a garbage collector, what people don't get is that it's really easy to screw up by creating race conditions or memory leaks.
If fighting the borrow checker is annoying, that's because you don't get memory safety otherwise.
The vast majority of vulnerabilities in the wild are created because of sloppy usage of C / C++, which is basically unavoidable in absence of expensive static analyzers that become as annoying as Rust, while not being as good.
>friendly learning curve
To be honest, i love Common Lisp, it might be the most powerful programming language out there, but it's not easy to learn at all. In part because, being a truly multi-paradigm language, you should better make sure you are well versed in most programming paradigms first, otherwise you won't leverage the full power of Lisp. Not to mention the paradigm of meta-programming and DSLs, something that is usually new to programmers foreign to Lisp.
However, languages like Clojure and Smalltalk can be rather easy to learn, and they are fairly powerful.
Smalltalk was designed to be taught to kids!!
>Not to mention the paradigm of meta-programming and DSLs, something that is usually new to programmers foreign to Lisp.
Is it really? What about templates(as in C++ templates), macros, CSS and HTML? These are two examples of metaprogramming and two DSLs respectively.
> However, languages like Clojure and Smalltalk can be rather easy to learn, and they are fairly powerful.
It's not like I don't believe you, but if this is true, then where's the popularity? Why isn't it there? I'm asking because I genuinely don't know.
EDIT: punctuation.
>Is it really? What about templates(as in C++ templates), macros, CSS and HTML?
"Lisp macros" go far, far beyond "C macros" ("preprocessor macros", and indeed go far beyond what you can do with C++ templates. You should take a look, but basically, explained in a few words:
In Lisp, code is data. Code is a first-class citizen. The functions and constructs that are there to manipulate data, also manipulate source code with the same easiness. So your code can manipulate code very, very easily. Writing code that creates code "on the fly" -be it at compile time or at runtime- is not only possible, it is also very easy to do, and it is 95% similar to writing regular code.
Thus, Lisp is sometimes described as "the programmable programming language."
>It's not like I don't believe you, but if this is true, then where's the popularity?
"Programming is pop-culture" -- Alan Kay.
The reasons a programming language gets highly popular is not always related to the quality of it. There are also other reasons. Consider Javascript for example. Before the ES6 specification, it was plainly a horrible programming language, full of pitfalls and missing features. You couldn't even be sure of the scope of the variable you just declared!! But it went popular, simply because it was the only programming language usable on all web browsers.
C, for example, was never a great programming language. But it ran efficiently on any hardware, so it started as a (very good) alternative to assembler. And then got more traction.
Then object-oriented programming got popular, because it allowed you to do nice stuff (on the Smalltalk language, where it was very well implemented). So somebody said: ok, i want C with object orientation, and C++ was invented, which wasn't a very good object oriented language, but since C was popular, and OOP was the next big thing, it got wildly popular.
C and C++ languages require you to manually manage memory, unlike in Smalltalk or Lisp, where the memory was automatically managed. So somebody at Sun said "ok, let's make a language with syntax similar to C++, but with automatic memory management", and Java was born, and thus, due to the small learning curve, and a LOT of marketing, went wildly popular, although many of the problems of C++ were present, plus it introduced limitations of its own. (I, as a student, loved Java when i learnt it, after having to use C++. How naive i was!!)
And the story goes on and on.
So it's more about riding the wave of popularity, rather than using the best tool for the job. It has also something to do with the triumph of UNIX over other operating systems. Otherwise, Smalltalk [what the groundbreaking Xerox machines used] and Lisp [what the groundbreaking Lisp Machines, and also the Xerox machines used] would be way more popular.
It also has something to do with speed -- Lisp (in the 60s) used to be a very slow language. Smalltalk (in the 70s and early 80s) used to run very slow as well. They also required a huge amount of memory. Nowadays they are not really memory hungry, and they can run very fast.
Some problems are much easier to express in Prolog, or Haskell, than Java or C++ or javascript; but they aren't popular languages. Popularity sometimes is harmful...
JavaScript was very nice before ES6. All ES6 did was to add syntax sugar/meta. var in JavaScript belongs to the function it is declared in. ES6 made the language way more complicated, and divided the community up into more dialects. The plan was to unite compile-to-JavaScript communities like CoffeeScript, but that didn't work because there's more compile-to-JavaScript languages now then it ever has been.
The learning curve for Julia is very easy.
> Ruby
Ruby has some of the best meta-programming facilities out there. Yes you can't manipulate syntax in the same way as lisp, but the fact that all methods are message passing and first class blocks make tons of very powerful meta programming possible. Basic features that look otherwise first class are based on Ruby's meta programming facilities like `attr_reader` and friends. Funnily enough, the meta programming facilities of Ruby are precisely what turns a lot of people off. The wtfs per minute of using something like ActiveRecord is super high for people with only passing familiarity because there's so much that that's defined through Ruby's meta programming facilities.
I say this as someone that has used and loved Ruby for more than ten years now:
Smalltalk has worlds better dynamic and metaprogramming.
That said, ruby does have a lot of power, but it's not of the same order as self/smalltalk etc.
>Ruby has some of the best meta-programming facilities out there.
Ruby has meta-programming facilities, but they pale compared to the easiness of doing meta-programming in Common Lisp. In Ruby, meta-programming is an advanced topic (see for example the implementation of the RoR ActiveRecord). In Lisp, meta-programming is your everyday bread&butter, and one of the first things a beginner learns. Because it isn't too different from regular programming!!
[The same comment applies, mostly, also to Clojure, Racket, Scheme, and the other Lisps]
So where does Java & C# fit in those categories?
Probably Camp A. They have some static typing, but it's pretty basic. Then again, according to the article posted, they may be in the "sweet spot" for many purposes.
Where's Rust in here?
Category C
How are the dynamic typing facilities in Smalltalk better than those in Ruby?
What makes you put Python and JS in the same basket here? Python is widely raised as a language that has a good dynamic typing system with strong types and good type error handling - arguably better than Lisps (nil punning). JS is infamous for the opposite. Macros are quite orthogonal to this.
> Macros are quite orthogonal to this.
You ain't gonna to find any sane way to combine macros with a powerful type system in a way the doesn't make a 140+ IQ a requirement for any programmer touching the code using these features in a real world project...
Problem with programming language design is that the ideal/Nirvana solutions lie at the edge, or beyond, the limits of human intellect. If you want something that can be learnt and understood with reasonable effort (like in not making "5+ years experience" a requirement for even basic productivity on an advanced codebase), you're going to have to compromise heaviliy! The most obvious ways to compromise are throwing away unlimited abstraction freedom (aka "macros"), or type systems.
Sorry to break it to ya, but we're merely humans, and not that smart...
There is a programming language property called Restrictability - it means that you only need to know a subset of the features the language provides to become productive. The best languages have high restrictability without compromising on the high level features like powerful macros.
The point of having macros is that they allow you to solve problems that cannot be solved elegantly in any other way. But 95% of programmers don't need to solve such problems and can do very well without using macros.
> Restrictability - it means that you only need to know a subset of the features the language provides to become productive
Thanks, but.... NO THANKS! It's basically what makes languages like Scala or C++ horrible - a false sense of "you only need to know this subset of the language" and then you see that in real life: (1) nobody agrees on what that subset is and (2) you are going to have to hack your way through the most advanced frameworks and libraries (written by folk way smarter than you) and you are going to need to do it under unreasonable time pressure!
If a feature exists in the language you will be forced to understand it and become proficient at using it, whether you like it or not. Otherwise you're a "play pen programmer", only comfortable in his little patch of expertise.
I'm personally an "Expert Generalist" and like to be confident I can hack my way through anything this shitty life throws at me ;) This is kind of why I'm starting to love forcefully minimalistic, abstraction-wise-rigid, and intentionally "retarded" languages like Go nowadays :) (But yeah, when dynamic is the way to go, I'd prefer a Lisp with macros any time - one extreme or another, never the middle way, I'm not smart enough for it.)
> it means that you only need to know a subset of the features the language provides to become productive.
That only works when you work by yourself (or in a small team to whom you can dictate the language subset), and without any third party code.
> But 95% of programmers don't need to solve such problems and can do very well without using macros.
Languages that have great macro systems use them to bootstrap themselves. So when you use the standard, documented features, you're using macros.
E.g. if you're writing in Lisp and your file begins with (defun ..., you've just used a macro.
Sure, you can use them as a consumer all the time, but that doesn't mean that you need to write your own to program in Lisps, for instance. For the few that do need it, it's a worthy tool to have.
Rust has macros that seem well liked, and everyday stdlib constructs are implemented using them. Though I can easily believe that the average Rust macro is authored by a smart person.
>What makes you put Python and JS in the same basket here?
I'm very well versed in Python (i've delivered two financial software systems done in Python, written entirely by yours truly). However its features and facilities pale in comparison to the languages i listed in camp "D".
There's one huge benefit to static typing people often forget: self documentation.
While, yes, top-quality dynamic code will have documentation and test cases to make up for this deficiency, it's often still not good enough for me to get my answer without spelunking the source or StackOverflow.
I feel like I learned this the hard way over the years after having to deal with my own code. Without types, I spend nearly twice as long to familiarize myself with whatever atrocity I committed.
Many dynamically typed languages offer excellent runtime contract systems (Racket, Clojure) that serve as an implicit documentation at least as well as a statically-type language. Often more so, because you can express a lot of things in contracts that are not easily expressed in type systems.
> because you can express a lot of things in contracts that are not easily expressed in type systems.
Can you give an or some example(s) of this?
you can put arbitrary functions in a contract. with static typing that requires dependent types. and while i'm a fan, that's an enormous can of complexity to bust open.
say you've got a function that takes a list of numbers, and some bounds, and gives you back a number from the list that is within the bounds (and maybe meets other criteria, whatever). your contract for the function could require not only that the list be comprised of numbers, and the bounds are numeric, but also that the lower bound is <= the upper bound, and that the return value was actually present in the input list.
I consider Reading the source code to see what something does is a feature, if you can understand the code that is. If the code is easy to understand, there will be less bugs.
Having programmed in languages ranging from Ruby to Coq, for web apps and games, I feel the sweet spot is somewhere in the neighborhood of Java/C#, i.e. include generics but maybe leave out stuff like higher kinds and super-advanced type inference (and null!).
The main use case of generics, making collections and datastructures convenient and readable, is more than enough to justify the feature in my view, since virtually all code deals with various kinds of "collections" almost all of the time. It's a very good place to spend a language's "complexity budget".
I wrote an appreciable amount of Go recently, with advice and reviews from several experienced Go users, and the experience pretty much cemented this view for me. An awful lot of energy was wasted memorizing various tricks and conventions to make do with loops, slices and maps where in other languages you'd just call a generic method. Simple concurrency patterns like a worker pool or a parallel map required many lines of error-prone channel boilerplate.
> An awful lot of energy was wasted memorizing various tricks and conventions to make do with loops, slices and maps where in other languages you'd just call a generic method.
I feel the same way going from languages with HKTs back to Java/C#...
Not sure why you think they're not as useful, it sounds like you're making the same argument as OP but just moving the bar one notch over...
I am. I think the OP is fundamentally right about the sweet spot being pretty far from either extreme, I just disagree slightly about where exactly :)
Subjectively, I use ordinary generics all the time, but see the need for HKTs only occasionally. It's entirely possible I'm not experienced enough to see most of their possible use cases, but then I'd wager most programmers aren't.
In retrospect, HKTs are arguably Haskells greatest innovation, enabling extremely general abstractions and huge amounts of code reuse.
In my subjective opinion, Haskell has taken abstraction way past the point of diminishing returns, at least for the problems I tend to work on.
A large portion of advanced Haskell type system features seem to be about emulating things you could do with side-effects. I guess I prefer Rust's approach to managing side-effects, or even just Scala's implied convention of: use 'var' very sparingly, and mostly locally. Yes, some guarantees get traded away, but so much simplicity is gained.
I'm not very experienced with Haskell, but I've written a fair bit of Scala and I've utterly failed to see the value in scalaz and similar libraries, despite trying them a few times. They always seem to add lots of complexity without a tangible benefit.
Coming at it from another angle, I just don't see many cases where I feel I have to repeat myself due to a shortcoming of, say, Java's or C#'s type system. If I could add one feature to either, it'd actually be support for variadic type parameters.
As a counterexample, C# needed expensive language extensions to accommodate both LINQ and Async/await. Both can be implemented in Haskell purely as a library, thanks to HKTs.
Both Java and C# tend to rely heavily on frameworks such as Spring to workaround issues with the expressivity of the languages. This causes problems when one needs two frameworks (they don't in general compose). In Haskell, HKTs allow one to write polymorphic programs that are parametric with respect to certain behaviours and dependencies, no dependency injection framework needed.
Please don't judge Haskell using Scala and scalaz.
I'm not sure what Java expressivity problem Spring is meant to solve. XML configs are basically just a duplication of what would be done in a static initializer, except you lose type-checking and get to find your wiring mistakes at startup time instead of compile time. Autowiring annotations can be nice when you first use them, but become inscrutable magic once some other poor sap has to come along and make changes to the original project setup.
I just don't understand what is so horrible and inexpressive about a static initialization block.
The only possible purpose I see to Spring is if for some reason you really need to be able to change how your dependencies are injected at runtime. (90% of Spring apologists point to this, and 99% of them never use it in practice.) Even then, I don't see how a Spring XML config file (which I have seen run to 4000+ lines, to my horror) is better than just reading some settings out of a properties file to pick an implementation in your static initializer.
I guess passing parameters down manually through all the constructors gets too painful. The language is not expressive enough for a Reader monad! :)
Java's static initialiser blocks are too dangerous whenever one has threads.
Not sure about LINQ, I thought that was "just" syntactic sugar for a bunch of collection methods. Are you refering to extension methods as an unfortunate prerequisite?
But I think I get your general point: things like 'Control.Concurrent.Async' ('async'/'await') and 'Control.Monad.Coroutine' ('yield') are libraries that implement some and very generic type classes: 'Functor', 'Applicative', 'Monad'. This then lets you use features that are generic over those type classes ('do' syntax, 'fmap', ...).
It's been many years since I had a proper look at Haskell. Maybe it just takes more practice than I had back then to fully "get it". But I still don't see those abstractions being that useful in everyday programming. They seem to have huge potential for hard to follow code as you need to mentally unpack and remember more layers of abstraction, and the gain is not clear to me. Even the features that have trickled down to C# are not _that_ crucial I feel. The way mainstream languages pick the most useful use cases of those abstractions seems pretty OK to me.
(Also, macros and compiler plugins are another interesting avenue towards very powerful abstractions, with a different set of problems.)
As for Spring and dependency injection, I don't follow how HKTs would help there. Could you give an example? Aren't DI frameworks mostly about looking things up with reflection magic to automate, and arguably just obfuscate, the task of wiring things up in 'main'?
>They seem to have huge potential for hard to follow code as you need to mentally unpack and remember more layers of abstraction
That's the beauty of abstraction without side effects, you don't need to unpack anything. If you know what the inputs are and the outputs are, you don't need to know how it works or what type classes are even used to transform certain things.
People use sequence
all the time in Scala, not realizing it's only able to be implemented with HKTs of Applicative and Traverse. FYI, sequence flips a list of Futures to a Future of List, or a vector of Trys to a Try of Vector, etc.
Fair point about 'sequence'. There are probably a bunch of these I use regularly in Scala without realizing it. Though as a counterpoint, 'Future.sequence' wouldn't really lose _that_ much if it didn't return a collection of the same type. And I haven't yet felt the need for a generic `sequence`, which I'm sure scalaz has.
I don't buy your point about not needing to unpack side-effectless code, however. There are _always_ reasons to dig into code, be it bugs, surprising edge cases, poor documentation, insufficient performance, or even just curiosity. And those high-level abstractions tend to be visible in module interfaces too. I remember some Haskell libraries being very hard to figure out how to use if you didn't know your category theory :)
It's a pretty typical symptom I've seen a lot of hardcore FP developers exhibit: they forget how much time it took them to reach their level of mastery.
It's like spending ten years learning to speak Russian and then criticizing anyone who says that learning Russian is difficult.
Puzzling out scalaz code is difficult and requires an enormous investment in hours and practice, investment that a lot of people prefer to put into different learnings.
Mainstream developers forget just how much time they invest in learning the latest fad frameworks with new ad-hoc concepts and terminology. I guess "Hardcore FP developers" are fed up with this state of affairs and are looking towards mathematics to provide guidance and common patterns/names. At least any knowledge of mathematics will not become outdated!
Yea, puzzling out some scalaz code takes investment. On the other hand, the library is used for web apps, network servers, database based applications, streaming libraries etc.
It's incredibly multipurpose, more so than even Spring or Guava or LINQ, and these are things that developers regularly have to invest serious time in.
The argument is just that FP libraries (like Scalaz) have a bigger payoff in the investment.
At Verizon Labs were have 20+ microservices that I have touched/looked at. Some use Akka, some use Play, some use Jetty, some use Http4s but everyone makes use of Scalaz somehow.
> The argument is just that FP libraries (like Scalaz) have a bigger payoff in the investment.
It depends on the people, not everybody has the inclination to dive so deep into hard core FP and they will be more productive using a different approach.
Don't make the mistake of thinking you've found the only software silver bullet that exists and that people who don't use it "don't get it", which is another attitude I've seen a lot of hardcore FP advocates embrace.
Just like async/await, LINQ (and even enumerators) are tied to special syntax in the C# language. HKTs allow Haskell to provide very general resuable syntax, such as do notation.
What I meant by "polymorphic programs" as an alternate to DI, is something like this:
doStuff :: HasLogger m => Input -> m Output
The effectful function "doStuff" above is polymorphic with respect to which logging implementation is used, it could even be one that uses IO. All made possible with HKTs.
Ok, the point about special syntax is fair, but as I said, I'm happy with the use cases that have trickled down to mainstream, and I'd argue there aren't _that_ many truly useful ones. I realize this is very analogous to how the Go programmer is somehow happy with the few generic collections they are granted :)
Your DI example seems to be an example of my earlier point about "emulating things you could do with side-effects". No HKTs are needed when you just pass an impure side-effectful Logger object. Or, as discussed in another subthread, you could do side-effect management with Rust-style uniqueness typing, which results in a less elegant but arguably easier to use type system. It's debatable, but it seems people struggle less with the borrow checker than with advanced Haskell.
Looks like we reached maximum thread depth so replying here.
I agree, the side-effectful choices are either a global, some DI container, or just passing it down.
For loggers, I think global lookup from some (pluggable) logging library is justified because logging is probably the most ubiquitous cross-cutting concern ever. For pretty much everything else, I think passing as a parameter is actually the best option. It's explicit and simple, and you don't even need to explicitly pass it around _that_ much if you store it in a field of a class that plays the role of a module. Most uses of the dependency will be in non-static methods, lambdas, or inner classes.
I dislike Reader because it's similar to a DI container (or a global) in that it's more work to figure out, for a given call site, what the last value written to it was. With parameters, you just climb the call chain.
I don't see how passing in a logger object explicitly is the same, this is what OO DI frameworks try to avoid, otherwise you'd also have to pass it down to other functions used inside. The example above works just like a Reader Monad, but we are not tied to any specific logging implementation.
I guess the side effecting version is just to use some global registry to look up the logging implemention to use. But such code does not compose.
Rust doesn't have anyway to manage side-effects in types ..
Don't mutable and immutable references with lifetimes count? Sure, one could argue whether the borrow checker is really part of the type system, but it's a compile-time check either way.
Yes, in the standard library, an immutable object can hide mutable state in e.g. a 'Mutex', and effects to the external system aren't wired through anything like monads or unique objects.
I see those as compromises Rust makes in the name of pragmatism and being a system'ey language. I don't necessarily like all of them, but I find the general uniqueness typing based approach interesting.
See articles comparing Clean and Haskell for an interesting historical perspective, including how both approaches could be used to model side-effects in a purely functional language. Haskell "won", possibly because it was seen as more generic and composable. I always felt Clean's approach had merit too, so I was really glad to see Rust bring the idea, or a closely related idea, to prominence.
Right, but you are addressing only part of the story to side-effects. IO is another story, which rust doesn't address.
The standard library doesn't, and most crates don't, but I'm pretty sure nothing prevents you from writing libraries in a style where all IO requires mutable access to some explicit unique "World" object, similar to Clean.
Passing a unique world object around is effectively the same as composing with the IO monad, and borrowing 'f(&mut world)' is basically equivalent to 'let world = f(world)'.
Maybe someone will one day write a standard library in that style.
It is not going to be convenient because rust doesn't have higher kinds. I see people making this argument in other language contexts e.g. Ocaml; but they have no typeclasses, which make writing monadic style code extremely inconvenient.
> I think the OP is fundamentally right about the sweet spot being pretty far from either extreme, I just disagree slightly about where exactly :)
And I think an important take-away should be, that this perception is entirely subjective and colored by both of our experiences, preferences and the kinds of problems we work on :)
But dynamically languages give you generic collections and data structures for free. Why would you need static types at all?
They emphatically do not. Ignoring types doesn't give you a type system "for free"; much the same way that building a shelf doesn't make you a librarian.
Dynamically typed systems don't "ignore types", they just handle them at runtime.
I just don’t buy that go is some sort of sweet spot because it doesn’t have generics. Generics pretty much exist for maps and slices, because they are needed in real programs. The language designers just don’t let you make your own generic collections.
Yeah far from finding a sweet spot, Go exists in some kind of type system ghetto, because its type system is so crippled users have to resort to code generation (go generate).
Neither Python nor Java programmers have to do that.
Yeah Go is weird in that its static type system doesn't to provide you with great static typing power but instead it's just there as a sort-of sanity checker. If there's logic, they say write it with data structures and functions. Have invariants? Enforce them yourself.
If Go is annoying with how little power it provides, that's fair, but other type systems can be just as annoying then, because when given the ability to, type astronauts will blast off into space, purely as a matter of honor or instinct.
Besides, code generation isn't all that bad. Java programmers will eventually find some kind of code generation in their build setup (serialization/schema tools).
There's nothing wrong if users independently choose to use code generation. However when a programming language starts to rely on it, it becomes a major problem.
We've been here before with the C preprocessor. There's nothing wrong with having a preprocessor, but in C it is necessary to use the preprocessor and that causes a lot of problems, like making it especially difficult to write tools.
Yeah, I've noticed that Go APIs are very stringly typed. The APIs are not very self documenting, and it is hard to figure out whether something is nullable or not. Libraries often require you to initialised data in a partially invalid state and the whole thing feels quite error-prone and flaky.
> Besides, code generation isn't all that bad.
It is the number one thing that makes C++ templates unusable: semantics defined by means of code generation.
The fact that maps/slices/channels already exist generically is what puts Go into the sweet spot. You have generic containers for the vast majority of use-cases, so the value-added consideration of being able to cover more use-cases with generic containers becomes a lot smaller.
In a hypothetical world where the designers never added the specific containers they did, you'd get a whole lot more value out of generics for containers. But it turns out, the designers used what seems on the surface like a kludge to get most of the benefits, while saving most of the cost. It's a perfect embodiment of the kinds of tradeoffs I'm talking about.
You have containers for all the use cases the designers thought of, but then you have it worse than Python for all the other use cases, and you are stuck doing code generation or type erasure. It is impractical to expect go’s designers to have foreseen the best trade off for every codebase.
Architecture astronauting can be prevented with best practices and code review, not with language limitations. It’s a fools errand to try, code generation allows you to get all the complexity and more of generics.
> It is impractical to expect go’s designers to have foreseen the best trade off for every codebase.
Which is not the argument made by anyone. Indeed, I explicitly acknowledge that there is a certain fraction of use-cases not covered by the builtins. So there isn't really any disagreement about this.
The question is how large this fraction is, how much it would benefit and how inconvenient/costly the existing workarounds are. Like all engineering questions, these are impossible to talk about when dealing in absolutes. And once you actually talk about these questions quantitatively, I achieved the goal I had with the post - to change the debate into a quantitative one explicitly acknowledging the tradeoffs involved.
> Architecture astronauting can be prevented with best practices and code review, not with language limitations.
I work at a company which has probably one of the highest standards in regards to code review in the industry. As such, I disagree with you that it is effective in addressing this.
> It’s a fools errand to try, code generation allows you to get all the complexity and more of generics.
If that's the case, where do the complaints come from about the lack of generics? It seems that Go really has generics then, in your opinion?
Of course, that's a strawman and a misrepresentation of your argument. But what makes this a strawman, the difference between the existing workarounds and actual generics, is just as effective an argument for your side as it is one for my side. Because codegen is made so inconvenient, people bias heavily towards using the builtins, away from custom data structures, if they can at all get away with it. Thus greatly reducing the overall complexity of the codebase.
So it would seem to me, that this argument is logically flawed. Either codegen is a poor replacement, thus leading to people using less generic code, thus there is an effective reduction in complexity. Or codegen has the same effect on complexity, which would mean it is used just as much, meaning it can't be that bad a workaround.
Code generation achieves the same effect with more work and without a standard abstraction in the language it takes more effort to understand. Using general purpose primitives that are blessed to be generic similarly takes more work to understand when it exposes too many underlying implementation details. A nice wrapper class would work much better.
I don’t agree that using piles of built in objects makes the code easier to understand. If I want a Tree<Node, Node, Value>, how is using lists of lists and integer pairs making my code easier to reason about? Or using code generation to make reams of classes that create a Tree for everything I want, and anyone who uses my functions? How is encouraging either of those things a positive?
In this thread: people will bring out the same tired arguments for or against static typing, without commenting on the actual content of the post, which was quite good!
I have come to see type systems, like many pieces of computer science, can either be viewed as a math/research problem (in which generally more types = better) or as an engineering challenge, in which you're more concerned with understanding and balancing tradeoffs (bugs / velocity / ease of use / etc., as described in the post). These two mindsets are at odds and generally talk past each other because they don't fundamentally agree on which values are more important (like the great startups vs NASA example at the end).
I think this post was extremely hand wavy. It stated the same divide that is already known, but doesn’t actually make any arguments to why Go or whatever lies on some part of the curve, because it assumes that the way you program at different points on the curve are roughly the same but with more type boilerplate. Higher kinded types offer entirely new ways to program, and stuff like optional typing in Python makes it all much more complex than just “how long do I spend writing and reading type declarations”. I was left with an impression that the author was content with go, and that’s pretty much it.
I agree. The graph of static checking vs. lines of code should really be factored into static checking vs. amount of annotations to achieve that level, amount of annotations to write vs. how much that slows you down, and amount of annotations that are already written (in your own code or libraries you use) vs. how much that speeds you up. And those will vary wildly depending both on the language and the programmer.
It has been interesting to see the to and froing of arguments for and against static typing in the discussions here.
Though I am not a type theorist (I only dabble in compilers and language design), I have noted that many people conflate static typing and dynamic typing with other additional ideas.
Static typing has certain benefits but also has certain disadvantages, dynamic typing has certain benefits but also has certain disadvantages.
What I find interesting is that few people fall into the soft typing arena, using static typing where applicable and advantageous and using dynamic typing where applicable and advantageous.
Static typing has a tendency in many languages to explode the amount of code required to get anything done, dynamic typing has a tendency to produce somewhat brittle code that will only be discovered at runtime. The implementation of static typing in many languages requires extensive type annotation which can be problematic.
But what is forgotten by most is that static typing is a dynamic runtime typing situation for the compiler even when the compiler is written in a static typed language.
Instead of falling into either camp, we need to develop languages that give us the beast of both world. Many of the features people here have raised as being a part of the static typing framework have been rightly pointed out as being of part of the language editors being used and are not specifically part of the static typing regime.
Many years ago a similar discussion was held on Lambda-the-Ultimate, and the sensible heads came to the conclusion that soft typing was the best goal to head for. Yet, in the intervening years,when watching language design aficionados at work, they head towards full static typing or full dynamic typing and rarely head in the direction of soft typing (taking advantage of both worlds).
S, the upshot, this discussion will continue to repeat itself for the foreseeable future and there will continue to NOT be a meeting of minds over the subject.
Maybe part of the problem is I can't picture what you're actually talking about with soft typing. I can tell you C#/.NET has the DLR which allows you to do dynamic types whenever you want. Outside of a few gimmicks, you rarely see these used. I've rarely even seen them for quick prototyping, because generally you mess around with using them for prototying, then the first time they go bad, it's really obnoxious, and you realize you're compiling the code and writing function signatures anyway, might as well save the time later and do it right the first time.
Then there's the whole tooling aspect of trying to mix type systems. It's different lifestyles. Dynamic programmers aren't going to start compiling their code to run it, static programmers aren't going to switch to a language with weaker tooling around the IDE-ish features, which are mostly built on the type system.
My conclusion is this: New languages should all be statically typed, because we shouldn't need new languages at all. We should be fine. The reason we need new languages at all, is because the trifecta of C++/Java/C# basically encompassed the entire statically typed world, but they're all infected with this fully overblown OOP obsession, and the null pointer bug--which newer languages have fixed, through more static typing. Basically we need to replace those languages with similar ones and then just stop making languages for a few decades, until whatever we're doing now looks as dumb as OOP and null pointers. In the long run, Go/Swift/Kotlin/Rust will take over the statically typed world and it's going to be great.
Soft typing could be characterised by having the compiler do static type analysis where it can, but leave the type analysis to the runtime when it can't.
A simple example of this is a list. Now in statically typed languages, list are homogeneous (this is includes type unions). In dynamically typed languages, list can be heterogeneous, essentially anything can be added at runtime.
In soft typing, we can indicate that a list is homogeneous and the compiler will ensure that this is true or we can specify no type checking (as such) and this will be done at runtime.
Contrived yes, but I regularly use other aggregates (tables and sets) into which I do not want them to homogeneous.
One of the aspects that I like about functional languages is the polymorphism available, but in all that I have come across, there is no way to make a tree or list heterogeneous without declaring union types before hand.
My problem with C#, C++, Java, and their ilk, is that code is multiplied with their generics.
How the IDE and compiler and type systems interact is a design function and is not inherent to any type system.
One of the reasons I don't use specific main stream languages such as C#, C++ or JAVA is that they don't provide the specific programming features that I desire.
I have looked at Go, Swift and Rust and I am not at all impressed by the "relative stupidities" within those languages. For other programmers, what they consider to "relative stupidities" is entirely up to their experience and outlook.
Our industry has not yet even scratched the surface of what types can offer: Types for enforcing architectures and controlling effects, types for checking correct use/free of scarce resources, types for verifying protocol implementations etc etc. Currently, half the industry is using schema-less json and dynamic languages; so really it is far too early to generally talk about any diminishing returns.
There's a lot of great things our industry doesn't use: contracts, proper fuzz testing, cleanroom, formal specification, constraint solvers, _checklists_. We might (not necessarily, but _might_) be in a place where types are diminishing returns with respect to other low-hanging fruit.
Yes it's true that retrofitting better type systems into existing languages may not be low-hanging fruit. But developers have shown a willingness to adopt new languages when they see clear benefits.
> Yes it's true that retrofitting better type systems into existing languages may not be low-hanging fruit.
Disagree here, actually! Javascript (Typescript) and Python (mypy) are both seeing pretty big benefits from adding gradual typing.
Glad to hear it!
When you speak of contracts, are you referring to run-time contracts i.e. Racket?
Its so funny how people argue for types everywhere, then use nosql databases and lack type checking on data validation.
I used to agree, but after seeing how easily versioning of schemas, procedures etc in conventional databases can turn into a clusterfuck I have changed my mind. I have begun to like the idea of putting all the schema info into compiled applications that can't easily be changed on the server. MySQL et al is the worst of all worlds.
In fact, the decades-old CSP model, upon with Go and Clojure's core.async are based, outlined compile-time assurance that there are no race conditions in your multi-threading. You are correct that these two modern implementations of CSP do not go there.
For data, schemas ala clojure.spec are a competing idea - it makes the "type system" much easier to metaprogram and apply selectively.
I'd argue that any schema is a type system of sorts.
Well said. This article is essentially dismissing a technique that is barely used, with the argument that the technique is not the entire solution. Of course it isn't, but that doesn't change the benefits that it can bring.
The industry has other issues, the constant cruft, tech debt and turnover. Lots of people make money through this, they won't accept making better software if it make them appear smaller and too cheap.
OP draws a false one-dimensional relationship between types vs tests in terms of code quality. Writing expressive types instead of tests does much more than affect a quality curve - it changes the way you approach the problem you are trying to solve. The classic Haskell example is understanding how IO being a monad allows you to push impurity to the edge of your system.
Start-ups decide not to write MVPs in languages like Haskell or Idris not because those languages aren't "rapid" enough, but because it's too difficult to find programmers experienced in those languages on the labor market. It's already difficult enough to find competent programmers - no founder wants to make their hiring woes even more difficult.
Sorry to contradict you, but we wrote an mvp in rails even though we have 3.5 experienced Haskell programmers on staff. We did this because we knew we could build some web stack apps with all the trimmings much faster in ror. So there is at least one counter example.
I don't think it's really a contradiction. In a startup you still have to choose the quickest path that you think will lead to success. It just depends on what your definition of success is. RoR can be a safe choice even for Haskell devs if they just want to build an off-the-shelf webapp with all the trimmings. But if your definition of success is that you want to create a formally-verified smart contract platform and cryptocurrency, you're going to use something like Haskell or OCaml: https://github.com/tezos/tezos
There's a point beyond which you spend more time proving things about your code than writing it, all the way up to the point where your ability to prove things about your code in your chosen type system starts to affect the kinds of solutions you can construct, and a different kind of complexity creeps in; representational complexity rather than implementation complexity. This can be a source of error, not just inefficiency.
Firstly, thank you for wanting to take an open-minded look into the issue, rather than simply defend a position that you have already committed to.
You write "Why then is it, that we don't all code in Idris, Agda or a similarly strict language?... The answer, of course, is that static typing has a cost and that there is no free lunch."
I take it that you wrote "of course" here through assuming that there must be some objective reason for the choice, and that it depends solely on strictness, but languages don't differ only in their strictness, so choices may be made objectively on the basis of their other differences, and we also know that choices are sometimes made on subjective or extrinsic grounds, such as familiarity. I don't know what proportion of professional programmers are familiar enough with Iris or Agda to be able to judge the value proposition of their strictness, but I would guess that it is rather small.
Now, to look at the sentences I elided in the above quote: "Sure, the graph above is suggestively drawn to taper off, but it's still monotonically increasing. You'd think that this implies more is better." As the graph is speculative, it cannot really be presented as evidence for the proposition you are making. I could just as well speculate that static program checking does not do much for program reliability until you are checking almost every aspect of program behavior, and that simple syntactical type checking is of limited value. That would be consistent with the fact that there is little empirical evidence for the benefit of this sort of checking, and explain why most people aren't motivated to take a close look at Iris or Agda. In this equally-speculative view of things, current language choices don't necessarily represent a global optimization, but might be due to a valley of much more work for little benefit between the status quo and the world of extensive-but-expensive static checking.
I think talking about a sweet spot is correct
I've been thinking about the trajectory of C++ language development recently and the emphasis has definitely been on making generics more and powerful. You watch CppCon talks and see all this super expressive template spaghetti and see that while it's definitely a better way to write code - the syntax is just horrifying and hard to "get over"
Just like when "auto" took off and people starting thinking about having "const by default" - I'm starting to think that generic by default is the way to go. The composability of generic code is incredible powerful and needs to be more accessible
However the other end of the spectrum: dynamic code leaves a lot of performance on the table and leads to runtime errors
When I went from working at Apple to a language implementation group at another company, my views on Objective-C's duck typing + warnings for classes being useful and good was pretty heretical. It's nice to see other people agree with me.
Especially when it comes to GUI programming, I really don't care if a BlueButton.Click() got called instead of RedButton.Click().
These graphs really mean nothing. There is no data behind them. I might as well make a graph that conveys a non-descript correlation between how much an article bashes static typing & assertion and how high it is on HN.
They're just sketches. That's part of the point, and the article says that directly. The point isn't the exact shape or slope of the curves, but just their asymptotic behavior and the relationship of "correct features/day" to the other two. I.e. As long as the two curves have that general shape, then the "sweet spot" exists somewhere between 0-100%, the exact location of which depends on language, developer experience, and business priorities. The exact numbers are irrelevant to the article's point.
But even the asymptotes are an assumption derived from pure thought experiment.
More realistically, it's an educated guess based off the author's personal experience as well as their understanding of the experiences of other developers operating under different constraints.
The author makes it clear that the analysis is not perfectly rigorous. There is a very wide landscape between perfectly rigorous and completely useless.
Do you think the article fails to hint at any of the fundamental dynamics of how type systems affect software development? How so?
I'm not who you're replying to, but for me the charts didn't make sense either.
For one example, I don't think it's a given that the green line (velocity vs % type-checked) should have a negative slope. Maybe in some cases, for some projects or some people, but certainly not universally. At least part of it would have been positive on almost all projects that I've worked on, and I'm not doing rocket science.
Then, the combined chart just looks at the amount of bug-free output, completely ignoring the amount of bug-ridden output. That latter part doesn't just get discarded, it needs fixing, and bugs that were only discovered in production are expensive to discover, debug and fix.
This is in addition to pretty much every other top level comment in this thread, a lot of which bring up important points that are unaccounted for even conceptually in the charts.
Quote -- "The answer of course is simple (and I'm sure many of you have already typed it up in an angry response). The curves I drew above are completely made up."
Think again--since when do graphs depict only cold, dry data? Graphs have always been useful for depicting relationships--real or proposed. Line graphs in particular are often found depicting proposed relationships rather than real data (though often inferred from data,) since for all cases where the real data is discrete, this would result in a scatter plot rather than a line.
You beat me to the same comment. It's pseudoscience. I guess they're measuring the anxiety at HN that people's sunk-costs in stringly-typed runtimes won't keep guarenteeing obscene salaries.
I had the same experience, but I also have to say that the static type systems of some FP-languages feel really light-weight.
So year, static typing doesn't buy you much, but in some languages it's at least cheap.
> So year, static typing doesn't buy you much, but in some languages it's at least cheap.
I think this is key. The benefit of static typing isn't that they provide safety, it's that they provide _low-cost_ safety. For a large class of problems, types are cheaper than tests are. For other classes, tests are cheaper than types. The main downside of nonstatic languages is that you have to use tests for everything, even that class where types are a better choice.
One of my favorite parts of Powershell is optional typing. Variables are a generic "Object" type by default, which can hold anything from a string to array to "Amazon.AWS.Model.EC2.Tag" or other custom types.
Or, type can be specified when setting the variable:
[String]$myString = "Hello World!"
This would generate a type error:
[Int]$myString = "Hello World!"
Often, typed and untyped variables will sit together:
[Int]$EmployeeID,[String]$FullName,$Address = $Input -split ","
Indeed! I think one of my favourites has to be:
And you get a deserialized version of the XML text.[xml]$someXmlDocument = Get-Content "path\to\file.xml"Also the fact that you can use types when declaring function arguments, removing the need to manually test if an object of the desired type was passed.
Powershell definitely strikes a good balance on type safety for a scripting language.
So when you call a function which takes a String as argument, do you need to cast the value manually?
If no, then what is the use of the typesystem?
If yes, isn't that cumbersome, since I suppose most library functions have typed arguments?
I'm converting a codebase of Javascript of about 200+ js files to Typescript today. I am about 5% complete... already found two places where the argument list was wrong and was being sent into a void. I also see the code that was making up for the fact that the third argument was being ignored (basically patching downstream because they thought the feature was broken).
Now this codebase was written with a high degree of quality (it's pretty good but not perfect), but the lack of compile (and of course runtime)-time checks has caused waste.
The second phase of my project to convert all promises to RX Observables :)
Promises (representing the result of a single asynchronous operation) and Observables (representing an ongoing stream of emitted values) aren't really equivalent. I know you can create an Observable from a Promise, which will emit a single value when that promise is fulfilled and then be marked as completed or closed - but this is more for integration - such as being able to combine inputs from single-shot async calls into broader observable operations. If your Promises are discrete async calls, I'm curious why you would want to convert them?
If you're just rewriting these Promises because the syntax is too verbose, you might be interested in checking out async/await as another alternative; I just rewrote some Promises to that recently, and it's really, really nice. Of course, if you prefer RX Observables, go right ahead :)
Thanks for the note, I am looking into it right now. One area that may grind my head with async await however is that there is a lot of Promise.all work in this codebase. Would you still use async/await constructs when you need to do a lot of fork/join/merge stuff? (sorry for the derail HN)
If you're using Promise.all to run code in parallel, then async await can't really replace that, as far as I know. But you can still use `await Promise.all(...)`, which will free you from having callbacks everywhere; running parallel code will no longer have to look so different from running it sequentially, which is quite nice.
Yes, of course !
>I'm converting a codebase of Javascript of about 200+ js files to Typescript today.
Pity you! I fear such tasks.
As mentioned,take advantage of async/await. Also, make sure you wrap everything in modules and access from outside through module exports.
The benefit of static typing isn't just reliability. Tooling is another major argument. Won't appeal to certain hardcore programmers who think that even notepad has too many features. But it is great for refactoring, finding all references to a function or a property or navigating through the code at design time. Basically all the features visual studio excels at for .net languages.
And I disagree with the barrier to entry argument. Static typing, by enabling rich tooling, helps a beginner (like it helped me) a lot more by giving live feedback on your code, telling you immediately where you have a problem and why, telling you through a drop down what other options are available from there, etc. Basically makes the language way more self-discoverable than having to RTFM to figure out what you can do on a class.
I think dynamic typing proponents get hung up on the auto-complete aspect. The real benefit is when you find someone writing a property with a common-ish name to a data structure and you want to know "who the hell uses this", you can answer that question pretty easy in statically typed languages. In dynamically typed languages you kind of just grep and hope the name is not too common.
> In dynamically typed languages you kind of just grep and hope the name is not too common.
For four days, I spent debugging a python production script because in one place I had typo'd ".recived=true" on an object and just couldn't understand why my state machine wouldn't work.
And very quickly, the whole team became fans of __slots__ in Python.
I still write 90% of my useful code in python, but that one week of debugging was exhausting & basically wouldn't have even compiled in a statically declared language. Even in python, the error is at runtime, after I got the __slots__ in place.
Is that really a typing problem, though? If I tried to use a "recived" slot on a Lisp struct that only had a "received" slot defined, I'd get a read-time error, and there's zero typing involved there.
> For four days, I spent debugging a python production script because in one place I had typo'd ".recived=true" on an object and just couldn't understand why my state machine wouldn't work
Is that really a dynamic typing problem or a language that allows you to create instance members anywhere? It seems like that is a flaw in the declaration model of the language and not a static / dynamic issue.
TXR Lisp, a dialect I created:
Both warnings are static. If we put that into a function body and put that function into a file, and then load the file, we get the warnings.$ txr This is the TXR Lisp interactive listener of TXR 185. Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet. 1> (set a.b 3) ** warning: (expr-1:1) qref: symbol b isn't the name of a struct slot ** warning: (expr-1:1) unbound variable a ** (expr-1:1) unbound variable a ** during evaluation of form (slotset a 'b 3) ** ... an expansion of (set a.b 3) ** which is located at expr-1:1The diagnostics after the warnings are then from evaluation.
Those are nothing; TXR Lisp will get better diagnostics over time. I'm just starting the background work for a compiler.
There is dynamic and then there is crap dynamic.
Don't confuse the two.
There is crap static too. Shall we use C as the strawman examples of static? Hey look, two argument function called with three arguments; and there's a buffer overrun ...
It seems to me that you've just invented a static type checker. (Combined with a run-time type checker.) Am I mistaken?
I mean, we can argue the semantics of what, exactly "static type checker" means, but...
Static checking doesn't make a "static language".
A "static language" occurs when we have a model of program execution that involves erasing all of the type info before run-time. Or most of it. (Some static languages support OOP, and so stuff some minimal type info into objects for dispatch.)
Note how above, my expression executes anyway; the checks produce only warnings. The warning for the lack of a binding for the a variable is confirmed upon execution; the non-existence of the slot isn't since evaluation doesn't get that far.
If we retain the type info, we have dynamic typing. There is no limit to how much checking we can do in a dynamic setting. The checking can be incomplete, and it can be placed in an advisory role: we are informed about interesting facts that we can act on if we want, yet we can run the program anyway as-is.
This really sounds like "semantics" to me (not PL semantics! :).
For example, these days it's quite possible to ask GHC to defer type errors to runtime. Does that mean that the GHC dialect of Haskell is dynamically typed? This is basically a command line switch away, btw.
Retention of type information does not "dynamic typing" make. As a trivial example, consider C++ RTTI.
You really have just reinvented static (type) checking and a good runtime. There's no shame in that, but let's not pretend that these are opposing forces.
C++ RTTI is only for class objects, and only useful when they are manipulated by pointer or reference. I believe I covered that sort of thing with my statement, "Some static languages support OOP, and so stuff some minimal type info into objects for dispatch.". It's a gadget which provides an alternative to the structure of doing everything via virtual functions on a base class reference.
Ok, fair enough, but what about e.g. "-defer-type-errors" for GHC?
I still think you're just arguing semantics.
EDIT: Incidentally, the statically typed crowd can even go the "other way", namely from runtime -> compile time. For example, it's quite possible to derive a static proof/type from a runtime value in e.g. Idris by pattern matching as long as you're meticulous about building up the proof.
Not sure it's really an invention, it's been around for awhile. Check any decent common lisp implementation.
I'm pretty sure he doesn't mean invented in a literal sense. The phrasing implies a meaning of reinvented.
Correct, I was being a little bit facetious. All in good fun, obviously :).
>There is dynamic and then there is crap dynamic. (...) There is crap static too.
Excellent. My point exactly. I have no fear of using a static or dynamic language, as long as it is a good implementation of a statically (or dynamically) typed language.
What would you consider good implementations of either?
Good "statics": Haskell, ML family, maybe Rust.
Good "dynamics": All the Lisp-family languages. Smalltalk. Julia. Lua. Tcl.
Plus you can eliminate all typos in any language with a simple statistical linter that you can implement in like an hour or two.
Could you clarify - did you go over all classes in your codebase and added __slots__ attribute? Or you have some tooling to do that for you? I don't have any clear counterargument but that sounds wildly unpythonic thing to do.
> I still write 90% of my useful code in python, but that one week of debugging was exhausting & basically wouldn't have even compiled in a statically declared language.[...]
nothing stops you from using static typing with python3
I haven't used optional typing in python. However, the problem I see with optional typing is that while I may choose to use it my team mates might not. Even more importantly, the 3rd party dependencies I use probably don't. So the amount of code that actually uses the types would be very limiting. Saying the choice is entirely up to you personally is a misnomer.
Strict use of type annotations has to be a team decision, yes. Checking for them can be integrated into a CI build, though.
The typing spec provides for supplementary "stub files" that can be provided for third-party dependencies without their own annotations. The typeshed project provides these for a fair and quickly increasing number of common dependencies, and is plugged into pycharm and mypy by default: https://github.com/python/typeshed
I think GP meant 'you' as in 'your team'. Anyway, Python has a type repository for popular libraries, just like TypeScript does. And mypy can still typecheck your code even without type annotations.
In fairness the hints available in Python 3 aren't really a static typing system. It would take a lot of extra work to make it one.
Well, nothing except almost the whole ecosystem of libraries not having type annotations, those type annotations that exist themselves being quite limited (IIRC), etc. etc.?
Can we stop pretending that adding type annotations to a 'dynamic' language solves the "good static typing" problem? It's just silly.
(About as silly as pretending that static types solve all problems.)
This seems like something a linter can catch though. ESLint certainly seems to give me warnings for use of undefined properties.
That's a kind of error caught by the popular static analysis tools. Better add that pylint and mypy to your travis.yml as soon as possible.
For me, what I miss when using dynamic languages is good, solid refactoring tools that can inline functions/methods, rename across source files, etc. In my experience, in both Python and Clojure, the refactoring tools simply don't come close to those of languages like C++, C# and Java. And I say this as a major Clojure fanboy...
Namespaced keywords can really help with refactoring your domain model.
Usually I use normal keywords for throwaway or glue code, but anything important (my actual domain entities) will be namespaced, allowing (relatively) pain free refactoring.
Cursive has a few good refactoring tools/shortcuts, but I would also like to see extract/inline/move for functions.
I'm not saying that I don't have facilities or tools (clj-refactor) to do refactoring, just that they are, in my opinion and experience, not as good or full featured as in the mainstream static languages (and this experience seems to translate between other dynamic languages, eg, Python: I have refactoring tools, but they're not as good as what I can do in, say, Java).
Its not a huge problem, just something I miss from the static languages.
This.
Not just, "who the hell uses this", but "where the hell is this defined" as well.
>Not just, "who the hell uses this", but "where the hell is this defined" as well.
On Common Lisp, a dynamic language, I can also get this answered instantly. I just press a key combination on a method call and i jump to the definition.
So this isn't exclusive to statically typed languages.
But if the compiler doesn't enforce that types are statically determinable, there will be cases where the tool will have to show you more potential definitions for function definitions than would be shown for a statically typed language.
Maybe good tools are able to perform some static analysis and rule out some of the methods with the same name but impossible types, but the language doesn't rule out situations where the best the tool will be able to do is show you all of the function definitions with the same name as the (dynamically dispatched) function call site you're looking at.
Let's say you have seven different type hierarchies having dynamically dispatched functions named "run", with 5 definitions in each hierarchy, for a total of 35 functions named "run". In a statically typed language, if the code compiles, it's possible to narrow down the type for a given call site to either one of the hierarchies or one definition, meaning you have to look at either 1 or 5 definitions. In a dynamically typed language, there are situations where legal code results in the tool having to throw up its hands and show you all 35 definitions.
The flip side is that if you really have a spot where you'll need to dispatch to any of the different hierarchies, then in a strong statically typed language, you'll need to either create an algebraic sum type covering all 7 hierarchies, or you'll need something like a typecase / typeswitch statement to enumerate out your possibilities.
there will be cases where the tool will have to show you more potential definitions for function definitions than would be shown for a statically typed language.
And even in a statically-typed language there will be cases where the tooling can only determine fairly generic things statically.
I don't see anyone advocating for abandoning static typing over that occasional limitation. Yet I do see people proposing similarly-infrequent issues as cause to abandon dynamic typing.
When $problem happens to $language_I_prefer, it's an unavoidable difficulty that is easy to work around.
When $problem happens in $language_I_dislike, it's a clear sign that the language itself is inherently broken.
I had to do a decent amount of setup and wrote a quick script to run the tools which generate ctags to make this happen in CL using slimv. With top tier ides for static languages, this is built in from day 1 with no effort. You can't not have it. That said, a top tier repl for development is missing in languages like C#. I'd love to see that gap bridged.
I think it's telling that while there are Lisp aficionados in this thread telling us how great it is, none of these ideas have been implemented in Python or JavaScript or Ruby, the most common dynamically typed languages.
And that's really all I care about, I'm not about to start writing production code in Lisp.
You can even do this with Javascript, too. And without doc types. At least in vscode, anyways.
Well, usually. It gets a little confused when you start using dependency injection containers, or dynamic requires, or anything like that.
Lets be real, when we say dynamic languages, we aren't talking about niche languages like Lisp. We are talking about JS and Python almost exclusively.
>Lets be real, when we say dynamic languages, we aren't talking about niche languages like Lisp.
How can it be a niche language, if it's an ANSI standard, has more than 8 or 10 fully feature, standard-conformant implementations, runs on most CPU types, and has been proven to work in production systems for spaceship guiding, worldwide airfare reservation and credit card transaction verification?
You use Js and Python because you choose to use it, but it's not the only choice. Not only Common Lisp, you could also be working in Clojure with many benefits.
Python has jump to definition too.
Does it work in the presence and (ab)use of dynamic features? (e.g., when you have lots of decorated functions) Or is it just a best effort thing? (i.e., only works when your code would actually be expressible in a static way)
This is a continuum... in many statically typed languages there are regularly used paths that lead outside the realm of the language-defined type system as well. Consider dynamic class loading in Java, dlopen / reinterpret_cast in C++ etc.
> This is a continuum.
It is not. A language feature is either amenable to static analysis (not necessarily type checking) or it is not.
> Consider dynamic class loading in Java, dlopen / reinterpret_cast in C++ etc.
This actually emphasizes my point. Features that are not amenable to static analysis are problematic for tooling.
You mean there’s an ide with that feature? It’s never as good as with any statically type language.
Sublime Text has this feature on hover, but I am fairly certain it is based on text search of the modules in the python path.
Have you tried it? It's usually pretty good for Python.
Which IDE?
PyCharm and VSCode are both pretty good.
There have been references to Clojure, Elixir, Smalltalk, Common Lisp in the comments here, what makes you think just Javascript & Python?
It's worth noting that Elixir (by descent from Erlang) can emulate at least a basic static typing system pretty easily through pattern matching. You miss out on some features common in more traditionally-object-oriented languages (namely: subtypes), but tagged tuples and structs do provide a lot of the same safety benefits in runtime (and tools like Dialyzer can - last I check - use such pattern matching as a basis for static verification).
Elixir and Erlang still lack the ability to typecheck any process behaviors because messages can take any type anywhere.
You can still pattern match when receiving, though (in fact, 'receive' in both Erlang and Elixir does pattern matching already in the same vein as 'case'); just match messages based on a tagged tuple or a record (Erlang) / struct (Elixir) signature or however else you want to define your "types".
Why would you say that? There are far more dynamically typed languages in widespread use than those 2!
IntelliJ/WebStorm is also pretty good at this (with JavaScript at least) by indexing a whole project and its dependencies. Refactoring doesn't work that well, though.
PyCharm gets it right 99% of the time. That 1% is hardly worth switching the a different language.
For someone paranoid about correctness, that's 1% of lingering doubt about every single operation in the IDE.
The 99% stat is quite likely a big exaggeration.
Refactoring is far far harder to do reliably, and more straining as so much more responsibility on the programmer.
IME, add that 1% to another "1% isn't worth switching" argument with some other Python deficiency and then to another and so on, and reasonable case for choosing other languages can be made. I'm not saying Python should be abandoned for all projects, but we should be careful with "but if we make everyone use this one tool, and this one tool does it right almost all of the time, so we're good" arguments.
I'm reading Type-Driven Development with Idris right now exactly because I don't think Python is always the correct choice. I just don't find slight improvements in IDE features to be relevant.
What about the time accounted for in each of those buckets? If "that 1%" accounts for more than 1% of debugging, error chasing, and related time it starts to look more important, yes?
The module system makes it very easy to find definitions, because one namespace = one source file.
And the other way around: Statically typed languages without module systems, such as C, exhibit this problem too.
I work with an in-house framework written in Javascript.
You would think by looking at the code that the creators had a 30 word vocabulary, because 80% of the code uses the same six nouns and four verbs to pass data around and what you use those for depends on the context of who is calling it.
Oh, but the entire thing is written using promises, so most of your function calls have no context. It's hell, and I'm starting to worry that Node has dug itself a reputation hole it will never get out of.
I can feel your pain, but note that this isn't a fault of dynamic typing per se. This is bad coding (for example, no good use of modules) plus the problem of Javascript itself having many issues; not to mention the wildly different levels of quality within the JS library ecosystem.
I don't think Node is doomed; Typescript + async/await makes code make sense.
Actually, refactoring browsers were pioneered by Smalltalk people.
And smalltalk had to leave trampolines in place to deal with renames that it couldn't finish deterministically. No thank you.
Nobody ever talks about that little horror when they bring up how Smalltalk could to refactoring JUST FINE without static types. It wasn't just fine, turns out.
I don't understand this argument.
In smalltalk the parameter names are part of the function name to reduce the likelyhood of name clashes. You still have the same problem when you have two functions with the same name and parameters.
You understand, someone will always claim that everything was invented by smalltalk at some point.
What are your favorite tools for doing this in C/C++ (which are extremely statically typed, if not very strongly typed)?
I think the best thing I've found for this personally is coccigrep, which works but I've only used it a couple of times. I'd like something I'd reach for about as often as I reach for grep. (Also I think these days you'd want it to be based on clang or something.)
One thing that does seem to be true is that the requirement to name types means that a textual grep is way more reliable than it is in C. If I want to find all places where a Python class is used in a large codebase, I might as well give up.
I imagine CLion is a popular answer here: https://www.jetbrains.com/clion/
Oooh, I've heard of people using it but for some reason I just thought it was an editor. I'll have to give it a try.
QtCreator
Here are the refactorings available : http://doc.qt.io/qtcreator/creator-editor-refactoring.html
Also they are backed by Clang and muuuuuuuuuuuch faster than in Visual Studio + whatever paid extension
A very nice feature it has and that I didn't see elsewhere is the optional case-sensitive renaming. eg renaming Foo to Bar will also change foo to bar and FoO to BaR.
And "what the hell is this?" when you see a data structure called "options"
And "why? oh, why?" when that data structure is unvalidated json and the only way to enumerate all options is to read all the code
I'm so happy to see the pendulum swing the other way, because when JavaScript was becoming popular, and Ruby and Python had a popularity renaissance (around mid to late 2000s), I had a lot of online arguments about the value of static typing.
People were complaining about static typing for the dumbest reasons (it's too 'wordy'). It started as a backlash against old-school enterprise Java development (which was fair, EJB2 world sucked) but then it went completely off-the-rails. Typing in Java could be better, sure, but even with its quirks it's way better than the nothing you get with dynamic languages. There are a class of bugs you just never need to worry about when the compiler does some compile-time checks for you ... like worrying that you passed the wrong type into a function, or the wrong number of arguments.
Thank God people are coming to their senses.
I've worked with, statically typed languages, dynamically typed languages, and my day job involves a dynamically typed language with optional annotations (dynamically enforced) and static analysis tools.
In my experience, the cost of static typing feels roughly constant per line of code, while the benefits of static typing feel roughly O(N log N) in lines of code or O(N) in number of cross-type interactions. These are just wild guesses based on gut feelings, but they feel about right. The constant factors are affected by individual coder preference, experience, and ability, but also specifics of thy type systems involved and the strength of type inference in the tools being used.
In any case, I think often times dynamic typing proponents and static typing proponents have vastly different notions of what a large code base is, or at least the size code bases they typically use.
One problem is that many code bases start out where the advantages of static typing aren't readily apparent, but re-factoring into a statically typed language is often not realistic, even/especially when a project starts groaning under its own weight.
I'd love to see more mainstream use of gradual typing/optional typing/hybrid typing languages, especially something like a statically typed compiled subset of a dynamically typed interpreted language, where people could re-factor portions of their quick-and-dirty prototypes into nice statically typed libraries.
There are no large codebases - just insufficiently modular ones. :-)
Typing in Java could be better, sure, but even with its quirks it's way better than the nothing you get with dynamic languages.
I disagree; that Java was the standard example of statically typed languages is what convinced me for a long time that I didn't want anything to do with it. Having to pollute my code with all that crap, deal with a lot of dumb restrictions and still have NPEs left me with a sour taste.
Only after I discovered Haskell (and more recently Idris), did I realize that static typing can actually be worthwhile.
Agreed, every time I had to patiently explain to javac that shockingly, my new ArrayList<T> was a List<T>, my new FooBar was a FooBar, and always would be, it drove me slowly mad. Still, I was thankful for the static types when trying to understand where this strange object came from and what it was supposed to do. I’m glad we have modern languages that have the potential to do an even better job of that without a lot of the verbosity.
Those things are indeed obnoxious, but they are truly _Java_ issues rather than static typing issues. For instance, C# will figure that stuff automatically in many cases using the _var_ keyword. The Lombok plugin can add something like that into Java, although it's only about 80% as good because of other Java limitations, primarily type erasure.
I agree. While I have never used C# in anger I have played with enough other (non-Java staticky typed) languages to know that it doesn’t have to be this bad :D
While it may be frustrating to novice programmers, the distinction between interfaces (List) and implementation classes (ArrayList) is quite valuable, especially when dealing with huge code bases that have to be maintained over decades. Those declarations help to establish an internal design contract and clarify the developer's intent. In this particular case, a future maintenance programmer could switch from ArrayList to any other class which implements the List interface without breaking the program.
True, but there are more elegant solutions. For example, Rust's traits allow you to define a trait (somewhere in between an interface and an abstract class in Java), and then methods can be generic over any object that implements that trait. And structs and traits have a many-to-many relationship.
This allows you implement patterns like duck typing which are idiomatic in languages like javascript, but impossible to implement in java.
Which happens exactly how often? I've been writing java code for the better part of two decades and basically never needed to swap my list/set/map types around. Also, one of the purported benefits of java's typing is your editor can find and refactor / swap all usages, so it doesn't seem to me like this distinction has much value. It instead makes me think it's a fetish for object orientedness...
In fact, the only cases I've ever swapped even a HashMap has been to trove, which implements data structures for builtin types rather than reference types. In which case using interfaces everywhere doesn't help.
It’s not about having the difference, it’s about having to explain to Javac what I am using because it’s incapable of doing meaningful type inference.
Only in the case of generic erasure, which is a Java wart that we will, sadly, never eliminate.
Generics in Java have well known limitations stemming from type erasure (a design decision made to preserve backwards compatibility). Things aren't quite as dire as you made them out to be and in the general case, they work quite well.
>Having to pollute my code with all that crap
What crap? Types?
>deal with a lot of dumb restrictions
Like what?
What crap? Types?
Types, especially those that could easily be inferred and that provide no value to the programmer. Writing types down can be useful - Python programmers do it too. Having to tell the compiler every little thing is crap.
Like what?
Type erasure. Having to treat primitives and arrays differently from other types. No first class functions or classes. I won't go on, the arguments are easy to find.
I won't deny that Java had some quirky design decision that the language is paying for (though some of those have been mitigated). But I can't find your criticisms very persuasive in a world where languages like JavaScript and PHP are popular. Even C and C++ have their own idiosyncrasies to contend with.
I am surprised though that you were completely ignorant of the benefits of static typing outside of Java. I would think that simply out of curiosity you would do a language survey just to get an idea of what else is out there.
II can't find your criticisms very persuasive in a world where languages like JavaScript and PHP are popular.
Persuading you on the general demerits of Java wasn't really my intention; my point was that if you want get more people on the static typing train, Java is a bad ambassador and might work against you.
I am surprised though that you were completely ignorant of the benefits of static typing outside of Java. I would think that simply out of curiosity you would do a language survey just to get an idea of what else is out there.
Well, back then (early 2000s) I was a self-taught teenager with a poorer grasp of English (I'm not a native speaker), so while my curiosity did lead me to discover a few languages (JavaScript, PHP, Python, Lua and C), anything remotely approaching "academic" - Haskell, OCaml, Oberon, etc - was out of bounds for me.
Since C was the only other statically-typed language I knew, and I also knew it was much older and designed for much slower machines, I assume static typing was mostly for efficiency, like manual memory management.
> Thank God people are coming to their senses.
There are no senses to come to. Static and dynamic typing each have their own benefits, and there are genuine tradeoffs to choosing one over the other. That we are even having this debate in 2017 shows that the world of typing is not a "solved problem" and there are still good reasons to use one over the other for various reasons.
>Static and dynamic typing each have their own benefits,
I struggle to think of any benefits of dynamic typing on a reasonably sized code-base (e.g. 50k+ LOC).
Cheap and plentiful dev talent. Grab any random guy (and at these shops it's always guys) off the street and bang you've got a python dev.
There's always going to be an agility multiplier so long as VC is flowing and getting an mvp and a high head count is worth more than any kind of sustainable product.
>Grab any random guy (and at these shops it's always guys) off the street and bang you've got a python dev.
I expect any programmer to be able to transfer their skillet to a new programming language - especially when we're talking about mainstream languages with a GC.
>There's always going to be an agility multiplier so long as VC is flowing and getting an mvp and a high head count is worth more than any kind of sustainable product.
What kind of agility multiplier? Maybe it makes a difference during a hackathon where you have 24 hours to put something together or maybe if you're putting together shell scripts ... but taking something to MVP takes weeks or months - mainstream dynamic languages simply don't have any edge in development speed in those instances.
>I struggle to think of any benefits of dynamic typing on a reasonably sized code-base
Being able to define and initialize types at runtime offers more flexibility and it's quicker to develop in.
I'm frankly surprised nobody brings it up more often but the prototypical example of "static typing done right" - Haskell - is talked about nearly constantly but when I look for actual software I might use that's written using it.... there's so little it's almost embarrassing. One obscure window manager, one obscure VCS, facebook's spam filter and some tool for converting markup.
Given a sprinkling of asserts, a decent linter and a high level of test coverage I don't see much of a benefit to adding static typing.
>Being able to define and initialize types at runtime offers more flexibility and it's quicker to develop in.
Quicker if you're writing shell scripts or a small single-purpose applications. Not quicker if you're adding to a codebase of any significant size.
Decent programmers compose code bases of "significant size" from many small, loosely coupled single purpose applications.
Yes, IME it's still quicker.
OTOH, if you're using "codebase of significant size" as code for "big ball of mud", static typing certainly helps, but integration tests are the real lifesaver.
>OTOH, if you're using "codebase of significant size" as code for "big ball of mud", static typing certainly helps, but integration tests are the real lifesaver
No. I mean when the codebase is of a certain size, new features require some thought and planning. Features may span multiple-modules. They may require partial or full rewrites or the refactoring of any number of sub-components to support the new behaviour. This means that you proceed carefully because you may not want to introduce regression bugs. This costs time. At that point, you're not limited by your typing speed as you may be putting in net 10 lines of code a day. There is just no benefit to dynamic typing at that point. Worse for dynamic languages, this is where improved tooling and static type constraints start paying extreme dividends.
>No. I mean when the codebase is of a certain size, new features require some thought and planning. Features may span multiple-modules. They may require partial or full rewrites or the refactoring of any number of sub-components to support the new behaviour. This means that you proceed carefully because you may not want to introduce regression bugs. This costs time.
Yes. And all equally true for statically typed languages.
Refactoring without tests is easier in a statically typed language, but still stupid. If you assume a decent body of tests, the benefit dissipates quickly.
If you prototype faster, as you usually will in a dynamically typed language, your design mistakes cost less. Since in large software systems I tend to find that the two biggest sources of bugs are a) errors in specification and b) poorly designed APIs, not obscure edge case bugs, quicker prototyping helps in large projects too.
And, if your design is solid, you can still achieve similar benefits as static typing by "locking down" your boundary code with asserts so that future development that interacts with that boundary code will fail quickly if it interacted with in the wrong way.
>At that point, you're not limited by your typing speed as you may be putting in net 10 lines of code a day. There is just no benefit to dynamic typing at that point.
This is fallacious. You're never limited by your typing speed in any language at any point. The cost of "extra typing" is cognitive, not a finger speed limitation.
The benefit of dynamic typing is more flexibility in the code you write (especially useful for writing frameworks and such) and quicker turnaround when prototyping because you do not have to prespecify as much up front.
>Yes. And all equally true for statically typed languages.
>Refactoring without tests is easier in a statically typed language, but still stupid. If you assume a decent body of tests, the benefit dissipates quickly.
We're not arguing whether dynamic language+extensive integration/unit test is better than static typing and no tests.
>And, if your design is solid ...
Yes, if you have a very solid architecture, strict coding guidelines, extensive integration and unit test coverage, experienced developers (etc. etc. etc.) you will mitigate a lot of problems with dynamic typing. So if you do everything right, avoid the pitfalls, you can have something solid. A similar argument is made to me when I assert JavaScript is a terrible language. I don't disagree with either but it doesn't prove anything.
>The cost of "extra typing" is cognitive, not a finger speed limitation.
Exactly.
>We're not arguing whether dynamic language+extensive integration/unit test is better than static typing and no tests.
I am, because that's the way I'd work in any language, because I'm not a hack.
>Yes, if you have a very solid architecture, strict coding guidelines, extensive integration and unit test coverage, experienced developers (etc. etc. etc.) you will mitigate a lot of problems with dynamic typing.
Eliminate.
>A similar argument is made to me when I assert JavaScript is a terrible language.
No, javascript is different. The weird and fucked up implicit type conversions render even a high level of testing insufficient to achieve a high level of confidence in the code. There's way too many edge case behaviors where it should be throwing exceptions and it does something weird instead. C suffers from this problem too despite being statically typed.
"Quick to develop in" doesn't sound as a very good thing to me - because it usually means "slow and hard to maintain for the next guy".
> I struggle to think of any benefits of dynamic typing on a reasonably sized code-base (e.g. 50k+ LOC).
And yet, it's done all the time with great results.
The type systems of Java and C/C++ are the most commonly encountered ones and by far the most widespread in industry programming (as opposed to academia/research) and they are actually awful and really add a lot of friction and inertia to developing. Being free of that kind of type system when using a language like Python really does feel like a big upgrade.
There is a different problem with the more powerful and useful type systems in more modern statically typed languages though: learning curve. Haskell is dysfunctionally hard to learn and other languages do a little bit better but there's still friction in the learning curve that gets in the way of widespread adoption in projects that want to be able to hire rapidly.
I think Typescript is significantly easier to learn than JavaScript.
> EJB2.0 world sucked
This is the #1 reason I leaned away from static typing. Typescript and Swift have changed my mind. I now see typing as a helpful tool that can solve a lot of problems.
>>There are a class of bugs you just never need to worry about when the compiler does some compile-time checks for you ... like worrying that you passed the wrong type into a function, or the wrong number of arguments.
People always say this and it baffles me. Bugs like that should be caught immediately by your test cases. You shouldn't rely on the compiler to catch them for you.
Personal anecdote:
I worked on a small Python project some years ago and we had a type error in production despite having tests.
We traced it back to a call to a third party library. It was supposed to return a list of results, and all of the test cases around it worked and always got a list back. In production however we encountered an error because if there was only one value to return, the library would not return a list of one element, as we expected, but a scalar value. So the rest of the code was expecting a list and when it encountered a scalar it blew up.
You can blame it on us for having insufficient test cases, or not coding defensively enough, or not reading the source code of the library we used, or the author of the library for bad design, but ultimately, this bug would not have been possible in a statically typed language.
So just saying "have test cases" is not good enough. Your test cases can be not exhaustive, but a good static type system and type checker is.
I cannot reply to cdoconnor (probably some downvotes), so I'll write here:
That's the whole point: as it's often said "type checking keeps you honest"
I stumbled time and time again upon badly designed libraries... With a static type system, the painfulness will be obvious and felt the first time you'll try to build your code
with dynamic types, the pain might not be felt at all, until a crazy bit of code will be invoked, sometimes at the most unfortunate of times
>or the author of the library for bad design
That's the one. That's a damned stupid decision.
Yeah and a static type system refuses to let you make such a stupid decision. When interoperating with code I didn't write, knowing that the code is guaranteed to conform to some specification is valuable.
Why would I write test-cases for something the compiler can catch for me? Yes, I need to write tests for all the correctly-typed cases, but there are a whole class of bugs that I don't need to test for any more because I can't even write the failing case.
>there are a whole class of bugs that I don't need to test for any more because I can't even write the failing case.
By adding sanity checking asserts in a dynamically typed language you can achieve more or less the same result.
If you are going to add a bunch of asserts, why not just use types?
I don't particularly see the need to extend this approach to every single variable in my code base. I usually just stick it on boundary code.
You don't write test cases specifically to check types. Your existing cases, if they are robust and thorough, will simulate your runtime and check those for you as a side effect.
My existing test cases, robust and thorough though they are, only test what happens when I feed in a correctly typed object. Because I can't write a test that exercises my code on an incorrectly typed input: the type system prevents that.
you're already writing the test cases to ensure correct behavior with typical input, and predictable exceptional input. putting in one more assert for predictable exceptional input (wrong type) doesn't really add a noticeable amount of overhead to writing the tests you were already writing.
>putting in one more assert for predictable exceptional input (wrong type) doesn't really add a noticeable amount of overhead
Yes. We call that 'typing'.
And it takes less typing (keyboard) than the assert, and you get free documentation inline with the code!
If you have to write a test for every line of code you write, the "chatty" argument against static typing goes away.
and not just that, but for every possible case for the line you can think of. Like the post above, testing the results of a function returns a list for 1 item, or many items...
I see all too often, java bad, just look at how many lines you need to write to get "hello world". Who cares, the IDE generates that for you, but if you don't understand static void main as a beginner... again who cares, just put it in and ignore it until you need to understand it. Also read that dynamic languages are so productive, but they never mention you need to then write tests to detect what the compiler would catch for free. Yes good code needs tests, but the compiler IMO is a damn good built in test suite for catching many classes of errors
Bugs like that should be caught immediately by your test cases. You shouldn't rely on the compiler to catch them for you.
What's wrong with relying on the compiler? Bugs like that will be caught immediately by a good static type system, so you don't have to rely on your test cases to catch them for you.
Relying on a test suite to verify basic properties that could be enforced automatically through a type system means you're only checking some cases instead of all cases. It also means you're cluttering your test suite with boilerplate about language mechanics instead of writing high value tests that verify the real operational behaviour of your system.
There are pros and cons to static vs. dynamic typing in general, but in this particular respect, static typing is strictly more powerful, less verbose and more efficient.
Well if you don't need to write that particular test you have more time writing tests that are actually useful, like validating your logic (and dare I say: more fun to write).
Having a compiler catch this for you means having to write less test cases.
Yes, why do something automatically when you can do manual work!
I think the idea is that you have to write tests anyway.
No, you don't have to write tests for invariants enforced by the type system.
I think the bigger idea is that people who've never actually written serious code in a dynamically-typed language assume that people who do always write a bunch of extra unit tests to assert correct types, assert behavior on incorrect types, etc. etc., and that programs crashing due to type errors is a super frequent occurrence.
None of those things are true.
And yet there are dynamic language proponents in this very subthread suggesting that's exactly what everyone should do.
IMO the autocomplete argument is rather unconvincing. Every dynamic language I've worked deeply with has powerful and simple introspection capabilities, and they generally come with much more interactive development environments (shell/REPL), so I've never found API discoverability to be any worse than statically typed languages, just different.
In general though, the more you can formally reason about the program, the more you can automate program transformations (refactoring). Programmers in dynamic languages will argue that the amount of code is far less than the equivalent code in a mainstream statically-typed language, so while the cost of refactoring may be higher per unit, the number of units is less, so the overall cost is the same (or less).
I believe in using the best tool for the job...some use cases would benefit more from static typing, while others would benefit more from using a dynamic language. One of the most important factors is the team and its engineers' backgrounds, preferences, styles, etc.
Yeah, I agree: Two of the languages with the best autocomplete and introspection in general are Common Lisp and Smalltalk. Some of that's simple maturity, but it proves that those features can work very, very well in even the most dynamic languages around.
Sorry, but I'd contest that remark, at least for Smalltalk. It only has the best autocomplete/introspection for people who have never worked with modern statically typed languages/IDEs and/or significantly large projects. It just doesn't have the neccessary information.
Does anyone believe in using the wrong tool for the job?
Yes! Many engineers use tools they like for reasons unrelated to the job they're doing or the product they're building.
It might still, at least arguably, be "the right tool for the job." If you have a huge team of expert C# developers C# might be "the right tool for the job" even if it would be a little easier to do in a different language, given the same pool of experts in that language.
Agree. My point is that, if I personally dislike C# (I don't), and I'm on a team of C# experts and C# is the best choice for shipping the product given those experts and all the other use cases, then C# is the best tool for the job and I'll peruse C# documentation. I try my best to focus objectively on the product, not my ego or subjective preferences...and I often fail :(.
I've been writing PHP for a couple months and I've been pleasantly surprised so I feel like I am moving beyond that.
To be fair, PHP 7.1 with Composer is a very different language to PHP 4, for the better of course.
Yeah, sure, and I was able to avoid the painful years. But C# pre-generics and pre-Linq is a way less appealing language too, you know? Most languages that are popular now look kind of rough to work in several versions ago.
Oh definitely. I think that’s the case for nearly any language that’s worth the code it’s written in. C++ is an interesting case study in how to do it both right and wrong; we’re spoilt for choice and it keeps getting better every day
At a place I worked at once, it was "use Microsoft for everything". So yeah.
Yeah but that's more a matter of the criteria to decide on the right tool. Nobody would say "I intend to use the wrong tool for the job."
> Nobody would say "I intend to use the wrong tool for the job."
...but there's plenty of sales charlatans out there who will say "I intend to get the manager of that IT dept to make its developers use my tool for the job, and I don't care if my tool is the right one or the wrong one."
> Does anyone believe in using the wrong tool for the job?
Yes. PHP developers.
:)
The thing about magical refactoring though is that it often isn't a good idea. You usually don't just change the name of something, but you change it conceptually. If you just let the ide go and change the name everywhere, you create bugs because you never actually went and made sure the old code was updated for the new concept, rather than just the name.
That's not really true.
In my mind, what swayed me towards static typing languages was after I played around with Haskell and was able to use the 'deriving' clause.
The realization that with types, smart enough compiler can implement interfaces for me, was amazing.
Unfortunately, I am afraid that it will take a while for these techniques to go mainstream. A.f.a.i.k most a mainstream language can do, is to fill in the method stubs for you.
Definitely. Static typing lets you turn the compiler into a hard-working friend that helps you refactor large projects without going insane.
There are no diminishing returns. Defining types is easy and enhances code readability.
Believing that any technology has no cost is poor engineering IMO. There are costs to that rigidity, and they're rather self-evident.
No one is saying there is no cost. The cost of using static types is (1) you have to think about type info when writing the code and (2) you have to fix compile-time type errors while you are developing. The poster is claiming that, over time, as code bases tend to grow large, this small investment yields increasing (not diminishing) returns; a claim I would agree with.
If you believe there are no diminishing returns, I'm interested to hear your reply to the author's question about why we don't all use Agda or Idris.
Well, the easy answer is that dependently-typed languages like Agda and Idris aren't very mature yet. They're still missing many commonly-needed libraries, compile times are slow, the tooling isn't great, etc etc.
Getting a language to the point where it's workable for serious projects is a lot of work. Rust is getting there with the backing of Mozilla, Haskell has made some decent strides too (but still has a way to go, and I think has some pretty fundamental flaws entirely apart from the type system). It'll be probably another decade at least before we see any dependently-typed languages getting a serious foothold, but I do think they're going to become a lot more common eventually.
It isn't just a matter of tooling. There exist hard limits on how much can be inferred about unannotated programs, and when you go past those limits, the price you have to pay is to embed (partial) proofs of correctness in your own code. For example, think about why GADTs don't play nicely with type inference.
IMO, machine assistance is useful to the extent it relieves us humans from work. In particular, types are useful to the extent they can be inferred. Beyond that, you still need to prove the correctness of your programs on your own, so there is no point to the ceremony of writing down those proofs in a machine-checkable format.
> Well, the easy answer is that dependently-typed languages like Agda and Idris aren't very mature yet.
It's also self-evidently wrong. Agda was first released in 1999, ten years before Go. If you use a wallclock interpretation of "maturity", Agda is twice as old as Go and Idris is roughly as old as Go. Both are used significantly less (by several orders of magnitude), though. Despite them having a far stronger type-system.
If you, on the other hand, you are using a "developers' time" interpretation of maturity, on the other hand, you are making a circular argument, i.e. "Agda is seeing less use, because it has been used less", as resources invested in a language ecosystem tend to be strongly correlated with it's usage.
Have you used them? Switching to a language like Agda or Idris is entering the realm of formal verification because the types are so expressive. It's completely different to what most programmers are used to.
Essentially the types being used as so complex the type checker cannot automate the decision about two types being compatible so you have to write maths proofs to help. The types used in mainstream languages are simple enough that the type checker never needs help like that.
The cost of formal verification right now is immense but the benefit is close to bug free code. Mainstream strong statically languages require nowhere near the same amount of effort and give clear benefits over dynamic types.
Defining types is easy and enhances code readability until you go too far. Some type declarations in Haskell or highly templated C++ are hard to read.
While this is a valid point, it's also worth remembering that if you have a data structure of sufficient complexity that writing out its type is cumbersome, then your data is still in that structure whether you choose to be explicit about it or not. Any code reading or modifying some element within that data still needs to correctly find that element, and if writing out the types is a burden then probably finding the correct location every time is also difficult. So if anything, when you have more complicated data structures flying around, and particularly if you work with several similar but different complicated structures, that could make having the types explicit much more useful for ensuring correctness and maintainability.
Absolutely. Sometimes the type is too complex not to be explicit about it.
Highly templated C++ is the highway to hell. If you haven't heard C++ templates are turing complete and so can also have the halting problem.
I really don't think it does that. Most refactoring is not like that. It is about changing things at a deeper level than just the type.
I don't think they enhance code readability - I think they make it worse. I never look at the type when I am reading code, it just gets in the way.
Semi-automatic refactoring tools is just one part of what a type system enables. A much bigger benefit in my opinion is that it immediately highlights issues in your code while you are refactoring. Basically answer "What do I still need to change to finish the refactoring?". Unit testing also gives you some of this, but is typically much slower - both in execution time, and the additional time it takes to figure out where the issue is.
When using languages with static typing amount of refactoring of that type is disproportionately large, so people notice how much the ide helps, without noticing that most of the help wouldn't be even necessary without the complexity added by static types.
My main progamming language at the time is Java, and the amount of assistance my IDE provides is astonishing (to the surprise of no one, it's IntelliJ). The confidence strong automatic refactors provide is of great help when managing large codebases.
I agree that static typing helps reading comprehension and that IDEs like pycharm help getting into big code bases, that said, at the end of the day, when you know the code both the IDE and static typing are getting in the way. Actually, I never saw anybody as quick as people using simple editors like emacs and vim. GUI is getting in the way of the programmers intent. Static typing is a hindrance in front of refactoring. Unit tests is the only truth that matters, static typing or not.
You're kinda damning it with faint praise when you say that you can use dynamically typed languages on small projects that fit in your head (and are also probably written by a single developer).
You can pretty much use any language in that scenario. But the chickens come to roost around day 30+ or so.
A large project is a poorly decoupled set of small projects.
True, once my libraries get bigger than five or six assembly instructions, I tend to break them down to a more manageable size.
May be.
Meanwhile real-world non-trivial projects tend to be large projects.
Nice to be prepared for that instead of gambling on "surely we'll extract out smaller projects that fall on the right abstraction boundaries in the face of unknown future requirements."
Personally I'd only use a language like C or C++ or Java for a tiny puny baby child's toy program. They're fundamentally unfit for real-world codebases.
I'm probably being trolled, but I'll bite.
How big are the "real-world codebases" you're talking about, and how many programmers are working on the code? Once you hit 5-10 million lines of code and/or thousands of developers, static typing really helps manage complexity.
It's more that I get tired of this argument and decided to try to pre-emptively spoil it.
> Unit tests is the only truth that matters, static typing or not.
Well that's provably false, because there exist properties that you can type check that literally can't be verified via unit tests, even in principle. For instance, race and deadlock freedom.
> Static typing is a hindrance in front of refactoring.
Are you kidding? Python programming's my day job but Haskell is an order of magnitude easier to refactor.
There's not only tooling and reliability. Don't forget performance, too. Knowing at runtime that a memory word represents, for instance, an integer rather than a reference to a struct that contains both a type tag and a reference to the actual value, makes a huge difference. And I'm not even mentioning the inlining possibilities.
This is really true. I was a complete Java newbie and knew some Python when I joined Google. Yet I found working in an unfamiliar Java code base much, much easier here.
Large Python code bases are really hard to understand and work in (here).
Dynamic languages support this kind of tooling too - you can even have seamless data completion (eg. map/dictionary keys). The original Refactoring Browser was written for Smalltalk. Etc.
Of course there are cases where dynamic languages do worse, but it balances out I think.
> The original Refactoring Browser was written for Smalltalk.
Unless you think original means Bill Opdyke's thesis work, where the tooling was written for C++ and written in CLOS.
But how can you offer any refactoring or auto-complete inside a function if you don't know what type to expect as an argument?
You can, but you may start to run into some limitations. Many languages have some version of type inference, which allows them to figure out the type of a thing, even though the programmer didn’t specify it.
C# has a very weak version of this with the auto keyword. Languages like Crystal take it much further by tracing the flow of data through the entire program. It generally works quite well, though there are a few edgecases that require explicit type annotations.
As for auto-completion, some languages feature designs that make it easy to offer auto completion even without type information. For instance, Elixir doesn’t have methods. You only have functions defined on modules, and it’s trivially easy to know what functions are defined on a module.
So it’s possible, but there are some limitations.
Type inference goes hand-in-hand with static typing. Those are not opposite things.
> C# .... Crystal
Both are fully statically typed languages with type inference. Those are unrelated to the argument the parent comment is making.
> Type inference goes hand-in-hand with static typing.
That would imply there's no type inference possible with dynamic languages - either at "compile" time or run time.
(Wouldn't it also imply that static typing and strong typing are synonymous?)
They still both allow generic arguments, in reduction, this means that they still have uncertainty.
I can and have built a totally untyped language within a fully statically typed language - nostrademons' Scheme-in-haskell exercise is a lot of fun.
You just have to define a Universal type and then back out of all that nasty compile-time nonsense. Everything is Univ and Univ is everything.
Yes! If your language allows you to put constraints on those types you can specify what specific types have access to. For example in Rust:
Note that this doesn't work in languages that use templates for generics, like C++, where templates work more like compile-time duck typing.fn foo<T: Positioned>(x: &T) -> Point { x.position() }But, if I may, you are effectively statically typing your code.
Oh wait, I misread the comment you were responding to. Oops!
In dynamic languages functions aren't typically overloaded by argument type.
For OO languages and methods it does get a little guesswork-y, and tools often offer a wider range of guesses than is correct for completion. But you can show the class along with the offered completion, so it's not too bad.
If Visual Studio is your example, then you're making the same argument as the author: there's a sweet spot. From the python world, c# may look statically typed, but from the Haskell/Idris point of view, the java/c# type systems look really sloppy and ambiguous.
Agreed. Although other comments claim that their dynamically typed languages have this same property, inherently in dynamic languages the auto complete will fail out unless you effective write code as if you were using static types.
I can do all of those things in Smalltalk, which is dynamically typed; these things are not byproducts of static typing, they're simply byproducts of mature tools. So no, these are not arguments in favor of static types.
>> finding all references to a function <<
Yes, you can find all references to a method in Smalltalk -- but those references are not separated-out from all the references to other methods that happen to have the same method name but are defined on a different class.
With type information for the receiver and method arguments, we can find just the references we're looking for.
Which is why Smalltalk's refactoring browser has a manual intervention step to allow you to see what it proposes to do, and remove any steps you don't agree with, as well as the ability to scope your refactorings to a package or class to limit the scope to more relevant data. Either way, it's sufficient to get 99% of the benefits of automated refactorings without the 100% guarantee static typing provides; good enough for me.
Automated refactoring was invented in Smalltalk, claiming it's a benefit only static typing provides is to not know history.
> Automated refactoring was invented in Smalltalk, claiming it's a benefit only static typing provides is to not know history.
History: In Bill Opdyke's thesis work, the tooling was written for C++ and written in CLOS.
Ralph Johnson, a prominent Smalltalk'er, was advisor on that paper and was the creator of the first Smalltalk refactoring browser and that paper is littered with references to how things are done in Smalltalk. I was unaware a C++ was involved, but I still think Smalltalk had the first commercial refactoring browser. A research paper is not a product, however, thanks for the ref.
Do you think having "the first commercial refactoring browser" is the same as "[a]utomated refactoring was invented in Smalltalk" ;-)
Incidentally, what's your source for "Ralph Johnson… was the creator of the first Smalltalk refactoring browser" ?
Don't be a douche, obviously I don't think it's the same hence my comment. And I'm willing to bet, same as you, my source was Google to verify my memory of something I read long ago.
If I'd read it as obvious I wouldn't have made that comment: your name-calling is unhelpful.
We still don't know if you simply confabulated your other claim.
I didn't name call, saying don't be something isn't remotely the same as saying you are something. If someone tells you don't be a jerk, they're not calling you a jerk, they're warning you you're nearing that point. We're done, good will has exited the building, there's no point in continuing.
I can't speak to Smalltalk, but any namespaced Lisp system can figure out what references what. The key is that the searches happen through the REPL, not grep/ag. So if I tell CIDER to find all instances of a Clojure symbol, it can use the namespace to avoid false positives.
Are you saying that reduces the number of false positives or are you saying that eliminates false positives?
Assuming there's no ambiguity, it can eliminate false positives.
I can ask my Lisp IDE to find callers for methods based on the sub/class of arguments.
Does it work better if you use type-specifiers in your code :-)
Isn't static typing a requirement when writing an algorithm and data structures for "every bit and clock cycle counts" situations? How can a 0-compute overhead for dynamic typing exist in a dynamic typed environment?
I always thought dynamic typing is a feature for situations where the code needs an extreme amount of flexibility to adapt to a wide variety of data; even at the expense of performance.
For optimal performance, the absence of unnecessary dynamic checks is a requirement. The presence of static checks is not, although they are useful for your sanity's sake.
Dynamic languages can actually be quicker in some cases because of virtual machine optimizations based on how your code actually runs with live data.
Advanced compilation techniques can transform (input data, input program) into a statically typed program, so the at the limit the answer is strictly "no". See partial evaluation, futamura projections, etc- or JS JITs for more down to earth stuff (though sidestepping data transformations).
> Isn't static typing a requirement when writing an algorithm and data structures for "every bit and clock cycle counts" situations?
Not necessarily. The phrase "dynamic language" actually bundles together a lot of related but distinct ideas: some of those add performance overhead, others don't. "Dynamic" features which are most notable for performance overhead are tagged unions and dynamic dispatch.
Imagine a dynamically typed language which provides types like int, bool, list, functions, etc. and we're free to assign (and reassign) any of those values to any variable we like:
It's very common to store such values as a tagged union: a "union" meaning that the data could represent an int, or a bool, or whatever, and "tagged" meaning that there's some extra data which tells us which one it is. For example, we might store `5` as a pair of machine words: `1` to indicate that it's an int, and `5` to represent the data. The boolean `true` might be the pair `2` to indicate bool and `1` for the data. And so on.x = 5 # An int y = true # A bool y = 3 # An int z = square(x + y) # An intThis tagging adds overhead. We can reduce it in some cases, e.g. using "tagged pointers" where we use some of the bits in a word for the tag and some for the data. But it's still overhead.
However, there's nothing fundamental about using tagged unions. We could just as easily use 'unboxed' values, i.e. store the int `5` as the machine word `5`; the bool `true` as the machine word `1`; etc. There is no overhead, no metadata, etc. Whilst this is a perfectly reasonable implementation strategy, it means that we cannot tell what type a value is intended to have: e.g. if a value is stored as the machine word `1`, we have no way of knowing if that's the boolean `true`, or the integer `1`, or the character `SOH`, or whatever. This would probably lead to very buggy programs, since we have no type checker to enforce correct usage, and (since there's literally no difference between data of different types) we can't even do runtime assertions like `assert(isInt(x))`.
Likewise, many dynamic languages use dynamic dispatch to choose which functions to call based on the type of data we have. In the above example, we might have `x + y` running an integer addition function when `x` and `y` are integers, or a string concatenation function when `x` and `y` are strings, and so on. That's what Python and Javascript both do. Yet, again, there is no fundamental reason to do this! We can have a dynamic language which uses static dispath everywhere: consider that in PHP, `+` is only used for numerical addition, whilst `.` is only used for string concatenation. Dynamic dispatch adds overhead, since we need to chase pointers, etc. whilst static dispatch doesn't. This is simply a question of programmer convenience: do we want every function to have a distinct name, and write them out in full each time?
Note that both of these features: tagged unions and dynamic dispatch, can be used in static languages too. It's just that, historically, they tend to be default (and hence unavoidable) in dynamic languages, and something we must explicitly create in static languages. Hence when we compare static languages to dynamic ones, we tend to compare statements like `x + y` in Python to `x + y` in C, which are actually quite different semantically. A fairer comparison would be to compare `x + y` in Python with `callWith(lookUpSymbolFromType("+", lookUpType(x)), x, y)` in C (I've made up those function names, but the point is that the same functionality is there if we want it, but it will be just as slow as if we'd used Python)
A good static type system will give you the ability to be able to control the level of dynamism in various parts of you code. For example you could create tagged unions to create mini dynamic type systems within parts of your codebase.
Not unless you think assembly language or machine code are statically typed.
Assembly is typed. The types are machine words and usually floating point types are also available. Basically, the machine types are the types of data that the register files can hold.
That's not a useful notion of type. If I cannot tell statically by inspecting a register name nor dynamically by inspecting a bit pattern whether a given register holds a pointer or an integer or a floating-point value, that's pretty much the definition of "untyped".
Dynamic type testing or introspection is not an essential feature of type systems. The fact is, your computing machine is typed. You can not deference a floating point register as an address, for instance.
> Dynamic type testing or introspection is not an essential feature of type systems.
I didn't say so. An essential feature of type systems is being able to determine the types of (many) values, statically or dynamically.
> You can not deference a floating point register as an address, for instance.
So you'd agree that architectures that don't make a distinction between integer and floating point registers are untyped? But sure, call this a type system if you must. It's just a very very weak one, so weak as to be almost entirely useless. (And the reason floating point registers are often separate from integer registers is not to provide this kind of "type safety", it's due to history and architecture.)
> It's just a very very weak one, so weak as to be almost entirely useless.
Weak and strong aren't meaningful terms. A machine ISA might have an inexpressive type system and/or an unsound type system (because it conflates addresses and integers).
> And the reason floating point registers are often separate from integer registers is not to provide this kind of "type safety", it's due to history and architecture.
No, the reason is performance. And we get performance by making statically known distinctions between datatypes, which is what the original poster asked about.
Intermingling the integer and floating point circuitry so they access the same register file would never improve performance over keeping them separate. You'd need longer wires to place them both near the same register files to minimize signal latency, an the added signal delay alone ensures lower performance.
>> It's for architectural reasons.
> No, it's for architectural reasons.
You win, I guess?
I'm not sure who you're quoting, but I never even used the word "architecture". I don't even know what that's supposed to mean.
You try to explain CPU architecture to me, then claim you don't know what CPU architecture is? OK, maybe you don't win, in any case I'll stop here.
"Architectural reasons" is nonsense. "Architecture" isn't a set of reasons justifying anything, rather it's the other way around: we design architecture for specific reasons. Your claims that we do things for "architectural reasons" don't have any clear meaning. I said the actual reasons are performance. I don't think I'm the one being unclear.
Yep. You can, of course, make a similar argument for Scheme. But you don't get a very interesting type system out of it.
Perhaps I should have written "... nontrivially statically typed."
Sure, but to bring it back to the original question, the fact that you can trivially distinguish floating point from word types already means statically typed languages are easier to optimize than dynamically typed languages for numerical programs.
And there are all sorts of optimizations like this that simply aren't available to dynamically typed languages. Tracing JIT can only take you so far.
OK, but that wasn't the original question.
(There are also all sorts of optimizations that can't be done statically, so I guess there's that.)
The original question asked whether static typing is needed when maximal performance is required. If we understand "maximal" to mean literally, "no more performance can possibly be squeezed out", then it simply is. Static typing might not be sufficient for this case, but it is necessary.
I think those few, crucial pieces of hand-written assembly language one sees from time to time disagree with you. (Modulo your observation that assembly languages are trivially statically typed.)
> I think those few, crucial pieces of hand-written assembly language one sees from time to time disagree with you.
I still don't think so. For example, plenty of silicon is spent on branch predictors simply because addresses and integers aren't distinguished, in general, thus permitting more expressive but costlier code.
Execution would be much faster if integers and addresses were forced to be distinct. Static typing pretty much always improves performance.
Live coding environments are amazing at this, as well. Often without the need for as extensive static typing, since it can just use reflection.
This isn't to say that static typing isn't good at this. Just, even with that, there is a lot of effort that goes into making the rich tooling. A lot of very smart and capable folks work hard to make Visual Studio.
>Tooling is another major argument.
vscode seems to figure out the types in javascript without any static typing.
>But it is great for refactoring
Searching for strings isn't that much worse. Also, when it comes to web development, you cross into the client-side and suddenly you can't refactor. So you can only refactor the server-side and end up with a mismatch.
>finding all references to a function or a property or navigating through the code at design time
You can do that without static typing in many cases as well.
>Basically all the features visual studio excels at for .net languages.
When I was working in c# on the server and javascript on the client, I really hated having to go back into c#.
>telling you through a drop down what other options are available from there
vscode seems to be able to figure this out most of the time as well.
I think static typing is necessary when you need performance because all of the fast languages are statically typed.
I think a lot of these things are about organisational complexity and making sure new and average programmers don't screw up the software. It is about large companies trying to manage their organisation, it isn't about the complexity of the code itself.
There are a ridiculous amount of tech companies that have used dynamic languages to go from nothing to the biggest companies in the world and only switched to static languages well and truly after that occurred.
> vscode seems to figure out the types in javascript without any static typing.
It is actually using TypeScript's engine for this. It uses TypeScript's type definitions where it is available (e.g. most popular libraries and built-ins), and some inference rules where it has to work with plain JavaScript.
While this does give you decent auto-complete in a lot of cases (and still getting better), it's not quite as good as using TypeScript directly.
> Searching for strings isn't that much worse.
I'm busy with a large refactoring project, and TypeScript has been amazing for this. After restructuring code, you can just keep on fixing things until there's no more red. This is not just about reducing bugs - it removes almost any thinking/mental overhead required for the refactoring process.
> Also, when it comes to web development, you cross into the client-side and suddenly you can't refactor. So you can only refactor the server-side and end up with a mismatch.
TypeScript also works very well for client-side. Of course, if you change the API between the client and server, that's a different story.
>vscode seems to figure out the types in javascript without any static typing.
There are limits to type inference. And if you're going to rely on type inference to prevent runtime bugs you might as well double-down on static typing since you're giving up on some of the 'features' (insanity) of dynamic languages anyway.
Dart 2.0 now mandates strict typing but will allow you syntactic shortcuts as long as the compiler can infer the type (if it can't, you get a compile-time error) - that's a great compromise. I wish more people would love Dart. It's such a great, well-designed, language.
>Searching for strings isn't that much worse.
There are very real limits with what you can do with 'strings'. And yes, it is that much worse.
>When I was working in c# on the server and javascript on the client, I really hated having to go back into c#.
I do not understand that view. You are an alien to me. C# is a beautiful language that fixes a lot of syntactic problems in Java. It is much more pleasurable to write C# code than JS code (outside of dinky 50 line programs).
>I think static typing is necessary when you need performance because all of the fast languages are statically typed.
That's not the only reason but it is one of them. JIT and AOT compilers can do more with strongly typed code.
>There are a ridiculous amount of tech companies that have used dynamic languages to go from nothing to the biggest companies in the world and only switched to static languages well and truly after that occurred.
Sure. PHP (pre-5) and JavaScript made a ton of money for a ton of people. Both languages were integral in the Web revolution. Doesn't change the fact that PHP was a terrible language and JavaScript is still a terrible language.
The vast majority of development time is spent finding and fixing bugs. Small startups got huge often because they hit the sweet spot of features before anyone else, not because their code was high quality. Once they get large installed bases (and large valuations) they get religion about the value of more strict typing.
The unknowable question is, would they have hit that sweet spot anyways by engineering their product more rigorously, and had less pain later? Or would it have impeded them in the exploratory phase of writing and rewriting their code until they hit that spot?
> vscode seems to figure out the types in javascript without any static typing.
Doesn't it do this by treating Javascript as a statically-typed language (Typescript) and using type inference?
No it works without using typescript. I just figures it out through... I have never thought about it. I mean, var p = new Cat();. I am sure it can find the cat definition easy and read the properties and so on. It probably can't prove things 100% but it can guess very well at what things are.
> I just figures it out through... I have never thought about it.
https://github.com/Microsoft/TypeScript/wiki/JavaScript-Lang...
> Visual Studio 2017 provides a powerful JavaScript editing experience right out of the box. Powered by a TypeScript based language service, Visual Studio delivers richer IntelliSense, support for modern JavaScript features, and improved productivity features such as Go to Definition, refactoring, and more.
It works off Typescript type annotations for packages and the Typescript compiler's type inference. VS Code's Javascript editing does not seem to be evidence for tooling around dynamic languages so much as it's evidence that if you have a lot of money and Anders Hejlsberg you can write a compiler for a statically typed language with type inference that looks like a specific dynamic language.
Our team (TypeScript) powers this experience.
>I just figures it out through... I have never thought about it.
Are there not definitions written by someone or a tool that vscode looks at?
That is what I saw using some npm packages with typescript: http://definitelytyped.org/
> There are a ridiculous amount of tech companies that have used dynamic languages to go from nothing to the biggest companies in the world and only switched to static languages well and truly after that occurred.
This is a biased sample though. You're saying 'ridiculous number', but the truth is most startups (using static or dynamic languages) fail and we don't actually know if their language choice had much impact on their success or failure.
Also the article does not take into account the much higher cost fixing a bug in live compared to development
It does (though admittedly not very explicitly), it just rolls it into the "benefit" (i.e. not shipping broken code is a benefit) and the "weight given to stability". If a bug discovered in prod (or even qa/canary) has a significantly higher cost than a bug discovered early, that will influence the weight you are putting on stability. In that way, this factor is subsumed in one of the other graphs.
As you might've noticed, I have been intentionally light on the details and only talked pretty abstractly about the specific scales and factors involved.
Ty I take the point and it depends on the sort of project your working on a celphone app is different to say a major telcos billing system.
I have been doing some elm, and refactoring is a MAJOR bonus. You can refactor your entire project, and when it compiles it usually works. It's funny how dynamic languages are seen as beginner friendly, while in reality they are not (ruby, js, python...).
A lot of the same refactoring is possible in dynamic languages as in static ones. I recommend reading up on Term to see what's possible to do with JavaScript http://marijnhaverbeke.nl/blog/tern.html
I use Cursive https://cursive-ide.com/ for working with Clojure, and it can do safe refactoring for symbols by doing static analysis of the source. It can show all usages of a symbol, rename it, do automatic imports, and so on.
Another piece of tooling that's not available in any statically typed languages at the moment is REPL integration with the editor seen here http://vvvvalvalval.github.io/posts/what-makes-a-good-repl.h...
I find that the REPL driven workflow found in Lisps is simply unmatched. When you have tight integration between the editor and the application runtime, you can run any code you write within the context of the application immediately. This means that you never have to keep a lot of context in your head when you're working with the application. You always know what the code is doing because you can always run and inspect it.
Having the runtime available during development gives you feedback much faster than the compile/run cycle. I write a function, and I can run it immediately within the context of my application. I can see exactly what it's doing and why.
The main cost of static typing is that it restricts the ways you can express yourself. You're limited to the set of statements that can be verified by the type checker. This is necessarily a subset of all valid statements you could make in a dynamic language.
Finally, dynamic languages use different approaches to provide specification that have different trade offs from static typing. For example, Clojure has Spec that's used to provide runtime contracts. Just like static typing, Spec provides a specification for what the function should be doing, and it can be used to help guide the solution as seen here https://www.anthony-galea.com/blog/post/hello-parking-garage...
Spec also allows trivially specifying properties that are either difficult or impossible to encode using most type systems. Consider the sort function as an example. The constraints I care about are the following: I want to know that the elements are in their sorted order, and that the same elements that were passed in as arguments are returned as the result.
Typing it to demonstrate semantic correctness is impossible using most type systems. However, I can trivially do a runtime verification for it using Spec:
The above code ensures that the function is doing exactly what was intended and provides me with a useful specification. Just like types I can use Spec to derive the solution, but unlike types I don't have to fight with it when I'm still not sure what the shape of the solution is going to be.(s/def ::sortable (s/coll-of number?)) (s/def ::sorted #(or (empty? %) (apply <= %))) (s/fdef mysort :args (s/cat :s ::sortable) :ret ::sorted :fn (fn [{:keys [args ret]}] (and (= (count ret) (-> args :s count)) (empty? (difference (-> args :s set) (set ret))))))For someone only passingly familiar with Spec, what's the benefit of Spec over just using a property based testing framework like Haskell's QuickCheck (and I think Clojure's test.check)?
I can encode all those invariants as QuickCheck properties and have them automatically tested against random inputs on every test run. It's still all runtime verification, but with random inputs I actually have more confidence of hitting a corner case than with just asserting during regular program runs or hand written example tests.
Also, with enough heavy lifting you can actually encode all of that in the types in a dependantly typed language like Idris [1]. And while a machine checked proof of your sorting algorithm is nice, it might be hitting the diminishing returns point the article mentions over just using property tests.
Think of QuickCheck/test.check but with better integration into the language.
This makes it much more likely to be used but it's fundamentally the same set of ideas.
A really cool idea I'm playing with at the moment is using fuzzing/static analysis based generators to feed spec/test.check.
I think it will help get past the, imo, biggest issue with generators in that they can miss exceptional cases in the code.
E.g. If (x=="jack and Jill) {exceptional case} is unlikely to be triggered with standard generators but "easy" for static analysis tools to solve.
> Also, with enough heavy lifting you can actually encode all of that in the types in a dependantly typed language like Idris [1]
In theory. In practice it is multiple orders of magnitude harder to prove properties in Idris than it is to spec them using property based testing.
To add to what sheepmullet said, I think the insertion sort example is exactly the problem with advanced type systems. It takes nearly 300 lines of code to provide the specification.
Somebody has to be able to read that specification and understand that it's correct in a semantic sense. Ultimately, the specification itself becomes a full blown program that the type checker executes. So, now you run a program to try and verify aspects of your original program, but how do you verify that the specification itself is correct?
At some point a human has to be able to read the code and decide that it matches the intent. This step can't be automated, and I certainly don't think the Idris example improves things. I'd argue that it's far easier to tell that this version is correct:
fun insertionSort(arr, int n) { var i, key, j; for (i = 1; i < n; i++) { key = arr[i]; j = i-1; while (j >= 0 && arr[j] > key) { arr[j+1] = arr[j]; j = j-1; } arr[j+1] = key; } }
I really enjoyed how the analysis shows that different developers can have different equally valid opinions on this topic. It's where you place your values and preferences of programming, modified by what you are programming. The failure state of a cat photo sharing web app likely isn't as dramatic or important as that of a financial system or driverless car code. Great article.
Static typing reduces the time you spend on debugging. Automatically reducing errors in code is not just for reducing errors in the resulting program. It also greatly reduces the time you spend on hunting bugs, especially if you have a poorly designed type systems where errors are reported far from their origin. Null, interface{}, NaN etc. propagates errors and thus gives you a stacktrace that is worthless when it finally fails. It's a waste of time.
In my experience, the time saved from writing in a statically typed language where the compiler catches the bugs for you is made up by having to work more closely with the compiler, typically write more code (type annotations and other things) and in general spend that same time on compile-time rather than run-time bug hunting. Dynamically typed languages typically involve a lot less code, which is time gained.
That both forms of languages are popular shows that there are benefits in overall productivity to each; they are just different benefits.
The thing is that errors at compile time get reported almost instantly, but errors at runtime might be reported hours after you started your program if you are unlucky.
That's entirely correct, and one of the tradeoffs.
However, in a statically-typed language, you must satisfy the type checker for everything, which adds development time. In reality, there might be a small percentage of functions in your code base for which errors (either compile-time or run-time) would likely crop up, yet you must pay that cost for 100% of them.
So that's really where the debate comes from.
Dynamically typed languages can get around this problem by generative testing (in Clojure's case) which allow very fine-tuned aspects of your system's requirements to be automatically tested before run-time without writing tests, which offers some of the same confidence as a compiler.
May I ask what your experience of statically-typed languages has been? I find that most people have had a common experience where they didn't get to work in tandem with a fast compiler and a succinct language which inferred most or all types for them, and so their perceptions are coloured by that.
Swift, C++, Haskell (a little), Elm (more than a little)
May I recommend giving ReasonML a try? Trust me when I say, you've never seen a faster compiler (except maybe C). Try writing a little experiment in ReasonReact and seeing the speed for yourself: https://reasonml.github.io/reason-react/
My theory is that there are different psychologies of developers. I always liked how C++ (now C#) checked a lot of stuff at compile time and I rely heavily on the compiler. On the other hand I know very good devs who hate this and prefer dynamic languages. Their whole style is geared towards dynamic languages where mine is geared towards as strict as possible typing.
I think the key is not to confuse both approaches and leverage the strengths of each to the max.
Also depends on your problem domain. If you have good test coverage but you're parsing strings found in the wild, you're going to spend a lot more time "debugging" your assumptions than AttributeErrors which would be caught by typing. Bug free code is not always the same as working code.
Disclaimer: Python user scarred by email header RFC violations
I think there are two kind of static typing languages. The ones that static typing is for helping the compiler(eg C) and the ones that it’s for helping the user(eg Typescript).
I think Go with its lack of algebraic type is more of the first, helping the compiler, so I wouldn’t use it as a good example of static typing.
Haskell, OCaml and Rust would make excellent case studies, but we have nothing to compare against.
So IMHO the best way to compare static typing vs dynamic typing is by comparing Typescript against JS. And in my experience the difference when writing code is huge. It completely eliminates the code-try-fix cycle during development.
The effort to fix a defect is proportional to the time between introduction of a defect and it's discovery.
This is a basic intuition behind all good practices, including CI, QA, etc.
Types allow one to discover program defects (even generalized ones, when using some of the programming languages) in (almost) shortest possible amount of time.
Types also allows one to constrain effects of various kind (again, use good language for this), which constraintment can make code simpler, safer and, in the end, more performant.
Also, retaining dynamic types at runtime enables you to find type errors that the static type system could not discover, or that were worked around. Language implementations that discard dynamic types make it harder to find defects.
Algebraic data types allow you to get any amount of dynamism you would needed.
Have you familiarized yourself with Haskell?
The two languages I develop in are Javascript and Swift. Couldn't be more different in type safety.
I love everything about Swift except the compile times and occasionally inscrutable compile error messages.
I love the interactivity of Javascript, but despise the lack of types, it's like I'm sketching out the idea for a program instead of directly defining what it is. And the lack of types burns me occasionally.
What are the costs of statically typed languages? The author stated "thinking about the correct types" and "increases compile times" among some other, weaker (imo) costs. What is wrong with "thinking about the correct types"? You are thinking about the same things in a dynamic language, right? For example, say you need to know about things that are "thennable". Weather you are in a statically typed language or not, you are still checking for the same thing: does it have the then() method? The tradeoff is in reading vs implementing code. With a statically typed language, you can easily search for implementers of the Thennable interface and you are guaranteed to be show every implementer. The downside is that you have to write a few more lines of code to satisfy the static typing. With a dynamically typed language, you have to find the implementers yourself, but you can just slap a then method on anything and it will work. I am biased toward static typing so I am interested to hear counter points.
One very simple and significant cost is developer time. It simply takes less time to write code in a dynamically-typed language. You don't have a compiler to please, you don't write extra code to massage types, annotate types, etc, and most dynamically typed languages are pretty elegant (i.e. Clojure), where you can pack a lot of punch in just a few characters.
So the trade off is: static typing gives you more compile-time certainty, but at a cost of spending more time developing your code. Dynamic typing gets you to a working product or prototype typically much much faster, but with added run-time debugging.
Each has its benefits and costs.
In my experience, there is no doubt that dynamically typed languages are faster-to-production than statically-typed. This doesn't mean that I don't admire static typing, though, because most developers appreciate some degree of purity in their work.
I like for example Refined
https://github.com/fthomas/refined
not only for the static checking,
scala> val i: Int Refined Positive = -5
<console>:22: error: Predicate failed: (-5 > 0).
val i: Int Refined Positive = -5
but the expressive descriptions of a domain model.Sometimes I wonder if we're arguing the wrong thing, where we think we're arguing static vs dynamic typing but what we're _actually_ arguing is static vs no-static typing. Haskell is static and not dynamic. Ruby is dynamic but not static. Python, starting with 3.5, is sorta both. C# is definitely both.
All static typing means is that type information exists at compile time. All dynamic typing means is that type information exists at runtime. You generally need _at least_ one of the two, and the benefits each gives you is partially hobbled by the drawbacks of the other, so most dynamic languages choose not to have static typing. I also feel that dynamic languages don't really lean into dynamic typing benefits, though, which is why this becomes more "static versus no static".
One example of leaning in: J allows for some absolutely crazy array transformations. I don't really see how it could be easily statically-typed without losing almost all of its benefits.
Honestly, I think you've nailed it.
The key is balance. Pure static does create a lot of extra up front cruft at the expense of long term safety. Pure dynamic does create a much faster path to features at the expense a lot of long term confusion.
The reason we have this conversation is because of web applications where everything is travelling over the wire as a string, consumed by the web server as a string, converted by whatever language the server is in...into something that it can use...9/10 times validated to make sure it reflects what we need and then stuff into a database.
In the case that you're using a SQL database, a huge number of people are enforcing types at the database layer and the validation layer. Since so much is "consume and store" followed by "read and return" the types at that server layer end up creating a ton of extra work that in many cases shows little to no benefit.
At the point that you're doing more in server layer, suddenly it becomes a lot more useful. At the point you're working on desktop, mobile, embedded, console, computational and graphics...static is going to provided more value.
At the point you're working on web in front of a database, the value is much more questionable.
This is really one of the reasons I'm such a huge Elixir fan because IMO it strikes that perfect balance where I live...on the server in front of a database. You get static basic types with automatic checking via dialyzer and you can make it stricter as necessary.
There is one aspect to this debate that is worth pointing out. What about generative testing, which is possible in static or dynamically typed languages? The article mentions that testing is perhaps more important in a dynamically typed language since there is less compiler support. But for example, Clojure rolled out the very clever Clojure.spec library that allows you to precisely specify all details relating to function arguments, data structures, etc, in even more fine-tuned methodology than just types; you can specify that the second argument to a function must be larger than the first, or that a function should only return a value between 5 and 10, etc. These "specs" have the interesting property of being run-time checked or compile-time checked in the form of automatic tests, which can generate inputs based on the specs.
In such a case, the line between these two type environments narrows.
Clojure.spec is very clever, but it can be exactly duplicated in a statically-typed language by unit or property testing. It doesn't bring anything to the table that is totally a superset of static typing.
> In such a case, the line between these two type environments narrows.
Not really. Static types still offer you total proofs of the properties you encode as types, not just experimental results of tests.
Generative testing is just one application of Clojure.spec. It does more than just aid in testing. It doubles as a runtime contract system, a data coercion system, and some folks are using it for compile-time checks as well (not in the testing sense, though I haven't read up on how they are doing that).
It is not a proof-like system, but outside of dependent typing, static typing does not catch value-related bugs, but Clojure.spec can. In a static type system, how easily would it be to exactly specify and guarantee that a function's second parameter is of a higher value than its first, or that a function's output is an integer between 5 and 50, etc? Clojure.spec is just predicate functions composed together to define the flow of data in a program, and those compositions can be used in a variety of ways.
> ... static typing does not catch value-related bugs, but Clojure.spec can.
Can you provide an example?
> In a static type system, how easily would it be to exactly specify and guarantee that a function's second parameter is of a higher value than its first, or that a function's output is an integer between 5 and 50, etc?
Scala:
def foo(param1: Int, param2: Int): Int = { require(param2 > param1, "Param2 must > param1") param2 - param1 ensuring { result => result >= 5 && result <= 50 } }
Those line charts are totally made up, with arguments pulled out of thin air to support this line:
> "Go reaps probably upwards of 90% of the benefits you can get from static typing"
That 90% number is totally made up as well. I don't see evidence that the author actually worked with Haskell, or Idris, or Agda these being the three static languages mentioned. Article is basically hyperbole.
If I am to pull numbers out of my ass, I would say that Go reaps only 10% of the benefits you get with static typing. This is an educated guess, because:
1. it gives you no way to turn a type name into a value (i.e. what you get with type classes or implicit parameters), therefore many abstractions are out of reach
2. no generics means you can't abstract over higher order functions without dropping all notions of type safety
3. goes without saying that it has no higher kinded types, meaning that expressing abstractions over M[_] containers is impossible even with code generation
So there are many abstractions that Go cannot express because you lose all type safety, therefore developers simply don't express those abstractions, resorting to copy/pasting and writing the same freaking for-loop over and over again.
This is a perfect example of the Blub paradox btw. The author cannot imagine the abstractions that are impossible in Go, therefore he reaches the conclusion that the instances in which Go code succumbs to interface{} usage are acceptable.
> "It requires more upfront investment in thinking about the correct types."
This is in general a myth. In dynamic languages you still think about the shape of the data all the time, except that you can't write it down, you don't have a compiler to check it for you, you don't have an IDE to help you, so you have to load it in your head and keep it there, which is a real PITA.
Of course, in OOP languages with manifest typing (e.g. Java, C#) you don't get full type inference, which does make you think about type names. But those are lesser languages, just like Go and if you want to see what a static type system can do, then the minimum should be Haskell or OCaml.
> "It increases compile times and thus the change-compile-test-repeat cycle."
This is true, but irrelevant.
With a good static language you don't need to test that often. With a good static type system you get certain guarantees, increasing your confidence in the process.
With a dynamic language you really, really need to run your code often, because remember, the shape of the data and the APIs are all in your head, there's no compiler to help, so you need to validate that what you have in your head is valid, for each new line of code.
In other words this is an unfair comparison. With a good static language you really don't need to run the code that often.
> "It makes for a steeper learning curve."
The actual learning is in fact the same, the curve might be steeper, but that's only because with dynamic languages people end up being superficial about the way they work, leading to more defects and effort.
In the long run with a dynamic language you have to learn best practices, patterns, etc. things that you don't necessarily need with a static type system because you don't have the same potential for shooting yourself in the foot.
> "And more often than we like to admit, the error messages a compiler will give us will decline in usefulness as the power of a type system increases."
This is absolutely false, the more static guarantees a type system provides, the more compile time errors you get, and a compile time error will happen where the mistake is actually made, whereas a runtime error can happen far away, like a freaking butterfly effect, sometimes in production instead of crashing your build. So whenever you have the choice, always choose compile-time errors.
The author addresses that point extensively in the second half of the article, beginning around this part:
Now if we are to accept all of this, that opens up a different question: If we are indeed searching for that sweet spot, how do we explain the vast differences in strength of type systems that we use in practice? The answer of course is simple (and I'm sure many of you have already typed it up in an angry response). The curves I drew above are completely made up. Given how hard it is to do empirical research in this space and to actually quantify the measures I used here, it stands to reason that their shape is very much up for interpretation.
>Those line charts are totally made up, with arguments pulled out of thin air to support this line
The line charts are there to illustrate the point, not as proof. Like not all arguments are axiomatic proofs, not all charts are plotting data.
>That 90% number is totally made up as well of course.
Yeah, we got that from reading TFA already.
>This is a perfect example of the Blub paradox btw. The author cannot imagine the abstractions that are impossible in Go, therefore he reaches the conclusion that the instances in which Go code succumbs to interface{} usage are acceptable
The author actually not only can imagine them, but plots them (e.g. how for some languages/uses cases the sweet spot will be 100% type help from the language), and explains why he thinks that interface{} can be acceptable in some cases.
>This is true, but irrelevant.
Irrelevant for you maybe. For others (and for prototyping/early exploratory use cases in general) the quick feedback cycle beats the guarantees from Haskell like types. See Bret Victor.
>In the long run with a dynamic language you have to learn best practices, patterns, etc. things that you don't necessarily need with a static type system because you don't have the same potential for shooting yourself in the foot.
I'd say that most people's experiences with typed languages like C++, Java, C# etc run counter to that. Most that I've seen anyway. Same, or even more so, for Haskell -- there are literally tons of stuff to learn, to the point it throws people off.
It's far more useful to implement validation and type checking via introspection and interrogation of type, quantity, structure, size, or some other property at runtime in a dynamic programming language than to pedantically have to type all your variables. Most interesting types are far from the basics of different size numbers, string and objects anyway. It's better to trade a fast and quick runtime type error than a lengthy compile-time type checking process, because less code needs to be evaluated at run-time to expose the type error. See the "Worse is better" principle in language design.
Wouldn't it be great if we can use the computer to figure out what the types should be by a runtime evaluation of the code and save precious human time for things only humans can do?
I don't have to think or decorate my speech with types of noun, verb, pronoun, adjective etc. when I speak, but I'm still able to communicate very effectively, because your brain is automatically adding the correct type information based on context that helps you understand what I'm saying, even with words that have multiple types. Granted, natural language is different than programming language but there was once a trend to try and make programming languages more like human language, not less so.
> It's far more useful to implement validation and type checking via ... runtime in a dynamic programming language than to pedantically have to type all your variables.
How is that? I'm not seeing the increased utility.
> It's better to trade a fast and quick runtime type error....
What if the runtime type error crashes your app in production and loses your company money? What if it's something that slipped through your end-to-end integration testing because certain unlikely conditions never got covered, but they happened in production?
> ... than a lengthy compile-time type checking process,...
There are several modern compilers which are quite fast: D, OCaml, Java.
> ... because less code needs to be evaluated at run-time to expose the type error.
With static type checking, no code needs to be evaluated at runtime to expose a type error. Does dynamic typechecking offer a reduction over that?
> Wouldn't it be great if we can use the computer to figure out what the types should be by a runtime evaluation of the code and save precious human time for things only humans can do?
Wouldn't it be great if the computer would figure out the types at compile time and save us from having to manually input them? Well, the computer can do that, thanks to type inference. Several popular languages offer full, powerful type inference.
https://www.theatlantic.com/technology/archive/2017/09/savin...
Software failures are failures of understanding, and of imagination.
The problem is that programmers are having a hard time keeping up with their own creations.
dynamic typing simply doesn't scale.
Languages like F# give a nice sweet spot between static typing and dynamic typing. It has Type Providers that "generate" code on the fly as you are typing. You don't need to specify all the types, it will infer many types for you. So, you almost feel like you are writing in a dynamic language but you it tells you if you are writing something incorrectly.
I would not consider a language to be modern unless it has Type Providers I consider this to be such an essential feature. I believe Idris and F# are the only languages that have it. People are trying to push TypeScript to add it - who knows if it will happen.
Many are saying that if you have a dynamic language you just need to be disciplined and write many tests. With good static typed languages like F# you can't even write tests on certain business logic since the way you write your code you make "impossible states impossible", see https://www.youtube.com/watch?v=IcgmSRJHu_8
1. performance dominates (like 80:20)
2. tooling
3. doc (becomes crucial on large projects)
4. correctness
Formal correctness doesn't really matter. Anecdotally (since that's really all we have), I find in practice, very few bugs are caught by the type-checker.Further, code is usually not typed as accurately as the language allows. i.e. the degree of type-checking is a function of the code; the language only provides a maximum. In a sense, every value has a type, even if it's not formally specified or even considered by the programmer, in the same sense that every program has a formal specification, even if it's not formally specified.
Upfront design is the price. Which is difficult to pay when the requirements are changing and/or not yet known.
What language in specific are you applying this to? I.e. what is the type checker that is catching few bugs?
Like other commenters, I disagree there are diminishing returns to static typing itself, but rather diminishing returns to proper engineering in certain cases (i.e. do something as perfectly as possible).
By adding types (and in the extreme, dependent types), you're allowing compiler to prove more things about the code (to check correctness or generate more optimal code). If you actually need to prove more things, then it's better to leave that for a compiler rather than human.
Of course, if you're writing e.g. web scraping script, you don't need these guarantees and then you don't have to care about types. But the better engineering you want, the more static typing will help and there is no diminishing returns.
It bothers me that types as representation of hardware constraints are mixed up with types as a machine readable subset of validation.
It makes the higher level types seem more transcendental than they are, and also seems to put actual validation on a second rate level. End of the day if an argument is the right scalar or interface you'll get the same result on runtime whether you hinted it -- for one's quality of life improvements -- or checked it with some boilerplate validation. Worst case scenario people will forgo encoding known stricter constraints after generally hinting the expected type.
I've generally felt that each shines in different areas. Static typing is best for lower-level infrastructure and shared API's, while dynamic is better for gluing these all together toward the "top" of the stack, closer to the UI and biz logic. The problem is that languages tend to be all one or the other so that we have to make choice. What's needed is a language (or language interface convention) that can straddle both. A given class or library can be "locked down" type-wise to various degrees as needed.
My 2 cents: dynamic typing works okay for library consumers. For libraries themselves though, or platform code, the disadvantages are real. It is harder to fix and extend code when you don't know who calls it, how they call it, what they get in return. Complex code becomes littered with 'black holes'. That is a big part of why facebook implemented Hack. I heard a talk by one of the developers. Even now there are PHP blackholes in the Facebook code base that they can't migrate to Hack.
100% statically-type-checked code != 100% bug-free code. That would require solving the halting problem. So you have to test everything anyway if you need high reliability.
This argument is incorrect. The "halting problem" is the problem of determining if an arbitrary program halts. It is not impossible to prove, and verify mechanically, that a particular program halts.
The state of the art is not up to proving every desirable property of every program that we would like to build. But that has nothing much to do with computability. And some extremely impressive things have been done, like the seL4 separation kernel, which has static proofs of, among other things, confidentiality, integrity, and timeliness, and a proof that its binary code is a correct translation of its source.
> It is not impossible to prove, and verify mechanically, that a particular program halts.
OK, let's put that to the test. Here is a particular program:
Can you tell me if it halts or not?let x = 6 let y = 3 while true: if y>x then halt if is_prime(y) and is_prime(x-y) then x = x + 2 y = 3 else y = y + 2 endif> The state of the art is not up to proving every desirable property of every program that we would like to build.
Isn't that exactly the same as what I said?
> But that has nothing much to do with computability.
What does it have to do with then?
> some extremely impressive things have been done
Yes, in some very particular cases. But note that even a proof of correctness is not a guarantee that the code is bug-free.
I think you have missed my point. I am not saying that humans are able to solve the halting problem! Nor am I saying that static verification is always better than testing. I am saying that you don't need a halting oracle to express and verify arbitrary properties in a static type system, because a static type system can and will reject programs that would not have type errors dynamically.
If you write this program in a statically checked language:
I can tell you that it will not type check. And for the same reason, if you write the same program in a language that can express termination and claim that it terminates, the program will not type check until you have supplied a proof of (edit: the negation of!) Goldbach's conjecture in a form that the type system understands.let x : int = 6 let y : int = 3 while true: if y>x then break if is_prime(y) and is_prime(x-y) then x = x + 2 y = 3 else y = y + 2 endif x = "foobar"> you don't need a halting oracle to express and verify arbitrary properties in a static type system
Replace the word "arbitrary" with "some" and I'll agree with you. There are some things a static type system will tell you. Some of those things are even useful things to know. But there are some things a static type system will not tell you, and cannot tell you, and some of those things are useful things to know too.
Furthermore, the way static type systems are used in practice, they don't just tell you things. They will actually refuse to let you run the program unless it conforms to some preconceived notion of correctness that is built in to the type system. Personally, that's the part that rubs me the wrong way. It is sometimes useful to me to run a program even if I know that it has certain kinds of errors in it.
> it will not type check
I'm pretty sure it would. Why do you think it would not?
> I'm pretty sure it would. Why do you think it would not?
Languages like Coq require you to prove a function halts before it will compile. Yes, for an arbitrary function it can be arbitrarily difficult or impossible to prove termination. In most cases though, termination proofs aren't that complex (e.g. "it halts because the collection gets smaller each recursive call").
Besides, you're argument is basically sounding like "because you can't prove all functions halt it's a waste of time proving any functions halt". See the sel4 OS for an impressive example of what formal proofs can do.
> Languages like Coq require you to prove a function halts before it will compile.
Well, that's incredibly stupid. That means you can't write, for example, a web server in Coq unless you intentionally introduce undesirable behavior to satisfy the compiler.
> because you can't prove all functions halt it's a waste of time proving any functions halt
No. That's obviously a straw man. Can you please consider the possibility that I might not be a complete idiot?
My argument is: because the halting problem is undecidable, there are an infinite number of properties of programs that are also undecidable. So there are only two possibilities:
1. None of the infinite undecidable properties of programs are things we will ever care about or
2. There are properties of interest that cannot be decided by static typing
Which of those is the case is an empirical question but I submit that #2 is much more likely to be the case. Therefore, static typing cannot obviate the need to be prepared for your program to exhibit unexpected behavior at run time except in the most trivial cases.
> Well, that's incredibly stupid. That means you can't write, for example, a web server in Coq unless you intentionally introduce undesirable behavior to satisfy the compiler.
There's ways around it (e.g. proving progress is always going to be made instead of termination) and there's a web server in Coq: http://coq-blog.clarus.me/pluto-a-first-concurrent-web-serve...
> 2. There are properties of interest that cannot be decided by static typing > > Which of those is the case is an empirical question but I submit that #2 is much more likely to be the case. Therefore, static typing cannot obviate the need to be prepared for your program to exhibit unexpected behavior at run time except in the most trivial cases.
Again, look at the sel4 project. It verifies the correctness of an entire OS showing that formal verification is powerful, practical and useful. Google for all the algorithms that have been formally verified with Coq, Isabelle and other proof assistants.
Why do you think it would be common properties of interest wouldn't be provable? Do you think mathematicians have this issue (there's not a lot of difference when you have expressive enough types)? You yourself must have an intuition about why the properties would be true so you should be able to write a formal proof of that although it can be very challenging currently.
> Why do you think it would be common properties of interest wouldn't be provable?
Because proving all common properties of interest is tantamount to proving all interesting mathematical theorems.
> Do you think mathematicians have this issue
Yes, obviously. If they didn't they wouldn't have jobs.
> it can be very challenging currently
Yes indeed, and that is exactly my point. Humans just keep finding new and more complicated things to care about. Math doesn't converge.
I intended the program to clearly have a type error on the last line, where the string "foobar" is assigned to a variable that has been declared to be of integer type. (In hindsight, I guess it is ambiguous whether the imaginary pseudo-language we are communicating in types variables, as most static languages do, or values, as most dynamic languages do, and in the latter case it would type check. I should have done something that is a type error in either case, like `x = x / "foo"`.) My intent was to show that even though the desirable property 'does not encounter type errors in execution' is reducible to the halting problem in general, and in this particular case, is reducible to a famous conjecture, static type checkers can calmly and soundly verify that programs have this property! They do so by verifying a strictly stronger property, necessarily rejecting some programs that have the desirable property but accepting only programs that have it. A dependently typed language which can express properties like termination does the same thing, rejecting some programs that terminate but accepting only programs that terminate. In particular, they will accept only programs where YOU provide them with (at least an adequate sketch of) a formal proof of the property.
In general, when writing programs, we ought to develop at least a very informal argument for why they have the properties we want them to have. To the extent that they are correct, these informal arguments could be formalized. It's possible to imagine that with future technology, formalizing these arguments with the assistance of powerful tooling will actually be easier than reasoning about them informally, in the same way that you often find running and inspecting your lisp program easier than reasoning about it without assistance. As far as I know, neither computability theory or any other theoretical obstacle rules this out; it is just (perhaps far) beyond the state of the art.
Perhaps you have mistaken me for an absolutist advocate of static typing or formal methods, which I guess is reasonable in the context of the thread. I'm not at all: I've experienced plenty of joy and pain (and bugs) in both static and dynamic languages, and have had more experience and success with advanced testing methods than with formal ones. At this moment, I'm writing a testing tool in a dynamic language! I just wanted to clear up a technical misconception, because I have seen fields held back before by widely misunderstood impossibility results.
Happy lisping!
Sorry, I missed the last line of your rewrite.
I think we actually agree here. Static typing can be useful. I just personally find the manner in which it is usually deployed to be unnecessarily annoying.
Any program that is non-trivial meaning 100K+ lines of code, involves many developers over 2+ years of time, should be written in a statically typed language.
That really means nothing. 100K+ lines of code is an arbitrary number. For that many lines of C++, a similar Clojure solution to the same problem would be a small fraction of that. And many widely-used Clojure libraries are in production all over the industry for many years.
That requires some sort of prophecy abilities. Why not play it (type-) safe?
So the article does praise Go, but how is Rust? Does it strike that sweetish spot? Is it a language a startup should use?
Rust's type system is much closer to Haskell than Go, and even advocates of the language will admit that it can sometimes be very difficult to convince the compiler that your program is valid. Compile speed isn't great either, although it's been improving. I would say that Rust is pretty much on the other end of the scale from the author's supposed "sweet spot".
I wonder if they would praise Java, especially ancient Java. It was a very similar language. Easy concurrency was a big selling point. Generics were a matter of casting to Object.
What’s old is new again, though one can hardly imagine cat-v touting the merits of Java.
I guess the only bit I don't really agree with is this:
> upfront investment in thinking about the correct types
being a cost. Surely you have to do this whether the compiler will check your work or not, and if you just don't do the thinking you'll end up with bugs? Isn't this a benefit?
Couldn't these discussions benefit from an inclusion of actual empirical evidence? Here's a list of some such studies: http://danluu.com/empirical-pl/
While the made up graphs might help understanding his reasoning, I think it's way too abstract/philosophical. It's like walking into a dark room making assumptions and arguments based on your belief of what color the walls are.
https://dl.acm.org/citation.cfm?id=2635922
Just ONE study, so don't take too much heed. That said, apparently:
* Strongly type, statically compiled, functional, and managed memory is least buggy
* perl is REVERSELY correlated with bugs. Interestingly, Python is positively correlated with bug. There goes the theory about how Python code looks like running pseudo-code... Snake (python's, to be more precise) oil?
* Interestingly, unmanaged memory languages (C/C++) has high association with bugs across the board, rather than just memory bugs.
* Erlang and Go are more prone to concurrency bugs than Javascript ¯\_(ツ)_/¯. Lesson: if you ain't gonna do something well, just ban it.
All in all, interesting paper.
Question for all static or dynamic typing proponents: do you see your language/type-system as a great and scalable way to program large distributed systems in 10 years? 20 years?
Can't we have tools that automatically perform the static typing for us, perhaps in an interactive way?
(I'm not talking about systems which just infer types automatically).
> “And more often than we like to admit, the error messages a compiler will give us will decline in usefulness as the power of a type system increases.”
Can someone explain this?
I would like lots of static typing, even more than we have now, but an ability to turn it off for faster compile times during some parts of development.
In my experience with growing companies, even business-critical code bases get rewritten within 3-4 years to account for flexibility that the previous strongly-typed system just can't handle. A well designed system uses strong types for the "knowns" but allows changes via dynamic types for the "unknowns". Those are the systems that last.
Just a technical point that hints at a significant philosophical idea: The asymptote cannot reach 100% of program behavior in any finitary way. That would solve the halting problem. The x-axis should go off to infinity. Also, it's not a smooth progression. There are huge jumps in expressivity involved here. Going from Java-style types to Hindley-Milner to full System F are all massive jumps in expressivity. There are also incompatible features of type theories. Type theories are a fractal of utility and complexity.
A type system doesn't only describe the behavior of the program you write. It also informs you of how to write a program that does what you want. That's why functional programming pairs so well with static typing, and in my opinion why typed functional languages are gaining more traction than lisp.
How many ways are there to do something in lisp? Pose a feature request to 10 lispers and they'll come back with 11 macros. God knows how those macros compose together. On the other hand, once you have a good abstraction in ML or Haskell it's probably adhering to some simple, composable idea which can be reused again and again. In lisp, it's not so easy.
A static type system that's typing an inexpressive programming construct is kind of a pain because it just gets in the way of whatever simple thing you're trying to do. A powerful programming construct without a type system is difficult to compose because the user will have to understand its dynamics with no help from the compiler and no logical framework in which to reason about the construct.
So, a static type system should be molded to fit the power of what it's typing.
The fact that every Go programmer I talk to has something to say about their company's boilerplate factory for getting around the lack of generics tells me something. This is only a matter of taste to a point. In mathematics there are a vast possibility of abstract concepts that could be studied, but very few are. It's because there's some difficult to grasp idea of what is good, natural mathematics. The same is in programming: there are a panoply of programming constructs that could be devised, but only some of them are worth investigating. Furthermore, for every programming construct you can think of there's only going to be a relatively small set of natural type systems for it in the whole space of possible type systems.
Generics are a natural type system for interfaces. The idea that interfaces can be abstracted over certain constituents is powerful even if your compiler doesn't support it. If it doesn't, it just means that you have to write your own automated tools for working with generics. It's not pretty.
On the other hand, once you have a good abstraction in ML or Haskell it's probably adhering to some simple, composable idea which can be reused again and again.
The catch there, as is often the case, is hidden in the word "good". Working with text data in Haskell is almost as painful as working with text data in C++, and for much the same reason: the original abstraction is far from ideal for most practical purposes, but became the least common denominator. Everyone and his brother has written a better string abstraction or more powerful regex library or whatever since then, but they're all different.
Consequently, even with the power of generics or typeclasses, you still often see developers just converting to and from the primitive default representation for interoperability. Static typing will at least stop you from screwing that up, which certainly is an advantage over dynamically-typed languages in some situations. However, it apparently hasn't made it any easier for the developer community as a whole to migrate to a better abstraction as the default.
In short, we often don't know what will turn out to be a good abstraction until we've gained a lot of experience, and in the face of changing requirements on most projects, we probably never can know from the start because what works as a useful abstraction might change over time. So while types are useful for checking whatever abstractions we have at any given time, until we've also got techniques for migrating from one to another much more smoothly and on much larger scales than anything I've yet encountered, I think we shouldn't oversell the benefits, particularly in terms of composability.
> Pose a feature request to 10 lispers and they'll come back with 11 macros.
What a ridiculous stereotype. Clojure community typically maintains the belief that macros are the last resort for things that genuinely justify them. You really shouldn't spread hyperbole like this.
In all honesty that was reckless for me to include. I meant it as gentle ribbing between functional comrades. In truth I admire Scheme-like lisps very much.
> The asymptote cannot reach 100% of program behavior in any finitary way. That would solve the halting problem.
There are languages that enforce termination. They only accept programs that can be shown to terminate through syntactic reasoning (e.g., when processing lists, you only recurse on the tail), or where you can prove termination by other means.
Coq is like this, as is Isabelle, as is F* , as are others. They also provide different kinds of escape hatches if you really want non-terminating things, like processing infinite streams.
This "we can never be sure of anything, because the halting problem" meme is getting boring. Yes, you cannot write the Collatz function in Coq. No, that is not a limitation in the real world.
I'm aware of strongly normalizing systems and the escape hatch of coinductive programming. But when we're talking about the space of all programs, the fundamental limit of incompleteness is important. How else do we judge the merit of a type system except by seeing how it fits into the overall space of computable processes?
There are two ways to see type systems. In the first way you construct terms along with their types, this is called Church style. In the second way, the terms exist before their types and you use types to describe their behavior, this is called Curry style. In particular take System F. In Church style the terms of System F come with their types. In Curry style we see System F types as a way to describe the behavior of untyped lambda terms.
I used to think Church style was more important but lately I've been more partial to Curry style. Programs exist before you type them, type systems tell you how they behave. They also tell you how to construct programs but this is subordinate to the more fundamental descriptive capacity.
> But when we're talking about the space of all programs, the fundamental limit of incompleteness is important.
I agree with this. But I think that usually we are not really talking about all programs. We are talking about useful programs, and those are usually terminating. In theory, not always, see first-order theorem provers; but in practice, we always call a prover with a time limit because nontermination isn't useful.
> Programs exist before you type them, type systems tell you how they behave. They also tell you how to construct programs but this is subordinate to the more fundamental descriptive capacity.
That's an interesting point, and I agree in many respects. I think when programming in a dynamically typed language, I approach things in one style, and in statically typed languages in another style. But specifically for termination, I don't think so. I never want to write a nonterminating program; the termination property for my programs exists (as a requirement) before the program does.
> This "we can never be sure of anything, because the halting problem" meme is getting boring. Yes, you cannot write the Collatz function in Coq. No, that is not a limitation in the real world.
How about, say, a video game? That's something where we reasonably _want_ it to not terminate, because we're primarily interested in its side effects.
That's what the escape hatches are for. In Coq, for instance, you can also write functions processing infinite streams, provided that they also produce something regularly. For example, in a game, you could write a function from an infinite stream of inputs to an infinite stream of frame updates. You can not write functions that would not terminate while computing the updates for some given frame.
Alternatively, you just write a function to iterate your game world update N times, where you choose N large enough to ensure that your game can run until the heat death of the universe. That's not a nonterminating function, but it's long-running enough for all practical purposes.
>How many ways are there to do something in lisp?
Many. That's the whole point -- to let you choose "the way to do something" that applies the best to your circumstances (development time, performance, allowable complexity, etc.)
So you are limited by your own mind and skills -- not by the language.
Yeah, actually I'm gonna go ahead and roll my eyes at the idea that parametric polymorphism is on the wrong side of the "diminishing returns of static typing". Less than ONE percent of Go code would benefit from type-safe containers?
If I don't have to maintain the thing you can give me any Python, JS or Go you want!
This site has a strange fascination with hatred of static languages. I really don't get it. My only guess is that modern colleges teach dynamic languages to students and so they're more familiar with it. Perhaps their teachers even stress that static languages are inferior.
To me, it's right tool for the right job. I have no problem spinning up a static language for performance and outsourcing the scripting to a dynamic language like Python for the best of both worlds in terms of speed, and rapid development.
"I don't think it's particularly controversial, that static typing in general has advantages"
That's not really true, just a belief. I give you an example to start understanding these things: the exact same program written in a very high level and very expressive language, like Perl, instead of Go, is going to have at least 3 times less code and since defect rates per line of code are comparable, you would end up with at least 3 times less bugs. Suddenly reliability argument of static typing doesn't make any sense. That's because in PL research there is a huge gap in understanding of how programmers actually think.
That's an argument for higher level languages over lower level ones rather than against dynamic typing.
And I'm not sure you should expect the number of bugs per line to remain constant across languages. Extra lines required because you have to do your indexing by hand as you're iterating over a list certain increases the chances of an error but the extra '}' required to end the block in some languages increases line count with very little chance of causing an error.
I think you are right, but you only cover producing code. For maintaining code, it is another story. If you have to take over an unknown code base, I think static typing will prevent bugs to be deployed, because the typing system will detect errors you might not be aware of due to your incomplete knowledge of the code.
I was in favour of dynamic typed, but lean more and more towards static typing, like ocaml.
Although I'm skeptical about the 1 to 3 ratio, let's run with it.
Given a million line codebase written in Perl vs a three million line codebase written in Go, which do you think most engineers would prefer?
Honestly the Ruby or Python one, but I've never seen them because you don't need a million lines in Ruby or Python to get something productive built.
"...and since defect rates per line of code are comparable..."
That's not really true, just a belief. A naive belief, if you ask me.
That claim is supported by more than one study.
Correct and useless programs are useless. Quite simple.
Why my favorite color is red not blue...
Time and time again I can make a well written functioning program in Java or C# at least twice as fast than using js and brothers. Sure it might have more "lines". Who freaking cares. My team and I square off all the time. "K, you use node I will use java" And the Java dev always wins. Its just so much faster, cleaner and mature. Its NO CONTEST.