Tulip – An untyped functional language
jneen.netThis might seem strange, but if there's one thing from Common Lisp that should receive wider adoption in other languages, it's hyphenated names. They are so much more readable than anything else (well, C with underscores comes close).
This is such a good idea I just added it to my current toy language:
https://github.com/TazeTSchnitzel/Firth/commit/7b9bf0b4c090e...
Thanks for the idea! :)
I've been playing around with creating a toy language which treats '-' as a name for a function. It means there always needs to be white space around a - (the syntax of the language isn't like LISP), but that increases readability at the cost of two extra key presses.
Also, note that, in the same breath practically, you can support true negative integer constants:
which are distinguished because there is no whitespace. You can distinguish the unary operator being applied to 1234 from a true -1234 constant.-1234
This would be an easy text-transformation that you could do in vim. Upon opening the file, translate all dashes without spaces "foo-bar" to add spaces "foo - bar". Then convert all underscores to dashes. "foo_bar" to "foo-bar". On save, invert the process.
You'd have to actually run the language's parser in order to do the transformation to avoid changing strings, and even then it'd only work if the parser output kept track of the original line and character so that you could know where to make the change.
This sort of text-transformation is something I've long wished my text editor did. At a previous job the standard was three-spaces of indent, regardless of the language.
> This sort of text-transformation is something I've long wished my text editor did.
Emacs does this in some cases: specifically for camelCasedWords (http://www.masteringemacs.org/article/making-camelcase-reada...) and for the word 'lambda' which can be displayed as a symbol. There are many other modes which "overlay" some text over how it looked originally.
Easier to type too. No need to press the shift key.
Perl6, Clojure, Racket, Rebol, Red, Factor & Forth are some other languages that allow hyphenated names.
And I agree with you that hyphens are more readable. They're also good for adding extra semantic meaning - https://news.ycombinator.com/item?id=3978992
Add to that list, GNU Make. In gmake, you sometimes need things that look like paths to be variable names:
where PATH could be path/to/foo-parser.o, say.VAR_$(PATH) := whatever
If that sort of thing interests you, also check out my language 'tab' (https://bitbucket.org/tkatchev/tab).
Tab is a statically-typed, functional, type-inferred language that occupies a niche between bash and python.
It's also not Turing-complete but can compute almost everything you could ever think of.
(I wish more languages aimed for Turing-incompleteness -- unsurprisingly, it turns out Turing-incomplete languages have big benefits for performance and resource management.)
How is yours not Turing-complete and what benefits are there?
> I strongly dislike macros that can hide in code. I get really frustrated when I open a source file and see (foo ...) and can’t tell whether it’s a function or a macro until I read documentation.
Well... that's just, like, your opinion, man.
Seriously though. In Elixir, for example, much of the language itself is implemented via its own macros, which demonstrates a certain nice extensibility. If Elixir followed this same pattern, it would get really annoying really quickly, as even simple if statements would require a leading slash.
Also, I preferred "unf" ;)
Yep, it's my opinion, and that's why I put it into the design. Lots of language design comes from opinions. I hope it's borne out. FWIW it's the same approach Rust has taken, where macros have to end with a ! to make them visually distinct.
That might be because Rust might not eat its own dogfood in that department, and build some of its own functionality out of its macro system.
But I can see just "knowing" at a glance if it's a macro or not.
I think the answer would basically be determined by how much of the language itself uses its own macro system AND what type of macro system it actually is. If it's significant, having special syntax would just look weird.
Rust does eat its own dogfood with regard to macros, and over time has steadily replaced former language-level features like `log` and `panic` with macros. Syntactic distinction is a philosophical choice in service of making costs more explicit (and while it's true that functions can hide behavior, overuse of macros can trigger enormous code bloat, such as the `regex!` macro which compiles your regex into a state machine).
(There are also valid technical reasons for requiring syntactic distinction, as the sheer flexibility of Rust's macros in their ability to create new syntax run the risk of making it a nightmare to parse if you remove the unambiguous ability of the compiler to drop into macro-parsing mode. These challenges aren't insurmountable, just very hairy.)
If you see (foo ...) but don't actually know what foo does, it doesn't matter all that much whether it is a function or an operator. Even if you know it's a function, that just tells you how the arguments are evaluated; but not what happens with those values. Untold effects could hide behind a function call.
This is also true for Racket. The language is basically all macros built on top of each other. While this superficial distinction between macros and other constructs serves a purpose, I think that purpose is largely misguided and invented.
What is the need for knowing if it's a macro or not when you could just know how it works (what'll it spit out / do?)?
While I do believe in limiting stuff for the sake of simplicity, this notation will actually burden the developer into not using the macro system fully, simply because someone wants there to be a non-forced distinction between macros and other constructs in the code.
Link is down for me.
Google Cache Text-Only:
http://webcache.googleusercontent.com/search?q=cache:cOp3ebJ...
Argh, thanks for the cache link. I'm still on heroku free-tier :\
This looks cool -- is there any source code? What language is it written in?
"Tulip is still in active development, and I could use a whole lot of help, both filling in the design gaps here and actually churning out the implementation"
OK interesting, it actually appears to be written in RPython, not full Python:
https://github.com/jneen/tulip/blob/master/tulip/libedit.py
(RPython is the "static" subset of Python used to bootstrap PyPy)
They have moved towards treating RPython as a framework:
Yeah, it's basically a toolkit for building jitted languages. Basically the easiest way to get a tracing jit these days. So it'll be a self-hosting jit similar to pypy or pixie.
Note that it definitely has types, they just aren't required explicitly. It seems to use dynamic type matching.
It's unityped! It has one type with infinitely many variants/tags (.<string>). Match failure occurs at runtime as in any other safe typed language such as Haskell or ML.
>Match failure occurs at runtime as in any other safe typed language such as Haskell or ML.
That is incredibly disingenuous. The only way a Haskell or ML program could be as colossally unsafe as a unityped language program is if the programmer used only one giant sum type for the entire program, and most functions in the program were non-total with respect to that type.
"Unityping" provides no static type safety. It is isomorphic to, and usually a euphemism for, the lack of any static type system.
Yeah, it was a difficult decision to remove types - I'd gotten myself into a corner trying to tack on dependent types, and it just wasn't happening. My bet is that unlike most of the un{i,}typed languages out there (most of which I'd categorize as lisps and smalltalks), tulip provides tagging and destructuring that allows the programmer to maintain some level of control over the polymorphism. Tulip will panic at runtime for non-total functions, but ideally you'll have the tools necessary to keep the panic as close to the problem as possible.
If you customize your runtime behavior based on any metadata about the value on which you operate, you have multiple types. Attempting to change the syntax will not change the fact that you need to differentiate behavior for numbers and strings.
> the fact that you need to differentiate behavior for numbers and strings
That's actually not exactly right: for example in Forth you really have no types at all.
Also, if somewhat uses a word such as "unityped" or "type with infinitely many variants" you should immediately know that any mention that "there are types, alright, just checked on runtime" will be immediately rejected. Majority of static typing fanatics are like that.
"Majority of static typing fanatics are like that."
That's not the problem. The problem is that to a first approximation, every language is "type safe" in the sense that you can't add a string to a number. Even in those languages where it looks like you can, it's because of a certain usually-limited set of automatic coercions, not because you can actually add a number to a string.
Truly adding a number to a string looks like this:
The string is, of course, a pointer, and the result, of course, is gibberish. This is why "no" languages to speak of implement this form of "untyped language"; it isn't what anybody actually wants. (Assembler, of course, has it, but that's an exception for obvious reasons.)number: 0x000000000000002a string: 0x7ffb000000007264 result: 0x7ffb00000000728eA term that describes essentially 100% of languages is not a useful one, so static typing usually refers to a language whose type system is somehow more restrictive at compile time than "Everything is a variant type and we'll work it out at runtime".
> The problem is that to a first approximation, every language is "type safe" in the sense that you can't add a string to a number
We're not discussing a concept of "type safety" here at all, but rather a concept of "untypedness". I just can't agree that for example Common Lisp (with CLOS), Smalltalk or Python are "untyped". They are not: untyped language is one which has no type errors both on compile time and runtime (unless I'm very . An obvious example is Assembler, but Forth or TCL qualify too. And quite a few others do too. See here: http://en.wikipedia.org/wiki/Programming_language#Typed_vers...
> so static typing usually refers to a language whose type system is somehow more restrictive at compile time than "Everything is a variant type and we'll work it out at runtime"
Again, it was never suggested that Tulip has "static types". It doesn't of course.
What I said is that it has types. I don't want to discuss how much better "static typing" is than "dynamic typing" or vice versa, this makes for a very boring discussion similar to Emacs vs. Vim and I'm not interested in it at all. I just object to the notion that "static types" are the only kind of types we can ever have in a language.
The problem is with "static typing fanatics", really. They'd like to bend the terminology in a way which helps them promote static typing, for example by equating all types with static types. This is both dishonest and unnecessary. No serious static typing advocate would do this (I hope) - static typing is a great idea able to defend on its own, there's no need to lie about "the other side" of the argument.
Well, all fanatics are like that. Way too much Kool-Aid, way too little critical thinking.
"I just can't agree that for example Common Lisp (with CLOS), Smalltalk or Python are "untyped"."
You completely missed my point, as evidenced by the fact you appear to believe I just claimed that when I in fact claimed the exact opposite, which is that since darned near nothing is "untyped", that's not a useful concept to use in discussing whether something is "typed" or not.
Assembler is truly untyped. Forth is untyped. Tcl is not untyped... it simply has a built-in coercian rule that it'll turn everything into strings if it doesn't like what you do to it. It's as close as you can get, but it isn't untyped.
A concept that describes only two languages is not all tha useful.
"They'd like to bend the terminology in a way which helps them promote static typing, for example by equating all types with static types."
Again, the fact that you appear to have completely missed my point is evidenced by the fact that I drew a distinction whereby languages may be "dynamically" or "statically" typed but darned near nothing is "untyped".
With all due respect, you're not in a position to be claiming that other people are "fanatics"... you don't have the information to come to that conclusion because you appear to be incapable of reading what people say, because you've already decided in advance what they're going to say. Yes, that makes the world look like... well... anything you want, really, but it's not a true description. You are not in a position to be complaining about other people's "critical thinking" skills when you yourself aren't even accurately gathering information with which to critically think.
> Tcl is not untyped...
Wikipedia page says otherwise. Why won't you edit it and fix it if you're sure it's a mistake?
> A concept that describes only two languages is not all that useful.
Which is why you decide to change the meaning of this concept? Or what are you getting at? Do you want to say that "lack of static typing" equals "untyped"?
> With all due respect, you're not in a position to be claiming that other people are "fanatics"... you don't have the information to come to that conclusion because you appear to be incapable of reading what people say, because you've already decided in advance what they're going to say. Yes, that makes the world look like... well... anything you want, really, but it's not a true description. You are not in a position to be complaining about other people's "critical thinking" skills when you yourself aren't even accurately gathering information with which to critically think.
You know, there's a difference between my attacking a general notion some (unspecified) people share and your directly insulting me. I wonder why did you choose to read my comment as targeted at you personally? Do you feel like a "static typing fanatic"? Did you write any of the things I was complaining about? Did you say that Python is untyped? Did you say that the only kind of types we can have are static types?
I'm re-reading your comments and can't see any of these. I wonder, why the heck would you, then, assume I was criticizing you specifically? Are you really going to defend people who say those things? You know better and, if you read our conversation once again, you'll see that we're in an agreement (except for TCL). Don't you see I'm talking TO you, not ABOUT you? Is my written English that poor (well, it may be so, sorry)?
B and early C are untyped like this
Early C? The parent described accurately how pointer addition in C works for all char pointers. (Well, other than the “nobody wants that” part, because that's how you skip n bytes of a string.)
C has pointer arithmetic
you can easily add a number to a char in C and get a jibberish character, which is why C is a weakly typed language
Not true (technically). When you add an integer to a char, you get an integer. (If you add a floating point number, the result is such as well.)
You can, of course, use the integer you got like a char (after all, C's char is a small integer type) to get your gibberish. You can also use a floating point number as a char, because C has lax implicit conversions.
If you add '!' to '#' you get 'D', which is non-sense in any high level language.
Not true; you added the byte that happens to be identified in ASCII by !, 33, to the byte that happens to be identified in ASCII by #, 35, and get 68, which is D. This is all perfectly sensible and type safe, two 8-bit numbers being added to produce another 8-bit number; you are only confused by the surface syntax. There's plenty of high-level languages that will let you do this; there's all sorts of reasons to make numbers and the ASCII chars they represent easily usable for each other in source code.
No, I didn't. I actually added the character '!' to '#'. The C language doesn't DISTINGUISH characters from their numerical representations. That's where I say it's weakly typed. It does not have an actual character type, it only has 8 bit numbers.
Of course in C it is perfectly logical. I am pointing out that it is not as strongly typed as other type systems that have richer types.
Yep! It focuses more on dynamic type-checks than on static typing though, so I put it in the category of "untyped functional" - more like clojure and erlang than haskell or ml.
Well, dynamic types are still types. :) It also seems strongly typed through a lack of implicit conversions between types.
I would say this is more like go than anything, though it seems to lack methods (and interfaces) and includes a functional syntax.
You're going to run into issues when attempting to extend polymorphism for built-in functions to user-defined types—imagine trying to figure out how to sort an 'unknown' type without a way to compare them without modifying the method to be explicitly aware of the new type.
There does seem to be a method/interface system (Under the "Methods, Protocols, Implementations" header). And it seems to have some sort of dispatch system for tagged structures that can be later modified by the user.
Exactly! This is what the @method / @impl system is for - it's about equivalent to clojure's defprotocol. Future plans include named protocols consisting of multiple methods, and protocol-based matching.
"I’ve renamed the language from Unf to Tulip, because some folks pointed out that the old name created an unnecessarily sexualized environment"
Are fifth graders critiquing programming languages now? Seriously, who makes that association and then feels the need to comment on it?
It was a decision I made, partly because I realized they were right, and partly because I think tulips are pretty.
) ( ( _) |/unnecessarily sexualized logo
she's doing a high kick away from the viewer?
um, it's a flower
“Unf” is quite widely used as a spelling of a moan, to express sexual desire or gratification. While it can be used to express non-sexual enjoyment, the sexual connotation it evokes is just unnecessary when it comes to a programming language, regardless of the original intent.
While I've never seen this use personally, UrbanDictionary very strongly corroborates this.
It’s worth noting that “universal noise of fucking” is a backronym—the word was originally onomatopoeic.
It is, still, but urbandictionary users like making inaccurate definitions as an attempt at "humour"
As an counter datapoint, this is the first time I have ever heard of this.
Perhaps as "umph" or "umf"? No?
No, not even slightly.
FWIW, I grew up in Scotland, then moved to England, so I'm probably from a rather different cultural background than the poster.
As a non-native speaker, I have seen "umph" before.