[UNN] Feature Macros, not an alternative to Feature Expressions

15 min read Original article ↗

Alan Dipert

unread,

Jan 22, 2015, 9:02:02 PM1/22/15

to cloju...@googlegroups.com

Hi everyone,

Micha Niskin and I wish to share that we identified concretely an
aspect of the Feature Macros proposal that we think makes the whole
thing unsound.  We unannounce it :-)

We couldn't have reached this conclusion without the vigorous argument
and discussion the Clojure community was kind enough to indulge us
with.  We invite everyone involved to share in the joy that comes with
working an idea completely through.  Go us!

The problem is one of composition.  While operations being applied
under the usual Lisp eval rules receive arguments after evaluation,
macros applied under the usual Lisp macro expansion rules *do not*
receive their arguments after macro-expansion.  It is a curious
asymmetry.

Because macros have the option of expanding their arguments, and
because most don't, we can't pass macros code containing other macros
and expect the same kind of inside-out composition we get with normal
Lisp.  Consider the normal Lisp function, which can be ignorant of the
code contributing to the argument values it sees:

(+ 1 2)
(+ (inc 0) (* 2 1)).

In both cases, + sees an arg list of [1 2].  It doesn't know or care
what code represented its arguments prior to evaluation.

Macros aren't like this.  We can't pass arguments to macros that
contain macro calls and expect the top-most macro to be ignorant of
how its arguments came to be.

Consider the ns macro and the case-platform macro from our proposal
[1].  If ns macro-expanded its arguments, we could achieve FX-like
functionality like:

(ns example-portable-ns
  (:require (case-platform
                 :clj clojure.core
                 :cljs cljs.core)))

But ns doesn't macroexpand its args, and never will.

The workaround, which we applied in ignorance of the asymmetry as the
ns+ macro in the proposal, is to wrap.  If you wrap an existing macro,
you have an opportunity to control the expansion of its arguments.
This means that for every macro you want to exhibit the new semantic,
you need to wrap it.  This results in a code explosion problem (in the
form of wrapper macros) which is the same problem we're trying to
solve.

The #+ and #- reader macros of Feature Expressions circumvent this
problem, because as reader constructs, they are the only
possibly-conditionalized thing preceding macro expand.  The code
containing them cannot know that they exist, in the same way that a
macro which received its arguments expanded sees no macros.  With
them, regular ns works fine, because it is ignorant of the
reader-level dispatch that preceded its expansion:

(ns example-portable-ns
  (:require #+clj clojure.core #+cljs cljs.core))

We're still not super-enthusiastic about FX as it stands, because
we're terrified by the prospect of losing code generation forever.  We
think there might alternatives to mitigate though, such as boxing
feature-read forms in a new special form with :+ and :- meta hung on
it.  At least then we could generate and print things without
descending immediately into string munging.

We encourage you to think deeply and critically about FX too.  We
tried to, and were rewarded by learning something awesome about Lisp.
Yes, FX was invented by geniuses in the beforetimes and is probably
good, but if we "cargo cult" without reasoning anew for ourselves why it's good, we
just might regret it.

We'd like to acknowledge Brandon Bloom, who imagined something similar
to the problem we describe on the Feature Expressions design page back
in 2013. [2] We thank also Colin Fleming, whose mention in IRC of a "fear of an explosion of
+ macros" caused us to see ns+ in a new and unsavory light.

Everything we mention was probably also known by somebody, but it was
hard to Google.  If anyone knows any related references regarding the
weird missing macroexpand semantic, do send them our way.

Oh, and here is a prototype implementation of the Weird Semantic:
https://gist.github.com/alandipert/331885e36756e691f41a

Alan Dipert
Micha Niskin

1. https://github.com/feature-macros/clojurescript/tree/feature-macros/feature-macros-demo
2. http://dev.clojure.org/display/design/Feature+Expressions?focusedCommentId=6390065#comment-6390065

Alan Dipert

unread,

Jan 22, 2015, 11:58:46 PM1/22/15

to cloju...@googlegroups.com

I forgot to acknowledge also Adrian, who responded to our thread yesterday with maybe the simplest formulation of the Feature Macro problem: macroexpansion works from the outside in.
Alan

James Reeves

unread,

Jan 23, 2015, 11:27:04 AM1/23/15

to cloju...@googlegroups.com

Congratulations on your unannouncement. Being willing to try out new ideas is impressive; being able to accept and act upon criticism is even more so.

- James

Rich Hickey

unread,

Jan 23, 2015, 5:32:01 PM1/23/15

to cloju...@googlegroups.com

Thanks for acknowledging the shortcomings, and prior work on the same idea.

I do think there are shortcomings to Common Lisp's #+, the primary being the one cited: it's not data. Without that, it is hard to make programs that generate feature-conditional programs, or ones that transform them.

That can easily be solved with:

1) a proper conditional-read form:

(#? ...)

and

2) a mode of the reader that does *not* do conditional read processing.

==
Another deficit of #+ is that each condition is floating around in the code, not visible as a single set of choices. The case-* of the feature-macro proposal was nice in organizing these choices.

I'm not sure about case-this vs case-that (i.e. features is a map with slots for this and that), or even 'case' as the right model (vs order-sensitive 'cond'), but the grouping is nice.

Setting aside features-as-map for a moment, given features-as-set, a form version could look like:

(#? :clj this :cljs that)

where #? reads as 'clojure.core/read-cond (or something).

Also interesting would be a splicing version ('clojure.core/read-cond-splicing):

(#?@ :clj [these-forms ...] :cljs [those-forms ...])

It's important that these be able to yield nothing, not nil, in order to be useful everywhere.

An obvious enhancement might be and/or/not boolean expressions:

(#? (or :clj :cljr) this
:cljs that)

Clauses would be considered in order, first match wins.

:default can be reserved for supplying a default when no options are true.

(#? :clj this
:cljs that
:default 42)

If none true and no default, the form reads nothing (not nil, not error).

'default-features' would still be available for macros to use.

====
Future possibilities

There was some talk about feature-macros being open, but in reality none of these things are very open. Either the feature is considered in the code or there is a default, there's no way to extend case/cond-like things without touching the code or having a true extension point.

Taking this idea further, we can say a final solitary (namespaced!) keyword can serve as a semantic label for extension:

(#? :clj this
:cljs that
::whatever-extension)

Given this, some read-time mechanism can first be consulted to see if there is an entry for :your-lib/whatever-extension. If so, it is used as the value read from the form, else the form logic runs. Now if none of the conditions hold and no default it is an error, since there is a way to provide customization for your environment w/o changing the source.

This would also allow for shared, named snippets:

(#? :clojure.port/date-ctor)

There is a lot more infrastructure required for this extensibility, and it remains a future prospect.
====

For the present, Alex Miller will be coordinating prototyping of this mechanism ('read-conditionals' ?)

Feedback and help welcome.

Thanks,

Rich

Alex Miller

unread,

Jan 25, 2015, 10:48:11 PM1/25/15

to cloju...@googlegroups.com

Quick update - Luke VanderHart built a prototype of this for Clojure over the weekend, which is attached at http://dev.clojure.org/jira/browse/CLJ-1424 as clj-1424-5.diff. I'll be doing some review of it Monday. (Thanks Luke!!)

One related piece of work that will be needed is a port of these changes to tools.reader (some of the code will already be present in the latest fx patch at http://dev.clojure.org/jira/browse/TRDR-14). If anyone is interested in helping out, I'd be happy to get a hand there! Let me know here so we're not duplicating effort.

The existing ClojureScript patch on CLJS-27 is probably the same, so that one should be ok.

My hope is that (pending feedback, which is welcome), we will have a complete prototype in the next day or two and can move towards inclusion in the next Clojure alpha (and tools.reader and ClojureScript).

Alex

Colin Fleming

unread,

Jan 26, 2015, 3:52:40 AM1/26/15

to cloju...@googlegroups.com

So the plan is to move ahead with the #? and #?@ syntax?

al...@puredanger.com

unread,

Jan 26, 2015, 8:23:53 AM1/26/15

to cloju...@googlegroups.com, cloju...@googlegroups.com

Well, as I said, pending feedback. Would love to have some!

Luke VanderHart

unread,

Jan 26, 2015, 10:13:26 AM1/26/15

to cloju...@googlegroups.com

Hi Alex,

I'll be taking a crack at tools.reader today, probably.

Thanks,

-Luke

Chouser

unread,

Jan 26, 2015, 10:24:17 AM1/26/15

to cloju...@googlegroups.com

Did I read the patch correctly, that :default, :none, and :else are all reserved but mean exactly the same thing? If that's correct, may I suggest we pick just one instead?

In my experience, synonyms for the same functionality provide little benefit, as everyone needs to know all the options anyway, in order to read each other's code. So it's just more things for people and tools to recognize, without providing any new meaning.

I'd recommend :default since that's already a special word in a couple other places in Clojure (tagged literals, and multimethods). But even if that's not chosen, I'd prefer any other single value over a set of three.

What do you think?

Alex Miller

unread,

Jan 26, 2015, 10:31:45 AM1/26/15

to cloju...@googlegroups.com

Hey Chouser,

Rich wanted to reserve those for future possible meanings. I think I'd agree that we should pick one (:default seems right to me) as the canonical term. The others should probably throw errors for now.

Alex

Brent Millare

unread,

Jan 26, 2015, 11:01:35 AM1/26/15

to cloju...@googlegroups.com

This sounds great but I'm still fuzzy on the basics. What's the purpose of "#? reads as 'clojure.core/read-cond"? Also, what shortcoming does "2) a mode of the reader that does *not* do conditional read processing." fix and how?

Alex Miller

unread,

Jan 26, 2015, 11:23:41 AM1/26/15

to cloju...@googlegroups.com

Chouser

unread,

Jan 26, 2015, 1:22:25 PM1/26/15

to cloju...@googlegroups.com

Also, is it intentional that reading (clojure.core/read-cond ...) does not behave the same as (#? ...)?  That is, (#? ...) can be read as c.c/read-cond depending on read options, but having been read, if it is printed again it doesn't round-trip back to #?.  This is different, for example, from how #(...) is read as (fn* [] (...)), which then retains its meaning.

Alex Miller

unread,

Jan 26, 2015, 3:20:18 PM1/26/15

to cloju...@googlegroups.com

The intention is that clojure.core/read-cond can be read like #? (so while not identical in round-trip, they are at least semantically identical). There is a bug in the current patch in shouldReadConditionally() - should be .equals() instead of == for the symbol comparison.

After fixing that issue:

user=> (defn sr [s] (java.io.PushbackReader. (java.io.StringReader. s)))

user=> (read {} (sr "(#? :clj :x :default :y)"))

:x

user=> (read {:preserve-read-cond true} (sr "(#? :clj :x :default :y)"))

(clojure.core/read-cond :clj :x :default :y)

user=> (read {} (sr "(clojure.core/read-cond :clj :x :default :y)"))

:x

Alex Miller

unread,

Jan 28, 2015, 10:23:52 AM1/28/15

to cloju...@googlegroups.com

Chouser

unread,

Jan 28, 2015, 10:34:05 AM1/28/15

to cloju...@googlegroups.com

Thanks very much for keeping us updated, Alex. Highly appreciated.

Chouser

unread,

Jan 28, 2015, 3:34:25 PM1/28/15

to cloju...@googlegroups.com

Has there been any thought of splitting the functionality into two
functions, rather than adding a flag to 'read'?

One function, you could call it 'parse', could consume text and
produce the kind of objects described above as
suspend-conditional-read (including reader-conditional and
tagged-literal objects).

A second function, maybe called 'read-expand', would take those
objects as input, translate the reader conditionals and tagged
literals, and return what a regular 'read' does today.

Then 'read' could be defined as the composition of parse and
read-expand, and no flag would be necessary. Would that be strictly
simpler?

—Chouser

Rich Hickey

unread,

Jan 28, 2015, 4:00:46 PM1/28/15

to cloju...@googlegroups.com

Doing this in two steps means two passes and tree-rewriting. suspend-conditional-read is not important enough to engender that overhead for normal read. Plus, there's only one substrate so we'll have a flag internally anyway.

Chouser

unread,

Jan 28, 2015, 4:46:04 PM1/28/15

to cloju...@googlegroups.com

If both styles of read (with and without suspend-conditional-read) consume text.

I think that means tools that manipulate reader-conditionals and then
want to eval the results will round trip back through text. That is,
they will:
1. read with suspend-conditional-read on
2. do their tree-walking, manipulation, whatever
3. print the results out to text (byte-array/disk/etc.)
4. read that with suspend-conditional-read off
...and then will have data in the shape eval wants.

Did I get that right?

On an entirely unrelated note, as anyone serialized a tree such that
it can be run through a pipeline of transducers and efficiently built
into a tree again? :-)

—Chouser

Herwig Hochleitner

unread,

Jan 28, 2015, 9:37:40 PM1/28/15

to cloju...@googlegroups.com

Ad. syntax proposal: I like how #? and #?@, are mirrors of `~ and `~@

Though, ~ normally occurs some levels down of `. Can something like this be done for ? and # .. what would this mean?

When (pr x) -> #tag form, then maybe (read {:preserve-read-cond true} ..) should be (read {:preserve true} ..) and also preserve white space and newlines, closing the gap to editors.

@ Alex, Rich: Do you think that giving :splicing? to regular reader tags and implementing #? and #?@ as such would be missing a critical API distinction? Invite people to mess with side effects in reader tags?

@ Chris: The idea of using a transducer on a tree intrigues me ;-). Do you think this can be efficient as in taking advantage of -XX:+DoEscapeAnalysis, thus avoiding to allocate intermediate trees on the heap? Also, how do you feel about "The compiler needs an entry-point distinct from read" vs "Conditional reading as well as Reader tags are not part of the compilation process"?

What is the story for going from  (UUID/randomUUID) to (tagged-literal 'uuid "...")?

Currently it seems that this would be achieved by (read-string #{:preserve} (pr-str (UUID/randomUUID)).

Could anything be gained by splitting this out from print-method, e.g. *data-writers*?

Alex Miller

unread,

Jan 28, 2015, 11:03:33 PM1/28/15

to cloju...@googlegroups.com

Herwig Hochleitner

unread,

Jan 29, 2015, 4:20:22 AM1/29/15

to cloju...@googlegroups.com

Rich Hickey

unread,

Jan 29, 2015, 8:22:46 AM1/29/15

to cloju...@googlegroups.com

Reader conditionals are not an evaluation feature. They are a reader feature. The reader reads text. That means they are *about* text. Basically they are a way to say "I want to write two substantially similar programs in one (text) file". Most tools and interpreters should be interested in one program, in one dialect, at a time, and are greatly simplified by not having to worry if they have been handed more than one :) Some tools need to manipulate files (text). Only those tools need to deal with this meta program.

Macros for this purpose do not work, for many reasons discussed at length. Were this an evaluation feature, it would need to be a phase of macroexpand that ran on code prior to its being sent to macro functions, and again prior to evaluation. And in both cases it would mean tree walking and rebuilding, potentially into nested data structures of arbitrary types, and likely not even possible given the rich, extensible set of types supported by Clojure and its reader.

Most important is this: *All of the branches is not a program in any dialect*

I don't see much need for many (any?) non-text tools to see all of the branches at once (as data). Because of the deeply nested contexts in which these can appear, direct interpretation would be quite convoluted and slow. And if you are writing a program that produces/transforms such a multi-program, you most likely are going to need to serialize it anyway (who will have more than one evaluator at this phase waiting for this as ephemeral data?).

Is this just a theoretical question or do people have particular tooling they envision that would not be well supported by this proposal?

In any case, if people want to write what you called 'read-expand' (really, 'find-one-program') over data returned by read with suspend-conditionals, they can. Doing the elision at the bottom is the efficient thing, and what the reader should do by default.

kovas boguta

unread,

Feb 3, 2015, 7:53:02 PM2/3/15

to cloju...@googlegroups.com

Alex Miller

unread,

Feb 4, 2015, 8:45:54 AM2/4/15

to cloju...@googlegroups.com

kovas boguta

unread,

Feb 4, 2015, 2:36:39 PM2/4/15

to cloju...@googlegroups.com

Steve Miner

unread,

Feb 4, 2015, 2:45:29 PM2/4/15

to cloju...@googlegroups.com

[I’m changing the subject as my remarks don’t have much to do with feature expressions.]

For anyone interested in the history of *default-data-reader-fn*, you can look up CLJ-927 and related discussions on the dev mailing list. I’ll just say there was not a consensus about the best default, but the important thing was to provide an option for the programmer to take control of unknown tags. Throwing on an unknown tag had already been established in the previous release so keeping that behavior was the conservative approach. And to be fair, it is possible that people depended on it, or at least were thankful that casual testing found unintentionally unknown tags (due to typos or other mistakes.)

Regarding the problem of mixing libraries and unknown tags: It gets tricky if you're trying to handle a whole class (I use the term loosely) of potentially unknown tags. As a library author, I can tell you to use my fancy function as your *default-data-reader*, but what do you do if the other library has the same idea? Who's in charge? It should be the user, not the libraries.

I wrote a little library called Tagged [1] that’s mostly concerned with treating Clojure Records as EDN tagged literals, but it also provides some utilities for authoring data-readers.

[1] https://github.com/miner/tagged

I just updated the README to explain how I use what I call the tag-reader convention:

I use the term tag-reader to describe a function taking two args, the tag symbol and a value, like a *default-data-reader-fn*. Unlike a data-reader, a tag-reader may return nil if it does not want to handle a particular value. (See CLJ-1138 for more information about why a data-reader is not allowed to return nil.) The tag-reader convention makes is simpler to compose multiple tag-reader functions using `some-tag-reader`. You can wrap one or more tag-readers to create a data-reader with `data-reader`. The `throw-tag-reader` always throws so it's appropriate to use as your last resort tag-reader.

I've found it convenient to provide appropriate tag-readers in my libraries and let the user create his own *default-data-reader* (or :default option for clojure.edn/read) by composing those tag-readers.

Steve Miner
steve...@gmail.com

kovas boguta

unread,

Feb 4, 2015, 3:23:09 PM2/4/15

to cloju...@googlegroups.com

Brandon Bloom

unread,

Feb 5, 2015, 11:28:50 AM2/5/15

to cloju...@googlegroups.com

This is their main point - to serve as an extensible, semantically rich data format. The default behavior thwarts this.

For what it's worth, this matches my experience.

If you do (defrecord Tagged [tag value]) yourself, now you've got a multi-standard problem in which no two pipeline processors can communicate unless they were written to the same Tagged type (read: written by the same author).

At minimum, it would be nice to have a Tagged type in the core language, because then a trivial default-data-reader can actually span the gap.

Even if the default behavior doesn't change outright, it could be readily normalized with a single flag.