The so-called ``tagless-final'' style is a method of embedding domain-specific languages (DSLs) in a typed functional host language such as Haskell, OCaml, Scala or Coq. It is an alternative to the more familiar embedding as a (generalized) algebraic data type. It is centered around interpreters: Evaluator, compiler, partial evaluator, pretty-printer, analyzer, multi-pass optimizer are all interpreters of DSL expressions. Doing a tagless-final embedding is literally writing a denotational semantics for the DSL -- in a host programming language rather than on paper.
The tagless-final approach is particularly attractive if the DSL to embed is also typed. One can then represent in the host language not only DSL terms but also the DSL type system (i.e., type derivations): Only well-typed DSL terms become embeddable. Therefore, the type checker of the host language checks -- and even infers for us -- DSL types. Even DSLs with resource-sensitive (affine, linear) types are thus embeddable. The approach ensures by its very construction that all interpretations -- in particular, various transformations and optimizations -- are type-preserving.
The characteristic feature of the tagless-final approach is extensibility: a DSL expression, once written, can be interpreted in a variety of ways -- to evaluate, to pretty-print, to analyze, to transform or compile. At any time one may add more interpreters, more optimization passes, and even more expression forms to the DSL while re-using the earlier written DSL programs and interpreters as they are. Perhaps counter-intuitively, the tagless-final style supports DSL transformations: from reduction, constant propagation and partial evaluation to CPS transformations and loop interchange.
We have used the approach to implement extensible DSLs in the domains of language-integrated queries, non-deterministic and probabilistic programming, delimited continuations, computability theory, stream processing, hardware description languages, generation of specialized numerical kernels, and natural language semantics.
- Typed Tagless Final Interpreters: The First, Introductory Course (with published lecture notes)
- Modular, composable, typed optimizations
- Tagless-final optimizations, algebraically and semantically
- Finally Tagless, Partially Evaluated: Tagless Staged Interpreters for Simpler Typed Languages [The first paper]
- Algebraic Data Types and Pattern-Matching
- Beyond Church encoding: Boehm-Berarducci isomorphism of algebraic data types and polymorphic lambda-terms
- Pattern-matching on tagless-final terms
- Algebras
- Denotational semantics
- Tagless-Final Partial Evaluation with De Bruijn indices
- Typed Type Checking
- Relating Final and Initial typed tagless representations
- Abstract Categorial Grammars: Tagless-Final style for natural languages
- Intrinsic encoding
- Do-it-yourself domain-specific optimizing compiler: The beginning of the tagless-final approach
- Tagless (staged) interpreter typeclass for typed languages
Typed Tagless Final Interpreters: The First, Introductory Course (with published lecture notes)
The so-called `typed tagless final' approach of Carette et al. has collected and polished a number of techniques for representing typed higher-order languages in a typed metalanguage, along with type-preserving interpretation, compilation and partial evaluation. The approach is an alternative to the traditional encoding of an object language as a (generalized) algebraic data type. Both approaches permit multiple interpretations of an expression, to evaluate it, pretty-print, etc. The final encoding represents all and only typed object terms without resorting to generalized algebraic data types, dependent or other fancy types. The final encoding lets us add new language forms and interpretations without breaking the existing terms and interpreters.
These lecture notes introduce the final approach slowly and in detail, highlighting extensibility, the solution to the expression problem, and the seemingly impossible pattern-matching. We develop the approach further, to type-safe cast, run-time-type representation, Dynamics, and type reconstruction. We finish with telling examples of type-directed partial evaluation and encodings of type-and-effect systems and linear lambda-calculus.
Version
The current version is August 2012
References
lecture.pdf [275K]
Typed Tagless Final Interpreters
Generic and Indexed Programming:
International Spring School, SSGIP 2010, Oxford, UK, March 22-26, 2010,
Revised Lectures
Springer-Verlag Berlin Heidelberg, Lecture Notes in Computer Science
7470, 2012, pp. 130-174
doi:10.1007/978-3-642-32202-0_3
Finally Tagless, Partially Evaluated: Tagless Staged Interpreters for Simpler Typed Languages
We have built the first family of tagless interpretations for a higher-order typed object language in a typed metalanguage (Haskell or ML) that require no dependent types, generalized algebraic data types, or postprocessing to eliminate tags. The statically type-preserving interpretations include an evaluator, a compiler (or staged evaluator), a partial evaluator, and call-by-name and call-by-value CPS transformers.
Our principal technique is to encode De Bruijn or higher-order abstract syntax using combinator functions rather than data constructors. In other words, we represent object terms not in an initial algebra but using the coalgebraic structure of the lambda-calculus. Our representation also simulates inductive maps from types to types, which are required for typed partial evaluation and CPS transformations. Our encoding of an object term abstracts uniformly over the family of ways to interpret it, yet statically assures that the interpreters never get stuck. This family of interpreters thus demonstrates again that it is useful to abstract over higher-kinded types.
Joint work with Jacques Carette and Chung-chieh Shan.
Although the paper itself does not use the name `tagless-final', it gives a hint about its origins, which by now may be regarded as historical accidents. The name `tagless-final' seems to have arose spontaneously, and stuck.
Version
The current version is 1.3, 2009
References
JFP.pdf [217K]
Journal of Functional Programming 19(5):509-543, 2009
Lecture notes from the course on typed tagless-final embeddings
of domain-specific languages
with mode details and more examples
README.txt [3K]
Commented code accompanying the JFP paper,
with the complete implementations of all interpreters.
The code files are in the same directory as the README.txt file.
Relating Final and Initial typed tagless representations
One may discern two approaches to the typed tagless representation of an embedded DSL -- to be labeled, for historical reasons, `initial' and `final'. Either representation can be interpreted with no errors due to type-tag mismatch, or due to a reference to an unbound variable. The absence of both sorts of errors is statically assured and patent to the metalanguage compiler.
In the so-labeled initial approach, typed terms are represented by GADTs. The absence of type-tag mismatch errors is the central property of GADT. The absence of unbound variable reference errors is assured either by higher-order abstract syntax (Xi et al., POPL 2003) or De Bruijn indices and dependent types (Pasalic et al., ICFP 2002). This page has described the final tagless approach. Type-tag mismatch errors are patently absent because there are simply no type tags and hence no possibility of type errors during interpretation. The absence of the second sort of errors can likewise be assured by higher-order abstract syntax (used here) or De Bruijn indices.
We demonstrate that the final and initial typed tagless representations are related by bijection. We use the higher-order language of the Tagless Final paper (APLAS 2007), which is the superset of the language introduced in Xi et al (POPL 2003). In the latter paper, the tagless interpretation of the language was the motivation for GADT. In a bit abbreviated form, the final and the initial representations of our DSL in Haskell are defined as follows:
class Symantics repr where
int :: Int -> repr Int
lam :: (repr a -> repr b) -> repr (a->b)
app :: repr (a->b) -> repr a -> repr b
fix :: (repr a -> repr a) -> repr a
add :: repr Int -> repr Int -> repr Int
data IR h t where
Var :: h t -> IR h t
INT :: Int -> IR h Int
Lam :: (IR h t1 -> IR h t2) -> IR h (t1->t2)
App :: IR h (t1->t2) -> IR h t1 -> IR h t2
Fix :: (IR h t -> IR h t) -> IR h t
Add :: IR h Int -> IR h Int -> IR h Int
The data constructor Var of the initial representation
corresponds to HOASLift of Xi et al. The initial
representation is parameterized by the type of the hypothetical
environment h: that is, h t is the type of
an environment `cell' holding a value of the type t.
The relation between the two representations is established as follows:
instance Symantics (IR h) where
int = INT
lam = Lam
app = App
fix = Fix
add = Add
itf :: Symantics repr => IR repr t -> repr t
itf (Var v) = v
itf (INT n) = int n
itf (Lam b) = lam(\x -> itf (b (Var x)))
itf (App e1 e2) = app (itf e1) (itf e2)
itf (Fix b) = fix(\x -> itf (b (Var x)))
itf (Add e1 e2) = add (itf e1) (itf e2)
The mapping from the final to the initial and its converse are both total; composing one mapping with the other preserves interpretations. The code below gives concrete examples of that preservation. The totality is especially easy to see for the mapping from the final to the initial, since the mapping even looks like the identity.
Version
The current version is 1.1, Jan 1, 2008
References
Algebras
In the case of a first-order DSL, which can be described by an
algebraic signature, both final and initial representations are
initial algebras, and hence isomorphic.
InFin.hs [8K]
Haskell code with the complete definitions of both representations,
several sample interpreters, complete bijections and their compositions.
The code includes several concrete examples.
Formatted IO as an embedded DSL: the initial view
Formatted IO as an embedded DSL: the final view
An example of initial and final embeddings of a DSL of formatting
patterns
Abstract Categorial Grammars: Tagless-Final style for natural languages
Tagless (staged) interpreter typeclass for typed languages
We demonstrate a tagless (definitional) interpreter for a typed language implemented in a typed meta-language: Haskell with multi-parameter typeclasses and functional dependencies. The interpreter uses no universal type, no type tags, no pattern-matching. It is, in fact, total -- syntactically. The interpreter supports heterogeneous binding environment and the (functional) dependence of the result type on the structure of the source term. The interpreter is in fact a type class:
class Eval gamma exp result | gamma exp -> result where
eval :: exp -> gamma -> result
Our code has been greatly inspired by the ICFP 2002 paper by
Pasalic, Taha, and Sheard on staged tagless interpreters. The paper
gives the most lucid explanation of the tagging problem in typed
interpretations. Although the paper develops a dependently typed
language Meta-D for writing typed tagless interpreters, the paper
itself gives hints that dependent types are not really necessary.
The key phrase is about indexing types by singleton types
rather than by terms. The former is easily implementable in Haskell
as it is. The introduction section gives the other hint: the apparent
problem with the eval function is that it should
yield an Int when evaluating the literal constant expression
B 1 and yield a function when evaluating the term
L "x" (Var "x"). Indeed no ordinary function
can return values of different types. But an overloaded function can,
e.g., Haskell's read.
With the help of Template Haskell, we stage our tagless code to remove its interpretative overhead. Because expressions in Template Haskell are untyped, we add a newtype wrapper to maintain their types. Our staged interpreter deals exclusively with these typed code expressions, to be faithful to the Pasalic et al. paper. Template Haskell can print code values, so we can see the staged result: the `compiled' code. In particular, here is the running example of the paper and the result of its evaluation with our staged interpreter:
stest4 = show_tcode $ seval (L (TArr TInt TInt) (V Z)) HNil
*Staged> stest4
\x_0 -> x_0
There are indeed no tags. Here is another test:
stest3 = show_tcode $ seval (A (L TInt (V Z)) (B 2)) HNil
*Staged> stest3
(\x_0 -> x_0) 2#
If we change TInt above to TArr TInt TInt, we get a type error
before running stest3: The typing is indeed done at the
meta-level.
The present code was the first attempt to define tagless interpreters in a language without (overt) dependent types. This work has continued in cooperation with Jacques Carette and Chung-chieh Shan. We showed that writing typed interpreters becomes significantly simpler if we change the building blocks of object language terms, from data constructors to constructor functions.
Version
The current version is 1.1, Aug 17, 2006
References
Emir Pasalic, Walid Taha, Tim Sheard: Tagless Staged Interpreters for Typed Languages. ICFP 2002
Interp.hs [3K]
The tagless typed interpreter for the the typed language of the
above paper, viz. simply-typed lambda-calculus with De Bruijn indices.
The interpreter is deliberately patterned after the one in the paper,
including the type-level function TypEval. The code almost literally
implements the Meta-D interpreter from Fig.3 of the paper and the
typing rules from Fig.1 -- without any dependent types.
Staged.hs [3K]
The staged tagless typed interpreter
Implicit configurations -- or, type classes reflect the values of types
(Haskell Workshop 2004. Joint work with Chung-chieh Shan)
The paper demonstrates how easy it is to introduce type families
indexed by singleton types in Haskell as it was in 2003: Haskell98
extended with multi-parameter type classes with functional
dependencies.