Why Specifications Don't Compose

111 points by msiemens 5 years ago · 64 comments

Reader

mmaroti 5 years ago

This is blatantly false. People are using formal verification tools in hardware design, and they do work and they do compose. Here is a simple verilog code which accomplishes some of what he wants.

  module Blinker (
   input wire clock,
   output reg state
  );
  
  always @(posedge clock)
   state <= !state;
  
  `if FORMAL
  
  always @(posedge clock)
   assert ($past(state) != state);
  
  `endif
  
  endmodule
  
  module testbench (
   input wire clock,
   output wire state1,
   output wire state2
  );
  
  Blinker blinker1(clock, state1);
  Blinker blinker2(clock, state2);
  
  `if FORMAL
  
  # assume that they are different
  assume (state1 != state2);
  
  # then they stay different
  always @(posedge clock)
    assert (state1 != state2);
  
  `endif
  
  endmodule

Of course you NEED clock wires to the blinker. Composability comes from the use of assume and assert. You assume that your environment is behaving according to spec and you prove that you are also behaving to spec PROVIDED the environment working correctly. See https://symbiyosys.readthedocs.io/en/latest/index.html for more details.

Moral of the story: broken spec cannot be used, and I am saying this as a mathematician!

junon 5 years ago

This. We use Hoare logic verification in software in very similar ways, to assert on symbol intervals to make sure arithmetic and memory operations can succeed in e.g. C.

tlarkworthy 5 years ago

I am not convinced.

There is no justification for specification languages to insist on a single global namespace with no option for namespacing. Just coz an operator doesn't compose does not undermine the usefulness that some things do compose, and that namespacing and scopes are generally useful organisation and communication techniques.

The only reason why LTL is in a global namespace is pure legacy inertia. Academics only apply it to toy problems, so it never needs the scaling features like namespacing.

Its not about logic. Thats just post-hoc stockholm syndrome.

dan-robertson 5 years ago

I don’t understand this at all. Either you have a totally different definition of namespacing to me (For me, this is basically getting to write x in most places and Foo::x and Bar::x in some places instead of foo_x and bar_x everywhere), or you’re just not addressing any of the issues discussed in the article (fairness, liveness, framing, etc).
- tlarkworthy 5 years ago
  
  Look at the 2nd paragraph of the motivating conversation:
``` Lars, 7:53: In software engineering, we have largely figured out modularization. All modern programming languages have some form of module system. You can pull in a module and import its functions anywhere. ```
... and then they write a blog about how this is impossible.
namespace is one trivial part of modularization (avoiding naming clashes). All implementations of LTL I have used do not have it. There is no good reason.
Your mentioning 'fairness' and other concepts that are part of expressing proofs on top of a state evolution system. Yes, the proofs are why we are using LTL, BUT, a good amount of time is spent building the model we wish to prove stuff about. There is no reason the state transitions cannot be namespaced and modularised, and it would save a ton of time, even if we have to express proofs globally of the "tree of state space".
Even on the proof level though, liveliness is a property that demonstrates the system never gets cornered. I am pretty sure that for many systems that will be expressible in general terms for a wide family of interesting systems. So I am pretty sure their argument doesn't hold even at the proof of properties level.
The real reason is abstraction is not something the logicians will get cited on, so it's a waste of their time trying to solve that thankless task. And probably they would not be very good at it anyway.
- dan-robertson 5 years ago
  
  I think the point of the article isn’t that you don’t have namespaces (which are basically trivial) but that, given two names, you can’t often combine them into something that gives you what you want. Instead you need to tease apart their definitions and combine them in a more complicated way. And it is this problem that the article is about
ahelwer 5 years ago

Can you describe how a formal specification system with namespacing would work?
- kgwxd 5 years ago
  
  Clojure Spec has it https://clojure.org/about/spec#_global_namespaced_names_are_...
  - gugagore 5 years ago
    
    To me Clojure Spec is more like IDL or a JSON Schema (or CBOR or Protobuf or etc), but with more "dynamic" stuff as opposed to just sum/product types.
    It doesn't occupy the same space as, say, TLA+.
    
    hwayne 5 years ago
    
    Clojure Spec is a contract system, which is a common way to write code specifications. You can formally verify it in theory; ACL2, another lisp, is used in formal verification. It's also hard to compose formally-verified code, as Lars talks about in the original link.
    
    gugagore 5 years ago
    
    Thanks for the correction. I've wanted to see more examples of Clojure Spec in practice because I had really understood it as an interface description language.
- tlarkworthy 5 years ago
  
  Something equivalent to hierarchical state machines.

sharkbot 5 years ago

It would seem that the problem is the notion of composition is underspecified in this domain. My intuition would be that one needs a different mechanism of composition aside from the most obvious “use basic logic operators to combine two unrelated logic statements”.

It’s unclear what “the composition of the two systems” really means. Are they running in parallel with no interaction (akin to two AWS instances in different data centers running each program)? Are they running in parallel with superficial interaction (two threads in the same process)? Are they running concurrently with mediated interaction (coroutines)?

The nature of how the programs are composed necessarily would influence the nature of how the specs should be composed, I would think.

Finally, maybe the fault lies with the logic system, rather than the problem?

hansvm 5 years ago

I came here to say exactly this. The problem isn't that composing the two specs was hard, but that their desired "composition" required internal details from both specs.
The same difficulty arises in software development too. If the behavior you want isn't just a combination of other existing systems, but rather a combination which peeks inside and hacks on additional constraints then you're going to have a bad time. Maybe it's still worth it for some reason, but that's a leading indicator that a few files probably ought to be refactored.
- mrkeen 5 years ago
  
  > The problem isn't that composing the two specs was hard, but that their desired "composition" required internal details from both specs.
  But there's nothing 'internal'.
  In conventional programming we pretend details are internal, e.g. by storing database handles as private fields. But that database is out in the real world.
  TLA+ needs to be able to tell you that you can reach a bad state from a particular interleaving of reads and writes to a supposedly internal variable.
  - hansvm 5 years ago
    
    My phrasing could probably have been better there. Do you have any ideas for improvements?
    The salient detail is not that you (or at least the verifier) can't reach in and look at the internal bits, but that if your goal is re-use then almost by definition you want to be able to take a whole spec, class, function, ..., and drop it in a new environment without having to wrap it in too many shims and modifications.
    Going back to the programming analogy, if the typical way a class was used was by picking and choosing half the private members to reflect over then that would strongly indicate it was a poor abstraction for the current use case. If you wanted to build a system using classes like that as your primitives then you'd incur a lot of development overhead from the impedance mismatch between the thing you want to build and the blocks you're trying to compose to build it with.
    TLA+ is a little mathy right? Let's tackle the idea from a different perspective :) The idea behind composability as used by the author is that you want complex specs to be describable as simple functions of simple specs -- C = f(B, A) where f isn't too distasteful. We don't just see that idea in programming and formal models; famous math problems often have that flavor [0], and in general it's pretty common to want to know what you can build from a set of blocks or to design blocks that can build lots of things. The insight the author had is that some things aren't cleanly related in _that_ way even if as humans we can "intuitively" see _some_ kind of a connection.
    Drawing a parallel to the real world, if you had two blinker systems (including power and other garbage) each adhering to some kind of blinker spec (always on or off and with some kind of guarantee that you'll keep alternating between those), you'd have a devil of a time composing those two blinkers to create a system of two blinkers which were in sync, whereas with a different set of building blocks it could be easy to create the desired result. The problem you uncovered isn't that the real world is hard to compose though, but that your chosen abstraction didn't fit the problem very well.
    [0] https://mathworld.wolfram.com/QuinticEquation.html
    
    sharkbot 5 years ago
    
    Just thinking out loud, this feels a bit like the problem ML’s module system is meant to solve: allowing the specialization of modules without needing to uncover the internal details of the implementation. I wonder if some of the same machinery would apply in this case. One would likely need a typed linear temporal logic theory to make it work.

dan-robertson 5 years ago

I wonder to what extent this is a problem from the global timesteps of LTL. They seem to require global transformations to work around and the result of the transformations seems to be harder to reason about. I wonder if some alternative logic would work better (eg one where time is more continuous and so there is no such thing as the X operator, or maybe one where time is partially ordered in some way but I’m not sure what that would look like).

I don’t really know how to talk about the problem of the shared variable coupling the two processes too tightly. It seems like the logic is too relaxed in allowing any variable to change at any time forcing specifications to lock down variables and therefore be too brittle to connect.

scotty79 5 years ago

All this is saying is you can't compose behaviors because we insist on having only one whole state, and only one concept of what 'next' state is.

If you can have state with sub-states and 'next' operator that is respective to what sub-state we want to address then this example is easily composable.

But then of course you might want to have substates interact with each other so it complicates further. It just looks like multithreaded programming.

Pet_Ant 5 years ago

Sure but do you have formal model of such a system that is sound? I think unstated is that LTT is a known, intuitive, and well understood model. Sure you can ad hoc a fix but will it have any rigor?
- scotty79 5 years ago
  
  That's the trade-off. Either you have things simple or expressive.
  But I'm strong supporter of approach where you start with the semantics you want and work hard to make it work as opposed to simple to implement semantics you don't want because it makes programming harder.
  I prefer TypeScript approach than C++ one.
  - hwayne 5 years ago
    
    Then you've left the realm of formal specification of systems, though, so you're entirely outside the scope of the problem.
    
    scotty79 5 years ago
    
    No. I haven't left anything. I'm just saying that expecting that useful widespread formal specification language will be something as simple as LTT is wildly optimistic and you should expect that it will rather mirror and encompass the complexities and requirements of actual programming, like necessity of composability.
    And it will most likely be somewhere between practical but unsound specification languages and those that come from pure mathematics.
    
    hwayne 5 years ago
    
    I used LTL because it's simple enough to explain the core ideas without getting bogged down in affordances and details. It's purely a pedagogical choice, just like how nobody explains the halting problem in terms of Java.
    Switching to a different notation helps with some of the practical details, but doesn't remove the essential problems that arise in composing specs.
    
    scotty79 5 years ago
    
    The sorce of the problem with composability in this post is that you don't have a concept of sub-states and 'partial next' operators.
    
    hwayne 5 years ago
    
    Would you be willing to provide a full example, replicating the dependent spec?
    
    Pet_Ant 5 years ago
    
    I'm wondering if you could as part of the O operator list all the variables in consideration for the "next"-state asar as that operator is concerned.
    like O[a,b,c]X in the next state which is determined by a change to any of a, b, or c, then X will be true. In the case where the set of variables is all of them they can be omitted and thus get the existing behaviour.
    I also suspect that the O operator with with a subset of variables will behave identically to other instances with the same set.
    However, I feel like this will still hit walls before practical limits, Beth would just need a non-trivial example
    
    Pet_Ant 5 years ago
    
    It must be sound or it's of no use. Reminds me of the Ultraviolet Catastrophe. They had a theory that gave them precise answers, they just happened to be blatantly wrong.
    
    scotty79 5 years ago
    
    Unsound things can be useful. https://frenchy64.github.io/2018/04/09/unsoundness-reply.htm...

jkhdigital 5 years ago

Ah, universal composition... the mother of complexity. Seems like the standard approach, in all manifestations of composed systems, is to find some property that makes composition “work” and then sandbox or guardrail the real system to (1) ensure the property holds in practice, and (2) recover without too much collateral damage when it fails to hold.

Of course there are no black boxes in the real world, but the staggering hierarchical complexity of life would seem to indicate that one can engineer systems (whether natural or man-made) that are close enough to be indistinguishable.

ncmncm 5 years ago

In practice, expressible specs are always, always, always incomplete. Some requirements just don't fit into any usable formalism. Some that do fit unavoidably overspecify behavior, so that, e.g. more efficient, acceptable behavior would be disqualified.

There are reasons why we don't just write formal specs and generate a system from them. It is not just stubbornness.

What is astonishing is that certain, isolated systems have been generated from formally complete specs. They need to recognized as miraculous, unique and unrepeatable.

MaxBarraclough 5 years ago

This post is about stateful formal methods. Formal methods can also be used with functional programming, right? Is it any easier to compose specifications in that case?

neel_k 5 years ago

For behavioural correctness of terminating functional programs, you can by and large treat your programs as plain mathematical functions. Then reasoning about them is as easy (or more likely, as hard) as ordinary math.
If you want to reason about the performance of functional programs, or about the behaviour of imperative/concurrent programs, then you need more sophisticated techniques.
But even in this case, specification composition (taken in the sense of the article, to mean the ability to state properties and proofs in a way which aligns with program module boundaries) has become a relatively routine technique in the last 5-10 years.
Basically, there are three main ingredients we seem to want: (1) you need to understand the abstract resources your program manipulates and how different parts of the program transfer ownership of these resources, (2) you need to understand the protocols (i.e., state machines) that your program components are participating, and (3) you need to understand the rely/guarantee invariants that the program components are maintaining.
Historically, verification techniques have had good stories for one or two of ownership/protocols/invariants (eg, TLA+ struggles with handling ownership), but the research community has developed techniques in the last few years that can handle all of them. As a result, there don't seem to be many small programs out of the reach of program verification -- compositional machine-checked proofs lock-free data structures over weak memory are a thing people can do these days. These proofs are by no means easy, but that's because the algorithms are fundamentally intricate rather than because our proof techniques suck.
We don't have any kind of proof that this is a complete inventory of the techniques we need to verify these kinds of programs. But there is the empirical fact that there are fewer and fewer small programs we don't know how to verify, so it's not an unthinkable possibility.
- simiones 5 years ago
  
  But the article doesn't deal with the possibility/impossibility of verification at all. It deals with the fact that (some?) specifications don't compose elegantly.
  That is, that a sufficient specification for System A and a sufficient specification for System B can't be trivially composed into a sufficient specification for a System C = System A + System B. That's not to say that System C can't be specified, but that it can't be specified without re-creating the specifications for System A and System B.
  - hwayne 5 years ago
    
    If I'm being honest, a comment from /u/neel_k on this is enough to make me doubt my entire premise. He's a giant in the field and knows many, many things I don't.
maweki 5 years ago

Now you have to talk about what a specification is.
If it is a relation from some function input to its output, that would be one step. And of course, functions are composable that way.
But say you have a system behaviour that is a stream of function applications, e.g. [f(0), f(f(0)), f(f(f(0)))...] and I forgot the name for this function. This infinite stream is your system behaviour.
Now we have a different behaviour [g(0), g(g(0)), g(g(g(0)))...]
You can (more or less) easily reason about the relation of g^n(0) to f^n(0) where both streams or behaviours are synchronous. But reasoning about the behaviour of all g^n(0) in relation to all f^m(0) is difficult.
All the useful properties for these streams, liveness, fairness, as well as other invariants, are not easily encoded.
- chriswarbo 5 years ago
  
  Co-inductive methods are useful for that sort of thing.
  For example, we can prove two infinite streams are equal by showing that their definitions/generators are "bisimilar".
  - maweki 5 years ago
    
    But the question is not whether their infinite extensions or limit are equal, but whether at each finite combination some property about the stream progress holds.
- bqmjjx0kac 5 years ago
  
  > and I forgot the name for this function.
  I believe this is called `iterate` in Haskell.
cobbal 5 years ago

The fact that CompCert, seL4, and similar projects exist is proof that a large amount of composition is possible if you have enough time to spend on it. It’s true that you can’t compose arbitrary formalizations, but you can architect your formalizations to be composable. Functional or stateful matters less than the engineering work required.
simiones 5 years ago

Not really, the problem described in the article could be rewritten to think of Next as a function World->World.
- contravariant 5 years ago
  
  Though in that case requiring that Next blinks both 'x' and 'y' should obviously mean they are required to blink synchronously. Depending on how you model World you can either compose the two Next functions that blink x and y or you can multiply their respective worlds and use the product map.
  None of these feel unnatural.
  - simiones 5 years ago
    
    Sure, that's a natural way of reading this requirement, but is it the right way? How would you specify the system such that it can produce:
    [(x, y), (!x, y), (!x, !y), (x, !y), (!x, y), ...]
    (that is, any interleaving of the two blinks that preserves the constraint that there is at least one blink in each "step")
    
    contravariant 5 years ago
    
    If you ask me requiring at least one blink in each step is a bit awkward as it is a global requirement as opposed to a local one.
    But you can add the requirement Next(World) <> World if you really want to. Then you just get two sets:
    - Next(World) <> World
    - Next(World).x in (World.x, !World.x)
    and
    - Next(World) <> World
    - Next(World).y in (World.y, !World.y)
    and if you combine them you get
    - Next(World) <> World
    - Next(World).x in (World.x, !World.x)
    - Next(World).y in (World.y, !World.y)
    which seems to do what you want.
    Adding global constraints seems like an easy way to shoot yourself in the foot though.
marcosdumay 5 years ago

What the article calls specifications are exactly programs in a logical paradigm (not exactly FP, but you can translate into it).
My understanding is that the problem the article states is that in a synchronous system with a global clock, everything has to be synchronized to that clock. And that if you do not add cycle-number dependent logic, you can not have cycle-number dependent memory. I fail to see what it new there, or even non-obvious enough to warrant an article.

Garlef 5 years ago

tl;dr:

"Here are some binary operators for a certain notion of specification and some very good examples of why these are not sufficient to express some obvious specs we would like to express."

But:

As others have noted: The given notion of composition is to restricted. The central shortcomming, as indicated in the article, seems to be that the individual parts need to be reintegrated into a new whole. Binary operators can obviously not do this.

Candidates for a compositional calculus that should be able to deal with this are operadic composition via wiring diagrams and comprehension expressions interpreted via arrows (here, the final return/yield is used to reintegrate the individual parts).

tome 5 years ago

If your programming language encourages you to structure your code in ways that don't compose when there are other perfectly good ways of structuring it that do compose then what you have on your hands is a bad programming language.

tome 5 years ago
Hillel, you called me out on Twitter so I felt like it would be best if I elaborated for a few reasons. Firstly, hopefully it will be reassuring to know that although my comment seemed glib I was actually trying to engage with the substance of your work, secondly, it will help clarify my thoughts, and thirdly, someone else may be interested.
I don't have any experience with machine-verified specifications but I do have experience writing computer programs and (traditional, human-checked) mathematical proofs, so my comment comes from that perspective.
Regarding
> Me: 200 words on how we're talking about composing specifications, not code, because specifications aren't programming languages
> HN: [parent]
I'm sorry if the use of the terminology "programming language" was jarring. I tend to take the point of view that a huge range of human activity in what are traditionally called programming languages, probabilistic graphical models, languages for constraint solving and optimisation, a lot of mathematics (especially category-heavy mathematics) and even writing YAML for configuration is actually all just different aspects of the same thing. I chose the word "programming" in the parent which might betray some bias on my part but regardless of the word chosen the practice of combining smaller logical pieces into larger logical pieces is common to all. Your article is suggesting that there is something about the nature of specifications that makes this composition harder than it is for executable code.
The conclusion may be true (I have no idea) but I'm afraid that I just don't see how your article justifies it. In the first example I see a language LTL whose design encourages users to write small logical units that cannot be composed[1] easily with the && operation. A more awkward encoding does achieve compositionality (through stuttering-invariance). It looks like TLA+ provides notation to make it more natural to write the small logical units in ways that they can be composed easily with &&. That makes it sound to me like TLA+ is a better language than LTL (for this purpose)! Better languages make it easier and more natural to write smaller units in ways that can be composed. That's all my parent comment says!
In the second example it is composition with && itself that proves to be impossible. The composition rule is something more complicated:
```
    Next = (NextX(z) && Same(y)) || (NextY(z) && Same(x))
    Spec = InitX && InitY && □[Next]_xyz
```
Unless I'm missing something, there's nothing wrong or hard about that composition rule, it's just the tooling that doesn't make it easy to work with. In Haskell we compose monadic functions with >>=. That would make life really awkward unless we had do notation. The problem is not that monadic functions are hard to compose, it's that the language needs to support them ergonomically. I can't see anything in your post that suggests to me that specifications are by nature hard to compose. All I can see is unergonomic languages.
See also https://www.haskellforall.com/2016/04/worst-practices-should...
[1] composed in the sense which your article elaborates, that is, in a way which combines the properties of both in some desirable way
- hwayne 5 years ago
  
  As also mentioned on Twitter, I think I'm taking the criticism of this article much harder than usual. I have a response to your comment, but it's a bit "sideways" from the actual discussion. Ping me on email/twitter DM?

veltas 5 years ago

Please change title to reflect that this is not about "specifications" in the more common usage and is about formal methods.

simiones 5 years ago

Not really, the problem described in the article applies to any kind of spec.
You could take the specs to be "x must blink" and "y must blink". Either of these informal specs are more or less explicit enough to implement directly, but the spec "both x and y must blink" can be interpreted in too many ways to implement.
dejj 5 years ago

What is the “more common usage” of the word “specifications”, for example?
- MauranKilom 5 years ago
  
  I read this as the "specifications" in "requirements and specifications" for e.g. software projects. For example, "there should be a button that pulls in new data when clicked". That's generally pretty different from formal/mathematical specifications.
  - Twisol 5 years ago
    
    Even those kinds of specifications don't compose well. For instance, if you have a solved problem, and you add a requirement, you don't really have a solved problem anymore, and the new solution may look extremely different from the old one.
  - jkhdigital 5 years ago
    
    I’d suggest that they are in fact the same thing, and differ only in the level of formality... somewhat akin to the difference between machine code and interpreted scripts.
    
    skybrian 5 years ago
    
    An informal spec is about getting your intention across to humans who will interpret it in some reasonable way and fill in any gaps. It’s not math or a computer program at all.
    David Chapman makes a distinction between rational and reasonable behavior, and it’s more like his “making breakfast” example:
    https://metarationality.com/reasonableness-aspects
    
    gugagore 5 years ago
    
    Both machine code and interpreted scripts are executable, and so they have a formal meaning, at least for a given machine or interpreter.
    They both feel as "formal" as the other, and much more so than a spec written in e.g. English.
  - lou1306 5 years ago
    
    Uhm, are they that different, though?
    For instance, you can formally express at least a part of the "there should be a button that pulls in new data when clicked" specification. In LTL, that would be something like something like:
    [] (buttonClicked -> <>newDataPulled)
    (Notice that I switched from actions (click, pull) to states (clicked, pulled), as LTL is a state-based logic.)
    
    pydry 5 years ago
    
    Yes.
    E.g. according to the article "Specs are math" <-- not in my experience
- nonameiguess 5 years ago
  
  File types have specifications and at least some of them can compose. One of the more bizarre things I ever had to do for work was make a general purpose file comparison tool that could ignore certain classes of differences that were expected in order to validate program outputs against regression test keys. In one cases, xml files had base64 encoded pngs embedded in them, and the pngs had metadata chunks with xml embedded.
  Another interesting case is the NGA is required to keep an archive of every geointelligence product created in the US for X years and the library only supports the nitf file format. This format specifies that images are stored as block segments that can either compose as tiles or as overlays, or in some cases can be non-viewable pixel data (i.e. something that can be used for downstream machine processing but isn't meant to be meaningful to humans and if you loaded it into a viewer it would look like nothing - consider something like a map of dead pixels).
  One consequence of this is that if you want to disseminate something more lightweight to your customers ordering imagery products, like a simple jpeg so they don't need nitf viewers to see them, we make the jpeg and embed it as a nitf image segment with metadata indicating an actual nitf viewing program should ignore it.
  There are Python build tools that get around the fact that their config isn't compatible with toml out of the box by embedding into pyproject.toml by adding a key value pair where the key is "legacy-ini" and the value is a string enclosing the entire config ini for the other tool. Or take secrets in Kubernetes, which specify key value pairs where the value is always a base64 encoded string, allowing you to store anything at all as a secret, including another yaml file, even a Kubernetes manifest. We presently use this where I'm at to get the Anchore Enterprise license into the cluster, and the license is in yaml format. Base64 encoded yaml embedded into another yaml file.
  Of course, none of this all that exotic. File types are just a subset of data types, and obviously types compose, both functionally and in terms of building compound types, including recursive types, from simple types. I think the real issue this guy is seeing is due to temporal logic introducing unique challenges, not that specifications in general don't compose. Specifications without temporality compose pretty easily in many cases. This is why regular expressions and context free grammars and compilers are possible.
  Nonetheless, the same issue would come up here. Given we can specify programming languages as context free grammars and regular expressions, why can't we compose them and have a compiler that compiles both Rust and Go at the same time and makes a single executable?
  The answer is we can, but the syntax doesn't actually fully specify the languages. A full specification needs to include the runtime and the ABI, and the actual behavior of those may not be fully specified in any actual specification other than the code for the respective compilers. It's not that they can't be composed. SWIG is a thing. It's just much harder than composing CFGs. You probably can't do it in a blog post or by hand on a piece of paper.

billytetrud 5 years ago

It seems to me that your specification language (formal logic?) sucks. If you can't specify parameters, then you can't compose. That's all you seem to be saying. You define the concept of states, but then you complain that you can't differentiate between timing of states between x and y. Well that's your own fault for not having a powerful enough definition of how states work. This has nothing to do with specifications at a fundamental level.

BenoitEssiambre 5 years ago

This is interesting and in my opinion appropriately blurs the line between code and specs.

I don't know what the standard is nowadays but in the past, every time I have argued against a purely natural language specification in favor a code based spec such as a reference implementation, people looked at me in shock that I wanted to skip the specification part.

I tried to explain that, I don't want to skip the specification. I just don't think human language is nearly precise enough to write an adequate specification. Natural language words are incredibly polysemic and contextual. Look, for example, at how many meanings the word "break" has: https://www.merriam-webster.com/dictionary/break

Kolmogrov has a long time ago suggested that fully specified information distills down to computer programs: https://en.wikipedia.org/wiki/Kolmogorov_complexity, https://en.wikipedia.org/wiki/Minimum_description_length

To me the ideal language for a pure specification might be a mix of natural language and pseudo code with a pseudo test suit. However, if you are writing this, you might as well go one step further and write working testable code.

I like the concept of Literate Programming (https://en.wikipedia.org/wiki/Literate_programming) and its descendants like having code with extractable inline comments that auto generate documentation. I would argue that modern pull request based workflows that tie discussions to version controlled code changes are also the progression of this line of thought. A cleaned up version of these make sense to me for a specification.

And I get some of the concerns. While natural language under specifies, reference implementations over specify. This is more of a problem with low level languages. Modern high level languages are getting fairly close to a form of pseudo code. I fully agree that the reference implementations shouldn't contain or should hide, low level optimizations.

I also understand that reference implementations can unduly tie specs to specific hardware, OSs and platforms.

But to me over-specification is less of a problem than under-specification and it can be mitigated by labeling particular functions or blocks of code as implementation specific and not part of the spec.

Without spec written in code, the different implementations always have subtle incompatibilities. I see egregious versions of under specification in government where horrendously bad spec are created in order to issue RFPs for getting software built. They usually end up with non working software at mind blowing cost.

People have this weird misconception that you are contracting out to build software. You are not. Building software is really easy. You press the build button or type the compile command. Building software has been fully automated for a long time. What is difficult is designing software and specifying what it must do. This is because there is a vast jungle of protocols, business flows, hardware and software platforms that need to be interacted with differently for different needs. This is what needs to be specified and only computer code can do it adequately.

I really wish, for example, that Mozilla adopted the chromium core. To me chromium is a de facto standard spec. I don't think it is possible to specify a standard that makes browser fully compatible without them having a common core based on a reference implementation. And we really need a popular non-profit managed release of said reference browser core.

waynesonfire 5 years ago

i also don't agree with this. specs are referenced by other specs all the time.

Settings

Why Specifications Don't Compose

Keyboard Shortcuts