Ask not what an object is, but...

26 min read Original article ↗

I can barely remember the days when objects were seen like a new, shiny, promising technology. Today, objects are often positioned between mainstream and retro, while the functional paradigm is enjoying an interesting renaissance. Still, in the last few months I stumbled on a couple of blog posts asking the quintessential question, reminiscent of those dark old days: “what is an object?”

The most recent (September 2012) is mostly a pointer to a stripped-down definition provided by Brian Marick: “It’s a clump of name->value mappings, some functions that take such clumps as their first arguments, and a dispatch function that decides which function the programmer meant to call”. Well, honestly, this is more about a specific implementation of objects, with a rather poor fit, for instance, with the C++ implementation. It makes sense when you’re describing a way to implement objects (which is what Marick did) but it’s not a particularly far-reaching definition.

The slightly older one (July 2012) is much more ambitious and comprehensive. Cook aims to provide a “modern” definition of objects, unrestricted by specific languages and implementations. It’s an interesting post indeed, and I suggest that you take some time reading it, but in the end, it’s still very much about the mechanics of objects ("An object is a first-class, dynamically dispatched behavior").

Although it may seem ok from a language design perspective, defining objects through their mechanics leaves a vacuum in our collective knowledge: how do we design a proper object-oriented system?

On one end of the spectrum, we find people quibbling about increasingly irrelevant details about what an object is or is not (just see some of the comments on Cook's post :-), while on the other end, we find programmers abusing objects all the time, building “object oriented software” based on manager classes and get/set methods, with just a few principles as their beacon toward the promised land.

Still, it's not my intention to question those definitions of objects. In a more radical way, I say we should redefine the question, not the answer.

[Optional] A little history…

(Feel free to skip this and jump to the next paragraph, if you don’t care much for the past. However, Alan Kay will promptly tell you that without history, it's just pop culture.)

Do we actually need to dig deeper into the definition of object orientation? Isn’t object oriented mainstream now? Isn't it true, as Cook says, that “The fundamental nature of objects is well-known and understood by the OO community”?

I tend to disagree. Bad object orientation is mainstream, but good object orientation is rare. Open up the Android source code (just to mention a relatively recent large-scale development) and you’ll see that 5000 LOC classes are the norm, more than the exception.

So, maybe the question is not really the best one. Maybe we shouldn’t ask what objects are, but what object orientation is about. Surprisingly, or maybe not, most literature simply equates object orientation with the adoption of an object oriented language, adding very little to the conversation. There also a few paper defining OO as "modeling the real world", which leaves a lot to be desired, but overall, the literature is mostly focusing on mechanisms (inheritance, polymorphism, classes). Sure, we have the usual blurb on principles, but that's more of a band-aid and can only help so much.

Facing dissatisfaction with contemporary literature, sometimes I turn to history for inspiration. For instance, we may just look up Alan Kay for the ultimate definition of what object orientation is about, right?

Wrong :-). Alan has always been elusive about defining OO. Feel free to peruse this comprehensive wiki page, but don't feel ashamed if you leave with a sense of inconclusiveness. We could also look up an interesting email exchange between Stefan Ram and Alan, ending with "OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things", which is better than nothing, but yeah, not really a great guidance when you're designing a system.

Well, maybe it's just old stuff. However, even recently, answering a (ridicolous) critique of OOP incredibly featured on Communications of ACM, Alan played the "misunderstood" card and said “the remedy is to consign the current wide-spread meanings of “object-oriented” to the rubbish heap of still taught bad ideas, and to make up a new term for what I and my colleagues did”. Ouch. 

Admittedly, he gets around some concepts like messaging again, but honestly, we can't blame people for going with mechanistic definitions if we cannot provide a good alternative.

Well, if Alan won’t cut it, perhaps his nemesis (Bjarne Stroustrup) will. After all, Bjarne wrote a promising What is ‘‘Object-OrientedProgramming?’’ paper, revised in 1991. Still, the paper is largely language-centric, and the general idea is to equate ‘‘support for data abstraction’’ with the ability to define and use new types and equate ‘‘support for object-oriented programming’’ with the ability to express type hierarchies. Not what I was looking for.

Well, maybe even “what is object orientation” is simply not the best question.

What if…

I guess you’re all familiar with this kind of structure:

It’s that saddle-shaped thing, made of cables and membranes, often used to cover stadiums and other large areas. The technical term is tensile structure.

Now, if you’re in computing, you can just call that a new construction paradigm, show a few examples of saddle-shaped things, then move to the much more interesting field of cables and membranes, and spend all your time debating about PTFE-coated fiberglass vs. PVC-coated polyester and how only a complete moron would ever consider creating a tensile structure in anything else than an ETFE film.

But this is not computing, so we can find a much better definition of tensile structure. You just have to fire up Wikipedia and step into an entire new world. The page begins with a precise definition of tensile structure, based on forces and reactions to forces: 

A tensile structure is a construction of elements carrying only tension and no compression or bending”. 

There is an immediate distinction with tensegrity, which carries compression as well. Only after a certain while, you'll find a discussion of materials, like membranes and cables, and what is more appropriate when you want to build a tensile structure.

A precise definition like that removes any possible source of confusion. Being tensile is not about the saddle shape: you can't just mimic the shape of a tensile structure and claim that it is indeed a tensile structure. Besides, tensile structures don't have to be saddle-shaped.

It's also not about the materials. It’s not strictly necessary to use membranes and cables; it's just that, given our current understanding of materials, those seem like a very good fit.

Interestingly, the definition is short yet powerful. It provides guidance when you design structures and guidance when you choose (or even better, design) materials.

The real question

So maybe we should not ask "what is an object" and then define object orientation as "the dynamic creation and use of objects" (the quote is a bit out of context here – Cook was talking about languages – but you get the idea). Neither we should ask ourselves “what is object oriented programming” and define it in terms of language features. That would be very much like talking about cables and membranes and then say that if you build it with cables and membranes then it's a tensile structure. It appeals to people who want to reason at the material level (language and constructs), but it's neither precise nor really useful for the software architect.

So the real question should be: what is an object oriented structure? Once we know the answer, we can freely discuss how different materials (languages, constructs, etc.) can help us building an OOS, and how they help, and how they don't.

Ideally, the definition of OOS would also bring some clarity on what we can expect out of an OOS and what is not to be expected. For instance, we do not expect tensile structures to sustain heavy snow load.

We could also try to define the nature of a functional structure, and compare the two without (too much) advocacy. Same thing for the logic paradigm, still waiting for its own little renaissance. Or for Aspect-Oriented structures.

The problem, of course, is that we don’t know how to do that. We don’t have a theory of forces. We don’t have clear notions about how software materials react to forces. So we end up building an entire industry based on advocacy, more than on science. It doesn’t have to be like that, of course. Seneca used to say “It is not because things are difficult that we do not dare, but because we do not dare, things are difficult”. So let’s dare :-)

Interlude – the Three Spaces

I’ll assume that most of you guys are not familiar with my work on the Physics of Software, so I’ll recap a simple notion here, that will be useful in understanding what follows.

Software is shaped by decisions. Decisions about the function (what the software will do) and about the form (how we are going to do that). Some decisions are taken after careful consideration, some by routine, and others by feedback (from the users, or from the material). Decisions have this nasty habit of being unstable. Features will be added, removed, updated. We’ll change the form by refactoring. Etc. At any given point in time, our software is just the embodiment of a set of decisions. I consider that as a single point into a multi-dimensional decision space. When decisions change, we move to another point into the decision space.

Software is encoded in artifacts. It doesn’t really matter if you write procedural, functional, or OO code. It doesn’t really matter if you use models or code. We cannot manipulate information; we can only manipulate a representation of information (artifact). Languages and paradigms define the structure of their artifacts, and provide modular units. A function is a modular unit in the functional paradigm. A class is a unit in the OO paradigm. For various reasons, I tend to call those units “centers”, which is a more encompassing concept in the physics of software.

In the end, a point into the decision space is encoded into a set of artifacts in the artifact space. Those artifacts define a set of centers.

As we know, the same run-time behavior can be obtained by widely different organizations of artifacts and centers. In the end, however, the knowledge encoded in those artifacts will be executed. Things will happen, on real hardware, not in some theoretical semantic space. Things like cache lines and multicore CPU etc. will come to life and play their role. I call this the run-time space.

Now, to cut it short, a change in the decision space will always be about the run-time or the artifact space, but it will always be carried out through a change in the artifact space. I want that new feature: it’s about the run-time space, but I need to add code in the artifact space. I want to refactor that switch-case into a polymorphic class hierarchy. It’s about the artifact space – the run-time, visible behavior won’t change a bit – and it’s also carried out in the artifact space.

In the physics of software, we acknowledge that materials (the artifacts) are shaped by decisions. Therefore, those decisions are the forces acting upon the materials (following D’Arcy – it’s a long story :-). Those forces will materialize as changes

So, what is an Object Oriented Structure?

Ok, this is my best shot as of December, 2012. Maybe I’ll come up with a better definition at some point. Or maybe you will. I'm totally interested in improving it, but not really interested in a fight over it :-), so feel free to chime in, but please be kind. Note that I'm not trying to say that OO is "good" or anything like that. I'm just trying to define what an OOS is, from the perspective of forces and reactions to forces.

Given a Change H (in the decision space), H will occur with some probability P(H). Generally speaking, P(H) is unknown, but in many cases we can reasonably classify P(H) in discrete sets, e.g. “unlikely”, “possible”, “very likely”.

H will be about the Run/Time space or about the Artifact space, but will always entail some work (a change again) in the Artifact space. Let’s say that H will require a change (creation, deletion, or update) in artifacts A1...An.

Now, given a set of (meaningful) changes H1…Hk, occurring over time, an OOS is one that minimizes the total size of the artifacts you have to update, encouraging the creation or deletion of entire artifacts instead. Slightly more formally, within the limits of HTML text, an OOS will try to minimize:

Sum( j = 1…k) { P(Hj) * SUM( i = 1…n ) { Size(Ai) | Ai is updated in Hj } }

A couple of notes:

  • Cheating is not allowed. If you delete a module and add it back with some changes, it counts as an update, not as 1 delete + 1 create.

  • Note the P(Hj) factor. It accounts for the fact that it’s not always possible to balance all forces, and that a good OOS accounts for the most likely changes.

Does it work?

I understand that this definition may not look like OO at all. Where is polymorphism? Where is encapsulation? Where is inheritance? Well, that's the material we use – the cables and membranes. A good definition of OOS must be beyond that.

Let’s use, once more, the familiar Shape example to explain why the definition above is aligned with Object Oriented Structures, while other structures tend not to comply with it. Say that we have a set of shapes: Circle, Polygon, perhaps some specialized polygons like Square and Triangle, etc. We want the ability to create one of those shapes, with a given position and size. We also want the ability to calculate the bounding box of that shape. Given that, we also want to get a bunch of those shapes, and distribute them evenly in space, like many programs do. Of course, that's just the beginning, we know that more features (and more shapes) are coming; we just don't know which features and which shapes.

What are our basic needs? Well, any shape is defined by some data, so we need to put those data somewhere. We also need to implement the bounding box logic. We also need to implement the "distribute evenly" logic. That requires the ability to move shapes around, of course. We can shuffle around that logic in different forms, of course. That's a design decision.

The decision space is usually very large, even for a small program like this. However, a few macro-decisions can be easily identified:

    1. We may choose a uniform representation for all the shapes, or let every shape have its own representation (if needed).

    1. We may have a separate implementation of the bounding box logic for each shape, or just a single center / artifact. That single center / artifact may or may not have to know about every shape type.

    1. We may have to know about every shape type inside the "distribute evenly" logic. Or not.

Let's go through this list with a little more attention to details.

1) One might trivially say that every shape is just a list of points. Oh, the magic of lists. Except that, well, we don't want to turn the circle into a list of points, and maybe not even the regular polygons. Of course, we can still shoehorn a circle in a list of points. Put the center as the first point and the radius as the second point; or maybe any given point on the circumference will do. Or use 3 points to define the circle. But other things are easier when you store the center and radius (think area).

That's sort of misleading though. It's uniform representation with non-uniform behavior. For instance, when you want to move a circle you only move the center. An irregular polygon requires that you shift all the points. Also, what if we want to add a shape that is not easily defined by a list of points, like a fractal curve? You can choose a uniform representation if you want, but then you also need to store the shape type somewhere, and have some logic dealing with specific shape types, usually by switching on the shape type.

2) Calculating the bounding box for a circle and for an irregular polygon requires a different logic. We can have one big function switching on types and then using the uniform representation to carry out non-uniform behavior, or we can have a separate logic for each shape type. Then it all depends on how we want to invoke that logic: by switching (again) on shape type, or in some other way.

3) What is a reasonable implementation for the "distribute evenly" thing? Well, to be precise, it's usually a "distribute horizontally" or "distribute vertically" thing. Say that it's horizontally. A reasonable abstract algorithm is:

- take the bounding box of every shape involved

- add the width of each box together so you know the required space

- take the bounding box of the bounding boxes; take its width; that's the available space

- calculate free space as: available - required

- divide by number of shapes - 1; that's the spacing.

- start with the leftmost and space shapes accordingly, by moving their left side

Most of the logic is in terms of bounding boxes; basically, the only things dependent on shape type are the calculation of the bounding boxes themselves, and moving shapes by the left side (different for circle and irregular polygon, again). Once more, we have choices: add a switch inside the distribute function (ugly), call a move function that is switching inside, or do it some other way that does not require a switch.

Ok, we're set. Let's see if the definition of OOS above will tend to favor an OO implementation over the alternatives.

Create, don't Update

A very reasonable, likely Change is that we want to add a new shape type. We may not know the probability P(H), but we can say it's likely. If we want to do that by adding a new artifact, without changing existing ones, any kind of switching is forbidden. That includes the common cases of pattern matching, including case classes and the like, as they normally require updating the list of matching (it doesn't have to be so – more about this later).

So, the definition above basically forbids the switch/based (or match/based) implementation. There is relatively little choice: you need some form of late binding. You want to loop over shapes and ask for a bounding box. You can't look inside because that requires a switching over shape type, to make sense of the pseudo-uniform representation. You want to ask for the bounding box (a service, not data, guess what :-), but you don't (can't) know the function that will be called. Of course, the obvious OO way to do that (a Shape interface, a Circle class, a RegularPoligon class, etc) is a very reasonable way to comply with this.

Localized change

The prophets of public data and uniform representation won't tell you, but as you add new features, you may want to improve and adapt the internal representation. For instance, calculating the bounding box for a circle is simple math and you may want to do that on the fly; for an irregular polygon, you have to go through all the points, and you may want to cache that thing. It's a Change you may want to take at some point. You can't do that by merely adding a new artifact: you have to go in and change an existing one. Or many.

How do you estimate the impact? A proper OOS requires that this change will impact as little code as possible (see the artifact size in the formula). Well, if you:

  • expose a BoundingBox service, not your internal data

  • actually hide your data with some modular protection notion

you're pretty sure that there will be only one thing to change: the IrregularPolygon abstraction, and inside that, the BoundingBox function only. So you need some form of encapsulation, the modular hiding of data behind stable services. Once again, the obvious class-based design is a good way to comply.

As you know, most OO languages have a notion of public and private, and sometimes protected. You can easily see those notions as ways to limit the potential impact of a change. Theoretically, if a Change requires an Update of a Private part, that change will be local, which is ok with the idea that an OOS tend to minimize the Update, not to eliminate the Update. Of course, for that to work, you really need to create an OOS. If you expose the private part through a Get/Set, the Update won’t be local anymore, reflecting the fact that the structure is not really OO.

Occam's Razor

Do we need an abstraction for the pentagon, one for hexagon, one for every regular polygon? Or is it better to have only one? Once again, look at the reality of development, not at some philosophical idea. The reality of development is that you may not have got the best Shape interface from the very beginning, and that you'll change that. Maybe you didn't think of a "move left side" service in the beginning. Now you have to go in and add it to your shapes. That's a bad change from an OOS perspective: you change many artifacts (more on this later). Still, the definition is telling you to avoid unnecessary classification for the sake of it. It's asking you to keep the number of artifacts (the cumulative size, actually) to a bare minimum. It's actually suggesting that in most cases, reusing a common implementation is good, because then you have only one place to change. Some kind of implementation inheritance will help balance this need with the need to specialize behavior (in some cases).

You don't need to "encapsulate" the Bounding Box (too much)

A large part of the "distribute evenly" logic is based on bounding boxes. A bounding box is an object and yeah, sure, it's nice to expose a few methods like height() and width() but, on the other hand, it's no big deal if you expose (perhaps read-only) the coordinates of top / left / right / bottom corners. Sure, strictly speaking, that would be a "violation of encapsulation", which is the kind of primitive and limited reasoning that principles tend to inspire. Once you look at the definition above, you'll quickly understand that the probability of change inside the bounding box, and especially the probability of that change having an impact on other artifacts, is pretty much zero. It's a stable abstraction, whereas Shape is an unstable notion (for a few hints on instability types, look up my previous post "Don't do it"). So, let's be honest: a few methods in BoundingBox won't hurt, but "exposing data" won't hurt much either.

You need to align the OOS with the force field

A very reasonable request would be to have a vertical version of the "distribute evenly" feature. The algorithm is basically the same, except now we have to move shapes by the top, not by the left side. That's a breaking change, and it's not localized. It's an high-impact change, because the obvious OOS (Shape interface + concrete classes) is well-aligned with the need to extend shape types, but not well aligned with the need to extend shape methods.

We can look at this under different perspectives:

  1. We can’t protect ourselves from every possible orthogonal change in the multi-dimensional decision space. Not with “regular” OO anyway (and no, not even with regular FP). So we drop in the probability, and create a structure which is well-aligned with the most likely changes.

  1. We need better materials. Open classes, selective protection, etc. Understanding forces will guide us better. For instance, Scala-like case classes are a bad idea if you’re aiming for an OOS (because of the switch/case, of course, not because of the pattern matching itself). Find the real good idea (it's rather simple).

  1. OOP is a scam. It doesn’t work. Let’s move en masse to FP. Wait, what is a Functional Structure again?

Note that it would be wrong to say that OOP can’t sustain instability on the functional side. The well-known Command pattern, for instance, is an OOS dealing specifically with a set of functions that is known to be unstable (high probability of accretive change). OOP can’t deal well with orthogonal changes, unless we improve the available materials. AOP, for instance, was a definite step forward on that area.

It is worth considering that, in many cases, instability in some areas is temporary. Very much like atoms in an annealing process, our design is often unstable in the beginning, as we try to understand the "right" partitioning and the "right" responsibilities. In many cases, instability on methods does not go on forever, as we can find a kernel of stable methods (I often say that the kernel tend to "saturate"). After that, we can build algorithms outside the kernel, like the "distribute evenly" thing. By the way: where do we put those algorithm in a proper OOS?

Mainstream OOP is a good fit, but not the only possible one

Whereas the discussion on "what is an object" tends to put strong boundaries on our materials, focusing on forces and reactions to forces is really liberating. The following mechanisms tend to be useful to build an OOS:

  • selective protection; not necessarily the "closed" kind provided by classes

  • late binding; but genericity is useful as well, in some circumstances

  • code reuse mechanisms, like inheritance, prototypes, mix-ins, delegation (ideally, the kind of automatic delegation that does not require manual construction of proxies), and higher-order functions.

  • generally speaking, indirection mechanisms (I'll get back to this in a future post), including, of course, closures and lambdas.

  • add your stuff here : )

What hurts: whatever makes you build an hourglass shape. Switch/case, the wrong approach to pattern matching (see above), etc. Whatever leads to duplication and non-modular decisions. Early-binding of things (for instance, having to specify a class name in code to get an instance). Etc.

It also goes without saying that the adoption of a mainstream OO language gives no guarantee that we're going to build an OOS. The state of practice largely proves that.

But… why should we do this?

Sorry, this is already a long post. Understanding why an OOS as defined above is, let's say… interesting, is left as the dreaded exercise for the reader.

Beyond principles

You can easily see how many principles follow directly from this definition:

  • Demeter's law (trying to minimize the impact of some updates)

  • Information Hiding (same as above)

  • Don't Repeat Yourself (for unstable code, I should say)

  • Etc etc.

As I said many times now, I think it's time to move beyond principle and toward a better understanding of forces and properties in software materials.

Is this really the Smalltalk way?

No, absolutely. Smalltalk aimed at something bigger than this. In a sense, Smalltalk tried to apply the same rule to the language itself, in a sort of reflective way. To really understand what Alan meant by "extreme late-binding of all things", you need to realize that in Smalltalk, "if" (ifTrue actually) is a method, or a message in Smalltalk speaking.

The message is sent to a Boolean object, and a code block is passed as a parameter. Yeap. A code block. Talk about extreme late binding of things. The meaning of that code block is up to the ifTrue method. It's not a compile-time decision. This is way beyond what we usually do with most mainstream OO languages, although there has been a recent (late) move toward similar concepts.

Building a language this way means we can change the language itself by adding things, instead of changing a compiler (or interpreter). Paradoxically, languages that proclaimed to be "pure OO" like Java were actually a step backward from this, even when compared to C++. Funny, in its own way.

A definition based on entanglement

For those of you more familiar with my work on the Physics of Software, here is a bonus :-) definition based on the notion of entanglement:

An Object Oriented Structure is a construction of centers carrying mostly */D and */C entanglement between the decision space and the artifact space. When */U entanglement is carried, the size of the involved artifacts is probabilistically minimized.

There are also a few corollaries that follow directly from the definition of OOS, but that might be worth stating:

An OOS reacts to growth in the problem space with Creation in the artifact space, not with Update in the artifact space.

An OOS reacts to enumerative variation in the problem space by creating an isomorphic set of centers (often, classes) in the artifact space. This is strongly related to the enumeration law.

An OOS does not force a common structure when common behavior suffices. Optimal representation can be pursued with local changes within each center. An OOS hides the chosen representation inside the centers whenever the representation is potentially unstable.

An OOS allows the targets to make sense of the request, freeing the source from knowledge about the target types. Hence, an OOS hides the nature of the center outside the center itself (it's a Shape, not a Polygon).

An OOS will react to compositional behavior by forming compositional structures, not by creating longer functions (see some patterns: composite, chain of responsibility, decorator, etc).

There would be more to say here, including some ideas on an economic theory of software design, but this is a blog post, not a book :-).

Surprise!

This post is actually “Notes on Software Design, chapter 17”. Ok, you don’t have to go back and read the other 17 chapters (there is a chapter 0 as well). But you may want to help the world by sharing this post :-)

If you liked this post, you should follow me on twitter!

Acknowledgment