A first lesson in meta-rationality
metarationality.comAn interesting category of problems are like Bongard problems in that you have to deduce the rule from examples, but the examples are presented one at a time at random long intervals so you have to work from memory. Most real-world learning is like this.
When working from memory, it's normal for your memory to have already parsed the previous situation into features. As some of the later examples in the blog illustrate, it's easy to fall into parsing examples into the wrong set of features, which is how you'll remember them.
While I could solve all the problems in the article, I doubt I could solve any but the simplest if I was shown 1 image per day over 12 days and not allowed to write anything down.
Perhaps the lesson is that when you're trying to deduce a rule (say, for what conditions your software crashes in) you can increase your rule-discovering power greatly by making notes and being able to look at several examples side-by-side.
This is exactly right, and it is exactly what makes quantum mechanics and relativity so hard to wrap your brain around because by the time you get around to learning them you have almost certainly deeply internalized a classical model of the world. It's just obvious that classical mechanics is "correct", that the world consists of objects embedded in a three-dimensional space that exist in specific places at specific times, and even talking about a world where this is not true doesn't even make sense, let alone qualify as a viable candidate for actual truth.
It is equally "obvious" that the heavens are governed by different laws of physics than the earth, because things on earth fall down if unsupported and naturally come to rest and things in the heavens don't. And of course all of these things are equally wrong.
One can and should apply the same lesson to social and political statements. For example, people get hung up on arguing about things like whether or not "God exists" as if they were arguing about a question of objective fact when actually what they are arguing about is the meaning of the words "God" and "exists."
I wrote a longer take on all this about six years ago:
http://blog.rongarret.info/2015/02/31-flavors-of-ontology.ht...
You also have to know you want to solve a problem.
Once you get to the point where you have any hypothesis whatsoever, no matter how weak, a systematic approach (saving examples as test cases) helps to avoid confirmation bias and makes testing further hypothesis less costly.
Another hard one is when there is a simple, probabilistic rule. You usually end up with an over-complicated rule to cover all your data instead of the true rule. (Of course that gets down to what is at the basis of the probability: are you satisfied with a probability?)
Probabilistic rules themselves tend to require much more data, which can be expensive.
In computing we try to write deterministic tests that either pass or fail, which means you can run them once after a change and know what the state is. Even if you just suspect flakiness you may have to run the test hundreds of times to be confident that the probability of failure is sufficiently low.
Here's a deduce-the-rule problem that completely stumped me until I wrote down a series of examples: https://illuminations.nctm.org/lessons/petals/petals.htm
It seems trivial from the name and I always get the same answer as it gives; is there a way to confirm my rule?
Spoiler?, ROT-13: Zl vagrecergngvba bs gur ehyr: "Crgnyf nebhaq gur ebfr" zrnaf pbeare qbgf nebhaq n pragre qbg, fb gjb sbe ebyyvat n guerr naq sbhe sbe ebyyvat n svir; mreb sbe nyy bgure ebyyf, nf gurl unir ab pragre qbg. Lrf, gur vafgehpgvba nobhg gur anzr orvat vzcbegnag jnf fb üore-boivbhf gung guvf jnf zl ulcbgurfvf rira orsber zl svefg tynapr ng gur qvpr.
> While I could solve all the problems in the article, I doubt I could solve any but the simplest if I was shown 1 image per day over 12 days and not allowed to write anything down.
This is true for me too. Even 1 image every minute over 12 minutes would be quite difficult for me. I'm not sure I could solve any of these problems without being able to look at the figures side-by-side, back-and-forth, again and again, until I get an "aha!" moment.
On a related note, I have found that revisiting old information multiple times over sometimes longish time spans and unconsciously comparing it with newer, fresher information often helps me gain insight and greater depth of knowledge.
Am I the only one that gets driven kind of crazy by these kinds of problems?
I'm not completely sure so far what it is, but I'm guessing it's the frustration of having to find a needle in a haystack of essentially infinite size, as depending on how complicated you want to see the problem, there's an infinitude of potential 'solutions' and you never really know which level of complexity the author had in mind.
I love logic puzzles, where the system is constrained and you have to work within it, but these find-the-rule problems really aren't my thing so far. Maybe I'd need to develop a higher frustration tolerance for them, heh.
I think this has to do with tolerance for being stuck, and that varies depending on how rewarded you think you’ll be for figuring it out and getting unstuck.
Real science and math involves getting stuck on problems, perhaps for weeks, months, or years. I guess we should be happy that there are people who can tolerate being stuck.
My tolerance for getting stuck on a mere game has dropped dramatically since I was a kid; many of the games we played then are unplayable by modern standards. You had to draw maps and take notes, yourself, rather than the computer remembering things for you.
But text adventures back then sometimes weren’t meant to be played alone. The game might be single-player but it was a group activity for college students where you’d share ideas. The modern equivalent might be games where you’re expected to search the web to find recipes and strategies for things.
Yep, we aren't exactly being conditioned towards higher frustration tolerance nowadays... I don't think I could still deal with a PC without an SSD.
I think there are different kinds of being-stuck, though. With many problems in science, there's at least things you can try to gather more data. So you're stuck, but you can come up with new experiments to get new insights into the problem. Here, you're only given the one set of examples and have to make do. I guess you could still see the process of generating hypothesis and testing them against the examples as a sort of "experiment", but it still feels a lot "stuckier" if you don't get anywhere with it.
Don't worry you can always solve these problems trivially: The examples on the left side are all on the left side, while the examples on the right side are not.
Yeah, the author says that for any of these problems "there should only be one reasonable rule." But I suspect that "reasonable" here really points to contingent facts about human psychology, i.e. some rules just strike us as more intuitive or appealing than others, but they aren't correct in any objective sense. That sort of gives the lie to the notion that what we're exercising here is "meta-rationality."
The problem with these puzzles is that, without rules for the system, you can just make up your own rules and then solve the puzzle within the context of those rules.
For example, in the second puzzle, the arrangement of black-and-white shapes is the same on the left and right pages, but the right page is rotated relative to the left page. Is the question about the shapes as in an ordered collection? Or is the question about the pages in their entirety? These problems tend to be underspecified, and end up being more of a guessing exercise about the authors intentions than anything else.
Yeah I think a lot of it is learning to think like how the people who thought up the problem think.
It still may not be a bad exercise (like art students at a gallery copying a master's work), but you shouldn't get too far ahead of yourself claiming it's some sort of 'exercise in pure reason'.
These sorts of tasks are teaching you how to think a specific way which our society promotes. Mechanistic, natural, causal, rational (in the first-order logic sense) with a healthy dose of Ockham's razor and simplicity as an aesthetic.
I stress though, I'm a big fan of these things and the innovations they have enabled, but you still have to understand that they are an axiomatic underpinning. It's like Euclid's parallel postulate in a way: There is non-euclidean geometry out there.
The fact that the solution of these problems is in some sense satisfying also has a lot to do with the fact that the people making these problems are Western systematic-thinkers. They think like us (because we were trained by them).
That's not to say there isn't value in learning this way of thinking, it's gotten society a long way.
How would you characterize the opposite of (or alternative to) "systematic" thinking?
Not an alternative to systematization, but different systems. Just that the analogies or groups that make sense to one group may not make sense to others.
Could you give an example of a different type?
Broadly, for example, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2838233/
"""
Analytic cognition is characterized by taxonomic and rule-based categorization of objects, a narrow focus in visual attention, dispositional bias in causal attribution, and the use of formal logic in reasoning. In contrast, holistic cognition is characterized by thematic and family-resemblance-based categorization of objects, a focus on contextual information and relationships in visual attention, an emphasis on situational causes in attribution, and dialecticism (Nisbett, Peng, Choi, & Norenzayan, 2001).
"""
This also comes up a lot in cognitive test design.
The anecdote I've always heard in reference to it was
""" What is considered wise in one society may not be considered wise in another; the value and meaning of intelligence depends on cultural norms. Demonstrating the culturally-specific nature of knowledge and intelligence, Cole, Gay, Glick, and Sharp (1971) conducted an experiment in which Western participants and Kpelle participants from Liberia were given an object-sorting task. Participants were asked to sort twenty objects that were divided evenly into the linguistic cat-egories of foods, implements, food containers, and clothing. Westerners tended to sort these objects into the groups for food and implements, while Liberian partici-pants would routinely pair a potato with a knife because, they reasoned, the knife is used to cut the potato. When questioned, Liberian participants justified their pairings by stating that a wise person would group the items in this way. When the researchers asked them to show what an unwise person would do, they did the taxonomic sort that is more familiar to the Western culture. """
quoted from https://uscaseps.org/wp-content/uploads/2020/07/standardized...
That actually explains a lot! Thank you, that's a great anecdote
A simple example that still constrains the puzzle to human abilities (but also makes it less universal) might be this. Rather than diagrams/images, each puzzle consists of two groups of short depictions of two people interacting. The differences are in the relationships, emotional states, or modes of expression. That kind of judgement requires different perception, intuition, knowledge, and so on compared to the puzzles based on shapes. Probably a lot of people who were good with the "standard" Bongard problems would struggle with the "interpersonal" variety, and vice versa.
The flaw with that "solution" is that if you have found the rule, you should be able to say whether a new image belong on the right or the left if it is presented to you. If you say, "I can do that, but I need one more bit of information, namely whether the new image belongs on the right or the left", that's a pretty severe defect.
There's no necessity (in general) that a new image should be able to be classified under the rule. If I give you two finite groups A={1,3,9,-2} and B={7,-11,i,5} and the rule actually is tautological, Then a new number 22 doesn't belong to either group under the rule.
A few of the examples from the article actually are similar. The two circles where one circle is either clockwise or counterclockwise from the nearest indention only admits pictures with two circles, one on the surface of the other and an indentation. There are images which wouldn't fit into either.
A math professor of mine was illustrating this point with number series (of the sort on aptitude tests, eg squares,arithmetic sequences, etc), by listing an obvious sequences whose completion ended up being an obscure function which diverged at the next point.
So, the trivial solution (and the more ultra-complicated solution) is defective basically because it's not interesting under the rules of the game which assumes the answer is somehow interesting, but not impossible to guess.
Thanks for that, actually made me chuckle :)
(And it illustrates the point quite well, as that is indeed probably the simplest and most general rule you can find.)
I can see where you're coming from. I always have to judge my working solutions using the parsimony of features and parsimony of rules to navigate the feature/solution landscape of Bongard problems.
The underlying assumption is that the problem's author has followed the same rules which is an assumption of good faith on my part. This narrows down the feature-solution space considerably or at least biases it in a way where I can prioritize hypotheses in a more tractable way.
Part of what I feel makes me a good puzzle solver is imagining that I'm a puzzle maker. What was going through the author's mind when they conceived the puzzle? If I can start to pull at that thread then the complexity of the puzzle will start to unravel.
Hmmmm yes, this assumption of good faith might actually be an important part of approaching these problems. I think I instinctively look for approaches to problems that would work even for adversarial examples. In the case of these puzzles, that of course doesn't work, because you then imagine the rule to be something completely outlandish and give up before you've even checked the easy options.
Something like "I know it's _possible_ the author has chosen a ridiculous rule, so it doesn't make sense for me to look for it, because even if I find it, I just got lucky and didn't actually solve the problem."
That might be a side-effect of perfectionism, actually.
I was thinking along these lines too when the author went into machines solving the problems. One bit that jumped out was the quote from Hofstadter:
> They depend on a sense of simplicity which is not just limited to earthbound human beings.
Followed immediately by a problem whose solution depends on having a sense of 3-D objects in gravity!
Solving Bongard problems is surely a hard thing to get an AI to do, but I am wondering too about AI-authored instances. Or, say, problems authored by aliens, with a different evolutionary history, and different in-built biases for cognition. Would they necessarily be solvable by humans, or our Bongard problems solvable by them? Some aspects (number maybe?) are probably universal. But even a good-faith puzzle maker has to take some assumption of shared basis for perception.
This probably connects up to the author's final point that "what objects even are" is not absolute.
> Followed immediately by a problem whose solution depends on having a sense of 3-D objects in gravity!
I can think of a Bongard problem where the images on the left are rebuses which spell English words, and images on the right are rebuses which spell Russian words. That's baking a LOT of a priori knowledge into the the puzzle. Alexandre Linhares' A glimpse at the metaphysics of Bongard problems talks a bit about what assumptions can go into this problems:
> An interesting but generally ignored aspect of Bongard problems is that their difficulty for a given subject is directly associated his or hers (or the system’s) previous experience.Since the problems consist of geometric figures, one may be led to believe that cultural factors do not influence the performance of a person attempting to solve them. This is not the case.
> Solving Bongard problems is surely a hard thing to get an AI to do, but I am wondering too about AI-authored instances.
I like where you're going. I left out the search I did of "generative adversarial networks bongard" because no good results popped up. (Hint: this would make for a fantastic HN post if any researcher wants to earn fake internet points).
Finally, this[2] comment stuck with me over the years. As always, would solving or generating Bongard problems be a quantum leap in AI or would they, like so many other problems, be subsumed into the category of AI-solvable problems and we all move on to the next problem at the frontier?
[1] - http://app.ebape.fgv.br/comum/arq/Linhares2.pdf [2] - https://news.ycombinator.com/item?id=8964017
This makes a lot of sense. People have wondered how our cognition would be affected by living in zero gravity.
And our language reflects these cognitive biases; how we would express the experience of "getting high" in zero gravity?
I feel similarly; the author mentions spending ten minutes trying to solve one of these puzzles, and I can’t imagine doing that and enjoying it. Maybe it’s the case that spending more time on the ones that stumped me would yield fruit, but I have the impression that on this class of problem, if I don’t see the solution within two minutes then I probably won’t be able to figure it out in ten, which disincentivizes investing time in them. I’d be interested in knowing if the people who /can/ solve all the problems in the article do so by investing time in them and being methodical or if they just “see it” eventually, which is how I feel solving the easier ones.
There’s also the factor that some Bongard problems, independent of their difficulty factor, are just more satisfying than others. Spoiler for the fourth one, with pairs of circles: its solution is that the entries on the left have $property while the ones on the right...don’t. This makes the right side virtually useless except to check the rule that you derived from the left side.
Maybe it’s just that I don’t have research experience, and am thus unsteeled against problems that seem impenetrable, or maybe I just don’t have the mindset to be good at these, but I agree the really difficult ones can be frustrating.
The cumbersome part is that the first step requires you to pick up patterns and differences without knowing which are actually relevant. Which is fine for the first few puzzles, but gets much harder with increasing amounts of details.
It's basically the same kind of problem that one faces in the more frustrating debugging sessions where you're looking at tons of data and try to find a pattern or clue of what causes the bug under what circumstances.
At least with Bongard problems you don’t have heisenbugs.
You're certainly not the only one, but the point of the article is that solving these kinds of problems, with all the vagueness that implies, is an important feature of human intelligence.
Since I generally seem to do fairly well in problem solving, that is exactly what I'm trying to figure out: Are these problems actually representative of an important skill that you need for general problem solving, or are they, through their nature of being man-made puzzles, actually in a realm of their own?
When I'm looking at some pattern that I'm trying to find a rule for in real-life, I don't think I'm running into the same frustration and in fact greatly enjoy trying to figure out rules for how things work (or so I believe, at least).
I think a crucial difference is that I know that the problems I encounter in real-life are only "as complex as necessary", and the data I'm looking at is a direct result of some process that serves a specific goal; presumably one I think "makes sense", as I wouldn't look for a rule otherwise. In contrast, puzzles are made to be complicated on purpose, and I suspect that annoys me subconsciously to the point where my brain complains about engaging with it. But it's only these kinds of "figure out the rules" puzzles, so there has to be another important difference compared to logic puzzles. Possibly the difference is: for the logic puzzle, the "meta-rules" for the problem are made explicit and I know the solution-space exactly. For the Bongard problems here I found myself thinking for example: "wait, is it always just two groups distinguished by single rule, or can there be dependencies on the positions of the symbols within the groups as well? What kind of solution am I even looking for?", and that also apparently frustrates me.
Sorry for the wall of text, but I've actually been trying to figure out why these kinds of problems get on my nerves for quite a long time, lol.
Bongard puzzles are pretty much the same as the test matrices in IQ tests, which annoy me in the exact same way these Bongard puzzles do. If you'd ask people questions in the same manner these puzzles do, they would refuse to answer because they'd feel trolled. Arguably, that's the case.
They're closely related to Raven's Progressive Matrices https://en.wikipedia.org/wiki/Raven%27s_Progressive_Matrices which are indeed the source for IQ tests. But the point of this all is that rationality works within a frame; Bongard games are just an illustration that the real problem of meaning and choice and acting in the world is the not-rational one of choosing a framing.
I don't know, the first took me two seconds, the second five and the third five too. I stopped reading there because the article says next ones are harder and I would like to take a quiet time to read the whole thing.
I use to solve chess problems at lichess, a similar concept. Maybe.
Is it really an infinite haystack? It's context. And simplest solutions first, gradually think of more complicated ones. First try to find a pattern visually, then use simple concepts, then more complex concepts.
From the Church-Turing Thesis, we know there’s nothing special going on! We know humans can’t do anything more than a computer can.
I see people making such claims about human cognition all the time, and I have no idea how it follows. (note the author is paraphrasing "people" here)
The Church-Turing Thesis says nothing about human cognition.
It is perfectly plausible that a human can do things a computer can't. (Scott Aaronson has a paper "Why Philosophers Should Care About Computational Complexity" which sheds some light on why that might be, but it's far from the only possible reason.)
The burden of proof is on people who claim that human cognition can be simulated by computer, not the other way around. To me, it seems far more likely that it can't.
Human cognition can obviously be simulated by "the laws of physics", since brains are material, but it seems very likely that computers are less powerful than that.
That's my refutation of the (silly IMO) "simulation argument". I'd argue it's simply not possible to simulate another universe. You can simulate something like SimCity or whatever, but not a real universe. The people who make that argument always seem to leave out the possibility that it's physically impossible.
In fact I would actually take the simulation argument ("we are almost certainly living in a simulation") as proof by contradiction that simulation is impossible.
> That's my refutation of the (silly IMO) "simulation argument"
I'm not one of these people, but your rebuttal wouldn't convince me if I were. Maybe we are the SimCity of a much more complex universe.
OK, but I'm saying the burden of proof is on who think that simulation is possible. It's not at all obvious that it is.
As far as I can tell, the idea was basically made popular by movies [1], and there is no science behind it. All the science I know of points the other way -- simulating anything is incredibly hard and slow. It requires approximations and shortcuts to make it work for specific cases, and it doesn't work in the general case.
(Maybe some alternative model like quantum computation will be different, but we're MUCH much further there. I think "adding two small numbers" is still an issue for state of the art quantum computers.)
-----
Here's a nice example from a few days ago, trying to represent even a tiny part of the human brain in a computer:
https://news.ycombinator.com/item?id=27362883
[1] I think this is more literally true than you might expect; IIRC the published papers in philosophy liberally reference The Matrix, maybe because it attracts readers.
There's an important misconception here.
The part you're right about is that simulating what we understand about our own reality on some subset of that reality (e.g. some kind of computer) is really, really hard, possibly verging on impossible.
But that's not really the simulation hypothesis at all.
It's relatively easy to build a simulation of a simplified or (and this is important) a different reality in some subset of our own. Trivial examples like Conway's Game of Life come to mind, but also (somewhat obviously) SimCity.
The simulation hypothesis is that our reality is a simulation running in some subset of a different reality. Given what we know about building simulations in our own reality, the hypothesis implicitly recognizes that the meta-reality in which which our reality is a simulation is necessarily different (and likely more complex) than our own.
There's an even deeper notion to the simulation hypothesis. If our reality is a "simulation", given the richness that we see around us, what is the difference between a "simulation" and something that isn't a simulation? Based on what I said above, the different "levels" would necessarily need to differ in terms of their own complexity. But is a reality in which our own could be simulated really any "better" than our own?
> given the richness that we see around us, what is the difference between a "simulation" and something that isn't a simulation?
One thought that stayed with me for a while is, my entire experience could, theoretically, be encoded in (a rather large) integer and in the same way that the number 2 exists in many contexts maybe I exist simultaneously in reality and multiple simulations as well.
This all assumes that the limitations of our universe are the same as the limitations on the hardware outside the simulation.
We can run Conway's game, but a being inside that simulation, contemplating it's existence, would have no way to even begin to think about quantum computing.
The rules are just too different.
That being said, I almost look at the finite speed of light, and quantum effects as shortcuts to simulation.
The light speed limit allows greater parallelization by reducing the number of particles inside the local light cone. Likewise, quantum effects seem like a compiler optimization where a calculation isn't performed until the result is needed.
It's fun to think about, but I doubt we will ever know for sure either way.
It seems to be purely an issue of computing power, not feasibility. That’s why it’s conceivable that future advances would make it feasible.
Like if in 1950 you looked at a computer and thought “there’s no way to make realistic 3D graphics”.
OK, somewhat stale already... But WTH, there are still "Reply" links, so AFAICS replying is acceptable. And the "reverse-perspective" rebuttal I'm about to propose hasn't been proposed so far.
> OK, but I'm saying the burden of proof is on who think that simulation is possible. It's not at all obvious that it is.
You seem to be arguing about whether it's possible to simulate this reality, the one we're in, within itself. That's like saying Sim City is impossible, because you can't build the computer Sim City is running on in Sim City; you can't have your Sims build a PC and then run that same copy of the Sim City game on that simulated PC.
If we and the universe we live in are all part of a simulation, that simulation can be running on something that cannot be built in our universe -- within that simulation itself -- without, AFAICS, that constituting any logical contradiction.
Or, IOW: Ants can build anthills, but we can build ant farms. If an ant claims "I can't be living in an ant farm; there can be no such thing, because we ants don't know how to make one!"... Then it is wrong.
In the simulation hypothesis, it is not necessary for humans to know how to build the simulation environment, because in that hypothesis we aren't the simulation builders. We're just the Sims (or ants).
If we go around demanding proof of every hypothesis before considering its implications, we will find that we have very little to think about. Not even mathematics and philosophy are conducted at this level of rigor.
The burden of proof is a double-edged sword: it can be used to rein in tendentiousness, but also to avoid discussing an issue.
> The burden of proof is on people who claim that human cognition can be simulated by computer, not the other way around. To me, it seems far more likely that it can't.
C-T makes a claim about computable functions on natural numbers. It seems strange to argue that humans can perform such computations on a fundamental level better than a computer, thus we might assume the same is true of more complex computations. So while I suppose you could take the position that the burden of proof is to show every individual method of computation is equivalent, since there are infinitely many methods this seems a bit unfair.
I don't think anyone is claiming that humans can compute computable functions in some way better than a computer, the claim is that minds can do things that are not reducible to computing computable functions. See, for example, the Lucas-Penrose argument:
https://en.wikipedia.org/wiki/Penrose%E2%80%93Lucas_argument
Note that I am not suggesting that I concur with it.
I have recently come up with a model that has been useful to me for thinking about thinking.
It involves the realization that different brain regions must communicate, but also contain their own representation of reality.
Partial thought precursors echo back and forth between these regions, with each region amplifying or dampening parts of the idea that it recognizes as valid.
When multiple brain regions begin to agree on its validity to a high level, the aha moment occurs.
This model has some characteristics of waveform collapse, and discrete task specific neural networks. When multiple tasks specific networks arrive at consensus that a model matches experience, the proto-idea forms. This proto-idea can then be evaluated, and inspected. New scenarios are reflected off this new idea, to see if it continues to make sense.
Converting an idea into words makes it useful to others, and allows sharing of ideas. This process requires refinement by echoing back and forth with the proto-idea until the words match it's shape.
In order for these words to be understood effectively, they need to make sense to the brains that are receiving them. That means the words chosen need to activate multiple brain regions that the listener may use to evaluate this new idea and have the aha moment themselves.
This process is easier when the 2 brains have many shared experiences to draw on, or communication is bi-directional to allow message refinement.
> It involves the realization that different brain regions must communicate, but also contain their own representation of reality.
Sure, but this is complicated by regions being fuzzy and overlapping (or perhaps interpenetrating).
Each part in a healthy brain should ideally take input, process it somehow, and reflect back an echo that amplifies and dampens the parts that match it's representation of reality. The lines between regions aren't as important.
I'm almost thinking about how this could be turned into a device for conveying information with actual echos.
> I'm almost thinking about how this could be turned into a device for conveying information with actual echos.
Like the acoustic delay line memory used in early computers?
https://en.m.wikipedia.org/wiki/Delay_line_memory#Mercury_de...
If anyone is interested in this, they should look into Alfred Korzybski and general semantics. He invented the term "The map is not the territory", in case you want an idea of who he is.
IIRC, he at one point says in his book, "Science and Sanity: An Introduction to Non-Aristotelian Systems and General Semantics" something along the lines that the mistakes most people make are in categorization. Something along the lines of "Some things look the same but they are different, and some things look different but they are the same". It's a very interesting book and I loved how Non-Aristotelian logic was used in Null-A by A.E. van Vogt, which introduced it to me.
I've always considered "pattern recognition for the purpose of prediction" a core function of the human brain that precedes language, rationality or any form of logic. So it feels counter-intuitive to me to label this meta-rationality or even associate it with rationality. I subscribe somewhat to the idea that much of our rational decision making is ex post facto, i.e a verbal narration that comes after the decision making to explain the decision. The actual decision making being an opaque process that takes place inside our brain's neural network. Confession: I lost the author about halfway through the article.
This ex post facto explanation is referred to as rationalization and is ironically irrational. If you’ve already made a decision you’ve forfeited the option to do so rationally.
I do agree that most of what gets labeled rational is in fact ex post facto rationalization. I do it myself all the time, haha.
Ex post facto rationalization is a major annoyance of mine. Let's just be honest with ourselves and recognize that lots of what we want isn't rational. That's ok, use your rationality to keep yourself from making huge, dumb mistakes, but don't pretend you're doing it for "rational" reasons or through a "rational" process.
But I'm kind of over-strict about it and like to reserve "rational" for processes you can do with first-order logic. Given the complexity and lack of information in real life, that happens almost never. Of course this is not a terribly rational point of view.
Another good introduction: https://drossbucket.com/2017/09/30/metarationality-a-messy-i...
Despite admitting verbally that a map is not the territory, rationalists hope that if they take one map, and keep updating it long enough, this map will asymptotically approach the territory. In other words, that in every moment, using one map is the right strategy. Meta-rationalists don’t believe in the ability to update one map sufficiently (or perhaps just sufficiently quickly), and intentionally use different maps for different contexts. (Which of course does not prevent them from updating the individual maps.) As a side effect of this strategy, the meta-rationalist is always aware that the currently used map is just a map; one of many possible maps. The rationalist, having invested too much time and energy into updating one map, may find it emotionally too difficult to admit that the map does not fit the territory, when they encounter a new part of territory where the existing map fits poorly. Which means that on the emotional level, rationalists treat their one map as the territory.
Furthermore, meta-rationalists don’t really believe that if you take one map and keep updating it long enough, you will necessarily asymptotically approach the territory. First, the incoming information is already interpreted by the map in use; second, the instructions for updating are themselves contained in the map. So it is quite possible that different maps, even after updating on tons of data from the territory, would still converge towards different attractors. And even if, hypothetically, given infinite computing power, they would converge towards the same place, it is still possible that they will not come sufficiently close during one human life, or that a sufficiently advanced map would fit into a human brain. Therefore, using multiple maps may be the optimal approach for a human. (Even if you choose “the current scientific knowledge” as one of your starting maps.)
The above comment should be attributed to https://www.lesswrong.com/posts/hxxN75ZQ5GY4Tjwkv/inscrutabl...
That comment is picking at a fair critique, but some details seem to be wrong.
> rationalists hope that if they take one map, and keep updating it long enough, this map will asymptotically approach the territory
That is, as far as can be detected, what the human brain does. It isn't just the rationalists who have a view and keep updating it, hoping it will asymptotically approach the territory. It is exceedingly difficult to have a strategy that doesn't do that and still be a semi-functional member of society.
I'm struggling to see how someone could hold 'different' maps because they become one map in your head. Rationalists are perfectly comfortable with there being multiple possible scenarios leading to an outcome.
My guess is that this observation is going to the fact that rationalists are very, very uncomfortable (to the point of falling apart, sometimes) in accepting "because I say so" as sufficient evidence to update a view, change behaviour stop arguing and be a good sport about the whole thing. Which is very much a social faux-pas when dealing with high status people and often a mistake when dealing with inarticulate people who are nevertheless correct in their view.
> I'm struggling to see how someone could hold 'different' maps because they become one map in your head. Rationalists are perfectly comfortable with there being multiple possible scenarios leading to an outcome.
A street map and a subway map both describe the connectivity within a city, but even when a human internalizes them both, they don't get subsumed into a single map exactly, rather the human mode-switches between them at various points to stich together a route.
If you doubt this, try visualizing exactly which streets and landmarks are going by overhead as you travel between stations. It is rather hard to do, and as a result people often treat stations more like portals into a parallel wormhole network.
This of course uses literal maps as an exemplar, but similar ones are 'code switching' back and forth between dialects rather than blending them, and visual illusions that can be seen as one image or another, but not both at once.
> If you doubt this, try visualizing exactly which streets and landmarks are going by overhead as you travel between [subway -- CRC] stations. It is rather hard to do, and as a result people often treat stations more like portals into a parallel wormhole network.
Idunno, I find if you're somewhat familiar with the more detailed map -- and the subway stations are marked on it -- it comes somewhat naturally. You probably fool yourself a bit if / when the subway lines are very curved, because (I guess) you'll tend to imagine pretty straight lines between stations, but... On the whole, that's pretty much why subways were built in the first place, so I wouldn't think you'll be all that much off
There's existing terminology for this idea. Rationalists are positivists and meta-rationalists are constructivists. These are terms from the philosophy of science, and directly relate to how one treats the map-territory relationship.
They also relate directly to the philosophies of materialism and emergentism.
Those two terms don't seem to map like you suggest? Doing good science is basically the same as solving Bongard problems. Good positivist science relies on meta-rationality.
Yes, Bongard problems are synonymous with the process of creating mental models, which is the process of science. Logical positivism and constructivism are both scientific - constructivism is a superset of positivist tools and better accommodates systemic/emergent properties.
Positivism is analytic/materialist, which is to say that it purports that you can understand a thing by understanding the behaviour of its smallest pieces. This denies emergent properties, which cannot be understood from the parts. Constructivism allows for multiple distinct (ie, irreconcilable) models to be used in reference to the same subject, depending on the properties that are of interest. It does not require all models to be reconciled into a hierarchy.
Yes.. metarationality is basicaly a sort of belief system about how and what we can undestand. If we dont undestand our limits and we do not use rationalities as tools we end in scientism or very wrong belief systems built on "fakery" and self delusions..
FYI, it is a synonym for the epistemology (philosophy of knowledge) called constructivism.
That line of thinking assumes you can’t share maps, as such it’s of minimal practical value.
To abuse the analogy, a glovebox of old road atlas doesn’t beat Google maps. On the other hand mixing Google maps with your personal knowledge that a bridge is out is a useful meta map.
I’ve heard about this idea in a form: «All models are false. Some are useful.»
Ahh, thanks, that's a very useful distinction to me
A site with lots of Bongard problems: https://www.foundalis.com/res/bps/bpidx.htm
I am reminded that the simplest regex for the words "apex, ibex, index" is 'apex|ibex|index'.
A commonality of all the boxes on the right is being on the right.
A commonality of all the boxes on the left is being on the left.
There is no offside in golf. The rules of the game only apply if when we are playing the game.
Here, Wittgenstein might have said Bongard problems are another language game and the confusion arises from using words in a peculiar way...the game is pretending there is a problem in a Bongard problem.
It depends how you define "simple" : '(ap|ib|ind)ex' is fewer symbols and a smaller character graph.
There is no offside in golf...yet. But the rules are surprisingly long, and will only get longer as loopholes are found and exploited. Humans trying to codify a game, or any system, cannot do so precisely; others can and will find the gaps between the written rules and the intended meaning, and play a different game but pretend it's the same. Or I guess, every game is a language game?
Which I think is a point of meta rationality: we can't avoid including questions of the rules of the problem. It's clouds all the way down.
‘*’ is fewer characters.
If we are playing that language game.
> ‘*’ is fewer characters.
yes, but that produces false positives, whereas both 'apex|ibex|index' and '(ap|ib|ind)ex' do not.
Your original claim was that 'apex|ibex|index' was "simplest", which means it has some property you think that '(ap|ib|ind)ex' lacks, but you haven't yet articulated what that IS.
It is the simplest to reason about.
And the simplest to produce algorithmically.
To prove it, extend each regex to include “indices”.
Being less clever, it is simpler to debug.
Depends on whether your relevant criteria is "those 3 words at random" or you have some reason to draw attention to the fact that they all end in "ex".
If I saw that list, I'd certainly assume there was something special about the "ex" ending since they don't otherwise have much in common and it's extremely unlikely for random chance to produce three "-ex" words just by chance. The shorter regex does a good job of highlighting that yes, "-ex" is the relevant criteria we care about.
It is easy to produce very brittle code by assuming a premise that allows one to rationalize being clever. Or as the British sometimes say “too clever by half.”
I just picked three words I fancied at the time I was illustrating my point. Not random, but not because -ex was important. The language game I was playing was writing an HN comment for amusement.
The finite state machine of my regular expression has one fewer nodes than yours. If that ain’t simpler it will do until simpler shows up.
Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? — Brian Kernighan
The language game we're playing is defining the rules for the language game we're playing. In your extension of the rules, adding a requirement of extensibility (to add 'indices') is a valid move; in your opponent's version it isn't. It could just as well be argued that it is adding the extensibility requirement that is “too clever by half.”
‘There is no spoon’
These problems remind me of https://github.com/fchollet/ARC
"ARC can be seen as a general artificial intelligence benchmark, as a program synthesis benchmark, or as a psychometric intelligence test. It is targeted at both humans and artificially intelligent systems that aim at emulating a human-like form of general fluid intelligence."
One tricky thing about Bongard problems is that for any given problem there are likely many different rules that could distinguish the six positive examples from the six negative examples.
For example, maybe a problem that is "really" about circles vs. triangles also happens to have more black pixels in the left images than in the right images.
A key skill in solving these problems is not just to find a compact and discriminating description, but to find such a description that is also one that a human Bongard problem designer would be likely to think was a cool and elegant puzzle that needs an "Aha" moment to recognize. If you find such a description, then you're very likely to be right.
I suspect that that last part (recognizing when you have found a solution that is pleasing enough to be the answer) is likely to be the biggest challenge for ML-based approaches to Bongard problems
This was a good read and I really enjoyed it (I'm another person who was turned onto Bongard problems by Hofstadter), but two parts weren't particularly strong.
The first one was the dismissal of intuition in a way that seemed pretty straw man like to me: "Mostly, “intuition” just means “mental activity we don’t have a good explanation for,” or maybe “mental activity we don’t have conscious access to.” It is a useless concept, because we don’t have good explanation for much if any mental activity, nor conscious access to much of it. By these definitions, nearly everything is “intuition,” so it’s not a meaningful category."
I think the author could have spent longer trying to come up with a better definition of what someone would mean by intuition with relation to these problems instead of just setting up a poor one then immediately tearing it down. Intuition here would be contrasted against the deliberate procedural thinking of "let's list out qualities of these shapes" and would be something like seeing the solution straight away, but can also be combined with the procedural thinking too with the intuition originating possible useful avenues and then the deliberate part working through them. The contrast is that you could easily write down one set of the steps to be replicated by others (the deliberate part: "I counted the sides on all shapes") but less so the other (intuition: "I thought x", "x jumped out").
The second is that the example they use for mushiness really isn't. There is a perfectly concrete solution to that that doesn't involve any mushiness and is simply that the convex hull of one set is triangular while the others are circular. The only mushiness involved is that saying "triangles vs circles" feels like enough of answer to us to not need to specify any more. We think that we can continue with just this answer and be able to correctly identify any future instances so it seems mushy but you can probably think of examples that would confound the mushy solution but be fine under the more concrete convex hull one.
I thought that one was the most interesting too. The convex hull appears circular but it is not, in one case you have to join the dots between the center point of triangles, none of the points on the triangles are on a circle, or maybe all of them are if we discard the hull abstraction, in which case there are three circles.
Imagine trying to write code that identifies that. However it's one of the most obvious to me.
Very Interesting. This reminds me of The Abstraction and Reasoning Corpus [1] by François Chollet, accompanying his paper On the Measure of Intelligence [2]...
Edit: Found a recent article mentioning both and discussing a NeurIPS paper on using Bongard problems to test AI systems [3].
[1] https://github.com/fchollet/ARC
[2] https://arxiv.org/abs/1911.01547
[3] https://spectrum.ieee.org/tech-talk/artificial-intelligence/...
David Chapman has multiple sites for approaching the same problem. However, all the sites (or 'books') are incomplete.
Every time I come across something by him I just want to read a complete book on the topic of metarationality from cover to cover.
Chapman's long-term hidden-agenda project is to teach us all by extended example that the notion of a "book" is intrinsically nebulous, and that the decision of whether and where one of his books stops and another begins (necessary for, say, counting how many books he has written) is itself dependent on the purpose you have for reading his books.
Yeah, as part of my free podcast audiobook version of Meaningness, I'm constantly asking him for updates, in order to provide a structure in which he'll work on one book at a time and completely finish it. So far it's working out pretty well.
I’d settle for more progress on In the Cells of the Eggplant.
Oddly, this page is on that site but not linked from the table of contents, nor does it show up as a recently changed page.
I know this is a little bit out of topic but I think meta-rationality is more about organizing other people/machines intelligences to achieve your goal even when you are not highly intelligent.
For the third I got:
triangle never in circle - circle never in triangle
compared to the given answer:
triangle bigger than circle - circle bigger than triangle
My solution is more general (worse), because it ignores size in non-containment arrangements, but also slightly more specific (better), because it constrains the single containment example in each set.
Neither of the rules say anything about overlapping cases, but there are no overlapping examples in the given sets. So there is a underlying constraint of no overlaps, but it applies to both sides, so it is not a distinguishing factor.
The idea is to arrive at a rule you can use for any given picture, to decide if it goes on the left or right:
> In a Bongard problem, you have to figure out what the rule is. You are given twelve specific images, and the result of applying the rule to each. (The rule assigns an image to either the left or right group.) Once you have discovered the rule, applying it to new images would be trivial.
Your rules don't do that. Most of the pictures have neither "triangle in circle" nor "circle in triangle", so your rules don't apply to... Ten out of the twelve pictures.
The issue with your solution is that it can not decide for a single combination of circle and triangle if it would belong to the left or the right side.
>The contents of the six boxes on the left all have something in common. The six on right also all have something in common, which is the opposite of the ones on the left.
I think you and the author might disagree on the meaning of "opposite" here. I think they mean logical negation and you are using a more colloquial interpretation.
There is no correct answer (see nebulosity) , the point is to learn how the problem works. And what a marvelous thing your brain is by being able to come up with any solution.
I still don’t understand how this is any different from the notion of problem solving. And I really didn’t understand the human-AI equivalence, is it just the substrate independent nature of turing-completeness? If so, I think we still lack the epistemic toolkit to say anything conclusive about creating an equivalence, or for that matter any form of comparative relationship, especially given that AGI is still not a problem that is well-defined, let alone discussing the solution space. No?
> However, by “system” I mean, roughly, a set of rules that can be printed in a book weighing less than ten kilograms, and which a person can consciously follow.
Is this tongue in cheek or is it a very strange thing to say? That's definitely not a statement that should be made with no justification. I can guess what the author meant, but I don't really want to guess at basics of their argument.
It's important to exclude infinite books, which could just contain a lookup table for every possible situation rather than any general rules. And also finite but exponentially large books, which could contain a lookup table for every possible image (and be something like 2^(1024x768x24) pages long.)
As people and societies we evolve towards metarationality. The reality is too complex to be handled within a single theory. We have to develop metarationality (wisdom) and pragamaticaly make jumps between contradictory theories. Any other aproach is doomed because the complexity will be always too much for any "single rationality" approach.
I don’t see how Bongard problems are human complete if the author can’t even solve half of them. Does that mean he doesn’t have human intelligence?
I think a better candidate for human complete is “knowing what other humans are thinking”. AKA “theory of mind”.
That would exclude many neurodivergent people. Perhaps also other cultures, and social classes.
It is easier to guess the state of minds of people who are similar to you. Because then your natural algorithm "what would I feel in such situation? what would make me say these words, act this way?" is more likely to match how they feel and think.
I suspect that many people overestimate their ability to "read other people's minds". First, they rarely verify their guesses. (I see another person and conclude that they are angry. I usually don't approach them and ask "hey, are you angry?". Therefore, if my guess was wrong, I am not going to learn it.) Second, if they turn out to be wrong, it's always the other person's fault. (If an autist cannot guess a neurotypical person's thoughts, it's the autist's fault. If a neurotypical person cannot guess an autist's thoughts, guess what, that's also the autist's fault.) Third, it is easier to guess thoughts of people we frequently talk to, because people usually think today the same thing they thought yesterday, and they already told us what they thought yesterday.
Being able to solve half of them by computer would be amazing progress, indicating a breakthrough in machine learning research.
But whether that means artificial general intelligence is solved is another question. Most people can’t play Go very well either.
It’s difficult to say whether the solution will generalize without having it, but easy to imagine that it might.
Author links to Turing-Church thesis, which is widely assumed to be right to talk about things not related to any single thing that thesis is about.
It is about computable functions and abstract machines.
We should keep one morning of school a week, no matter our age, to get together with other humans and discuss about stuff like this. Individual reading and HN comments are fine, but we are missing the best part of learning, which is actually verbalizing and battle-testing those vague ideas. It would really help in cases like this... Have no friends? Make a club!