1. What are Counterfactuals?
1.1 Counterfactuals vs. Counter-to-Fact Conditionals
In philosophy and related fields, counterfactuals are taken to be sentences like (1):
- (1)
- If cats were able to talk, they would complain a lot.
The term counterfactual promotes a confusion worth dispelling. Despite the label, counterfactuals are not always counter-to-fact conditionals, i.e., conditionals with false antecedents. For example, (2) is a counterfactual and yet may have a true antecedent. Indeed, (2) can even be used to argue the antecedent is true (Anderson 1951).
- (2)
- If Jones had ingested arsenic, he would have shown exactly the symptoms he’s showing now.
For this reason, some prefer the term subjunctive conditional, reserving the term counterfactual for subjunctive conditionals with false antecedents (Brée 1982; von Fintel 1999b; Declerck and Reed 2001: 99). While slightly more enlightened, this terminology does not neatly align with the sprawling philosophical and interdisciplinary literature surveyed here. It also brings its own complications (section 7.3). And, despite examples like (2), some theorists argue subjunctives actually do (in some sense) assume their antecedent is false (Leahy 2011; Zakkou 2019).
For ease of exposition, this entry uses counterfactual and subjunctive conditional interchangeably. Most of the entry focuses on counterfactuals that are also counter-to-fact conditionals.
1.2 Indicatives vs. Subjunctives
Counterfactuals (i.e., subjunctive conditionals) contrast with indicative conditionals, which concern what is actually the case. To see the contrast, consider (3) and (4) (Adams 1970).[1]
- (3)
- If Oswald didn’t kill Kennedy, someone else did. (Indicative)
- (4)
- If Oswald hadn’t killed Kennedy, someone else would have. (Subjunctive)
We know Kennedy was assassinated in 1963. Thus, the indicative in (3) is true: someone killed Kennedy, so if it wasn’t Oswald, it was someone else. But, as far as we know, Oswald shot Kennedy and acted alone. Thus, as far as we know, the counterfactual in (4) is false: if Oswald hadn’t killed Kennedy, no one else would have.
Indicative and subjunctive conditionals differ in their linguistic form. Indicatives are written in the “indicative” mood common to declarative sentences, which typically feature verbs with simple tenses, as in If A was/is/will be true, B was/is/will be true. Subjunctives are written in the “subjunctive” mood, often featuring the past perfect, or “pluperfect” tense, as in If A were/had been true, B would be/have been true.[2]
Indicatives also sound “infelicitous” (linguistically inappropriate or contextually incoherent) when their antecedent is explicitly denied (Stalnaker 1975; Veltman 1986; though see Holguín 2020). By contrast, subjunctives sound fine even when their antecedent has been denied. This is illustrated in (5), where infelicity is marked with the hash symbol ‘#’.
- (5)
- Bob never danced.
- a.
- #If Bob danced, Leland danced. (Indicative)
- c.
- If Bob had danced, Leland would have danced. (Subjunctive)
- c.
- If Bob were to dance, Leland would dance. (Subjunctive)
We return to the significance of this distinction in section 7.3. This entry sets aside indicatives until then. See the entries on indicative conditionals and the logic of conditionals (see also Edgington 1995; J. Bennett 2003; Gillies 2012; von Fintel 2012).
1.3 Would vs. Might
Most of the counterfactuals discussed in this entry feature would in the consequent. There are also counterfactuals with other modals, such as might:
- (6)
- If cats were able to talk, they might complain a lot.
Some theorists maintain that might-counterfactuals are the “duals” of would-counterfactuals. On this view, (6) is equivalent to (7):
- (7)
- It’s not true that if cats were able to talk, they would not complain a lot.
This equivalence is controversial, however (section A.2 of Debates over Counterfactual Principles; see also section 7.1). Unless otherwise specified, the term counterfactual refers to would-counterfactuals in this entry.
2. Counterfactuals in Philosophy
Counterfactuals are ubiquitous in philosophy. They have been used to analyze many philosophical concepts, including causation, explanation, knowledge, freedom, rationality, harm, and value, just to name a few. Even when these analyses fail, they are considered the starting point for debate and offer crucial insights that shape subsequent discussion.
2.1 Metaphysics and Science
One reason counterfactuals are so useful is they capture a sense of dependency and tendency. It is often tempting to invoke counterfactuals when describing more specific relations of dependency and tendency, such as causation, explanation, and dispositions. This suggests counterfactuals could be fruitful for elucidating these concepts.
On the dependency side, consider causation. Why did Suzy’s throwing a rock at the window “cause” it to shatter? It’s not just because Suzy threw the rock and then it shattered: perhaps Billy tied his shoes at the same time, but he didn’t cause the window to shatter. Arguably, it’s because if Suzy hadn’t thrown the rock, the window wouldn’t have shattered. By contrast, even if Billy hadn’t tied his shoes, Suzy would still have thrown the rock, and so the window still would have shattered.
This illustrates the close connection between causal claims and counterfactuals: the former can often be paraphrased in terms of the latter. Why is that? A simple explanation is that causal claims are counterfactual claims: an (actual) event c causes an (actual) event e just in case if c had not occurred, e would not have occurred. This analysis, inspired by David Hume (1748) and developed by David Lewis (1973a,c) (cf. Mackie 1974), is the main starting point for much of the literature on causation.
Despite its influence, it is now generally agreed that this simple counterfactual analysis is problematic and needs refinement (Collins, Hall, and Paul 2004; Paul and Hall 2013). Still, even critics agree they must account for the intimate connection between counterfactuals and causation (Chisholm 1955). More recently, a new wave of counterfactual analyses has emerged using “causal models” (section 6.3; Hitchcock 2001, 2007; Woodward 2002, 2003: ch.5). See the entries on counterfactual theories of causation, causal models, and causation and manipulability.
Counterfactuals likewise figure in many theories of non-causal explanation. Wilson (2018) analyzes metaphysical explanation, or “grounding”, in terms of counterfactuals (cf. Schaffer 2016; Lange 2017; for criticism, see Koslicki 2016; K. Bennett 2017: §3.3). Baron et al. (2017, 2020) give a parallel analysis of mathematical explanation (cf. Lange 2013; for criticism, see Kasirzadeh 2023; Povich 2023). Lange (1999, 2000, 2009) analyzes laws in terms of a counterfactual notion of stability (cf. Woodward 2004). Conversely, some analyze counterfactuals in terms of laws (Maudlin 2007) or explanations (Kment 2006, 2014). As with causation, these analyses are enlightening albeit controversial starting points for theorizing. See the entries on 20th century theories of scientific explanation, causal approaches to scientific explanation, laws of nature, metaphysical grounding, and metaphysical explanation.
On the tendency side, consider dispositions. What does it mean to say a wine glass is disposed to break when dropped? Arguably, it means the glass would break if it were dropped. Indeed, many dispositional claims can be rephrased as counterfactuals in this way: an object x is disposed to M under condition C just in case if C were to obtain, x would M. Like with causation, this counterfactual analysis has historically been the starting point for work on dispositions (Ryle 1949; Quine 1960; N. Goodman 1955; E. Prior 1985; D. Lewis 1997; Fara 2005; Manley and Wasserman 2008, 2011). And again, even those who reject the analysis will acknowledge, and must account for, this tight connection. See the entry on dispositions.
For one more example, consider ability. You are able to take a break from reading this entry and grab a cup of tea. What does that mean? It doesn’t just mean it is possible for you to take a tea break: theoretically, it’s possible for you to teleport to Jupiter, but, alas, you are unable to do so. A natural thought, tracing back to Hume (1748) and Moore (1912), is this: you are able to take a tea break in that if you had tried to, you would have succeeded. By contrast, even if you were to try, you would not teleport to Jupiter. Indeed, many ability claims seem paraphrasable in this way, suggesting a conditional analysis: S is able to V just in case if S tried to V, they would succeed. Like with other analyses discussed above, this analysis has been both influential and controversial, with many proposed counterexamples and refinements (Chisholm 1964; Davidson 1973; Lehrer 1976; Ginet 1980; Cross 1986; Thomason 2005; Mandelkern, Shultheis, and Boylan 2017; Schwarz 2020; Willer 2021; Kittle 2023). But also like other concepts discussed above, an adequate analysis of ability must still account for the tight connection between ability ascriptions and counterfactuals. See the entry on abilities.
2.2 Epistemology and Mind
Counterfactuals are frequently invoked in discussions of knowledge, justification, and informational content. They also raise interesting epistemological questions in their own right.
Start with knowledge. It was once thought that knowledge is justified true belief: S knows that P just in case P is true, S believes P, and S’s belief is justified. Famously, however, Gettier (1963) presented devastating counterexamples to this analysis. Something more (or other) than justification is needed to turn a true belief into knowledge. But what? Intuitively, the belief can’t be “accidentally” true: the formation of the belief has to, in some sense, depend on its truth. But “depend” in what sense? One natural answer, of course, is counterfactual dependence. Indeed, this notion of dependence is often construed in one of two ways, both of which invoke counterfactuals:
- Sensitivity: If P were false, S would not believe that P (Nozick 1981).
- Safety: If S were to believe that P, P would not be false (Sosa 1999).[3]
While there’s much debate over the adequacy of these criteria, they play a crucial role in contemporary discussions in epistemology. They are used not just in the analysis of knowledge, but also to articulate related epistemological notions, such as “defeaters” or “reliable” belief formation processes. See the entries on the analysis of knowledge and reliabilist epistemology.
Certain ways of obtaining knowledge, such as thought experiments or imagination, often involve counterfactual reasoning. While thought experiments and imagination differ from counterfactual thinking in important respects (e.g., the former often have an “experiential” component), these notions are clearly related. In this vein, Williamson (2007) argues for a unified epistemology of thought experiments and counterfactuals. See the entries on thought experiments and imagination.
More generally, the epistemology of modality is fruitfully informed by counterfactual theorizing. Some philosophers are skeptical we can ever know what is metaphysically possible or necessary (van Inwagen 1998). But using standard counterfactual logic, Williamson (2007) shows how such modal knowledge can be derived from a more general capacity to know counterfactual conditionals, suggesting knowledge of metaphysical modality is within reach. This approach to modal epistemology has sparked much discussion and continues to be a central focus of the field (Jenkins 2008; Malmgren 2011; Roca-Royes 2011; Casullo 2012; Tahko 2012; Yli-Vakkuri 2013; Deng 2016; Vetter 2016b; Gregory 2017; Berto et al. 2018; Mallozzi 2021; Vaidya and Wallner 2021; Thomasson 2021). One point of contention is that Williamson’s principles require vacuism about counterpossibles, which we return to later (section 7.2). See the entry on the epistemology of modality.
Finally, counterfactuals arise in debates over the analysis of informational content. For example, tree rings “carry information” about the age of a tree (Dretske 2011). Likewise, certain brain states carry informational content: one might be in a brain state corresponding to thinking about cats. But what does it mean for such states to “carry information”? Why, for example, does that brain state represent cats and not something else? One prominent answer uses counterfactuals (Loewer 1983: 76; Cohen and Meskin 2006; cf. Fodor 1987, 1990). For example, Cohen and Meskin argue a state carries information that a is F just in case it is nonvacuously true that that state would not obtain if a were not F. See the entries on causal theories of mental content and semantic conceptions of information.
2.3 Agency and Rationality
Counterfactual reasoning is crucial for learning, planning, and deliberation and thus stars in many influential analyses of agency and rationality.
According to the standard view, a choice is a kind of intentional action, i.e., an action caused in part by one’s intentions. Empirical evidence suggests counterfactual thinking is central to forming rational intentions. Ruth Byrne (2005, 2016: 138) presents experiments showing people who engage in certain counterfactual thinking after particular events to formulate plans improve the outcome of their actions in related scenarios. However, those who dwell on how things could have been worse, or do not counterfactually reflect at all, show less persistence and no improvement in performance. This process can become further disordered when counterfactual thinking goes astray, e.g., in depression, anxiety, and schizophrenia (Byrne 2016: 140–143). This suggests counterfactual thinking strongly influences our ability to rationally plan and act. See the entries on agency, action, and intention.
For an intentional action to be a choice, it must be free or voluntary. What does that mean, though? A standard answer is that free action requires an ability to do otherwise. We saw already how counterfactuals may fruitfully elucidate abilities (section 2.1). They can also help clarify the “do otherwise” part. You arguably don’t choose whether to take a tea break unless the following is true:
- (8)
- If you had wanted to (not) take a tea break, you could have done so.
Experiments in social psychology support this link, suggesting belief in free will is linked to increased counterfactual thinking (Alquist et al. 2015) and that counterfactually reflecting on past events and choices is one significant way humans imbue life experiences with meaning and create a sense of self (Galinsky et al. 2005; Heintzelman et al. 2013; Kray et al. 2010; Seto et al. 2015). This might even be used as part of a pragmatic argument for believing in free will: because counterfactual thinking is so practically important, the belief in free will accompanying it is justified.[4]
The idea that choice entails a certain counterfactual claim immediately leads to a classic problem, however: if the future is determined by the past and the physical laws, then every action of every agent, including their “choices”, are predetermined (van Inwagen 1975). In that case, it looks like counterfactuals like (8) are false. Incompatibilist theories of free will accept this conclusion arguing we must either reject determinism or reject free will. Compatibilists reject this argument, seeking instead to understand agency in a way that is compatible with living in a deterministic universe either by denying choice requires counterfactuals like (8) or by explaining how (8) is compatible with determinism. See the entries on free will, arguments for incompatibilism, incompatibilist theories of free will, and compatibilism.
Finally, counterfactuals are helpful in understanding what constitutes rational choice. The standard view from decision theory, game theory, and economics is that an action is rational iff it maximizes one’s expected utility. But how exactly one defines expected utility is a controversial matter. Examples like Newcomb’s Problem (Nozick 1969) suggest there are competing definitions one could adopt. One prominent proposal, known as causal decision theory, uses counterfactuals: one should choose the action that would most likely bring about good outcomes were one to do it (Stalnaker 1972 [1981]; Gibbard and Harper 1978; Joyce 1999). Indeed, some authors suggest “causal” decision theory would be better called counterfactual decision theory (Collins 1996; Hedden 2023; Gallow 2024) and have used features of counterfactuals to inform the theory (Hedden 2023; McNamara 2024; for dissent, see DeRose 2010; Korzukhin 2014). Related work on belief revision explores how rational agents should revise their beliefs when they are inconsistent with something previously learned—much like a counterfactual antecedent demands—and uses structures that formally parallel those used in the semantics of counterfactuals (Harper 1975; Gärdenfors 1978, 1982; D. Lewis 1979b, 1981a; Levi 1988). See the entries on decision theory, causal decision theory, and formal representations of belief.
3. Semantic Puzzles
Despite their ubiquity, it is surprisingly difficult to clarify what counterfactuals mean. This is arguably the central question motivating the philosophical and linguistic literature on counterfactuals: what does it mean to say that if something were to occur, something else would occur?
Answering this question is challenging partly because counterfactuals exhibit a number of puzzling features. This section highlights some of the features of counterfactuals that every semantic analysis must account for.
Throughout, we use the following notation for counterfactuals due to Robert Stalnaker (1968):
Notation
\(A \gt C\) symbolizes the counterfactual if \(A\) were the case,
\(C\) would be the case
(more precisely: if it were/had been the case that \(A\), it
would be/have been the case that \(C\) ).
In the philosophical literature, one also sees a different notation due to David Lewis (1973b,c): \(A \mathbin{\Box\!\!\rightarrow} C\). (Lewis also uses \(A \mathbin{\Diamond\!\!\!\rightarrow} C\) for might-counterfactuals.) These notations are about as prevalent in the literature.[5]
3.1 Goodman’s Problem
Nelson Goodman (1947) articulated two puzzling features of counterfactuals that sparked much of the subsequent literature. The first is that counterfactuals seem to depend on certain background facts. But it is difficult to specify these facts in a non-circular manner. This is known as Goodman’s problem.
To see the problem, suppose I have a match that I never strike. In that case, (9) seems true:
- (9)
- If I had struck this match, it would have lit.
But the antecedent of (9) assumes certain background facts like the presence of oxygen—a struck match won’t light without oxygen. Moreover, the match must be dry, there must be relatively little wind, the friction between the striking surface and the match must be sufficient to produce heat, which must be sufficient to activate chemical energy stored in the match head, certain physical laws like conservation of energy must hold,…and on and on. There seems to be no end to the list of background conditions that the antecedents of even simple counterfactuals like (9) implicitly assume.[6]
Which background facts does a counterfactual antecedent assume, though? We cannot simply say it assumes all actual facts besides the fact that the match was not struck. After all, one such fact is that the match was not lit! Even excluding this, there is also the fact that the room is dark, that the match isn’t burnt, and so on, which seem to rule out the match being lit. Goodman suggests we only assume facts that are “cotenable” with the match being struck. But cotenable in what sense? The match not lighting is logically compatible with antecedent, as is the match not being burnt, the room being dark, and so on. One is tempted to say those logically compatible facts would not obtain if the match were struck. But, Goodman notes, this definition of cotenability is circular, as it relies on the truth of other counterfactuals.[7]
Most theorists believe the key to solving Goodman’s problem is to better understand the role context plays in the truth of counterfactuals. Consider an example due to Quine (1960: §46):
- (10)
-
- a.
- If Caesar had been in charge during the Korea War, he would have used the atom bomb.
- b.
- If Caesar had been in charge during the Korea War, he would have used catapults.
One can easily imagine asserting either (10a) or (10b) in different contexts. When discussing Caesar’s brutality, (10a) sounds natural to assert. But when discussing Caesar’s choice of weaponry, (10b) might sound better. This suggests counterfactuals are context-sensitive: the possibilities we consider when evaluating the antecedent are constrained by the context in which the counterfactual is asserted (D. Lewis 1973b: 67).[8] Most accounts incorporate some version of this idea (Ichikawa 2011; Ippolito 2016; K. Lewis 2016, 2017). But, as Goodman’s example illustrates, it is difficult to spell out how, exactly, context constrains counterfactual evaluation.
3.2 Sobel Sequences
The second puzzling feature that Goodman (1947) observed is arguably the most iconic: adding information to the antecedent of a counterfactual can change its truth value. For example, (11a) could be true while (11b) is false.
- (11)
-
- a.
- If I had struck this match, it would have lit.
\(S \gt L\) - b.
- If I had struck this match and done so in a room without oxygen,
it would have lit.
\((S \wedge \neg O) \gt L\)
David Lewis (1973b: 10; 1973c: 419) dramatized the point with sequences like (12), where adding information to the antecedent repeatedly flips the counterfactual’s truth value. These are known as Sobel sequences (named after Howard Sobel).
- (12)
-
- a.
- If I were an Olympic athlete, I would have won the race.
\(O \gt W\) - b.
- If I were an Olympic athlete but had a broken leg, I
wouldn’t have won the race.
\((O \wedge B) \gt \neg W\) - c.
- If I were an Olympic athlete and had a broken leg but were racing
a bunch of snails, I would have won the race.
\((O \wedge B \wedge S) \gt W\) - \(\vdots\)
This feature of counterfactual antecedents—that strengthening them doesn’t preserve truth—is known as non-monotonicity (or, more precisely, downward non-monotonicity).[9] Not all analyses agree that counterfactuals are non-monotonic in this sense (section 4.2). But the apparent non-monotonicity of counterfactuals, as illustrated by Sobel sequences like (12), is a striking feature that all accounts must explain. For more on non-monotonicity, see the entries on non-monotonic logic and defeasible reasoning.
3.3 Counterfactual Fallacies
Non-monotonicity leads to failures of several counterfactual principles. Specifically, monotonicity corresponds to four principles of counterfactual reasoning. In fact, given modest auxiliary assumptions, these principles are equivalent to one another.[10] (Notation: “\(A_1,\dots,A_n \vDash B\)” stands for “\(A_1,\dots,A_n\) entail \(B\)”.)
- Antecedent Strengthening: \(A \gt C \vDash (A \wedge B) \gt C\)
- Transitivity: \(A \gt B, B \gt C \vDash A \gt C\)
- Contraposition: \(A \gt C \vDash \neg C \gt \neg A\)
- Simplification of Disjunctive Antecedents (SDA): \((A \vee B) \gt C \vDash (A \gt C) \wedge (B \gt C)\)
These principles are all prone to counterexamples. David Lewis (1973b: 31) refers to the first three as counterfactual fallacies. Whether the fourth, SDA, is a “fallacy” is much more controversial (section A.4 of the supplement Debates over Counterfactual Principles).
We have already seen counterexamples to Antecedent Strengthening in (11) and (12). Against Transitivity, Stalnaker (1968: 48) presents (13). Imagine Hoover has a strong disposition towards patriotism for his home country, wherever that is. But suppose also that being a communist in the US amounted to being a traitor. In that case, (13a) is true, given the US’s stance towards communism. And (13b) is true, given the period Hoover was in. But (13c) is false, given Hoover’s patriotism.
- (13)
-
- a.
- If J. Edgar Hoover were a communist, then he would be a traitor
(to his home country).
\(C \gt T\) - b.
- If J. Edgar Hoover had been born a Russian, then he would be a
communist.
\(R \gt C\) - c.
- #If J. Edgar Hoover had been born a Russian, he would be a traitor
(to his home country).
\(R \gt T\)
Contra Contraposition, Lewis (1973b: 35) presents (14). Imagine Olga went to the party but Boris didn’t. Boris wanted to go, but stayed home to avoid Olga. However, Olga really wanted to see Boris. Both Boris and Ogla have reliable access to each other’s whereabouts. In that case, (14a) is true (Olga wants to see Boris) whereas (14b) is false (Boris wants to go to the party while avoiding Olga).
- (14)
-
- a.
- If Boris had gone to the party, Olga would have gone.
\(B \gt O\) - b.
- #If Olga had not gone, Boris would not have gone.
\(\neg O \gt \neg B\)
Finally, against SDA, McKay and van Inwagen (1977: 354) offer (15). Imagine Spain preferred to remain neutral, but preferred working with the Axis over working with the Allies. In that case, (15a) is true. Suppose further that under no circumstances do they want to “play both sides”: the risks of getting caught are just too great. In that case, (15b) is false.
- (15)
-
- a.
- If Spain had fought for the Allies or the Axis, it would have
fought for the Axis.
\((L \vee X) \gt X\) - b.
- #If Spain had fought for the Allies, it would have fought for the
Axis.
\(L \gt X\)
Not everyone accepts these counterexamples (section 4.2). Still, all analyses must provide an explanation of why these monotonicity principles seem to fail in these cases.
4. Strict Conditional Analyses
Arguably, the simplest analysis of counterfactuals is the strict conditional analysis (or just the strict analysis). On this approach, counterfactuals are “strict conditionals”: they are true only when their antecedent strictly necessitates, i.e., necessarily implies, their consequent.
The strict analysis was first articulated by Charles Sanders Peirce (1896: 33).[11] C. I. Lewis (1912, 1914) defended this analysis and developed an axiomatic system for it. A precise model-theoretic semantics for the strict conditional was first developed by Rudolf Carnap (1956: Ch.5). Saul Kripke (1963) introduced a modal semantics featuring an accessibility relation, which is crucially exploited by more modern forms of this analysis. (For more historical context, see the entry on modern origins of modal logic.)
Despite its simplicity, the strict analysis has not been as popular amongst philosophers as its main rival, the variably strict analysis (section 5). But strict analyses have recently made a comeback thanks to the rise of dynamic semantics (section 4.3).
4.1 Strict Conditionals
According to the strict analysis, counterfactuals are strict conditionals, meaning their antecedent necessarily implies their consequent. C. I. Lewis introduced \(A \strictif C\) for the strict conditional from \(A\) to \(C\). Nowadays, this is canonically expressed using tools from modal logic.
Modal logic extends standard propositional logic with two operators: \(\Box\) (necessity) and \(\Diamond\) (possibility). Roughly, \(\Box A\) says “it is necessary that \(A\)”, while \(\Diamond A\) says “it is possible that \(A\)”. These are analyzed using the notion of an “accessible world”. Intuitively, an accessible world is just a world that is possible in the relevant sense, be it metaphysical, nomological, epistemic, or deontic. Formally, accessibility is represented as a binary relation \(R\) over a set of worlds \(W\). Thus, \(wRv\) says that \(v\) is possible from \(w\)’s perspective, i.e., \(w\) can “access” \(v\). So \(\Box A\) says that \(A\) is true at every accessible world, while \(\Diamond A\) says that \(A\) is true at some accessible world.[12]
Canonically, \(A \strictif C\) is defined as \(\Box(A \mathbin{\supset} C)\), where \(\supset\) is the material conditional. In other words:
Strict Conditional Semantics
\(A \strictif C\) is true iff \(C\) is true at every accessible world
where \(A\) is true.
Figure 1 illustrates this semantics. In this minimal model, \(A \strictif C\) is true at \(w\): every \(A\)-world with an incoming arrow from \(w\) (i.e., \(v\) and \(u\)) is also a \(C\)-world. By contrast, \(C \strictif A\) is false at \(w\), since \(s\) is not an \(A\)-world.
Figure 1: A model illustrating the strict conditional analysis. An arrow from \(x\) to \(y\) means \(y\) is accessible from \(x\), i.e., \(xRy\).
The logic of strict conditionals we obtain from this semantics depends on the constraints we place on accessibility. See the entry on the logic of conditionals.
4.2 Are the Counterfactual Fallacies Real Fallacies?
While the strict analysis seems relatively simple and straightforward, it faces an immediate problem: it validates all of the counterfactual fallacies (section 3.3).
- Antecedent Strengthening: \(A \strictif C \vDash (A \wedge B) \strictif C\)
- Transitivity: \(A \strictif B, B \strictif C \vDash A \strictif C\)
- Contraposition: \(A \strictif C \vDash \neg C \strictif \neg A\)
- SDA: \((A \vee B) \strictif C \vDash (A \strictif C) \wedge (B \strictif C)\)
The reason Antecedent Strengthening holds for \(\strictif\) is simple: since \(A \wedge B\) entails \(A\), every \((A \wedge B)\)-world is an \(A\)-world. Thus, if every accessible \(A\)-world is a \(C\)-world (i.e., \(A \strictif C\)), then so is every accessible \((A \wedge B)\)-world (i.e., \((A \wedge B) \strictif C\)). The other counterfactual fallacies are validated for similar reasons.
David Lewis (1973b) and Robert Stalnaker (1968) argued the counterexamples to these principles (section 3.3) refuted strict analyses. They concluded one must build a semantics around non-monotonicity (section 5).
Despite this, evidence suggests counterfactuals actually are monotonic, in line with the strict analysis. For one, in many cases, the counterfactual “fallacies” do not seem fallacious. The inferences below all sound reasonable.
- (16)
- Antecedent Strengthening
- a.
- If the switch were flipped up, the light would be on.
- b.
- So, if the switch were flipped up and painted red, the light would be on.
- (17)
- Transitivity
- a.
- If the switch were flipped up, the light would be on.
- b.
- If the light were on, I would be able to see.
- c.
- So, if the switch were flipped up, I would be able to see.
- (18)
- Contraposition
- a.
- If the switch were flipped up, the light would be on.
- b.
- So, if the light weren’t on, the switch wouldn’t be flipped up.
- (19)
- SDA
- a.
- If either the left or the right switch were flipped up, the light would be on.
- b.
- So, if the left switch were flipped up, the light would be on, and likewise if the right switch were flipped up.
Even if some instances of these inferences are “reasonable”, however, that does not show they are universally valid (see, e.g., Schultheis 2025). After all, we saw earlier that Sobel sequences like (20) (repeated from (12)) sound consistent, suggesting Antecedent Strengthening fails.
- (20)
-
- a.
- If I were an Olympic athlete, I would have won the race.
\(O \gt W\) - b.
- If I were an Olympic athlete but had a broken leg, I
wouldn’t have won the race.
\((O \wedge B) \gt \neg W\)
But even this is not so clear. Irene Heim observes that reversing the order of these counterfactuals, as in (21), sounds markedly worse (von Fintel 2001; Gillies 2007), suggesting the counterfactuals are inconsistent after all. These are known as Heim sequences, or reverse Sobel sequences.
- (21)
-
- a.
- If I were an Olympic athlete but had a broken leg, I
wouldn’t have won the race.
\((O \wedge B) \gt \neg W\) - b.
- #If I were an Olympic athlete, I would have won the race.
\(O \gt W\)
Further linguistic evidence comes from negative polarity items (or NPIs), such as any and ever. Linguists hypothesize that NPIs are only licensed in downward-monotonic environments (Ladusaw 1979; Katz 1991; Kadmon and Landman 1993; Partee 1993; von Fintel 1999a).[13] For example, doubt is downward-monotonic: doubting Ann is at the movies entails doubting Ann is at the movies eating popcorn. By contrast, believes is not downward-monotonic: believing Ann is at the movies doesn’t entail believing Ann is at the movies eating popcorn. This difference seems to directly correlate with the contrast between (22), which sounds fine, and (23), which sounds ungrammatical. Notably, however, NPIs are licensed in the antecedents of conditionals, as in (24). This suggests counterfactual antecedents are downward-monotonic, i.e., Antecedent Strengthening is valid after all.
- (22)
- I doubt Ann ever saw any movies.
- (23)
- #I believe Ann [ever] saw any movies.
- (24)
- If Ann had ever seen any movies, she would have eaten popcorn.
This last point is not decisive: exactly, questions, superlatives, and the restrictors of most and only all license NPIs despite not being downward-monotonic. Whether conditional antecedents are also exceptions is debated (Giannakidou 2002, 2011; Rothschild 2006; C. Barker 2018).
Even given all of this, strict theorists must provide an alternative explanation for the apparent counterexamples to these principles. They typically do so by appealing to context-sensitivity.
4.3 Second Wave Strict Analyses
In this vein, there was a subsequent wave of strict analyses aimed at addressing the appearance of non-monotonicity in counterfactuals, beginning with Daniels and Freeman (1980) and Warmbrōd (1981a,b), and followed by Lowe (1983, 1990) and Lycan (2001). Later, von Fintel (2001) and Gillies (2007) spearheaded a related approach using dynamic semantics.
These “second wave” strict analyses all invoke some version of the following ideas (though even rival theories accept the first two):
- Accessibility: The truth of a counterfactual depends on which worlds are accessible.
- Context-Sensitivity: Which worlds are accessible depends on the conversational context. Generally, accessible worlds are those that preserve certain background facts assumed by the conversational participants.
- Presupposition: Counterfactuals presuppose (in some sense) their antecedent is compatible with these background facts, i.e., the antecedent is true at some accessible world.
- Accommodation: When this presupposition is not met, conversational participants will try, to the extent possible, to accommodate this presupposition by expanding the set of worlds that are deemed accessible.
Let’s see how this applies to Sobel sequences like (20) and Heim sequences like (21). Start with (20). Normally, in accepting an assertion of (20a), we do not consider worlds where I am an Olympic athlete but have a broken leg to be accessible (we may not consider such worlds at all). Against this background, the presupposition of (20b) isn’t met: there is no accessible world where its antecedent is true. So when interpreting (20b), we must accommodate its presupposition by adding a world where I am an Olympic athlete and I do have a broken leg. In doing so, we will plausibly only add worlds where I lose the race (not worlds where I am racing snails, etc.). But now consider (21). If we accept (21a) at the start, there must be some accessible worlds where I am an Olympic athlete and yet lose the race (due to the broken leg). In that case, the presupposition of (21b) is still met—there are accessible worlds where I am an Olympic athlete—and so the accessibility relation does not change. But since (21a) has already been accepted, some of those accessible worlds are ones where I do lose the race. So (21b) can’t be accepted in this context.
On this account, Sobel sequences like (20) sound felicitous because of a context-shift: the counterfactuals are interpreted against different notions of accessibility, and so against different contexts. It’s like saying It’s raining doesn’t entail It’s raining because the weather can change between asserting the premise and the conclusion. Logical principles cannot be so easily dismissed, as they only concern which inferences are valid within a single context. Parallel explanations of the other counterfactual fallacies can be given (Warmbrōd 1981a,b).
One challenge for these second wave strict approaches is to clarify how accessibility changes in presupposition accommodation. In the story above, we said speakers interpreting (20b) after (20a) “plausibly” do not add worlds where I win the race (e.g., by racing snails). But why not? Which worlds should we add when accommodating presuppositions? If we simply add all antecedent-worlds, we will inadvertently add worlds where I do not lose the race, rendering (20b) unacceptable. Some second wave theorists suggest we only add the most “similar” antecedent-worlds (Warmbrōd 1981b; von Fintel 2001; Willer 2018). But similar in what sense? This same appeal to similarity in non-strict analyses has been criticized (section 5.4), suggesting strict analyses inherit similar similarity problems as their rivals.
Another challenge comes from probability judgments. Suppose a fair coin is never flipped. How likely is it that if it were flipped, it would land heads? Intuitively, the answer is 50%. But strict theorists say 0%: the antecedent definitely fails to necessitate the consequent. Accounting for these judgments is an open research program for strict theorists (see Willer 2025 for a recent attempt; section 7.1 discusses related issues).
The debate over whether counterfactuals are best given a strict analysis is very much ongoing. Critics propose ways of explaining reverse Sobel sequences within a non-monotonic analysis, like those discussed in section 5 (Moss 2012; Starr 2014a; Nichols 2017, K. Lewis 2018, Boylan and Schultheis 2021; Schultheis 2025). Advocates defend strict approaches as preferable on the basis of other data (Willer 2015, 2017, 2018; Williamson 2020; G. Greenburg 2021; Loewenstein 2021a; Moss forthcoming).
5. Variably Strict Analyses: The Lewis-Stalnaker Semantics
The main competitor to the strict analysis is, arguably, the most popular analysis of counterfactuals within philosophy. This analysis goes by many names, including “the similarity analysis”, “the ordering analysis”, “the selection function analysis”, and “the Lewis-Stalnaker semantics”. We will use the most general name, which encompasses all of these labels: the variably strict analysis. All variably strict analyses endorse some version of the following idea: a counterfactual is true when its consequent is true at all of the “closest” worlds where the antecedent is true. Whereas the strict analysis requires all antecedent-worlds to be consequent-worlds, variably strict analyses only require that all of the closest antecedent-worlds be consequent-worlds.
The variably strict analysis is commonly attributed to David Lewis and Robert Stalnaker (hence the alternative name). The actual history is a bit more nuanced: although publication dates do not tell the full story, the approach was developed roughly contemporaneously by Stalnaker (1968), Stalnaker and Thomason (1970), Lewis (1973b), Nute (1975b), and Sprigge (1970). William Todd (1964) gives an even earlier statement of the view:
When we allow for the possibility of the antecedent’s being true in the case of a counterfactual, we are hypothetically substituting a different world for the actual one. It has to be supposed that this hypothetical world is as much like the actual one as possible so that we will have grounds for saying that the consequent would be realized in such a world. (W. Todd 1964: 107)
The variably strict analysis has received both the most support and the most criticism in the literature, leading to many reformulations and refinements over the years. For ease of exposition, we present a popular version of the semantics, due to David Lewis, known as the ordering semantics. Another popular formulation due to Robert Stalnaker, using selection functions, is discussed in the logic of conditionals (see also Starr 2019: §2).
5.1 Closeness and Similarity
Variably strict analyses assume that, when it comes to counterfactuals, some worlds are more relevant than others. When considering what would happen if this glass had dropped, we do not consider worlds where a bubble wrap suddenly materializes around it or where a wizard casts a spell on it to make it indestructible. We instead consider worlds like the actual world except for the fact that the glass is dropped. Such worlds only minimally differ from the actual world. Put figuratively: some worlds are closer to the actual world.
According to variably strict analyses, only the closest (or perhaps the sufficiently close) antecedent-worlds are relevant for counterfactuals. This is why the analysis is “variably” strict: which set of worlds we must consider varies with the antecedent. But within that set, the analysis is strict: for a counterfactual to be true, all of those closest (or sufficiently close) antecedent-worlds must be consequent worlds.
Talk of worlds being “close” or “far” is metaphorical. What does it actually mean? David Lewis (1973b) measured closeness by similarity: the more similar a world is to the actual world, the closer it is. (We’ll return to issues surrounding similarity in section 5.4.) We can imagine worlds are arranged in “spheres” around the actual world, as in Figure 2. The closer a world is to the center, the more similar it is to the actual world. Worlds on the same sphere are equally close to the actual world.
Figure 2: A similarity model. Worlds on the same ring are equally close to the center \(w\).
We can represent this idea more precisely using world-orderings. The idea is that each world \(w\) is associated with an ordering relation \(\leq_w\) over the set of its accessible worlds \(R(w)\), which we can think of as the “outermost” sphere. Here is a quick translation key:
- \(x \leq_w y\) says \(x\) is at least as close to \(w\) as \(y\) is
- \(x <_w y\) says \(x\) is strictly closer to \(w\) than \(y\) is
- \(x \equiv_w y\) says \(x\) and \(y\) are equally close to \(w\)
Most ordering theorists require \(\leq_w\) to be a “total preorder” over \(R(w)\), meaning it is reflexive (\(x \leq_w x\)), transitive (\(x \leq_w y \leq_w z\) implies \(x \leq_w z\)), and total (either \(x \leq_w y\) or \(y \leq_w x\)) over \(R(w)\) (totality is the most controversial; see section 6.1). They also require \(R\) to be reflexive, i.e., \(wRw\). Further constraints are often imposed to make \(\leq_w\) behave more like the intuitive notion of closeness or similarity. Below are some common ones.
- centering: every world is the unique closest
world to itself
if \(x \in R(w)\) and \(x \neq w\), then \(w <_w x\) - uniqueness (linearity): there are no ties, i.e.,
distinct worlds cannot be equally close
if \(x \equiv_w y\), then \(x = y\) - limit assumption (well-foundedness): there are no
“infinite descending chains” of ever closer and closer
worlds
there are no \(x_1,x_2,x_3,\dots\) where \(x_1 >_w x_2 >_w x_3 > \cdots\) ad infinitum
These constraints each correspond to substantive principles of counterfactual reasoning (section 5.3).
5.2 Ordering Semantics
Given a world-ordering, we can state the ordering semantics as follows:[14]
Ordering Semantics
\(A \gt C\) is true iff either:
- there is no accessible \(A\)-world, or:
- there is an accessible \((A \wedge C)\)-world that is closer than any accessible \((A \wedge \neg C)\)-world.
Given the limit assumption, the ordering semantics can be equivalently stated as follows:[15]
Ordering Semantics (simplified with the limit
assumption)
\(A \gt C\) is true iff all of the closest accessible \(A\)-worlds are
\(C\)-worlds.
Figure 3 illustrates the ordering semantics. In this model, \(A \gt C\) is true at \(w\), since the closest \(A\)-worlds (in this case, \(u_1\) and \(u_3\)), are all \(C\)-worlds. By contrast, \(C \gt A\) is false since there is a closest \(C\)-world (in fact, both of them, i.e., \(v_1\) and \(v_2\)) that is not an \(A\)-world. Notice that neither of the corresponding strict conditionals (\(A \strictif C\) and \(C \strictif A\)) are true.
Figure 3: A model of \(A \gt C\) in the ordering semantics.
None of the counterfactual fallacies hold for variably strict conditionals. The reason is simple: generally, the stronger an antecedent is, the further away one must go to find a world where that antecedent is true.
Figure 4 depicts a simple ordering model where Antecedent Strengthening fails. Imagine \(A\) stands for “I am an Olympic athlete”, \(B\) stands for “I have a broken leg”, and \(C\) stands for “I win the race”. So the closest worlds where I am an Olympic athlete (\(A\)) are ones where I do not have a broken leg (\(\neg B\)). In those worlds, I win the race (\(C\)). Further away are worlds where I am an Olympic athlete and have a broken leg (\(A \wedge B\)). In the closest such worlds, I lose the race (\(\neg C\)). This model also refutes the other counterfactual fallacies; see the note for details.[16]
Figure 4: Counterexample to the counterfactual fallacies. For instance, the closest \(A\)-worlds, \(u_1\) and \(u_3\), are all \(C\)-worlds, but the closets \(A \wedge B\)-worlds, \(z_1\) and \(z_4\), are not \(C\)-worlds.
As for Heim sequences (section 4.2), various authors have proposed variably strict explanations of their infelicity, all of which appeal to some form of context-shifting (Moss 2012,; Starr 2014a; K. Lewis 2018; Boylan and Schultheis 2021; Schultheis 2025). Whether these explanations are adequate, or better than rival strict accounts, is a live debate.
5.3 Counterfactual Principles
Many of the debates surrounding the variably strict analysis are internal to its proponents. Depending on which constraints we impose on world-orderings, different principles of counterfactual reasoning will come out as valid. Variably strict theorists disagree amongst themselves over which principles and constraints they should accept.
The supplemental entry, Debates over Counterfactual Principles, surveys some of the most influential of such debates, including debates over:
- Centering principles, such as And-to-If and Modus Ponens
- Conditional Excluded Middle (CEM)
- The Limit Assumption
- Duality of might- and would-counterfactuals
- Simplification of Disjunctive Antecedents (SDA) and Replacement of Equivalent Antecedents (REA)
- Uniformity principles (e.g., CSO)
- Import-Export
For more about the logic of variably strict conditionals, see the entry on the logic of conditionals.
5.4 Objections to Similarity
Many of the philosophical objections to the variably strict account focus on its appeal to similarity. As we’ll see, developing an adequate account of similarity is a tricky business.
One immediate objection is that the notion of “similarity” is entirely vague. However, variably strict theorists view vagueness as a feature, not a bug. David Lewis (1973b) argued that the vagueness of similarity precisely matches the vagueness of counterfactuals:
Counterfactuals are notoriously vague. That does not mean we cannot give a clear account of their truth conditions. It does mean that such an account must either be stated in vague terms—which does not mean ill-understood terms—or be made relative to some parameter that is fixed only within rough limits on any given occasion of language use. It is to be hoped that this imperfectly fixed parameter is a familiar one that we would be stuck with whether or not we used it in the analysis of counterfactuals; and so it will be. It will be the relation of comparative similarity. (D. Lewis 1973b: 1)
But similarity in what respects? Countless things are “similar” in some respects but not others. New York City and San Francisco are similar when comparing their cost of living or political leanings, but not when comparing topography or climate. What notion of similarity is relevant for counterfactuals?
Initially, Lewis (1973b: 92) proposed the following:
Lewis’s 1973 Proposal: Our familiar, intuitive concept of comparative overall similarity, just applied to possible worlds, is employed in assessing counterfactuals.
Kit Fine (1975: 452) presents a powerful objection to this proposal, known as the future similarity objection. Imagine that Nixon has a button to launch nuclear missiles at Russia and trigger a nuclear war. Fortunately, he never presses the button. Plausibly, (25) is true.
- (25)
- If Nixon had pressed the button, there would have been a nuclear war.
Lewis’s 1973 Proposal incorrectly predicts (25) is false. Suppose, optimistically, there never will be a nuclear war. Then a world where Nixon presses the button and, say, a short-circuit fortunately prevents the missiles from firing is intuitively much more similar to the actual world than one where there is a nuclear war. Tichý (1976: 271) presents a related but distinct counterexample (which is discussed more in Starr 2019: §2.5.1, §3.1).
In response, Lewis (1979a: 472) proposed a ranked system of weights that give what he calls “the standard resolution of similarity”:
Lewis’s 1979 System of Weights
- Avoid big, widespread, diverse violations of law. (“big miracles”)
- Maximize the spatiotemporal region throughout which perfect match of particular fact prevails.
- Avoid even small, localized, simple violations of law. (“little miracles”)
- It is of little or no importance to secure approximate similarity of particular fact, even in matters that concern us greatly.
So in Fine’s counterexample, a world where Nixon pushes the button, a small miracle occurs to short-circuit the equipment and nuclear war is avoided will count as less similar than one where there is no small miracle and a nuclear war ensues. The former world is similar to our own only in one insignificant respect (particular matters of fact) but dissimilar in one important respect (the small miracle).
Unfortunately, even Lewis’s 1979 System of Weights is refutable. Particular matters of fact often are held fixed in some cases. This is demonstrated by Morgenbesser conditionals, named after Sidney Morgenbesser, who made this observation.[17]
- (26)
- [You’re invited to bet heads on a coin toss. You decline.
The coin comes up heads.]
See, if you had bet heads, you would have won! (Slote 1978: 27)
Lewis (1979a: 472) admits not knowing what to make of such cases. Additional problems have been discussed (Bowie 1979; Elga 2000; Edgington 2004; Schaffer 2004; Kment 2006; Wasserman 2006; Dorr 2016).[18] Morreau (2010) even presents a formal argument, based on Arrow’s Theorem in social choice theory, suggesting there is no unproblematic way to aggregate respects of similarity into a coherent notion of overall similarity.
Independently, critics have questioned the very idea that counterfactuals rely on similarity. Horwich (1987: 172) pointedly asks “why we should have evolved such a baroque notion of counterfactual dependence”. Bowie (1979: 496–497) argues no such system of weights can be given an independent, non-circular justification given how complex they must be to correctly predict our counterfactual judgments. Relatedly, Hájek (2014b: 250) objects that similarity is not well suited for the kinds of counterfactuals commonly used in science:
Science has no truck with a notion of similarity; nor does Lewis’s (1979a) ordering of what matters to similarity have a basis in science.
In response, some variably strict theorists propose alternative notions of closeness not solely based on similarity (Kment 2006, 2014, Ippolito 2016, K. Lewis 2018). Others reject the reductionist project of seeking a non-circular non-counterfactual analysis of closeness (e.g., Stalnaker 2019: 209ff; cf. Williamson’s (2000) knowledge-first approach to epistemology). Still, many find the reliance on these “squishy” notions troubling. These criticisms are precisely what motivate the alternative accounts explored in section 6.
6. Alternative Analyses
Both the strict and variably strict analyses rely on some notion of similarity. Variably strict analyses build similarity directly into the truth conditions of counterfactuals, while strict analyses appeal to similarity in pragmatic mechanisms such as presupposition accommodation (section 4.3). But, as we saw in section 5.4, theorists question the utility of similarity. Angelika Kratzer (1989: 626) nicely puts the point:
[I]t is not that the similarity theory says anything false about [particular] examples… It just doesn’t say enough. It stays vague where our intuitions are relatively sharp. I think we should aim for a theory of counterfactuals that is able to make more concrete predictions with respect to particular examples.
The analyses discussed in this section all eschew similarity in favor of more “objective” methods for evaluating counterfactuals, such as laws, objective chance, and explanation.
6.1 Premise Semantics
Nelson Goodman (1947) thought a counterfactual is true iff its consequent is entailed by its antecedent together with further premises.[19] Historically, this was a leading view of counterfactuals prior to the development of variably strict approaches (Ramsey 1931; Chisholm 1946; Mackie 1962; Rescher 1964). Later, this approach was refined by Frank Veltman (1976, 1985, 2005) and Angelika Kratzer (1979, 1981a,b, 1989, 1990, 2002, 2012), and is now called premise semantics.
Premise Semantics
\(A \gt C\) is true iff \(A\) together with further premises
associated with \(A\) necessarily entail \(C\).
This approach splits the difference between strict and variably strict accounts. Counterfactuals are strict conditionals containing further hidden premises in the antecedent. But which premises one adds may depend on what the antecedent is. The further premises associated with \(A\) are known as the premise set of \(A\).
One immediate question is what these “further” premises are: how do we determine an antecedent’s premise set? Goodman took the premise set to consist of facts and laws that are “cotenable” with the antecedent. But Goodman saw no way to define the notion of cotenability in non-counterfactual terms—this is precisely Goodman’s problem (section 3.1).
Both Veltman and Kratzer approached the question with more optimism. In different ways, they suggest context specifies which premises are added to an antecedent. For example, Kratzer assumes context supplies two sets of propositions for each world: a modal base, roughly the non-negotiable facts settled by context (such as certain laws or generalizations) and an ordering source, roughly the negotiable background facts one should try to preserve in evaluating modals and conditionals but may have to forego. A premise set for an antecedent is a consistent set of propositions that includes (i) the modal base, (ii) the antecedent, and (iii) as many propositions from the ordering source as possible while remaining consistent. (Note, this means an antecedent may have multiple premise sets; counterfactuals require every premise set to entail the consequent.) Veltman proposes a related but interestingly different way of computing premise sets (see Starr 2019: §3.1 for an overview).
David Lewis (1981b) shows that Kratzer’s premise semantics is equivalent to the ordering semantics without the totality requirement on world-orderings (note 13). Where these approaches diverge is not in their logic (though see Chemla 2011 and Kaufmann 2017 for important caveats), but in their truth-conditions. While the ordering semantics must incorporate a complex system of weighing respects of similarity, like Lewis’s 1979 System of Weights, to make correct predictions, these predictions more naturally fall out from the way premises are calculated in premise semantics.
One might complain this approach does not address the worry of making concrete predictions about particular cases, as it remains unclear how context supplies a modal base and ordering source (Kanazawa, Kaufmann, and Peters 2005; Kratzer 2005; Veltman 2005; K. Schulz 2007, 2011; see Starr 2019: §3.1 for an overview). Several authors have suggested addressing these concerns by combining premise semantics with interventionist approaches from section 6.3 (K. Schulz 2007, 2011; Kaufmann 2013; Santorio 2014, 2019).
6.2 Probabilistic Analyses
While premise semantics has been prominent among linguists, probabilistic analyses have been more prominent among epistemologists and philosophers of science.
Broadly, there are two kinds of probabilistic analyses, corresponding to two notions of probability. First, there are accounts based on subjective probability. These are often inspired by Bayesian epistemology, which replaces the ordinary binary conception of belief with the gradated notion of credence. Second, there are accounts based on objective probability, i.e., chance. These are typically inspired by modern physical theories such as statistical mechanics or quantum mechanics. For more on different notions of probability, see the entry on interpretations of probability.
On the subjective side, Ernest Adams (1965, 1975) made a seminal proposal, based on his analysis of indicative conditionals:
Adams’s Prior Probability Analysis
The assertability of \(A \gt C\) is proportional to \(P_0(C \mid A)\),
where \(P_0\) is the agent’s credence function prior to learning
that \(A\) was false.
Adams did not think conditionals had truth conditions. One can only talk of a conditional’s “degree of assertability”. One challenge for this approach is to integrate assertability into a comprehensive compositional semantic theory of natural language, as such theories typically assume that (i) meaning is compositional (the meaning of a sentence is a function of its parts) and (ii) meaning is analyzed in terms of truth conditions. Several theorists have since tried to refine Adams's idea to meet this challenge (Manor 1974; S. Barker 1995; Edgington 1995, 2003, 2004; Goldstein 2019).
While subjective probabilistic analyses have been pursued primarily by epistemologists, philosophers of science have been more drawn to analyses appealing to objective chance (Kvart 1986, 1992; Skyrms 1981; Kaufmann 2005a; Loewer 2007; Leitgeb 2012a,b; Hájek 2014a, 2025; Kocurek 2022). Such accounts aim to ground counterfactuals in notions that are more objective and more frequently employed in science than comparative similarity.
Very broadly, these approaches identify counterfactuals with conditional chance statements in some form. More precisely, where \({\textit{ch}}(A) = x\) says “the objective chance that \(A\) is \(x\)”:
Objective Chance Analysis
\(A \gt ({\textit{ch}}(C) = x)\) is true iff the objective chance that
\(C\), given \(A\), was \(x\) shortly before it was settled whether
\(A\) was false.
Two questions immediately arise. First, what does “shortly before” mean? One cannot say it is the time immediately before the antecedent is settled false: there may be no such time if time is dense. One might suggest it is the time where \(A\) had the greatest chance of obtaining, but that could be quite a long time before the antecedent is settled false if its probability drops gradually enough. More questions arise for antecedents that are not time-specific, as in (1): at what point is the antecedent If cats were able to talk,… settled false? Similar problems arise for subjective probabilistic accounts.[20]
Second, what about bare counterfactuals of the form \(A \gt C\), whose consequent is not an objective chance statement? Theorists differ on what to say. Leitgeb (2012a,b) allows bare counterfactuals to be true when the relevant conditional chance is sufficiently high. Hájek (2014a, 2025), by contrast, requires the relevant conditional chance to be maximally high, equating \(A \gt C\) with \(A \gt ({\textit{ch}}(C) = 1)\). Each of these answers has drawbacks. Requiring only that the objective chance be sufficiently high fails to validate a plausible principle of counterfactual reasoning known as Agglomeration—\(A \gt B, A \gt C \vDash A \gt (B \wedge C)\)—since even when both \(B\) and \(C\) are sufficiently probable, \(B \wedge C\) may not be (Hawthorne 2005; Leitgeb 2012a; Hájek 2014a). By contrast, requiring the relevant conditional chance to be 1 will ensure most ordinary counterfactuals are false (Hájek defends this conclusion; see section 7.1). Kocurek (2022) offers a middle way between these on which Agglomeration holds but bare counterfactuals do not require chance 1 to be true.
One worry for all probabilistic analyses, subjective or objective, is whether they are psychologically realistic. Do people really judge counterfactuals using probability? There is a large literature in psychology, beginning with Kahneman, Slovic, and Tversky (1982), showing that human reasoning diverges in predictable way from precise probabilistic reasoning. It remains to be seen whether the kinds of probabilistic mistakes humans make are reflected in their counterfactual judgments. Separately, probabilistic knowledge of this sort imposes unreasonable demands on memory: even for very simple domains the probability calculus does not provide computationally tractable representations and algorithms for implementing Bayesian intelligence. These concerns are one of the main motivations for interventionist approaches discussed next.
6.3 Interventionist Semantics
Recent work in cognitive science and artificial intelligence has made heavy use of new mathematical tools, developed by Spirtes, Glymour, and Scheines (1993, 2000) and Pearl (2000, 2009), known as Bayesian networks, structural equation models, or causal models. These tools help address the memory limitations of traditional Bayesian frameworks while also providing simple algorithms for causal and counterfactual reasoning, among other cognitive processes. They also lead to a new theory of counterfactuals known as the interventionist semantics.
Structural equation models are models of relations of conditional dependence, which typically include three elements: (1) a set of variables, representing different events or facts, which can take various values (e.g., truth values, a probability, etc.); (2) a set of structural equations, which specify how the values of certain variables functionally depend on those of others; and (3) an assignment of values to variables in accordance with these structural equations. Figure 5 depicts a structural equation model involving eight variables. For example, \(H \mathrel{\colon=} F \vee G\) means the value of \(H\) is determined by the value of \(F \vee G\) (but not vice versa).[21]
Figure 5: A structural equation model.
To check whether a counterfactual is true in a structural equation model, one performs an intervention on the model to ensure the antecedent holds. To do this, one first “surgically” removes all arrows leading into the variables that correspond to the antecedent. One then resets the values of those variables to make the antecedent true. Finally, one updates the values of subsequent “descendant” variables that depend on the antecedent variables according to the remaining structural equations. So in the example above, \(E \gt H\) is true: if we intervene to ensure \(E\) is true (removing the arrow from \(C\) to \(E\) and changing the values of \(E\) and its descendants accordingly), then \(H\) is true. (Compare Lewis’s 1979 System of Weights.) More generally:
Interventionist Semantics
\(A \gt C\) is true iff after intervening to make \(A\) true, \(C\) is
true.
Structural equation models have many applications in cognitive science (Glymour 2001; Gopnik et al. 2004; Sloman 2005; Sloman and Lagnado 2005; Gopnik and Tenenbaum 2007; Chater et al. 2010; Rips 2010; Lucas and Kemp 2015) and artificial intelligence (see Pearl 2002, 2009, 2013; see also the entry on logic-based artificial intelligence). Interventionist counterfactuals are also used in robotics to develop safer autonomous vehicles (Thrun et al. 2006; Parisien and Thagard 2008; Chen et al. 2021), in climate science to attribute extreme weather events to climate change (Hannart et al. 2016; Stott et al. 2016; Hannart and Naveau 2018; Mengel et al. 2021; Mérel, Paroissien, and Gammans 2024), and in explainable AI (XAI) to address the “black box” problem (Wachter, Mittelstadt, and Russell 2018; Asher et al. 2022; Chou et al. 2022; Baron 2023; Celar and Byrne 2023; Verma et al. 2024; see Kasirzadeh and Smart 2021 for criticism).
As a semantic analysis, interventionism faces challenges. Some question whether even causal counterfactuals are best modeled in interventionist terms (Hiddleston 2005; Fisher 2017a,b; Zhang, Lam, and De Clercq 2019; see Starr 2019: §3.3 for discussion). Some raise questions over how to interpret these structural equations and whether they can, or must, be understood in non-counterfactual terms (cf. Goodman’s problem; see Spirtes, Glymour, and Scheines 1993, 2000; Pearl 2000, 2009; Woodward 2002, 2003; Halpern and Pearl 2005a,b). Finally, most interventionist accounts only accommodate very simple antecedents (e.g., a conjunction of atomics or negated atomics). Generalizing the semantics to incorporate more complex antecedents is an open research task. There are indications of progress on this front by combining the structural equation approach with others or employing more sophisticated techniques for computing interventions (K. Schulz 2007, 2011; Briggs 2012 (cf. Fine 2012a,b); Kaufmann 2013; Santorio 2014, 2019; Lucas and Kemp 2015; Snider and Bjorndahl 2015; Champollion, Ciardelli, and Zhang 2016; Günther 2017; Ciardelli, Zhang, and Champollion 2018; Stern 2021; Khoo 2022a; Wysocki 2023, forthcoming). See the entries on causal models, causation and manipulability, and counterfactual theories of causation.
7. Recent Debates
7.1 Counterfactual Skepticism and Counterfactual Probability
One burgeoning area of research is the interaction between counterfactuals and probability. While this topic has been extensively explored for indicatives (see the entries on indicative conditionals and the logic of conditionals), theorists have only begun to scratch the surface on how to carry over these lessons to counterfactuals (Skyrms 1981; Edgington 2004, 2008; Williams 2012; Moss 2013; Hájek 2014a,b; M. Schulz 2014, 2017; Schwarz 2018; Khoo 2022a,b; Kocurek 2022; Santorio 2022, 2023; Schultheis 2023; Willer 2025).
Reflecting on counterfactual probability leads to a puzzle. Theorists generally think that, in most ordinary contexts, counterfactuals like (27) are true.
- (27)
- If I were in my office, I would breathe normally.
Recently, however, Alan Hájek (2014a, 2021a,b, 2025) uses objective chance to defend counterfactual skepticism, the view that most ordinary counterfactuals are false.[22] His argument is strikingly simple, yet powerful.[23] According to quantum mechanics, there is a (very!) small chance that the air molecules in my office all quantum tunnel outside it. Thus, quantum mechanics seems to entail:
- (28)
- If I were in my office, there would be some chance of the air molecules in my office quantum tunneling to the outside.
However, (28) seems to entail (29):
- (29)
- If I were in my office, the air molecules in my office might have quantum tunneled to the outside.
But (29) seems inconsistent with (27). Hájek concludes our best physical theories imply most ordinary counterfactuals like (27) are false. This simple argument has sparked many attempts to resist it (D. Lewis 1986b; DeRose 1999; J. Bennett 2003; Hawthorne 2005; Edgington 2008; Ichikawa 2011; Leitgeb 2012a,b; Moss 2013; K. Lewis 2016; Stefánsson 2018; Sandgren and Steele 2021; Kocurek 2022; Boylan 2024; see Loewenstein 2021a for a notable exception). Loewenstein (2021b) provides a helpful overview of these debates.[24]
One option appeals to context-shifting. Karen Lewis (2016) argues that in contexts where we are inclined to assert (27), quantum tunneling possibilities are implicitly excluded as irrelevant (cf. Sandgren and Steele 2021; Boylan 2024). This appeal to context-shifting has been criticized, however (Hájek 2021b; Loewenstein 2021a,b). For example, such views face a dilemma: they must say either (28) is simply false in the context where (27) is true, contrary to physics, or else (28) does not entail (29) even holding context fixed, contrary to intuition.
Another option holds that (27) and (29) are compatible (thus denying the duality of would and might; see section A.2 of Debates over Counterfactual Principles). Stefánsson (2018) argues that (29) does not undermine the truth of (27) but only our ability to know it. He develops this idea using primitive counterfactual facts, or counterfacts (see Hawthorne 2005). The metaphysics of counterfacts is controversial, however (Hájek 2021a; Khoo 2022b). Kocurek (2022) argues that the apparent tension between (27) and (29) is explained by the fact they cannot both be settled true. On this view, Hájek’s argument only supports counterfactual indeterminism, the view that most ordinary counterfactuals are unsettled, not that they’re false (cf. M. Schulz 2014, 2017). This draws parallels with ordinary indeterminism about future contingents (cf. Cariani 2022a; Boylan 2024; see P. Todd 2021 for skepticism about future contingents).
This connection between future contingents and counterfactuals may go even deeper. Cariani and Santorio (2018) and Cariani (2021) defend a semantics for future will that closely resembles the variably strict semantics for counterfactuals. Indeed, linguists hypothesize both will and would derive from a common modal morpheme, sometimes called woll (Abusch 1997, 1998; Condoravdi 2002; Kaufmann 2005b), suggesting a unified treatment between counterfactuals and future contingents (for discussion, see Boylan 2023; Ninan 2024; P. Todd 2024; see also the entry on future contingents).
7.2 Counterpossibles
Another active area of research involves counterfactuals with impossible antecedents, also known as counterpossibles:
- (30)
- If Hobbes had squared the circle, he would be a famous mathematician.
- (31)
- If water were hydrogen peroxide, life would not exist.[25]
All of the semantic analyses of counterfactuals discussed in this entry strikingly predict that counterpossibles are trivially true. For if there are no possible worlds where the antecedent is true, it follows that, vacuously, “all” of the (accessible, closest, etc.) antecedent-worlds are consequent-worlds. This is counterintuitive, however. For one, while (30) and (31) seem true, they do not seem trivially true. More directly, neither of the following counterpossibles sound true:
- (32)
- If Hobbes had squared the circle, he would have eliminated world hunger.
- (33)
- If water were hydrogen peroxide, we would all be just fine.
According to vacuism, the standard semantic analyses are correct: all counterpossibles are vacuously true (Lewis 1973b; Stalnaker 1968, 1996; Kratzer 1979; J. Bennett 2003; Williamson 2007, 2017, 2020; Emery and Hill 2017). Vacuists postulate pragmatic explanations for why (32) and (33) seem false. Emery and Hill (2017) defend a Gricean story on which the apparent falsity of counterpossibles arises from an implicature. Williamson (2007, 2017, 2020) argues counterpossibles seem false due to our reliance on a heuristic for evaluating counterfactuals that, while generally reliable, fails in these cases.
According to nonvacuism, the standard analyses are wrong: some counterpossibles are indeed false (Cohen 1987, 1990; Mares 1997; Nolan 1997; J. Goodman 2004; Vander Laan 2004; Krakauer 2012; Brogaard and Salerno 2013; Jago 2014; Kment 2014; Bernstein 2016; Berto, French, Priest, and Ripley 2018; Berto and Jago 2019). While there are many ways of implementing this idea, the simplest “quick fix” for most semantic analyses is to add impossible worlds to their models. If possible worlds are ways the world could have been, impossible worlds are ways the world could not have been. While some reject the coherence of this notion (Lewis 1986a: 7), many philosophers adopt relatively harmless conceptions of impossible worlds as abstract objects (e.g., sets of sentences or propositions; see Nolan 1997). Thus, we might amend variably strict analyses as follows: a counterfactual is true iff the consequent is true at all of the closest antecedent-worlds, regardless of whether those worlds are possible.
There is much debate over these general strategies. On the pragmatic side, nonvacuists criticize pragmatic explanations as insufficiently general just-so stories (see Byrne 2024 for recent empirical tests). On the semantic side, the use of impossible worlds raises many difficult questions. For example, what determines whether one impossible world is closer than another? Are there any constraints? Can an impossible world be “closer” than a possible world? (See Nolan 1997; Vander Laan 2004; Krakauer 2012; Kment 2014; Bernstein 2016.) What does the logic of counterfactuals look like with impossible worlds? Do any counterfactual principles survive this amendment? (See Nolan 1997; Brogaard and Salerno 2013; Berto, French, Priest, and Ripley 2018; Kocurek 2020; Kocurek and Jerzak 2021.)
Some authors defend intermediate positions. Vetter (2016a,b) utilizes Edgington’s (2008) observation that counterfactuals have an “epistemic” reading (section 7.3) to argue that counterpossibles are only nonvacuous on that reading (see Dohrn 2021 for criticism). Kocurek and Jerzak (2021) argue nonvacuous counterpossibles are counterconventionals, where the antecedent shifts the conventional interpretation of the relevant expressions (cf. Einheuser 2006; Kocurek, Jerzak, and Rudolph 2020).[26] Thus, when counterpossibles are nonvacuous, it is because the interpretation of their antecedent shifts to express a possible proposition (see Kocurek 2019, 2021b, 2024a,b for refinements of this approach; cf. Sandgren and Tanaka 2020; Locke 2021; Knowles 2023).
Another recent debate concerns counterpossibles with logically impossible antecedents, or counterlogicals (e.g., If it were both raining and not raining… or If the law of excluded middle had failed…).[27] Even nonvacuists disagree over whether counterlogicals are all vacuous: some nonvacuists say yes (e.g., Downing 1959; J. Goodman 2004; Kment 2014) while others say no (e.g., Cohen 1987, 1990; Mares 1997; Nolan 1997; Vander Laan 2004; Krakauer 2012; Brogaard and Salerno 2013; Berto et al. 2018). Kocurek and Jerzak (2021) summarize the arguments on both sides. This issue is significant for the study of counterfactuals. For example, some argue that counterlogicals trivialize the logic of counterfactuals: nearly every counterfactual principle, with few exceptions, can be undermined by a suitably chosen counterlogical (Cohen 1990: 131; Nolan 1997: 554–555). By contrast, Kocurek and Jerzak (2021) show how analyzing counterlogicals as counterconventionals avoids this consequence and leads to a non-trivial logic of counterfactuals (see Kocurek 2024a,b for an axiomatization of this logic). But what to make of this argument, and counterlogicals more generally, is still a live issue.
There are still many questions about counterpossibles and the field is actively growing. Kocurek (2021a) provides a general overview of the literature on counterpossibles. See also the entries on impossible worlds and hyperintensionality.
7.3 A Unified Theory of Ifs
This entry started with the distinction between indicative conditionals and subjunctive conditionals. Historically, many philosophers have simply assumed indicatives and subjunctives involve related but substantially different meanings (D. Lewis 1973b; Gibbard 1981; Jackson 1987; J. Bennett 2003). But there is no obvious lexical ambiguity in if. Indeed, with very few exceptions, languages all across the world use the same word for indicative and subjunctive if.[28] Some natural languages, such as Mandarin, Japanese, and Indonesian do not even distinguish indicatives from subjunctives grammatically: the distinction must be pragmatically or contextually communicated (Comrie 1986).
Moreover, as illustrated by (34), future-directed indicative conditionals become counterfactuals once the antecedent is settled false (J. Bennett 2003; DeRose 2010):
- (34)
-
- a.
- [You bet that a coin will land heads. It has not been flipped
yet.]
If the coin lands heads, you will win the bet. - b.
- [The coin lands tails.]
Dang! If the coin had landed heads, you would have won!
This suggests there should be a “unified” theory of conditionals, which assigns a single semantic entry to if and derives the differences between indicatives and subjunctives from other grammatical features, such as tense and mood (see Carter 2023 for further arguments). How best to do this is an open question (Stalnaker 1975; Ellis 1978; Davis 1979; Thomason and Gupta 1981; Kratzer 1986, 2012; Edgington 1995; Weatherson 2001; Starr 2014a; Khoo 2015, 2022).
Stalnaker (1975) postulates indicative and subjunctive conditionals are variably strict, but indicatives carry an additional pragmatic constraint that subjunctives lack: for an indicative to sound felicitous, the closest antecedent-worlds to worlds in the “context set”—i.e., the set of worlds that are compatible with everything the conversational participants mutually assume—must also be within the context set (see Edgington 1995; Leahy 2011; Mandelkern 2018; Boylan and Schultheis 2022; Cariani 2022b for discussion). But at most, this characterizes an important difference between indicatives and subjunctives; it does little to explain why or how this difference arises in the first place.
Weatherson (2001) suggests these differences are explained by different flavors of modality: indicatives are epistemic conditionals, concerning our knowledge and uncertainty, whereas subjunctives are metaphysical, concerning objective dependency and tendency relations. See the entry on varieties of modality.
There is a complication, however: Dorothy Edgington (2008) observes counterfactuals can have an epistemic reading.[29] Imagine the FBI narrowed the possible Kennedy assassins to Oswald and Jones. After bringing both in for questioning, the FBI arrests Oswald. When asked why the FBI director brought Jones in, she might say:
- (35)
- If Oswald hadn’t been the murderer, it would have been Jones.
On one reading, the “epistemic reading”, (35) sounds true: the evidence at the time suggested that either Oswald or Jones was the shooter, so if it weren’t Oswald, it would be Jones. But on another reading, the “metaphysical” or “circumstantial” reading, (35) sounds false (cf. (4)): If Oswald hadn’t shot Kennedy, no one else would have. While the metaphysical reading may be more standard, the existence of epistemic readings suggests we can’t simply equate the indicative-subjunctive distinction with the epistemic-metaphysical distinction (Mackay 2023).
A further challenge for a unified theory of ifs is to explain how the differences between indicatives and subjunctives are linked to their different morphologies. One basic question concerns the role of tense in subjunctives. While subjunctive conditionals appear to contain past (or past perfect) tense, it is unclear whether these carry temporal meaning. For example, adding tomorrow to past or past perfect tense is typically ungrammatical, but it is fine in counterfactual conditionals:
- (36)
-
- a.
- #Bob had danced tomorrow.
- b.
- Too bad Bob broke his leg and can’t dance anymore. If Bob had danced tomorrow, Leland would have danced.
- (37)
-
- a.
- #Bob and Leland were on the dance floor tomorrow.
- b.
- Too bad Bob and Leland can’t come to the dance. If Bob and Leland were on the dance floor tomorrow, the crowd would go wild.
Iatridou (2000) suggests the tense marking in subjunctive conditionals is a “fake tense”. The literature is divided on relationship between fake tense and real tense.
According to modal past theories, the past tense in subjunctives serves a modal function rather than a temporal one: it signals that the possibility described by the antecedent is not assumed to be among those left open by the discourse (Isard 1974; Lyons 1977; Palmer 1986; Iatridou 2000; K. Schulz 2007, 2014; Starr 2014a; Mackay 2019). This explains why indicatives, but not subjunctives, sound marked when their antecedent is explicitly denied. However, such a view must explain why tense should be systematically ambiguous between a temporal and modal reading in languages across the world (see Iatridou 2000; K. Schulz 2014; Mackay 2015, 2017).
A further complication is backtracking counterfactuals. Consider the following example due to Frank Jackson (1977) (cf. Gibbard’s (1981) “Sly Pete” case). Imagine your thrill-seeking friend sits on the edge of the roof of a tall building. Your friend is not suicidal, however, and so is not inclined to jump without proper safety measures in place, such as a net to catch them. They eventually step away from the edge safely. Now contrast the following pair of conditionals:
- (38)
-
- a.
- If your friend were to jump, they would fall to their death.
- b.
- If your friend had jumped, they would have ensured proper safety measures were in place so that they wouldn’t have fallen to their death.
Both counterfactuals seem reasonable (cf. Quine’s Caesar example in (10)). While (38a) is a forward tracking counterfactual, (38b) is a backtracking counterfactual (Downing 1959; Jackson 1977; Lewis 1979a; J. Bennett 1984, 2003; Khoo 2017). Intuitively, in evaluating the first, we hold fixed the past up to the point where your friend jumps, including the fact that (say) there is no net to catch your friend. In evaluating the second, however, we “backtrack” further into the past to ask what the conditions must be so as to lead to the antecedent being true. Notice the backtracking reading is easier to hear with the pluperfect morphology in (38b) (Davis 1979; Morton 1981). This suggests the tense carries something of its original temporal meaning.
According to temporal past theories, subjunctives are just the past-tensed versions of indicatives (Adams 1976; Skyrms 1981; Tedeschi 1981; Dudman 1984a,b; Arregui 2007, 2009; Ippolito 2006, 2008, 2013; Khoo 2015, 2022a). Thus, even though unembedded past-tensed sentences sound bad with tomorrow, they may sound fine if they are understood as embedded in a past-tensed indicative. Such theories thus aim to explain the differences between indicatives and subjunctives without postulating an ambiguity in tense morphology.
One challenge is to explain Morgenbesser conditionals (section 5.4). For example, while (39) seems true, the corresponding indicative conditional in (40) did not seem to be true in the past (S. Barker 1998; Edgington 2004; K. Schulz 2007; see Khoo 2022 for a defense of temporal past theories):
- (39)
- [You’re invited to bet heads on a coin toss. You decline.
The coin comes up heads.]
See, if you had bet heads, you would have won! (Slote 1978: 27)
- (40)
- If you bet heads, you will win!
To summarize: While it is tempting to develop a unified theory of ifs, it is quite tricky. Developing a theory that can explain all of these distinctions (metaphysical vs. epistemic, forward vs. backtracking, etc.) and account for the way if interacts with tense is one of the Holy Grails for theorists working on conditionals.