With the rapid development of artificial intelligence have come concerns about how machines will make moral decisions, and the major challenge of quantifying societal expectations about the ethical principles that should guide machine behaviour. To address this challenge, we deployed the Moral Machine, an online experimental platform designed to explore the moral dilemmas faced by autonomous vehicles. This platform gathered 40 million decisions in ten languages from millions of people in 233 countries and territories. Here we describe the results of this experiment. First, we summarize global moral preferences. Second, we document individual variations in preferences, based on respondents’ demographics. Third, we report cross-cultural ethical variation, and uncover three major clusters of countries. Fourth, we show that these differences correlate with modern institutions and deep cultural traits. We discuss how these preferences can contribute to developing global, socially acceptable principles for machine ethics. All data used in this article are publicly available.

Robustness checks: external validation of three factors Calculated values correspond to values in Fig. 2a (AMCE calculated using conjoint analysis). For example, ‘Sparing Pedestrians [Relation to AV]’ refers to the difference between the probability of sparing pedestrians, and the probability of sparing passengers (attribute name: Relation to AV), aggregated over all other attributes. Error bars represent 95% confidence intervals of the means. a, Validation of textual description (seen versus not seen). By default, respondents see only the visual representation of a scenario. Interpretation of what type of characters they represent (for example, female doctor) may not be obvious. Optionally, respondents can read a textual description of the scenario by clicking on ‘see description’. This panel shows that direction and (except in one case) order of effect estimates remain stable. The magnitude of the effects increases for respondents who read the textual descriptions, which means that the effects reported in Fig. 2a were not overestimated because of visual ambiguity. b, Validation of device used (desktop versus mobile). Direction and order of effect estimates remain stable regardless of whether respondents used desktop or mobile devices when completing the task. c, Validation of data set (all data versus full first-session data versus survey-only data). Direction and order of effect estimates remain stable regardless of whether the data used in analysis are all data, data restricted to only first completed (13-scenario) session by any user, or data restricted to completed sessions after which the demographic survey was taken. First completed session by any user is an interesting subset of the data because respondents had not seen their summary of results yet, and respondents ended up completing the session. Survey-only data are also interesting given that the conclusions about individual variations in the main paper and from Extended Data Fig. 3 and Extended Data Table 1 are drawn from this subset. See Supplementary Information for more details.

… 

Average marginal causal effect (AMCE) of attributes for different subpopulations Subpopulations are characterized by respondents’ age (a, older versus younger), gender (b, male versus female), education (c, less versus more educated), income (d, higher versus lower income), political views (e, conservative versus progressive), and religious views (f, not religious versus very religious). Error bars represent 95% confidence intervals of the means. Note that AMCE has a positive value for all considered subpopulations; for example, both male and female respondents indicated a preference for sparing females, but the latter group showed a stronger preference. See Supplementary Information for a detailed description of the cutoffs and the groupings of ordinal categories that were used to define each subpopulation.

… 

Validation of hierarchical cluster of countries a, b, We use two internal metrics of validation of three linkage criteria of calculating hierarchical clustering (Ward, Complete and Average) in addition to the K-means algorithm: a, Calinski–Harabasz index; b, silhouette index. The x axis indicates the number of clusters. For both internal metrics, a higher index value indicates a ‘better’ fit of partition to the data. c, d, We use two external metrics of validation of the used hierarchical clustering algorithm (Ward) versus those of random clustering assignment: c, purity; d, maximum matching. The histogram shows the distributions of purity and maximum matching values derived from randomly assigning countries to nine clusters. The red dotted lines indicate purity and maximum matching values computed from the clustering output of the hierarchical clustering algorithm using ACME values. See Supplementary Information for more details.

… 

Figures - available from: Nature

This content is subject to copyright. Terms and conditions apply.

ResearchGate Logo

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Join for free

ARTICLE https://doi.org/10.1038/s41586-018-0637-6

The Moral Machine experiment

Edmond Awad1, Sohan Dsouza1, Richard Kim1, Jonathan Schulz2, Joseph Henrich2, Azim Shariff3*, Jean-François Bonnefon4* &

Iyad Rahwan1,5*

With the rapid development of artificial intelligence have come concerns about how machines will make moral decisions,

and the major challenge of quantifying societal expectations about the ethical principles that should guide machine

behaviour. To address this challenge, we deployed the Moral Machine, an online experimental platform designed to

explore the moral dilemmas faced by autonomous vehicles. This platform gathered 40 million decisions in ten languages

from millions of people in 233 countries and territories. Here we describe the results of this experiment. First, we

summarize global moral preferences. Second, we document individual variations in preferences, based on respondents’

demographics. Third, we report cross-cultural ethical variation, and uncover three major clusters of countries. Fourth, we

show that these differences correlate with modern institutions and deep cultural traits. We discuss how these preferences

can contribute to developing global, socially acceptable principles for machine ethics. All data used in this article are

publicly available.

We are entering an age in which machines are tasked not only to pro-

mote well-being and minimize harm, but also to distribute the well-

being they create, and the harm they cannot eliminate. Distribution

of well-being and harm inevitably creates tradeoffs, whose resolution

falls in the moral domain

1–3

. Think of an autonomous vehicle that is

about to crash, and cannot find a trajectory that would save everyone.

Should it swerve onto one jaywalking teenager to spare its three elderly

passengers? Even in the more common instances in which harm is not

inevitable, but just possible, autonomous vehicles will need to decide

how to divide up the risk of harm between the different stakeholders

on the road. Car manufacturers and policymakers are currently strug-

gling with these moral dilemmas, in large part because they cannot

be solved by any simple normative ethical principles such as Asimov’s

laws of robotics4.

Asimov’s laws were not designed to solve the problem of universal

machine ethics, and they were not even designed to let machines

distribute harm between humans. They were a narrative device whose

goal was to generate good stories, by showcasing how challenging it

is to create moral machines with a dozen lines of code. And yet, we

do not have the luxury of giving up on creating moral machines5–8.

Autonomous vehicles will cruise our roads soon, necessitating

agreement on the principles that should apply when, inevitably, life-

threatening dilemmas emerge. The frequency at which these dilemmas

will emerge is extremely hard to estimate, just as it is extremely hard to

estimate the rate at which human drivers find themselves in comparable

situations. Human drivers who die in crashes cannot report whether

they were faced with a dilemma; and human drivers who survive a

crash may not have realized that they were in a dilemma situation.

Note, though, that ethical guidelines for autonomous vehicle choices in

dilemma situations do not depend on the frequency of these situations.

Regardless of how rare these cases are, we need to agree beforehand

how they should be solved.

The key word here is ‘we. As emphasized by former US president

Barack Obama9, consensus in this matter is going to be important.

Decisions about the ethical principles that will guide autonomous vehi-

cles cannot be left solely to either the engineers or the ethicists. For con-

sumers to switch from traditional human-driven cars to autonomous

vehicles, and for the wider public to accept the proliferation of artificial

intelligence-driven vehicles on their roads, both groups will need to

understand the origins of the ethical principles that are programmed

into these vehicles10. In other words, even if ethicists were to agree on

how autonomous vehicles should solve moral dilemmas, their work

would be useless if citizens were to disagree with their solution, and

thus opt out of the future that autonomous vehicles promise in lieu of

the status quo. Any attempt to devise artificial intelligence ethics must

be at least cognizant of public morality.

Accordingly, we need to gauge social expectations about how auton-

omous vehicles should solve moral dilemmas. This enterprise, how-

ever, is not without challenges11. The first challenge comes from the

high dimensionality of the problem. In a typical survey, one may test

whether people prefer to spare many lives rather than few9,12,13; or

whether people prefer to spare the young rather than the elderly14,15;

or whether people prefer to spare pedestrians who cross legally, rather

than pedestrians who jaywalk; or yet some other preference, or a sim-

ple combination of two or three of these preferences. But combining a

dozen such preferences leads to millions of possible scenarios, requiring

a sample size that defies any conventional method of data collection.

The second challenge makes sample size requirements even more

daunting: if we are to make progress towards universal machine ethics

(or at least to identify the obstacles thereto), we need a fine-grained under-

standing of how different individuals and countries may differ in their eth-

ical preferences

16,17

. As a result, data must be collected worldwide, in order

to assess demographic and cultural moderators of ethical preferences.

As a response to these challenges, we designed the Moral Machine,

a multilingual online ‘serious game’ for collecting large-scale data on

how citizens would want autonomous vehicles to solve moral dilemmas

in the context of unavoidable accidents. The Moral Machine attracted

worldwide attention, and allowed us to collect 39.61million decisions

from 233 countries, dependencies, or territories (Fig.1a). In the main

interface of the Moral Machine, users are shown unavoidable accident

scenarios with two possible outcomes, depending on whether the

autonomous vehicle swerves or stays on course (Fig.1b). They then

click on the outcome that they find preferable. Accident scenarios are

generated by the Moral Machine following an exploration strategy that

1The Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA. 2Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA. 3Department of Psychology,

University of British Columbia, Vancouver, British Columbia, Canada. 4Toulouse School of Economics (TSM-R), CNRS, Université Toulouse Capitole, Toulouse, France. 5Institute for Data, Systems &

Society, Massachusetts Institute of Technology, Cambridge, MA, USA. *e-mail: shariff@psych.ubc.ca; jean-francois.bonnefon@tse-fr.eu; irahwan@mit.edu

1 NOVEMBER 2018 | VOL 563 | NATURE | 59

© 2018 Springer Nature Limited. All rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved

... Altruism shapes individual well-being, community cooperation, and social functioning [1][2][3][4][5][6] . A substantial body of research across biology, neuroscience, psychology, and economics has clarified altruism's role in human evolution, its neural mechanisms, as well as its behavioral determinants and consequences [7][8][9][10][11][12] . ...

Altruism is fundamental to human societies, fostering cooperation and social cohesion. Recent studies suggest that large language models (LLMs) can display human-like prosocial behavior, but the internal computations that produce such behavior remain poorly understood. We investigate the mechanisms underlying LLM altruism using sparse autoencoders (SAEs). In a standard Dictator Game, minimal-pair prompts that differ only in social stance (generous versus selfish) induce large, economically meaningful shifts in allocations. Leveraging this contrast, we identify a set of SAE features (0.024% of all features across the model's layers) whose activations are strongly associated with the behavioral shift. To interpret these features, we use benchmark tasks motivated by dual-process theories to classify a subset as primarily heuristic (System 1) or primarily deliberative (System 2). Causal interventions validate their functional role: activation patching and continuous steering of this feature direction reliably shift allocation distributions, with System 2 features exerting a more proximal influence on the model's final output than System 1 features. The same steering direction generalizes across multiple social-preference games. Together, these results enhance our understanding of artificial cognition by translating altruistic behaviors into identifiable network states and provide a framework for aligning LLM behavior with human values, thereby informing more transparent and value-aligned deployment.

... Its deployment in autonomous vehicles, judicial systems, healthcare, and military applications has prompted scholars and policymakers to confront an unsettling question: when AI systems make ethically significant decisions, are they reasoning morally, or merely simulating moral behaviour based on programmed logic? Machine is an experiment that was conducted by the researchers at MIT and involved millions of participants to study how autonomous cars ought to act in the ethically difficult situations (Awad et al., 2018). The findings showed that there was vast cultural difference in moral preferences creating immediate concerns as to whether universalistic moral programming was possible. ...

This paper carried out a philosophical analysis of the moral agency concept as applied to artificial intelligence. Based on traditional theories developed by Aristotle, Kant, and Hume, the research outlines the normative standards of agency, that is, autonomy, intentionality, and moral responsibility. It is against this theoretical scaffolding that the paper critically analyses the current AI systems and concludes that, despite the ability to simulate moral agency by using advanced rule-based systems or machine-learning algorithms, they do not possess the critical qualities of consciousness, free will, or moral self-reflection. Through the use of case studies and thought experiments, specifically the Turing Test and the Chinese Room argument, the analysis proves that the existing AI systems do not possess the ontological depth necessary to be considered as the agents of true morality. The paper is not a statement that AI is incompetent as an agency; it explores other conceptions, including distributed agency and relational agency; in particular, it explores socio-technical systems where human and machine actions are closely embedded. The ethical implications, especially in low-regulatory environments, as is the case in Pakistan, are given special consideration, where digital inequalities and pluralism across cultures can serve to expand accountability gaps. Finally, the paper will argue that making AI the subject of moral agency is not only philosophically unsound but also morally dangerous. It promotes a less carefree attitude to human responsibility and suggests extensive ethical regulation of the creation and implementation of intelligent systems.

... According to Alderson and Morrow (2020), since educational studies involve information, they must all include research ethics. Ethics is defined by Awad et al. (2018) as a body of generally accepted moral principles that provides norms and societal expectations for appropriate behavior of subjects and respondents. Additionally, research ethics addresses what is ethically appropriate and inappropriate when dealing with participants or accessing stored data, as Alderson and Morrow (2020) explain. ...

This study looks at the attitudes of policymakers regarding equity in the practice and policy of mathematics education. In addition to the culture of mathematics education, it also examines the broader culture. It explores the effects this has on teachers, lawmakers, and students, as well as how the disorderly flow of ideas from one subject to the next shapes how language is expressed, both orally and in writing. The findings are based on a flexible, essential instructional approach to mathematical word problems that improves learning outcomes. To facilitate meaningful learning, create inclusive learning environments, and make content accessible and flexible, teaching is crucial. Consequently, to maximise learning, educators must provide adaptive instruction for mathematical word problems. The emphasis on the fifth strand is "a consistent propensity to view mathematics as meaningful, practical, and worthwhile, along with a faith in one’s own productivity and hard work.” These authors claim that students excelled in conceptual understanding, procedural fluency, and strategic competence. Furthermore, a robust correlation was observed between procedural fluency and conceptual comprehension. Nonetheless, conceptual knowledge serves as the foundation for all other mathematical concepts.

... All educational studies, according to Alderson and Morrow (2020), incorporate research ethics because they involve collecting information. Awad et al. (2018) define ethics as a body of widely accepted moral concepts that may offer guidelines and social expectations for the proper conduct of participants and respondents in experiments. In addition, Alderson and Morrow (2020) elaborate that research ethics concern what is morally right and wrong when dealing with participants or accessing stored data. ...

Multiple means of engagement address problems and emotional dispositions, such as demotivation, anxiety, and disengagement, that learners often experience across various school subjects, particularly in mathematics. There are multiple factors related to student underachievement and disengagement in mathematics. Data processing in qualitative research is inductive, creating themes and patterns. While there are many benefits to using sources as data, several limitations must be considered when employing this research methodology. The multiple means of engagement principle is expedited through the application of numerous, flexible methods of student engagement in learning, incorporated into teaching to support affective learning. Increasing learners’ motivation to learn mathematics is crucial in the classroom environment. Further, it is argued that there is a connection between learners’ motivation and comprehension in mathematics and increased academic performance. The term paradigm is used in educational research to describe a researcher’s worldview, defined as the community’s beliefs and knowledge. The information gathered from books, articles, and journals is organised into themes and presented accordingly. Validity is a fundamental concept in research, referring to the extent to which a test, measurement, or study accurately reflects or assesses the specific concept that the researcher is attempting to measure. The problem is that, for many years, teachers have taught mathematics through whole-group lectures, resulting in very limited learner engagement, instruction, and participation.

... Regarding cultural differences, empirical research has shown considerable variation in the preferences for solutions of moral dilemma situations (Awad et al., 2018(Awad et al., , 2022. A fruitful theoretical background for classification of countries can be derived from Hofstede (1979). ...

The discussion about the fair allocation of scarce medical resources through triage decisions gained momentum during the COVID-19 pandemic. Ethical experts proposed to maximize the overall treatment benefits by prioritizing patients with the highest estimated post-treatment length of life, and apply a random lottery if the estimated post-treatment length of life for two patients is equal. However, implementing expert ethics requires the procedures to at least coincide with the preferences of the general public to ensure trust in institutions and social cohesion. Although there is a growing body of research on preferences of the public in triage decisions, evidence for preferences in accordance with this rank-ordering approach is lacking. To address this research gap, 1,998 English-speaking adults from North America and Europe completed an online conjoint experiment. We find a close match between the rank-ordering approach and participants’ preferences. We discuss implications and provide guidance for future research.

... While AI/ ML may be able to figure out what the majority believes is right, we are doubtful that such information answers the question of what is actually right. For example, this kind of assumption-the assumption that we can crowdsource morality, so to speak-is at the very core of the now wellknown Moral Machine project (Awad et al. 2018). In that context, researchers have participants play a trolley style game, where the game involves making decisions about who should be saved, and who should be injured, by a self-driving car. ...

As artificial intelligence and machine learning (AI/ML) systems become increasingly pervasive in society, their opacity—i.e., the difficulty, and sometimes impossibility, of understanding why they make the decisions they make—has become a serious problem. This is especially true in sensitive decision-making contexts, such as criminal justice, health care, and finance, or in choices requiring allocation of scarce resources. One attempt to “open up” the AI/ML black box has been the emergence of post hoc explainability algorithms—algorithms which generate post hoc approximations to black box models. However, such algorithms have been criticized as merely providing after the fact rationalizations for the decisions these systems make. In this paper, we defend and articulate a different concept—AI/ML justifiability. We explore several ways in which an algorithm could be justifiable, and we argue that pursuing justifiability is a worthwhile goal. A key to our argument is a distinction from the philosophy of action between motivating and normative reasons: effective explanations require (but are unable to provide) motivating reasons, while effective justifications require (and can indeed provide) normative reasons alone. We conclude that as long as a model is justifiable, it can be trusted even if it cannot be explained.

The integration of Large Language Models (LLMs) into strategic environments marks a paradigm shift in decision science, transitioning AI from a supportive tool to a proactive "cognitive partner" in shaping human choices. 1 This article investigates the philosophical implications of "Decision-Making Mastery" in the age of generative AI, examining how these models interact with the core tenets of Expected Utility Theory (EUT), Bayesian updating, and bounded rationality. 2 By conceptualizing the LLM as Homo silicus-an implicit computational model of human behavior trained on vast social datasets-we explore the extent to which synthetic agents mirror or diverge from normative rational standards. 4 Empirical findings from simulated game-theoretic interactions, including the iterated Prisoner's Dilemma and the Moral Machine experiment, reveal that while LLMs exhibit human-like biases and "content effects," they also face unique structural bounds, such as positional encoding-induced martingale violations and context window limitations. 6 The study demonstrates that LLMs are "Bayesian in expectation, but not in realization," and suggests that their strategic utility lies in augmenting human cognition rather than replacing it.

The ethics of autonomous cars and automated driving have been a subject of discussion in research for a number of years (cf. Lin 2015; Goodall in Transportation Research Record: Journal of the Transportation Research Board 2424:58–65, 2014; Goodall in IEEE Spectrum 53(6):28–58, 2016). As levels of automation progress, with partially automated driving already becoming standard in new cars from a number of manufacturers, the question of ethical and legal standards becomes virulent. For exam-ple, while automated and autonomous cars, being equipped with appropriate detection sensors, processors, and intelligent mapping material, have a chance of being much safer than human-driven cars in many regards, situations will arise in which accidents cannot be completely avoided. Such situations will have to be dealt with when programming the software of these vehicles. In several instances, internationally, regulations have been passed, based on legal considerations of road safety, mostly. However, to date, there have been few, if any, cases of a broader ethics code for autonomous or automated driving preceding actual regulation and being based on a broadly composed ethics committee of independent experts. In July 2016, the German Federal Minister of Transport and Digital Infrastructure, Alexander Dobrindt, appointed a national ethics committee for automated and connected driving, which began its work in September 2016. In June 2017, this committee presented a code of ethics which was published in German (with annotations, BMVI 2017a) and in English (cf. BMVI 2017b). It consists of 20 ethical guidelines. Having been a member of this committee, I will present the main ethical topics of these guidelines and the discussions that lay behind them.

p>Self-driving cars offer a bright future, but only if the public can overcome the psychological challenges that stand in the way of widespread adoption. We discuss three: ethical dilemmas, overreactions to accidents, and the opacity of the cars’ decision-making algorithms — and propose steps towards addressing them.</p

As intelligent systems are increasingly making decisions that directly affect society, perhaps the most important upcoming research direction in AI is to rethink the ethical implications of their actions. Means are needed to integrate moral, societal and legal values with technological developments in AI, both during the design process as well as part of the deliberation algorithms employed by these systems. In this paper, we describe leading ethics theories and propose alternative ways to ensure ethical behavior by artificial systems. Given that ethics are dependent on the socio-cultural context and are often only implicit in deliberation processes, methodologies are needed to elicit the values held by designers and stakeholders, and to make these explicit leading to better understanding and trust on artificial autonomous systems.

AI is here now, available to anyone with access to digital technology and the Internet. But its consequences for our social order aren't well understood. How can we guide the way technology impacts society?

Two separate bodies of work have examined whether culture affects cooperation in economic games and whether cooperative or non-cooperative decisions occur more quickly. Here, we connect this work by exploring the relationship between decision time and cooperation in American versus Indian subjects. We use a series of dynamic social network experiments in which subjects play a repeated public goods game: 80 sessions for a total of 1,462 subjects (1,059 from the United States, 337 from India, and 66 from other countries) making 13,560 decisions. In the first round, where subjects do not know if connecting neighbors are cooperative, American subjects are highly cooperative and decide faster when cooperating than when defecting, whereas a majority of Indian subjects defect and Indians decide faster when defecting than when cooperating. Almost the same is true in later rounds where neighbors were previously cooperative (a cooperative environment) except decision time among Indian subjects. However, when connecting neighbors were previously not cooperative (a non-cooperative environment), a large majority of both American and Indian subjects defect, and defection is faster than cooperation among both sets of subjects. Our results imply the cultural background of subjects in their real life affects the speed of cooperation decision-making differentially in online social environments.

Codes of conduct in autonomous vehicles When it becomes possible to program decision-making based on moral principles into machines, will self-interest or the public good predominate? In a series of surveys, Bonnefon et al. found that even though participants approve of autonomous vehicles that might sacrifice passengers to save others, respondents would prefer not to ride in such vehicles (see the Perspective by Greene). Respondents would also not approve regulations mandating self-sacrifice, and such regulations would make them less willing to buy an autonomous vehicle. Science , this issue p. 1573 ; see also p. 1514

It would be desirable if, as a society, we could reduce the amount of landfill trash we create, the amount of carbon dioxide we emit, the amount of forest we clear, etc. Since we cannot (or are in any case not willing to) simultaneously pursue all these objectives to their maximum extent, we must prioritize among them. Currently, this is done mostly in an ad-hoc manner, with people, companies, local governments, and other entities deciding on an individual basis which of these objectives to pursue, and to what extent. A more systematic approach would be to set, at a global level, exact numerical tradeoffs: using one gallon of gasoline is as bad as creating x bags of landfill trash. Having such tradeoffs available would greatly facilitate decision making, and reduce inefficiencies resulting from inconsistent decisions across agents. But how could we arrive at a reasonable value for x? In this paper, we argue that many techniques developed in the multiagent systems community, particularly those under economic paradigms, can be brought to bear on this question. We lay out our vision and discuss its relation to computational social choice, mechanism design, prediction markets, and related topics.

As intelligent systems are increasingly making decisions that directly affect society, perhaps the most important upcoming research direction in AI is to rethink the ethical implications of their actions. Means are needed to integrate moral, societal and legal values with technological developments in AI, both during the design process as well as part of the deliberation algorithms employed by these systems. In this paper, we describe leading ethics theories and propose alternative ways to ensure ethical behavior by artificial systems. Given that ethics are dependent on the socio-cultural context and are often only implicit in deliberation processes, methodologies are needed to elicit the values held by designers and stakeholders, and to make these explicit leading to better understanding and trust on artificial autonomous systems.

Deception is common in nature and humans are no exception. Modern societies have created institutions to control cheating, but many situations remain where only intrinsic honesty keeps people from cheating and violating rules. Psychological, sociological and economic theories suggest causal pathways to explain how the prevalence of rule violations in people's social environment, such as corruption, tax evasion or political fraud, can compromise individual intrinsic honesty. Here we present cross-societal experiments from 23 countries around the world that demonstrate a robust link between the prevalence of rule violations and intrinsic honesty. We developed an index of the 'prevalence of rule violations' (PRV) based on country-level data from the year 2003 of corruption, tax evasion and fraudulent politics. We measured intrinsic honesty in an anonymous die-rolling experiment(5). We conducted the experiments with 2,568 young participants (students) who, due to their young age in 2003, could not have influenced PRV in 2003. We find individual intrinsic honesty is stronger in the subject pools of low PRV countries than those of high PRV countries. The details of lying patterns support psychological theories of honesty. The results are consistent with theories of the cultural co-evolution of institutions and values, and show that weak institutions and cultural legacies that generate rule violations not only have direct adverse economic consequences, but might also impair individual intrinsic honesty that is crucial for the smooth functioning of society.