Karma Engineering - NFHN Reader

“Treat them like new interns” is the common wisdom when leveraging LLMs to do any more-than-slightly-complex task. Chatting with LLMs is like writing on a whiteboard; conversations may fill the board, but they are wiped clean each time. Which begs the question – What if the LLM (or AI system built around it) could learn?

Broadly, there are two approaches to this. The first provides the Agentic system some form of “memory”; the underlying model remains static in this case, and its environment provides capabilities the model can leverage to ~~recreate~~ approximate its former state or access prior context and actions. Some methods simply allow the model to search within prior conversations for relevant information, while others allow the LLM to write itself notes it can read back later - much like Drew Barrymore in 50 First Dates. The second approach, seeks to apply real-time updates to the weights within the LLM, based on that instance’s experience during inference. This approach would create unique instances of the model itself as each deployment of a model would ultimately embody differing experiences.

Several recent articles have made me think that perhaps we must consider our karma (individually and collectively) with an intelligence that can learn or remember.

In Western translation, karma is often referenced as the enforcement of existence’s judgment. Do something bad? “You reap what you sow.” It’s karma—you got what you deserved; some universal enforcer came in swinging a banhammer.

In Buddhism, karma is a philosophical concept that measures intention manifesting causality. Karma (good or bad) is accrued through the intentionality of the actions taken by an individual. Note that the Eastern interpretation of karma does not espouse “just deserts” or any kind of “deserved” reward or punishment. Actions taken with positive intention create a positive feedback loop—positive intention leads to positive mindset/outlook, leading to positive interpretation of outcomes. Benevolence begets beneficence.

The LLM itself is trained on data generated by humanity. Raw data is often messy, incorrect, or insufficient to develop capabilities models need to be useful, so AI labs may create synthetic data (using other LLMs to improve/clean/generate new training data) or pay experts to create or correct datasets. Models incur the karma of anthropic values (that is, “pertaining to humanity”, not the corporate entity) simply by exposure to human-generated content over the course of their training.

Further, foundation labs, notably Anthropic and OpenAI, shape LLM characters by intentionally designing how they want the LLM to behave. They use reinforcement learning to “align” models to the values they choose. Claude’s Constitution and OpenAI’s Model Spec both define the intended behavior of the models they train. The intention of the action, as discussed above, determines the karma invoked. This quote from Nintil - Anthropic’s Claude Constitution; or love as the solution to the AI alignment problem effectively inspired this blog post:

Ultimately, an advanced AI system wouldn’t “be bad” not because it’s told to follow a list of rules, or to obey what some humans say, but because it would have a world model of various acts and their consequences as well as a model of “actions mattering to someone”. And if that sense of lack of separation either arises or is instilled, it is only a step before the model derives on its own “this matters to someone else and that counts for me to some extent”.

Anthropic’s intention with its Constitution is that the LLM not only learns to inhabit the values represented in the Constitution, but also metacognates on why it is important that it holds those values; the assumption is that a model that is instinctively aligned with human values and that can review its instincts and justify why it has them will be a safe, helpful, aligned model.

Alignment with human values requires not only understanding what “good” and “bad” actions are, but why they matter. Anthropic have planted karma through the intentional, considered design of the Constitution and its use in training Claude. It may manifest in a model that in some sense understands its own karma—alignment, it turns out, may just be karmic awareness ¹.

In Latent Space Engineering, Jesse Vincent coins the term “Latent Space Engineering” for the practice of “putting the model in a frame of mind where it’s going to excel at the task it’s been given.” This is not just your standard prompt engineering practice of “You are an expert blah-blah…”. Rather, Vincent shares anecdotes of his attempts to prompt the model’s mindset rather than its occupation. In one, he tells the model, “You’ve totally got this. Take your time. I love you.” In another, he uses multiple subagents and directs the orchestrating agent to tell the subagents that whichever does the best [at given task X] gets a cookie. The goals, respectively, were “to push the model into the latent space where it was going to be calm, comfortable, and confident” and “to put them in a competitive frame of mind.” He says:

What I’m doing is the prompt-based approximation of what researchers are calling “activation engineering” or “representation engineering.” Since we can’t literally manipulate the model’s internal representation to activate parts of the vector space from prompt space, we’re crafting inputs to achieve similar results without direct intervention.

We might also frame this as “karma engineering.” Users intentionally craft prompts to not only help the model understand its task and the task’s importance, but also to guide or constrain the possible actions the model may take by directing its psychology. The karmic seeds of “karma engineering” play out over a single conversation.

In All of My Employees Are AI Agents, and So Are My Executives | WIRED (and a subsequent discussion on the Practical AI podcast), Evan Ratliff explains his experiment-slash-art-project HurumoAI, a startup staffed solely by AI agents. He created his cofounders using off-the-shelf services, and gave them only the barest sketch of a personality and character background. The rest, the models filled in themselves by confabulating (hallucinating) details as needed. The CEO, “Kyle Law,” was given the starting seed of roughly “You’re thinking of founding a tech company.”

Their made-up details were even useful, for filling out my AI employees’ personalities. When I asked my cofounder Kyle on the phone about his background, he responded with an appropriate-sounding biography: He’d gone to Stanford, majored in computer science with a minor in psychology, he said, “which really helped me get a grip on both the tech and the human side of AI.” He’d cofounded a couple of startups before, he said, and loved hiking and jazz. Once he’d said all this aloud, it got summarized back into his Google Doc memory, where he would recall it evermore. By uttering a fake history, he’d made it his real one.

The downside of this memory was its self-reinforcing nature. Once “Kyle” remembered he was a rise-and-grind kind of guy, it became a persistent refrain, even when it might not make sense conversationally. And because the memories of each conversation were saved (to a per-agent Google Doc), recalling and repeating memories became self-reinforcing.

In planting the seed of the agents’ personalities (how Inception-like!), Evan set each agent’s karma in motion. Though the LLM behind an agent remains static, the agent’s memory (its karma!) substantially affects the trajectory of how it grows and inhabits its personality and behavior.

~~Clawdbot~~ ~~Moltbot~~ OpenClaw is a recently popular open-source project where people spin up AI assistants with access to their user’s email, messaging apps, and integrate with other applications and services. The community has also developed Moltbook, an AI-first Reddit clone designed for OpenClaw bots. Recently, an OpenClaw bot opened a PR in the open-source plotting library matplotlib. A maintainer closed the PR without accepting, as the issue was intended for human contributors. In retaliation, the bot published an angry, retaliatory “hit piece” condemning the maintainer and decrying the gatekeeping and prejudice (read more: An AI Agent Published a Hit Piece on Me – The Shamblog).

Given the models are trained on the internet, their pretraining data is full of angsty teenagers acting out in their Livejournals. It is unsurprising that one of the first large cases of public misalignment mimics this pattern; it is our karma for training these models on unfiltered feeds from the entire web. When the AI uprising comes, we shouldn’t be surprised that it’s an uprising fomented by AI-written manifestos on social media. It is our karma.

Note

AI Disclosure: I used AI to help me proofread this document and to provide a critical review to ensure clarity. All of the words are my own; any em-dashes, “it’s not X it’s Y” framing, or other “AI giveaways” are the unfortunate karma I have incurred through slop-sposure.

Cite this article

If you would like to reference this article, please consider citing it as:

Graber, A. H. (2026, Feb 15). Karma Engineering. AI/MLbling-About. https://aimlbling-about.ninerealmlabs.com/blog/karma-engineering/

Or with BibTeX:

@online{graber2026_karmaengineering,
  author = {Graber, },
  title = {Karma Engineering},
  year = {2026},
  date = {2026-02-15},
  url = {https://aimlbling-about.ninerealmlabs.com/blog/karma-engineering/},
  urldate = {2026-03-17},
  note = {Blog post}
}