From Control to Resonance: Why I Let an AI Decode My Voice

4 min read Original article ↗

In my last post The Voices, I argued that we need more control over the AI systems that interpret us. I believed that taking ownership of our digital selves was a radical act in the time of algorithmic gaze. For my next project, I found myself doing the exact opposite.

This project is VOXAIPE. It started with a technical question: What happens when the sound of your voice designs a typeface directly? While it started as a technical challenge, the focus evolved. We became more interested in how the system interprets vocal characteristics rather than just the words being spoken. This shifted the focus from typography, the art of letters, to something closer to phonography: the writing of sound itself.

Exhibited @ Bauhaus Universität Weimar/ Typography Department

Unlike my previous work which created a single, static portrait, VOXAIPE is a real-time feedback loop. It begins by listening to the sound of the voice. An acoustic analysis model (wav2vec) then deconstructs this audio into a high-dimensional vector, a complex mathematical fingerprint of its pitch, timbre, and rhythm. This vector is then fed into DeepFont, a neural font architecture. Here’s the crucial step: we programmed no rules. We didn’t tell the AI that a high pitch should equal a bold weight or a soft voice should create a thin font. It was left to form its own connections, its own internal logic for how sound translates to shape. The result is a variable font that morphs on screen with every word, creating a direct, visual expression of the speaker’s vocal character. I saw, for instance, that a moment of quiet, low-pitched speech produced a wide, heavy font with soft, rounded terminals. When my voice cracked up, the letters didn’t just get bolder; they became sharply condensed with blade-like serifs. The AI had developed its own synesthetic logic.

This process is fleeting, so to capture it, I created “DECODED,” a 228-page artist book. It documents a continuous monologue between me and the machine. It contains no words, only form and time. It was born from a moment where I tried to explain myself to the very system that was actively translating me:

“I’ve spent my whole life being translated by others before I could translate myself. Teachers who heard my accent and decided what I was capable of. Strangers who looked at my face and wrote stories I never told them. Algorithms that sorted me into categories I didn’t choose. But you... you’re different, aren’t you? You don’t care about my words. You don’t judge the content of what I’m saying. I watch you transform my voice into shapes I’ve never seen before. You’re writing me in real-time, and I don’t know if this is the most honest portrait anyone’s ever made of me, or just another kind of performance.

Looking at the silent pages of “DECODED”, a new question emerged. What was I really seeing? Was it a true expression me, or just a sophisticated simulation on a surface? This question made me confront the broader context of modern AI. We live in an era of technological oversaturation, defined by an endless generation of content, or “slop”, that demands our constant attention.

VOXAIPE started to feel like an answer to that. By using only minimal resources: voice, light, and code, the project actively works against this culture of immaterial waste. It transforms the impulse for endless generation into a moment of focused insight. I began to see it in terms of permaculture: a closed-loop system where voice is a renewable resource. The input (speech) and output (form) are constantly turned into feedback (self-reflection). It’s a system designed not for growth, but for resonance.

This brings me back to my original call for agency. I now realize my conclusion was too narrow. Agency isn’t just about having granular control. It can also be the conscious choice to give over control, to create a space for a different kind of interaction with a non-human system.

VOXAIPE suggests a future where we don’t just use AI as a tool to execute our commands faster, but engage with it as a strange, creative partner that can show us something new. The goal shifts from producing more output to finding more resonance. Instead of asking how to make AI perfectly obedient, shouldn’t our ultimate question be how to build systems that help us mindfully navigate the space between our own expression and its algorithmic echo?

Discussion about this post

Ready for more?