The ear does not do a Fourier transform

dissonances.blog

229 points by izhak 4 hours ago


edbaskerville - 3 hours ago

To summarize: the ear does not do a Fourier transform, but it does do a time-localized frequency-domain transform akin to wavelets (specifically, intermediate between wavelet and Gabor transforms). It does this because the sounds processed by the ear are often localized in time.

The article also describes a theory that human speech evolved to occupy an unoccupied space in frequency vs. envelope duration space. It makes no explicit connection between that fact and the type of transform the ear does—but one would suspect that the specific characteristics of the human cochlea might be tuned to human speech while still being able to process environmental and animal sounds sufficiently well.

A more complicated hypothesis off the top of my head: the location of human speech in frequency/envelope is a tradeoff between (1) occupying an unfilled niche in sound space; (2) optimal information density taking brain processing speed into account; and (3) evolutionary constraints on physiology of sound production and hearing.

antognini - an hour ago

If you want to get really deep into this, Richard Lyon has spent decades developing the CARFAC model of human hearing: Cascade of Asymmetric Resonators with Fast-Acting Compression. As far as I know it's the most accurate digital model of human hearing.

He has a PDF of his book about human hearing on his website: https://dicklyon.com/hmh/Lyon_Hearing_book_01jan2018_smaller...

shermantanktop - 3 hours ago

The thesis about human speech occupying less crowded spectrum is well aligned with a book called "The Great Animal Orchestra" (https://www.amazon.com/Great-Animal-Orchestra-Finding-Origin...).

That author details how the "dawn chorus" is composed of a vast number of species making noise, but who are able to pick out mating calls and other signals due to evolving their vocalizations into unique sonic niches.

It's quite interesting but also a bit depressing as he documents the decline in intensity of this phenomenon with habitat destruction etc.

superb-owl - 2 hours ago

The title seems a little click-baity and basically wrong. Gabor transforms, wavelet transforms, etc are all generalizations of the fourier transform, which give you a spectrum analysis at each point in time

The content is generally good but I'd argue that the ear is indeed doing very Fourier-y things.

kazinator - 3 hours ago

> A Fourier transform has no explicit temporal precision, and resembles something closer to the waveforms on the right; this is not what the filters in the cochlea look like.

Perhaps the ear does someting more vaguely analogous to a discrete Fourier transforms on samples of data, which is what we do in a lot of signal processing.

In signal processing, we take windowed samples, and do discrete transforms on these. These do give us some temporal precision.

There is a trade off there between frequency and temporal precision, analgous to the Pauli exclusion principle in quantum mechanics. The better we know a frequency, the less precisely we know the timing. Only an infinite, periodic signal has a single precise frequency (or precise set of harmonics) which are infinitely narrow blips in the frequency domain.

The continuous Fourier transform deals with periodic signals only. We transform an entire function like sin(x) over the entire domain. If that domain is interpreted as time, we are including all of eternity, so to speak from negative infinite time to positive.

xeonmc - 4 hours ago

Nit: It’s an unfortunate confusion of naming conventions, but Fourier Transform in the strictest sense implies an infinite “sampling” period, while the finite “sample” period counterpart would correspond to Fourier Series even though we colloquially refer to them interchangeably.

(I had put “sampling” in quotes as they’re actually “integration period” in this context of continuous time integration, though it would be less immediately evocative of the concept people are colloquially familiar with. If we actually further impose a constraint of finite temporal resolution so that it is honest-to-god “sampling” then it becomes Discrete Fourier Transform, of which the Fast Fourier Transform is one implementation of.)

It is this strict definition that the article title is rebuking, but it’s not quite what the colloquial usage loosely evokes in most people’s minds when we usually say Fourier Transform as an analysis tool.

So this article should have been comparing to Fourier Series analysis rather than Fourier Transform in the pedantic sense, albeit that’ll be a bit less provocative.

Regardless, it doesn’t at all take away from the salient points of this excellent article which are really interesting reframing of the concepts: what the ear does mechanistically is applying a temporal “weigting function” (filter) so it’s somewhere between Fourier series and Fourier transform. This article hits the nail on the head on presenting the sliding scale of conjugate domain trade offs (think: Heisenberg)

fennec-posix - 27 minutes ago

"It appears that human speech occupies a distinct time-frequency space. Some speculate that speech evolved to fill a time-frequency space that wasn’t yet occupied by other existing sounds."

I found this quite interesting, as I have noticed that I can detect voices in high-noise environments. E.g. HF Radio where noise is almost a constant if you don't use a digital mode.

amelius - 9 minutes ago

What does the continuous tingling of a hair cell sound like to the subject?

rolph - 34 minutes ago

supplemental:

Neuroanatomy, Auditory Pathway

https://www.ncbi.nlm.nih.gov/books/NBK532311/

Cochlear nerve and central auditory pathways

https://www.britannica.com/science/ear/Cochlear-nerve-and-ce...

Molecular Aspects of the Development and Function of Auditory Neurons

https://pmc.ncbi.nlm.nih.gov/articles/PMC7796308/

javier_e06 - an hour ago

This is fascinating.

I know of vocoders in the military hardware that encode voices to resemble something more simple for compression (a low-tone male voice), smaller packets that take less bandwidth. This evolution of the ear to must also have evolved with our vocal chords and mouth to occupy available frequencies for transmission and reception for optimal communication.

The parallels with waveforms don't end there. Waveforms are also optimized for different terrains (urban, jungle).

Are languages organic waveforms optimized to ethnicity and terrain?

Cool article indeed.

- 21 minutes ago
[deleted]
adornKey - 3 hours ago

This subject has bothered me for a long time. My question to guys into acoustics was always: If the cochlea performs some kind of Fourier transform, what are the chances, that it uses sinus waves as a base for the vector-space? - if it did anything like that it could just as good use any slightly different wave-forms as a base for transformation. Stiffness and non-linearity will for sure take care that any ideal rubber model in physics will in reality be different from the perfect sinus.

tryauuum - 4 hours ago

man I need to finally learn what a Fourier transform is

debo_ - an hour ago

Fourear transform

rolph - 4 hours ago

FT is frequency domain representation.

neural signaling by action potential, is also a representation of intensity by frequency.

the cochlea is where you can begin to talk about bio-FT phenomenon.

however the format "changes" along the signal path, whenever a synapse occurs.

p0w3n3d - 4 hours ago

Tbh I used to think that it does. For example, when playing higher notes, it's harder to hear the out-of-tune frequencies than on the lower notes.

gowld - 3 hours ago

Why is there no box diagram for cochlea "between wavelet and Gabor" ?

brcmthrowaway - 2 hours ago

OT: Does anyone here believe in Intelligent Design?

bloppe - 4 hours ago

Man, I've been spreading disinformation for years.

lala_ - 3 hours ago

[flagged]