PNAS · 10.1073/pnas.2404121121 · Dec 202411 min read

A corollary discharge circuit in human speech, or: how the brain whispers what it is about to say, to itself, in advance.

Every animal that moves needs a copy of the motor command sent to sensory systems so they can tell "I did that" from "something happened to me." In humans, this corollary discharge was inferred from behaviour for decades but never located. We mapped it. It lives in ventral precentral gyrus. It takes 120 ms to arrive at auditory cortex.
HUMAN CORTEX · CORTICAL PATH OF THE DISCHARGE (schematic)ventral preCGsource (t=0)STGsink (t≈120ms)30ms70ms100msAUDITORY INPUT (your own voice)± cancelled by incoming dischargePREDICTED SPECTROGRAM (from discharge)Self ≈ predicted. Prediction error ≈ the part that is new.
Fig. 1 — The discharge travels from ventral precentral gyrus to superior temporal gyrus in ~120ms, carrying a prediction of what the mouth is about to produce.

If you whisper to yourself in a quiet room, the sound of your own voice does not startle you. If someone else whispers the same word, at the same volume, a foot from your ear, you jump. The nervous system has a prior on who made the sound, and the prior is built out of something the brain sent to itself a moment before the mouth moved.

That something is the corollary discharge. In frogs, in crickets, in cats, in primate visuomotor control, we have known for seventy years that such a signal exists. What had been missing, for humans, was a receipt. This paper is the receipt.

Why it was hard to catch

Behaviourally, corollary discharge has been easy to infer — auditory cortex suppresses during self-generated sound, perturbations of your own voice in real time elicit specific correction patterns, schizophrenia is partly a story about discharge failures. None of those tell you where the signal starts, when it arrives, or what it carries.

The reason the source was elusive is that the signal lives on a very short timeline — it has to arrive at auditory cortex before the sound reaches the ear, which gives it on the order of 100ms to travel a few centimetres of cortex. Non-invasive tools either do not have the spatial resolution (EEG/MEG) or do not have the temporal resolution (fMRI) to localise it.

We used intracranial ECoG recordings from patients undergoing clinical monitoring — the one window where you get millisecond timing AND millimetre localisation AND healthy-ish human cortex. And we used connectivity techniques that could tell us not just when a region was active but where the activity came from.

Interactive · slide the time cursor
preCG (source)STG (sink)060ms120ms180ms
Slide the time cursor. The yellow signal leaves preCG at t=0 and arrives at STG around t=120ms — just in time to pre-empt the echo.
Simulated schematic of Fig. 3 — spatial path + timing

What the signal actually looks like

The discharge is not a single pulse. It is a tiled, frequency-specific pattern that matches the spectral content of the speech about to be produced — high energy in the 50–150 Hz range during voiced segments, silence during the stop consonants, band-limited activity timed to formants. It is, in other words, a prediction of the spectrogram, delivered in advance.

If the discharge were just "here comes speech, hush," the phenomenon would be interesting but architecturally simple. That it is frequency-specific means the downstream comparator is doing something much more like subtraction than gating. The auditory cortex doesn't go quiet; it goes unsurprised.

The most elegant part of the paper, to me, is that the discharge spectrum tracks the about-to-be-produced phoneme. Not the category. The actual acoustic trajectory.

The clinical hook

A natural question is whether the discharge is disrupted in conditions that have self-agency symptoms. Schizophrenia is the canonical example — patients sometimes experience their own thoughts as external voices, which is exactly the phenomenology you would expect if the comparator were miscalibrated. We do not answer that question in this paper. But we give the field an anatomical target and a timing window to look in, which is almost certainly the precondition for answering it.

120
ms · source → sink
ventral preCG
anatomical origin
STG
primary target

My role in this paper

I am a middle author. The lead — Amirhossein Khalilian-Gourtani — did the heavy causal-analysis lifting and is the reason the signal-to-noise in the connectivity figures is as clean as it is. My contribution was the spectrogram-matching analysis and some of the single-trial variance figures. I will say, without qualification, that this is the most scientifically mature paper I have been a coauthor on to date, and I am grateful for the window it gave me into how a really good senior-postdoc runs a multi-year investigation.

The feedforward-and-feedback story from paper 06 and this corollary-discharge story are two sides of the same coin. That paper showed the bidirectional flow during production. This paper tells you one specific thing that flow is for. Together they begin to sketch a mechanistic account of how humans produce language without getting drowned out by the sound of producing it.

← all fun papers Next: 13 Stimulus GAN →