AI is being used to resurrect the voices of dead pilots
Overview
On May 22, 2026, the National Transportation Safety Board (NTSB) temporarily suspended public access to its accident docket system after discovering that AI tools were used to reconstruct cockpit voice recordings from a spectrogram image associated with UPS Flight 2976. The crash, which occurred in Louisville, Kentucky, the previous year, killed the pilots aboard the cargo flight. While federal law strictly prohibits the NTSB from including raw cockpit audio recordings in its public dockets, the agency had inadvertently released a spectrogram file for this investigation. This file contained the audio data encoded as a visual representation of sound frequencies, which users subsequently exploited to resurrect the voices of the deceased pilots using generative AI, prompting a regulatory crackdown.
Key Highlights
- NTSB Action: The agency removed access to its docket system upon discovery of the AI reconstruction and restored access on Friday, but kept 42 investigations closed pending review, including the UPS Flight 2976 case.
- Incident Details: UPS Flight 2976 crashed in Louisville, Kentucky, resulting in the deaths of the pilots; the accident docket included a spectrogram file despite federal prohibitions on cockpit audio.
- Spectrogram Vulnerability: A spectrogram uses a mathematical process to convert sound signals, including low and high frequencies, into an image containing megabytes of encoded data.
- Reconstruction Method: Users combined the publicly available spectrogram image with a public transcript of the recording to create approximations of the cockpit voice recorder audio.
- AI Tools: Social media posts and NTSB confirmations indicate the use of AI tools, specifically citing "Codex," to perform the audio reconstruction.
- Catalyst: Scott Manley, a popular YouTuber covering physics and astronomy, noted on X that it was possible to reconstruct audio from the data encoded in the spectrogram image, drawing attention to the vulnerability.
- Regulatory Response: The NTSB confirmed the reconstruction via an X post and initiated a review of 42 closed investigations to assess similar risks.
Technical Details
The core technical vulnerability lies in the information density of spectrograms and the capability of modern AI models to perform cross-modal reconstruction. A spectrogram is a visual representation of the spectrum of frequencies of a signal as it varies with time; mathematically, it encodes the audio waveform into pixel data. The article notes that the spectrogram file contained "megabytes of data," implying high-fidelity encoding of the original audio signal. Users leveraged this visual data alongside a text transcript, likely using the transcript as a conditioning prompt or alignment guide for the AI model. The mention of "Codex" suggests the use of large language models or specialized code-generation models capable of interpreting the spectrogram data and synthesizing audio, or potentially multimodal models that can process image inputs and generate audio outputs. This process effectively bypasses the NTSB's redaction strategy, demonstrating that converting audio to an image format does not sufficiently obfuscate the underlying audio information against AI-driven reconstruction techniques.
Impact & Significance
This incident underscores a critical failure in traditional data redaction protocols when applied to the era of generative AI. Regulatory bodies and organizations handling sensitive audio data can no longer rely on format conversion or visual obfuscation to protect privacy or comply with legal restrictions. The ability to reconstruct intelligible, emotionally resonant audio from spectrograms using accessible AI tools poses significant ethical and privacy risks, particularly regarding the voices of deceased individuals. For the AI industry, this highlights the dual-use nature of reconstruction models and the urgent need for robust detection mechanisms and policy frameworks to prevent the unauthorized resurrection of sensitive biometric data. Developers and agencies must immediately audit data release pipelines to ensure that "anonymized" or transformed data cannot be reverse-engineered by current AI capabilities.