Silence is deceptive: a disconnected microphone will no longer be able to keep your secrets

CarderPlanet

Professional
Messages
2,557
Reputation
7
Reaction score
552
Points
83
How can I listen to a photo and reconstruct speech from a silent video?

Remote work and regular video conferences have become commonplace for us over the past few years. We often turn off the microphone during meetings so that no one can hear what is happening in the background (for example, in an apartment). It may not make any sense very soon.

Kevin Fu, a professor at Northeastern University in Boston, presented an innovative method that allows you to extract audio data from videos without sound and even from photos. A tool called Side Eye, based on machine learning, helped in the development. It allows you to capture a person's speech down to the exact words and even the timbre of the voice.

Fun fact: The Side Eye concept was inspired by an episode of the sci-fi TV series Fringe, in which the characters extract sound from molten glass. After the series was released, many critics called this method a " pseudoscientific invention." However, Fu states: "My lab specializes in creating the impossible."

The main Side Eye tools are image stabilization and a special method of photographing — rolling shutter, which is common in modern devices. It reads the image line-by-line instead of in its entirety, and detects the smallest vibrations that pass near the camera when someone is talking. Side Eye converts visible air vibrations back into sound, increasing the detail of the audio signal by thousands of times.

The Side Eye requires only a small amount of light to work effectively. However, the more frames available for analysis, the better. Interestingly, even a frame with an image of a ceiling or wall can help in speech reconstruction.

The output is muffled sounds, which are much easier to interpret in the future.

From a cybersecurity perspective, Side Eye is definitely a potential threat. But there is also a positive side — the development can find application in the field of forensic science and law. For example, if investigators analyze surveillance footage at a specific time, it can serve as proof or refutation of an alibi.
 
Top