AI Versus AI: Scientists Develop New Technologies To Confuse AI Assistants With Background Noise

take 7 minutes to read
Home News Main article

Big brother is watching you. This sentence comes from the famous British political novel 1984. The slogan can be seen everywhere in the story setting, indicating that there are monitoring devices and secret police all around.

Picture from jason reed/the daily dot

Nowadays, the company uses "bosware" to monitor the employees' working conditions; Many "spyware" applications can record mobile phone calls; Smart home devices represented by Amazon echo can record your daily conversations.

So how to resist these pervasive monitoring? The newly developed "neural voice camouflage" technology is expected to help. You can confuse the AI assistant by generating custom audio noise in the background as you speak.

The new system uses "adversarial attacks". The system deploys the method of machine learning, finds patterns in the data through algorithms, and then adjusts the sound to cover the human voice. Essentially, you use one AI to fool another.

However, this process is not as easy as it sounds. The artificial intelligence of machine learning needs to process the whole sound segment, and then know how to adjust it. When you want to camouflage in real time, this is not feasible.

Therefore, in the new research, researchers taught a neural network, a brain inspired machine learning system, to effectively predict the future. After many hours of voice recording training, it can continuously process two seconds of audio clips and disguise what it may say next.

AI will listen to what has just been said and make sounds that will disturb many phrases that may follow. To human ears, audio sounds like background noise. Spoken language is easy to understand, but it is difficult for machines to understand.

For example, if someone just says "enjoy the great feast", although the system cannot predict what he will say next. However, according to the speaking situation and the voice characteristics of the speaker, the sound produced by it will disturb a series of possible subsequent phrases. This includes what actually happens next; Here, the same speaker says, "that's being cooked.".

To human listeners, this audio camouflage sounds like background noise, and they have no problem understanding spoken language. But it is difficult for machines.

Scientists superimpose the output of their system on the recorded speech because it is directly input into one of the automatic speech recognition (ASR) systems that may be used by eavesdroppers to transcribe. The system improves the word error rate of ASR software from 11.3% to 80.2%. For example, "I'm nearly starred myself, for this qualifying KINDOMS is hard work" is transcribed as "Im moderately Starr my Scell for three for this conqernd KINDOMS as harenar ov the Recon".

The error rates of speech masked by white noise and competitive adversary attacks (lack of prediction ability, only the noise played half a second later to mask the content it just heard) are only 12.8% and 20.5% respectively. This work presented a paper at the International Conference on learning representation last month, which peer reviewed the submitted manuscripts.

Even when ASR system is trained to transcribe speech disturbed by neural speech camouflage (it is conceivable that eavesdroppers will use this technology), the error rate is still 52.5%. Generally speaking, the most difficult words to confuse are short, such as "the", but these are the least exposed parts of the dialogue.

The researchers also tested this method in the real world by playing a voice recording combined with camouflage through a set of speakers in the same room as the microphone. It is still valid. For example, "I also just got a new monitor" is transcribed as "with reasons with they also toscat and neumonitor".

Looking Glass Introduces A Novel Holographic Image Format Web Code Embedding Is As Simple As Video
« Prev 06-02
Violating The Laws Of Physics: Time Crystals Are "impossible" But Subject To Quantum Physics
Next » 06-02