The First Music Player Owned By Many Small Partners Should Be MP3 Player. Even In That Year, Everyone Defaulted That MP3 Was Synonymous With Music.
But It Is Also Digital Audio. MP3 Is Only One Twelfth Of The Size Of Standard CD. For Human Ears, The Difference Between The Two Is, But It Is Much More Difficult To Detect Than Picture Compression, So What Does It Do To Music? What Did You Lose?
Hello, Everyone. I'm Poor Critic. Today I'd Like To Talk To You About Your Most Common Audio Format - MP3.
What's Missing From MP3
What Is The Difference Between Before And After Compression
If We Want To Reduce The Volume Of A File, The Most Direct Way Is Compression.
Generally, The Compression We Understand Is Repeated Compression. For Example, If You Go To The Supermarket To Buy Five Bottles Of Coca Cola, You Won't Write Coke Five Times On The Small Ticket. Just Write "Coca Cola * 5". This Process Is Equivalent To Encoding The Repeated Parts Of The File With Shorter Bytes. The File Itself Will Not Lose Data, Nor Will It Lose Any Information After Decoding, But It Will Make The File Smaller.
This Is A Kind Of Lossless Pressure Loss. In Fact, This Is The Last Step Of MP3. It Is Completed By An Algorithm Called Huffman Coding. However, If Only This Algorithm Is Used, The Volume Of MP3 Will Not Be Significantly Reduced.
Because Sound Itself Is An Extremely Chaotic Data With Very High Information Entropy. So It's Impossible To Reduce It To 10% Of The CD Volume In This Way.
Since Lossless Compression Is A Dead End, Just Throw Away Some Information.
What Sound Did MP3 Lose?
The Easiest Way For Us To Know The Answer Is To Compare.
We Juxtapose The MP3 Of The Same Sound With Its Lossless Version On Two Tracks And Invert One Of Them. If The Two Sounds Are The Same, They Will Cancel Each Other, And We Should Get A Mute Effect. This Is Also The Working Principle Of Noise Reduction Headphones.
But Because MP3 Is Lossy Compression, We Can't Tell The Difference Between MP3 And Lossless Compression. If A Piece Of Music Is Constantly Switching Between MP3 And Lossless, Can You Really Distinguish It? I'm Sure You Can't Hear It.
This Is The Magic Of MP3 Algorithm. Its Compression Is Not Simply To Lose The Sound Data, But You Can't Notice It While Losing The Data.
Mp3 Birth History
The Story Of Brandenburg And Dieter Seitzer
In The Late 1970s, A German Professor Named Dieter Seitzer Suddenly Came Up With An Advanced Idea. He Wants Everyone To Sit At Home And Use ISDN Telephone Line To Call To Order Music, Which Is Similar To A Jukebox.
At That Time, It Was Only Popular To Use ISDN To Send And Receive Faxes, But It Was Also Called "128kbps" And So On.
So When Dieter Patented The Idea, The Staff Of The Patent Office Told Him It Was Out Of The Question. Unless You Can Increase The ISDN Network Speed By 12 Times, It Will Be Enough To Transmit The Amount Of CD Data.
As Soon As Dieter Heard About The Internet Speed, He Probably Couldn't Do It Himself, But If He Invented An Audio Format That Is Only 1 / 12 The Size Of CD, Wouldn't He Be Able To Build This "digital Jukebox". So He Turned Around And Gave The Invention To A Student Named Karlheinz Brandenburg.
Brandenburg's Master's Thesis Focuses On A Speech Compression Algorithm Called ATC, And Dieter Found Him After Paying Attention To This. But Brandenburg Was Also Depressed When He Received The Task.
I Thought Tnnd You Can't Do It By A Professor. How Can My Little Doctoral Student Do It, But The Task Assigned By The Tutor Can't Be Completed, So He Planned To Spend A Few Years To Prove It Impossible, And Then As Soon As The Thesis Was Written, Mix The Doctoral Degree With The Ball! In The Process Of Proving That "it Is Impossible To Invent An Audio File 12 Times Smaller Than A CD", He Found That, Alas, It Is Really Possible.
Psychoacoustics And MP3
The Key To Making Brandenburg Feel That This Matter Has A Play Is An Extremely Unpopular Subject Called Psychoacoustics.
This Is A Branch Of Psychophysics, Which Originated From People's Exploration Of Music And Musical Instruments, And Later Became A Discipline To Study The Relationship Between People's Physiological Senses And The Objective Sound World. It Sounds Very Complicated, But It's Actually Easy To Understand. For Example, The Research On The Sound Location Of Human Ears And The Range Of Hearing Belong To The Category Of Psychoacoustics.
There Is A Very Classic Research In Psychoacoustics, That Is, The Isoacoustic Curve. This Curve Tells Us That The Human Ear's Perception Of Different Frequencies Of Sound Varies Greatly. The Range Of Human Hearing Is Between 20-20000 Hz, And The Loudness We Hear Is Different At Different Frequencies In This Range.
Low Frequency Sound Needs A Higher Sound Pressure To Be Equal To The Sound Pressure Of Medium Frequency.
For Example, Bass Needs A Higher Volume To Sound As Loud As The Guitar. This Is Why Bass Instruments Like Bass Have A Much Larger Volume Than Other Speakers.
The Lowest Point In The Figure Appears At About 3000hz, So It Shows That People Are Most Sensitive To The Sound Of This Frequency And Can Be Heard Only With A Relatively Small Sound Pressure. For Example, The Fundamental Frequency Of Most Of The Alarm Sounds We Hear Is 1000-3000hz, So That The Human Ear Can More Easily Capture These Sounds And Avoid Danger.
Interestingly, As We Grow Older, Our Range Of Sound Reception Is Also Decreasing, And Most Adults Can't Hear Sounds With Frequencies Above 16000hz.
Along This Direction, The Discoverers Of The Isoacoustic Curve Have Developed A More Magical Thing, Which Makes Us Realize How Great The Gap Between Our Senses And Objectivity Is.
This Thing Is "masking".
Sound Masking
One Afternoon In 1958, A Psychologist Named Licklider Went To The Dentist. He Told The Doctor That I Didn't Need To Take Anesthetic. Then He Took Out A Pair Of Headphones And Began To Listen To Deafening Music. In The Loud Music, The Dentist Helped Him Get Rid Of Three Decayed Teeth, And He Didn't Feel The Pain As If He Had Been Anesthetized.
Liklider Named The Technology Audiac. Later, He Took It With Him To Visit And Extract Teeth With The Dentist. He Also Helped Many Women Relieve The Pain During Childbirth.
Audiac Uses A Strong Auditory Stimulus To Suppress Pain, Which Is A Cross Sensory Masking Effect.
In The Auditory System, One Sound Can Also Be Masked By Another Sound Emitted At The Same Time. For Example, In A Band, The Guitar Sound Is Often In Position C, But If An Instrument With Similar Frequency Suddenly Joins, Such As Trumpet. Then The Sound Of The Guitar Will Be Briefly Submerged. This Process Is Called Co Frequency Masking.
Let's Use An Animation As An Example. When A Noise With Low To High Frequency Passes Through A Sinusoidal Sound, The Sound Will Be Masked By The Noise.
For Another Example, When You Want To Cover Up Your Farting Sound With A Cough In Class, You'd Better Meet Three Conditions: One Is That The Duration Of The Cough Is Greater Than Or Equal To The Fart Sound, And The Other Is That The Loudness Of The Cough Is Greater Than Or Equal To The Fart Sound. The Third Is To Ensure That The Frequency Of The Two Is Close. If These Three Conditions Are Met, It Is An Excellent Masking.
What Does This Have To Do With MP3? Mp3 Algorithm Makes Use Of This Characteristic Of Human Ear To Throw Away The Drowned Sound In Different Frequencies In The Song. This Can Minimize The Loss Of Sound Quality Under The Condition Of Reducing The File Volume.
Temporal Masking
But That's Not Enough.
When We Hear A Sudden Stop Of Noise, There Will Actually Be A Gradually Weakening Masking Effect Of 100-200ms. During The Period After The Noise Completely Stops, The Voice Smaller Than Him Will Be Masked And We Can't Hear It At All, Just As It Takes Our Ears 200ms To Regain Consciousness.
Not Only That, Noise Can Also Mask The Sound Before It. Although It Is Only 50ms, It Has Been A Long Time For The Senses, Which Means That Our Brain Needs A 50ms Buffer To Report To Consciousness.
This Process Is Called Time Masking.
The Core Of MP3 Compression Algorithm Is To Use A Human Auditory Psychology Model That Has Been Carefully Iterated For Many Years To Correspond Every Moment In Music To Every Frame In MP3 File Format, Check The Frequency And Time Period Of The Above Two Masking Effects In This Frame, And Throw Away All The Audio Information That Is Covered And We Can't Hear.
This Process Is Not Simply An Accurate And Mechanical Judgment. Its Background Color Is Actually The Control Of The Senses.
In The Early Stage Of MP3 Algorithm Testing, Testers Need To Find Out The Problem Of MP3 Compression Algorithm In A Large Number Of Songs. They Should Compare MP3 And Lossless Versions Of Various Songs, And They Should Rate Each Song. There Are Four Levels: No Difference, A Little Difference, A Little Ugly And Very Ugly. In Particular, The Last Two Options Can Be Said To Be A Very Subjective Judgment.
This Means That The Invention And Improvement Of MP3 Algorithm Actually Take People's Subjective Judgment As One Of The Objectives Of Consideration. We Cannot Say That This Algorithm Is Completely Subjective, And It Is Not Absolutely Objective, So Its Effect Cannot Be Average In All Songs.
Battle Between Vega And Audio Coding Standards
Speaking Of This, I Have To Mention A Little Story In The Process Of MP3 Invention. In The Final Stage Of MP3 Compression Algorithm Testing, Brandenburg And His Colleagues Encountered A Big Problem. At That Time, Brandenburg And His Colleagues Felt That Their Algorithm Was Very Awesome, And It Was Difficult To Hear The Difference In Almost All Double-blind Tests.
One Day, He Accidentally Saw A Magazine Saying That People Like To Test Their Speakers With Suzanne Vega's Song Tom's Diner, And He Happened To See The CD In Fraunhofer Lab, So He Uploaded The Song To The Computer.
This Song Is Very Simple. It Is A Pure Vocal Song Without Accompaniment. But When He Processed The Song Through The MP3 Compression Algorithm, He Got Such An Effect.
At The Lower Bit Rate Of MP3, Vega's Voice Becomes Hoarse And Unnatural. So In The Following Year, The R & D Team Made Thousands Of Minor Adjustments To The MP3 Algorithm. Brandenburg Said He Listened To The Song At Least 3000 Times, Probably More Than Anyone On Earth.
Finally, They Successfully Compressed The Song Tom's Diner, And Really Improved The MP3 Compression Algorithm Through This Song.
Many Years Later, Brandenburg Really Met Vega And Listened To Her Sing The Tom's Diner Live. Although He Has Heard It Countless Times, He Said That The Song Is Still Very Good.
Brandenburg Finally Published His Paper In 1989. The Next Step Is To Bring This Technology To The World. That Is, In The Early 1990s, Several Emerging Technologies Suddenly Appeared In The Industry. They Are Looking For A New Audio Coding Standard To Use, Including The Familiar "CD-ROM" And "DVD".
So He And His Team Submitted Their Entries To The Moving Picture Expert Group (MPEG). They Wanted To Compete With 13 Other Teams For The New Audio Coding Standard. The Biggest Competitor Comes From An Organization Called MUSICAM. Behind This Organization Is Philips. At That Time, Philips Held The Patent Of CD-ROM, Which Can Be Said To Be In The Ascendant.
Therefore, Although The Technical Data Volume Of Their Team Was Smaller And The Sound Quality Was Stronger, They Lost To MUSICAM In The End.
Because MUSICAM's Algorithm Requires Less Processing Power. At That Time, When Processors Were Generally Not Very Popular, They Did Have More Advantages.
Therefore, In That Year, MP3 Was A Proper Failure. Even Its Inventors Have Begun To Study New Audio Coding. Mp3 Was Thrown Into The Garbage Of History.
Until The Mid-1990s, The Birth Of Two Revolutionary Technologies Brought MP3 Back To Life, That Is, The World Wide Web And windows 95。
A Research And Development Team Also From Germany Developed A Software Player For MP3 And Released It On Windows System.
At That Time, 1 GB Hard Disk Was Just Beginning To Popularize, And The Storage Space Was Very Precious. On The Contrary, The Processor Had Been Greatly Improved. Therefore, The Smaller MP3 Is Slowly Accepted By Everyone, And Has Unexpectedly Become A New Audio Coding Standard. July 14, 1995 Is The Birthday Of MP3. Karl Heinz Brandenburg And His Colleagues At Fraunhofer Institute Decided To Name The File Extension Of The Compression Algorithm MPEG-2 Audio Layer III According To The Full Name Of The Industry Standard MP3。
In The Late 1990s, "mp3" Replaced "sex" As The Most Queried Word In Search Engines. When Brandenburg Was On A Business Trip In Hong Kong, He Saw 30 Different Brands Of MP3 Players In The Window. He Thought, "well, We Finally Won."
Mp3 Is Disappearing
The Birth Of MP3 Is Much More Complicated Than I Thought. It Is A Scientific Research Achievement That Takes Many Years And Goes Through Countless Iterations. This Achievement Can Be Said To Reshape The Human Music Industry. It Is Also From MP3 That Music Has Become A Kind Of Mass Consumer Goods That Everyone Can Touch.
From Black Glue And Tape To CD And MP3, Every Technological Innovation Is Changing People's Music Experience And The Way We Consume Music. Mp3 Is Particularly Different In This History. People Who Admire It Believe That MP3 Is Great And Makes It Easy For Everyone To Enjoy Music; Those Who Oppose It Regard It As A Beast Because It Devours The Copyright That Record Companies Rely On And The Golden Age Of The Record Industry.
Today, Digital Music Remains, But MP3 Has Come To The Edge Of Being Eliminated By The Times. We No Longer Need To Download Music To The Player And Listen To Music With The Player. Everyone Uses mobile Phone Online Music Listening, 5g Communication And Hundreds Of GB Of Memory Make Audio Compression More Unnecessary, And The Music Platform Has Gradually Turned To Lossless Formats Such As FLAC.
But We All Remember The Era Of Listening To MP3 And The Music That Accompanied Us.