Audio coding

Được đăng lên bởi Trần Lý
Số trang: 35 trang   |   Lượt xem: 790 lần   |   Lượt tải: 2 lần
Audio Coding

Yao Wang
Polytechnic University, Brooklyn, NY11201

• Psychoacoustic model of human hearing
– Threshold in quiet
– Frequency masking
– Temporal masking

• Basic steps in perceptual audio coding
– Quantization basics
– Subband analysis
– Bit allocation based on masking threshold

• MPEG audio coding
– MPEG1 audio layers (including MP3) and technical differences
– MPEG-2 audio coding (BC and AAC)
– MPEG-4 audio coding
©Yao Wang, 2004

EE3414: Audio Coding


Speech vs. Audio Coding
• Speech coding
– Targeted for telephony applications
• High rate waveform-based speech coder: for comfortable, natural sound,
use simple predictive coding techniques
• Low rate model-based speech coders: for intelligible speech, sufficient for
communication purposes, use speech-production models (a filter driven by
an excitation signal)

• Audio coding
– For high quality production of music (including speech) in multiple
• Music has a much wider bandwidth and multichannels
• Waveform-based to retain the natural sound quality
• Make extensive use of human hearing properties in determining the
quantization levels in different frequency bands
– Each frequency component is quantized with a step-size that depends
on the hearing threshold
– Don’t code if the ear cannot hear it!

©Yao Wang, 2004

EE3414: Audio Coding


Psychoacoustic Model of Human
• Ear as a filter bank
• Three masking effects:
– Threshold in quiet
– Frequency masking
– Temporal masking

©Yao Wang, 2004

EE3414: Audio Coding


Ear as a Filterbank
• The auditory system can be roughly modeled as a filterbank,
consisting of 25 overlapping bandpass filters, from 0 to 20 KHz
– The ear cannot distinguish sounds within the same band that occur
– Each band is called a critical band
– The bandwidth of each critical band is about 100 Hz for signals below
500 Hz, and increases linearly after 500 Hz up to 5000 Hz
– 1 bark = width of 1 critical band
 f / 100, f ≤ 500Hz
Bark = 
9 + 4 log 2 ( f / 1000), f > 500Hz



©Yao Wang, 2004


EE3414: Audio Coding




Threshold in Quiet
Put a person in a quiet room. Raise level of 1 kHz tone until just barely
audible. Vary the frequency and plot

The threshold levels are frequency dependent. The human ear is most
sensitive to 2-4 KHz.
©Yao Wang, 2004

EE3414: Audio Coding


Frequency Maskin...
Yao Wang
Polytechnic University, Brooklyn, NY11201
Audio Coding
Audio coding - Trang 2
Để xem tài liệu đầy đủ. Xin vui lòng
Audio coding - Người đăng: Trần Lý
5 Tài liệu rất hay! Được đăng lên bởi - 1 giờ trước Đúng là cái mình đang tìm. Rất hay và bổ ích. Cảm ơn bạn!
35 Vietnamese
Audio coding 9 10 966