MPEG Overview

MPEG-2

MPEG (Moving Pictures Expert Group)

Pathway to compression

spatial and temporal redundancy
- pixel values are not independent
  - correlated to neighbors in the same frame
  - correlated to neighbors across frames
psychovisual redundancy
- human eye has a limited response to fine spatial detail
- human eye less senstive to detail around object edges

Spatial redundancy

transform to the frequency domain
- Fourier analysis - any periodic waveform can be reproduced by adding together an arbitrary number of harmonically related sinusoids of various amplitudes and phases
  - does not result in compression (usually increases)
  - samples are not periodic

STFT (short-time Fourier Transform)

break up the continuous time domain with windows
- rectangular
- # of frequencies depends on size of window
wrap samples into ring to make "continuous"
DFT (discrete Fourier transform)
- we want # input samples == # of frequency coefficients
- FFT is a fast way to compute DFT

DCT (Discrete Cosine Transformation)

Compression?

64 pixels -> 64 coefficients
but
- most coefficients will be 0
- statistically, the further from the top-left, the smaller the magnitude
compression
- run-length coding
- Huffman coding gain
- quantization of coefficient wordlengths
  - amount weighted according to visibility by a human observer
- not reversable in the decoder (lossy)

Sequences

try to predict the next picture from a previous picture
Send a difference picture
- difference between coded picture and next picture
  - original picture not available at the decoder
should also contain spatial redundancy
- encode the difference picture before sending

Motion-compensated inter-frame prediction

divide screen into areas called macroblocks (16x16)
each macroblock steered by a motion vector
vector gives offset in another picture to find pixel data
- fetch from another picture
motion vector overhead can account for 1/3 of bitrate in a "high-action" sequence
practical search ranges between +/-15 and +/-32 pixels

Types of pictures

I or 'intra' pictures
- coded w/o reference to any other pictures
- just reduce spatial redundancy
- access points in the bitstream where decoding can begin
P or 'predictive' pictures
- use previous I or P for motion compensation
- each block is either predicted or intra-coded
B or 'bidirectionally-predictive' pictures
- can use the previous and/or next I/P for motion compensation
- can cause a reorder from natural display order
- prediction is a fetch operation

Sequences of pictures

Summary

DBS

broadcast multiple channels over a transponder
- 23 Mbit/sec available
- mix/match to balance entropy
- 1997 - 5-6 channels per trans.
- 1999 - 8-10 channels per trans.
- 2001 - 11+ channels per trans.
take advantage of variable bit rates
- statistical multiplexing
- probability that all channels reaching peak entropy at once is very small

DVDs

12cm dia disc, short laser wavelength, finer track pitch, better optics
- approx. 5GB of storage -> movie at 5Mbit/sec
- disc reduced to 0.6mm (can glue 2 together for 1.2mm disc)
  - approx 10GB of storage -> movie at 10Mbit/sec
film performs well with MPEG
- frame rate of 24Hz vs 30Hz (20% savings)
- film source progressive (no interlacing for motion estimation)
- pre-digital source oversampled (very high quality signal)
- focusing, motion blur, etc from film more amenable to the transform and quantization of MPEG