MPEG (Moving Pictures Expert Group)

- 1988 working group within ISO
- borrows significantly from JPEG (lossy)
- encoder is algorithmic and adaptive
- decoder is "dumb"
- perfect for broadcasting (few encoders vs. many decoders)
- MPEG defines the bitstream - not encoders/decoders

Pathway to compression

- spatial and temporal redundancy
- pixel values are not independent
- correlated to neighbors in the same frame
- correlated to neighbors across frames

- pixel values are not independent
- psychovisual redundancy
- human eye has a limited response to fine spatial detail
- human eye less senstive to detail around object edges

Spatial redundancy

- transform to the frequency domain
- Fourier analysis - any periodic waveform can be reproduced by
adding together an arbitrary number of harmonically related
sinusoids of various amplitudes and phases
- does not result in compression (usually increases)
- samples are not periodic

- Fourier analysis - any periodic waveform can be reproduced by
adding together an arbitrary number of harmonically related
sinusoids of various amplitudes and phases

STFT (short-time Fourier Transform)

- break up the continuous time domain with windows
- rectangular
- # of frequencies depends on size of window

- wrap samples into ring to make "continuous"
- DFT (discrete Fourier transform)
- we want # input samples == # of frequency coefficients
- FFT is a fast way to compute DFT

DCT (Discrete Cosine Transformation)

- special case of DFT where sine components eliminated
- repeat original samples in time-reversed order
- perform DFT (8x8 block of pixels)
- produces as many usefull coefficients as input samples

Compression?

- 64 pixels -> 64 coefficients
- but
- most coefficients will be 0
- statistically, the further from the top-left, the smaller the magnitude

- compression
- run-length coding
- Huffman coding gain
- quantization of coefficient wordlengths
- amount weighted according to visibility by a human observer

- not reversable in the decoder (lossy)

Sequences

- try to predict the next picture from a previous picture
- Send a difference picture
- difference between coded picture and next picture
- original picture not available at the decoder

- difference between coded picture and next picture
- should also contain spatial redundancy
- encode the difference picture before sending

Motion-compensated inter-frame prediction

- divide screen into areas called macroblocks (16x16)
- each macroblock steered by a motion vector
- vector gives offset in another picture to find pixel data
- fetch from another picture

- motion vector overhead can account for 1/3 of bitrate in a "high-action" sequence
- practical search ranges between +/-15 and +/-32 pixels

Types of pictures

- I or 'intra' pictures
- coded w/o reference to any other pictures
- just reduce spatial redundancy
- access points in the bitstream where decoding can begin

- P or 'predictive' pictures
- use previous I or P for motion compensation
- each block is either predicted or intra-coded

- B or 'bidirectionally-predictive' pictures
- can use the previous and/or next I/P for motion compensation
- can cause a reorder from natural display order
- prediction is a fetch operation

Sequences of pictures

- a sequence may consist of almost any pattern of I, P, and B pictures
- in a typical sequence, relative sizes of pictures are
- I = 3*P
- P = 1.5*B

Summary

- lossless compression generally limited to around 2:1
- compression "sweet spot" of MPEG is between 8:1 and 30:1
- standard VHS - approx 1.5 Mbit/sec
- broadcast NTSC - approx 3 Mbit/sec
- sports/high temporal activity - approx 5-6 Mbit/sec
- betacam (90 Mbit/sec) - approx 10 Mbit/sec

DBS

- broadcast multiple channels over a transponder
- 23 Mbit/sec available
- mix/match to balance entropy
- 1997 - 5-6 channels per trans.
- 1999 - 8-10 channels per trans.
- 2001 - 11+ channels per trans.

- take advantage of variable bit rates
- statistical multiplexing
- probability that all channels reaching peak entropy at once is very small

DVDs

- 12cm dia disc, short laser wavelength, finer track pitch, better optics
- approx. 5GB of storage -> movie at 5Mbit/sec
- disc reduced to 0.6mm (can glue 2 together for 1.2mm disc)
- approx 10GB of storage -> movie at 10Mbit/sec

- film performs well with MPEG
- frame rate of 24Hz vs 30Hz (20% savings)
- film source progressive (no interlacing for motion estimation)
- pre-digital source oversampled (very high quality signal)
- focusing, motion blur, etc from film more amenable to the transform and quantization of MPEG