MPEG-2
MPEG (Moving Pictures Expert Group)
- 1988 working group within ISO
- borrows significantly from JPEG (lossy)
- encoder is algorithmic and adaptive
- decoder is "dumb"
- perfect for broadcasting (few encoders vs. many decoders)
- MPEG defines the bitstream - not encoders/decoders
Pathway to compression
- spatial and temporal redundancy
    
    - pixel values are not independent
        
        - correlated to neighbors in the same frame
        
- correlated to neighbors across frames
        
 
 
- psychovisual redundancy
    
    - human eye has a limited response to fine spatial detail
    
- human eye less senstive to detail around object edges
    
 
Spatial redundancy
- transform to the frequency domain
    
    - Fourier analysis - any periodic waveform can be reproduced by
      adding together an arbitrary number of harmonically related
      sinusoids of various amplitudes and phases
      
      - does not result in compression (usually increases)
      
- samples are not periodic
      
 
 
STFT (short-time Fourier Transform)
- break up the continuous time domain with windows
    
    - rectangular
    
- # of frequencies depends on size of window
    
 
- wrap samples into ring to make "continuous"
- DFT (discrete Fourier transform)
    
    - we want # input samples == # of frequency coefficients
    
- FFT is a fast way to compute DFT
    
 
 
 
 
 
 
 
 
 
DCT (Discrete Cosine Transformation)
- special case of DFT where sine components eliminated
- repeat original samples in time-reversed order
- perform DFT (8x8 block of pixels)
- produces as many usefull coefficients as input samples
 
 
 
 
Compression?
- 64 pixels -> 64 coefficients
- but
    
    - most coefficients will be 0
    
- statistically, the further from the top-left, the
        smaller the magnitude
    
 
- compression
    
    - run-length coding
    
- Huffman coding gain 
    
- quantization of coefficient wordlengths
        
        - amount weighted according to visibility by a human observer
        
 
- not reversable in the decoder (lossy)
    
 
Sequences
- try to predict the next picture from a previous picture
- Send a difference picture
    
    - difference between coded picture and next picture
        
        - original picture not available at the decoder
        
 
 
- should also contain spatial redundancy
    
    - encode the difference picture before sending
    
 
 
Motion-compensated inter-frame prediction
- divide screen into areas called macroblocks (16x16)
- each macroblock steered by a motion vector
- vector gives offset in another picture to find pixel data
    
    - fetch from another picture
    
 
- motion vector overhead can account for 1/3 of bitrate in
      a "high-action" sequence
- practical search ranges between +/-15 and +/-32 pixels
 
Types of pictures
- I or 'intra' pictures
    
    - coded w/o reference to any other pictures
    
- just reduce spatial redundancy
    
- access points in the bitstream where decoding can begin
    
 
- P or 'predictive' pictures
    
    - use previous I or P for motion compensation
    
- each block is either predicted or intra-coded
    
 
- B or 'bidirectionally-predictive' pictures
    
    - can use the previous and/or next I/P for motion compensation
    
- can cause a reorder from natural display order
    
- prediction is a fetch operation
    
 
Sequences of pictures
- a sequence may consist of almost any pattern of I, P, and B pictures
- in a typical sequence, relative sizes of pictures are
    
Summary
- lossless compression generally limited to around 2:1
- compression "sweet spot" of MPEG is between 8:1 and 30:1
- standard VHS - approx 1.5 Mbit/sec
- broadcast NTSC - approx 3 Mbit/sec
- sports/high temporal activity - approx 5-6 Mbit/sec
- betacam (90 Mbit/sec) - approx 10 Mbit/sec 
DBS
- broadcast multiple channels over a transponder
    
    - 23 Mbit/sec available
    
- mix/match to balance entropy
    
- 1997 - 5-6 channels per trans.
    
- 1999 - 8-10 channels per trans.
    
- 2001 - 11+ channels per trans.
    
 
- take advantage of variable bit rates
    
    - statistical multiplexing
    
- probability that all channels reaching peak entropy at once is
    very small
    
 
DVDs
- 12cm dia disc, short laser wavelength, finer track pitch, better optics
    
    - approx. 5GB of storage -> movie at 5Mbit/sec
    
- disc reduced to 0.6mm (can glue 2 together for 1.2mm disc)
        
        - approx 10GB of storage -> movie at 10Mbit/sec
        
 
 
- film performs well with MPEG
    
    - frame rate of 24Hz vs 30Hz (20% savings)
    
- film source progressive (no interlacing for motion estimation)
    
- pre-digital source oversampled (very high quality signal)
    
- focusing, motion blur, etc from film more amenable to the transform
    and quantization of MPEG