** Next:** Arithmetic and Logical Operations
**Up:** Floating Point
** Previous:** Decimal to Floating Point

##
The IEEE FPS is the most widely accepted standard representation for
floating point numbers. The standard provides definitions for *single
precision* and *double precision* representations.

The single precision IEEE FPS format is composed of 32 bits, divided
into a 23 bit mantissa, M, an 8 bit exponent, E, and a sign bit, S:

The normalized mantissa, m, is stored in bits 0-22 with the hidden
bit, , omitted. Thus M = m-1.

The exponent, e, is represented as a bias-127 integer in bits 23-30.
Thus, E = e+127.

The sign bit, S, indicates the sign of the mantissa, with S=0 for
positive values and S=1 for negative values.

Zero is represented by E = M = 0. Since S may be 0 or 1, there are
different representations for +0 and -0.

The maximum value of E = 255 is reserved to indicate *overflow* values
(usually the result of floating point arithmetic) with exponents that
are too large or too small to be represented.

The special interpretations for E = 255 and M = 0 are for
S = 0 and for S=1. Floating point division by zero
produces a number with E=255 and nonzero M called NaN (Not a Number).

To convert decimal 17.15 to IEEE FPS:

- Convert decimal 17 to binary 10001. Convert decimal 0.15 to the
repeating binary fraction . Combine integer
and fraction to obtain binary .
- Normalize the binary number to obtain Thus, M = m-1 = and
E = e+127 = 131 = 1000 0011.
- The number is positive, so S=0.
- Align the values for M, E, and S in the correct fields.

The hexadecimal value is 0x41893333.

The range of values for the mantissa, m, is between 1 and 2- .

Because E=0 and E=255 are reserved, the range of values for the exponent, e,
is between -126 and +127.

The largest positive number that can be represented is approximately
The decimal value of this number
is approximately since

The mantissa represents a 24 bit binary fraction which corresponds to
approximately 7 decimal digits since

** Next:** Arithmetic and Logical Operations
**Up:** Floating Point
** Previous:** Decimal to Floating Point
*CS 301 Class Account *

Mon Sep 13 11:15:41 ADT 1999