Representation of Integers, Signed Integers, and Reals (incl. Double Precision)

At a Glance

Frequency: 1 sub-part across 1 of 13 years (2024)
Priority tier: T4
Marks (count): 5 (1)
Average solve time: ~10 min
Difficulty mix: medium 1
Section: A | Dominant type: computation

Why This Chapter Matters

A single 5-mark question from 2024 covers the full spectrum of number representation — unsigned integers, signed integers (2’s complement), and IEEE 754 double-precision floating-point. The marks are quick if the double-precision bit layout is memorised and the bias-1023 formula is applied correctly. This atom is a reliable minimal-effort maximum-marks target.

Minimum Theory

Unsigned Integers

An $n$ -bit unsigned integer stores values from $0$ to $2^n - 1$ . The value is:

$V = \sum_{k=0}^{n-1} b_k \cdot 2^k$

where $b_k$ is the $k$ -th bit (LSB = $b_0$ ).

Signed Integers: Three Schemes

Scheme	Positive $N$	Negative $-N$	Range ( $n$ bits)
Sign-magnitude	$0\,\\|N\\|$	$1\,\\|N\\|$	$-(2^{n-1}-1)$ to $2^{n-1}-1$ ; two zeros
1’s complement	$N$	$\overline{N}$ (bitwise NOT)	$-(2^{n-1}-1)$ to $2^{n-1}-1$ ; two zeros
2’s complement	$N$	$\overline{N}+1$	$-2^{n-1}$ to $2^{n-1}-1$ ; one zero

2’s complement is universal in modern hardware. Its key advantage: ordinary binary addition works for both positive and negative numbers without special cases.

Detecting overflow in 2’s complement addition. Overflow occurs if and only if two numbers of the same sign are added and the result has the opposite sign.

Floating-Point: IEEE 754 Double Precision

Bit layout (64 bits total):

$\underbrace{s}_{1}\;\underbrace{e_{10}\cdots e_0}_{11}\;\underbrace{m_{51}\cdots m_0}_{52}$

$s$ = sign bit (0 = positive, 1 = negative).
$e$ = biased exponent (11 bits); the stored value is $E = e + 1023$ , so the actual exponent is $e = E - 1023$ .
$m$ = mantissa (52 bits); the leading 1 is implicit, giving an effective 53-bit significand.

Value of a normalised number ( $1 \le E \le 2046$ ):

$x = (-1)^s \times 1.m \times 2^{E - 1023}$

where $1.m$ means $1 + \sum_{k=1}^{52} m_k \cdot 2^{-k}$ .

Special values:

$E$ (stored)	$m$	Meaning
0	0	$\pm 0$
0	$\ne 0$	Subnormal: $(-1)^s \times 0.m \times 2^{-1022}$
2047	0	$\pm \infty$
2047	$\ne 0$	NaN

Machine epsilon. The smallest $\varepsilon$ such that $1 + \varepsilon \ne 1$ in double precision:

$\varepsilon_{\text{mach}} = 2^{-52} \approx 2.22 \times 10^{-16}$

Converting a decimal to double precision — procedure:

Determine the sign bit $s$ .
Convert $|x|$ to binary.
Normalise: write as $1.m \times 2^e$ (shift the binary point so that exactly one 1 is to the left).
Biased exponent: $E = e + 1023$ ; convert $E$ to 11-bit binary.
Mantissa: take the 52 bits after the binary point of $1.m$ , padding with zeros on the right if needed.

Question Archetypes

Archetype	Recognition
decimal-to-double	Represent a given decimal number in IEEE 754 double-precision format
interpret-bit-pattern	Given a 64-bit pattern, decode the double-precision value
signed-range-or-2s-comp	State the range, or convert a negative number to 2’s complement

decimal-to-double (1 question; 2024)

Recognition Cues

“Represent $x$ in IEEE 754 double-precision format.”
“Give the sign, exponent, and mantissa bits.”

Solution Template

Write $s = 0$ (positive) or $s = 1$ (negative).
Convert $|x|$ to binary using repeated multiplication (fractional part) or division (integer part).
Normalise to $1.f \times 2^e$ .
Compute biased exponent $E = e + 1023$ ; express as 11-bit binary.
Write the 52 mantissa bits (the fractional part $f$ , padded to 52 bits).
Assemble: $s\;|\;E_{10}\cdots E_0\;|\;m_{51}\cdots m_0$ .

Worked Example

2024 Paper 2, 2024-P2-Q8a (5 marks)

Represent the decimal number $-13.625$ in IEEE 754 double-precision (64-bit) floating-point format. Give the sign bit, biased exponent (in binary), and the first 10 bits of the mantissa.

Step 1 — sign bit.

$x = -13.625 < 0$ , so $s = 1$ .

Step 2 — convert $|x| = 13.625$ to binary.

Integer part: $13 = 8+4+1 = 1101_2$ .

Fractional part: $0.625 \times 2 = 1.25 \to$ bit 1; $0.25 \times 2 = 0.5 \to$ bit 0; $0.5 \times 2 = 1.0 \to$ bit 1. Stop.

So $0.625_{10} = 0.101_2$ .

Therefore: $13.625_{10} = 1101.101_2$ .

Step 3 — normalise.

$1101.101_2 = 1.101101 \times 2^3$

Exponent $e = 3$ .

Step 4 — biased exponent.

$E = 3 + 1023 = 1026_{10}$

Convert $1026$ to 11-bit binary:

$1026 = 1024 + 2 = 2^{10} + 2^1 \implies 10000000010_2$

Step 5 — mantissa (52 bits).

The fractional part of $1.101101$ is $101101\underbrace{00\cdots0}_{46}$ . The first 10 mantissa bits are $1011010000$ .

Step 6 — assemble.

$\underbrace{1}_{s}\;\underbrace{10000000010}_{E,\;11\text{ bits}}\;\underbrace{1011010000\cdots0}_{m,\;52\text{ bits}}$

$\boxed{s=1,\quad E = 10000000010_2,\quad m = 1011010000\underbrace{00\cdots0}_{42}}$

Common Traps

Bias is 1023 for double precision, not 127 (which is for single precision / 32-bit).
The implicit leading 1 is not stored in the mantissa bits. The 52 bits hold only the fractional part after the binary point.
When the exponent is negative (e.g., $0.0625 = 2^{-4}$ ), the biased exponent is still positive: $E = -4 + 1023 = 1019$ .
Subnormal numbers (biased exponent = 0) do not have an implicit leading 1; their value is $0.m \times 2^{-1022}$ .
Confusing machine epsilon $2^{-52}$ with the smallest positive double $2^{-1022}$ (the latter is the smallest normalised number).

Marks-Aware Writing

At 5 marks, an efficient answer has five numbered steps: sign bit, binary conversion of $|x|$ , normalisation showing $e$ , biased exponent computation and conversion to 11-bit binary, and the mantissa bits. Every step must be shown — the examiner cannot award marks for a final bit pattern without the derivation. Stating the IEEE 754 field widths (1-11-52) in the opening line saves you from being penalised for the wrong layout.

Practice Set

Only one historical question on this atom (shown above).