Lecture Notes

for

CSCI 301: Great Ideas in Computer Science

(c) 1993 - 2007 John E. Howland, All Rights Reserved

Department of Computer Science

Trinity University

One Trinity Place

San Antonio, Texas 78212-7200

Internet: jhowland@ariel.cs.trinity.edu

3. Computer Arithmetic

3.1 Complement Arithmetic

Most computers use complement arithmetic for integer representations. The reason for this is mostly to simplify the circuitry required to perform integer arithmetic operations. We will see in Section 3.1.1 that negative numbers may represented in complement form and that the operation of subtraction may be accomplished by adding the complement of a number. We will show that the complement of a number is very easy to calculate and both addition and subtraction can be accomplished by adding!

3.1.1 Base 10 Complement Arithmetic

To explain the basic idea behind complement arithmetic we will use base 10 representations for our numbers. However, complement arithmetic is independent of the number base used. Our choice of base 10 is one of convenience since most of us have better arithemetic skills using base 10 numbers than base 2 numbers.

First, note that ordinary mathematical notation for numbers uses positional notation. That is, the rightmost digit has a place value of 1 (100), the next rightmost digit has a place value of 10 (101), the next digit has a place value of 100 (102), etc. Hence, 123 is 1x100 + 2x10 + 3x1.

To simplify our discussion we limit our arithmetic to 4 digit integers. Limiting the size of numbers we deal with is reasonable since we need to store numbers in the memory of a computer and we do not want to use a representation which wastes valuable memory space.

Consider the following subtraction problem:


 4221
-0063
-----
which is the same problem as

  4221
+10000
 -0063
-10000
------
which is the same problem as

  4221
 +9999
 -0063
 +0001
-10000
------
We first combine

 +9999
 -0063
------
  9936
9936 is called the 9's complement of 0063. Next we add 1 to obtain the 10's complement.

  9937
The 10's complement is then added to 4221 yielding.

  4221
 +9937
------
 14158
Recall that we still need to subtract 10000, however, notice that this problem will always have a carry of 1 out of the 4th digit and since we are just representing 4 decimal digits, we accomplish the subtraction of 10000 by ignoring the carry. This process yields a result of 4158 which is the correct answer for the original problem. Note also, that we have accomplished the operation of subtraction by adding the 10's complement!

The 10's complement of a number acts like a negative representation for that number, that is, adding the 10's complement of x is the same as subtracting x. A little experimentation shows that if the left most digit is 5 or greater, then that number represents a negative. For example, (using 4 digit 10's complement representations) 9999 is -1 and 5000 is -5000 and 5001 is -4999.

3.1.2 Base 2 Complement Arithmetic

Complement arithmetic is independent of the radix and works in a similar fashion for radix 2 (base 2) as the following example will illustrate.

The reason we wish to use base 2 rather than base 10 is that electronics technology has not been able to produce inexpensive, reliable and fast circuits which have 10 stable states (used to represent the 10 base 10 digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9). However, it is possible to build inexpensive, reliable and fast circuits which have 2 stable states (used to represent the 2 base 2 digits 0 and 1). A popular family of circuits, TTL, (Transistor Transistor Logic) uses 0 volts to represent the digit 0 and +5 volts to represent the digit 1.

For simplicity, we limit our example to base 2 numbers having 6 bits (binary digits). The problem of


 31
-26
---
is, written in binary notation,

  011111
 -011010
---------
which is the same as the problem:

  011111
+1000000
 -011010
-1000000
--------
which is the same problem as:

  011111
 +111111
 -011010
 +000001
-1000000
--------
We first combine

 111111
-011010
-------
 100101
100101 is called the 1's complement of 011010. Notice that the 1's complement is obtained by simply changing 0's to 1's and 1's to 0's. This is easily implemented with a very simple and very basic TTL circuit element called a complementer or a not circuit. Next we add 1 to the 1's complement.

 100101
+000001
-------
 100110
We call 100110 the 2's complement of 011010. Next we add the 2's complement to our original number.

  011111
 +100110
--------
 1000101
As before, since we are representing just 6 bits, we accomplish the subtraction of 1000000 by ignoring the carry out of the 6th bit yielding a result of
 
000101
which is the answer (5 base 10).

2's complement representations may be used to represent negative values. A 2's complement representation is negative if the leftmost bit is 1 otherwise the representation is positive. Using 6 bit 2's complement representations, 111111 = -1, 100000 = -32, 100001 = - 31 and 011111 = 31.

3.1.3 Number Conversions

We need to be able to convert integers from base 10 representation to representations in another base, such as base 2 or base 16. Before we describe this conversion process, notice that in the positional notation for an integer n, base b, the rightmost digit (one's digit) is multiplied by b0. The next digit is multiplied by b1; the next digit by b2 , etc. The value of the number is determined as the sum of these respective powers. For example,

3.1.4 Example


124 = 1x102 + 2x101 + 4x100
Also, notice that the ratio of the place value of any two consecutive digits is always b. In Example 3.1.4, b = 10.

There are other commonly used number systems where the ratio of two consecutive digits is not always the same. For example, when we keep time in the form centuries, years, days, hours, minutes and seconds, the ratios between the consecutive digits are 100, 365, 24, 60 and 60. That is, there are 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, 365 days in a year and 100 years in a century. If we want to convert time, measured in seconds to time measured in centuries, years, days, hours, minutes and seconds we would use the following procedure:

3.1.5 Algorithm

a. Divide the number of seconds by 60 (the number of seconds per minute). The remainder of this division is the number of seconds. The quotient is the number of minutes.

b. Divide the number of minutes (the quotient of the previous division) by 60 (the number of minutes per hour). The remainder of this division is the number of minutes. The quotient of this division is the number of hours.

c. Divide the number of hours (the quotient of the previous division) by 24 (the number of hours per day). The remainder of this division is the number of hours. The quotient of this division is the number of days.

d. Divide the number of days (the quotient of the previous division) by 365 (the number of days per year). The remainder of this division is the number of days. The quotient of this division is the number of years.

e. Finally, divide the number of years (the quotient of the previous division) by 100 (the number of years per century). The remainder of this division is the number of years. The quotient of this division is the number of centuries.

Suppose that we represent the ratios of the place values of each digit in a number system as a list of numbers. We call such a list a base list. The base list for a time number system might be 100 365 24 60 60. This list is determined by the fact that there are 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, 365 days in a year and 100 years in a century. Algorithm 3.1.5, which converts time in seconds to time measured in centuries, years, days, hours, minutes and seconds uses the elements of the base list 100 365 24 60 60 in reverse order, dividing each quotient by the next (in reverse order) base list value. The remainders produced by this process are the digits in the new number system from least significant to most significant. This process is modeled by the following J verb:

3.1.6 Representation (Antibase)


rep =: #:
Suppose we wish to convert 1000000 seconds to centuries, years, days, hours, minutes and seconds. Then we would write:

100 365 24 60 60 rep 1000000 ==>
0 11 13 46 40
That is, 1000000 seconds is 0 years, 11 days, 13 hours, 46 minutes and 40 seconds. If you carry out the individual steps of the algorithm on this example, you can see that 1000000 divided by 60 is 16666 with a remainder of 40, 16666 divided by 60 is 277 with a remainder of 46, etc.

Notice that even though this algorithm divides by 100 (the number of years in a century), the result does not contain an item for the number of centuries. This is because the number of centuries is the quotient of the division by 100 whereas the number of years is the remainder of the division by 100. The rep verb contains a provision to obtain the quotient as well as the remainder of the division by the last element (in reverse order). If the first element of the base list (last element in reverse order) is zero, no division is performed and the rep verb terminates, returning the quotient as its answer.

To compute the 6 digit base 2 representation of 25 you would use a base list of 2 2 2 2 2 2.


2 2 2 2 2 2 rep 25 ==>
0 1 1 0 0 1
To compute the 4 digit base 3 representation of 29 you would write:
3 3 3 3 rep 29 ==>
1 0 0 2
The J verb base performs conversions from a given number system which is determined by a base list to base 10.

3.1.7 Base


base =: #.
For example, to compute the number of seconds in 0 years, 11 days, 13 hours, 46 minutes and 40 seconds you would write:

100 365 24 60 60 base 0 11 13 46 40 ==>
1000000
Notice that the base verb un-does the calculation done by the rep verb. The relationship between these two verbs is that base is the inverse of rep. A close examination of Agorithm 3.1.5 reveals a repeated process of division and subtraction to get the remainders. To un-do this calculation (which base does) we have to un-do (in reverse order) the operations of division and subtraction. This means we have to add and multiply. Other examples:

2 2 2 2 2 2 2 2 base 0 1 1 1 1 1 1 1 ==>
127
3 3 3 3 base 1 0 0 2 ==>
29
Finally, we should notice that the first element (last element in reverse order) of a base list is not used in the rep verb to produce a quotient for any successive divisions (there are none since this is the last divisior); it is only used to produce the last remainder (most significant digit of the result). This means that the base verb should not use the first element of the base list either and the first element of the representation list (most significant digit) is the first summand of a sequence of add followed by multiplication. A close examination of the base verb algorithm will reveal these facts.

3.1.8 Sizes of Integer Representations

Most modern computer systems represent several different sized integers in memory. The most common size for an individually addressable memory cell is 8 bits (one byte). These cells are used singly to represent very small integers and in pairs or groups of 4 to represent progressively larger integer values.

One byte integers, n, satisfy:

	
unsigned  0 <= n <= 255 = -1 + 28.
	
signed    -128 <= n <= 127 = -1 + 27
16 bit integers (short integers), n, satisfy:
	
unsigned  0 <= n <= 65535 = -1 + 216
	
signed    -32768 <= n <= 32767 = -1 + 215
32 bit integers (long integers), n, satisfy:

unsigned  0 <= n <= 4294967295 = -1 + 232
	
signed    -2147483648 <= n <= 2147483647 = -1 + 231
Recently, some machines are being produced which operate directly on 64 bit integers.

3.1.9 Summary

To summarize, given circuitry for addition (later we will see how such circuitry might be built) and circuitry for computing the 1's complement (we have to have complementing circuitry to add anyway) we can accomplish both addition and subtraction. Also, since multiplication is repeated addition and division is repeated subtraction, we can accomplish these operations as well.

3.2 Hexadecimal notation

When we wish to view 32 bit or 64 bit memory images of various types, the human eye has trouble dealing with 32 or 64 bits of information. Hence, groups of 4 bits at a time are used to solve this problem. So a 32 bit quantity requires only 8 symbols. This notation, hexadecimal (radix 16) uses the following symbols:

0000 - 0	1000 - 8
0001 - 1	1001 - 9
0010 - 2	1010 - A
0011 - 3	1011 - B
0100 - 4	1100 - C
0101 - 5	1101 - D
0110 - 6	1110 - E
0111 - 7	1111 - F
Hence, the 32 bit 2's complement value of -1 has the following hexadecimal representation:

FFFFFFFF
The 16 bit 2's complement value of 2 has the following hexadecimal representation:

0002
etc.

3.3 Binary Coded Decimal values (BCD)

Some computers support yet another representation for integers, called decimal arithmetic or BCD arithmetic.

BCD integers are variable length strings of bytes in memory where groups of 4 bits are used to represent decimal digits. This means that 2 decimal digits may be packed into each byte. Some systems support use of unused bits in the leftmost digit to represent the sign of the number. For example, 329 would require two bytes of memory and would look like (using hexadecimal notation)

0329

3.4 Scientific Notation

Scientific notation uses a two part representation for a number which consists of a signed fraction (mantissa) and a signed exponent (characteristic).

An example might be +1.234 x 1023

Numbers written in scientific notation are usually written in a normal form (for example the value of the fraction being written so that it is between 1 and 10). If the result of some calculation is .0001 x 1015, then it must be normalized as

1.000 x 1011

When scientific notation is used to represent inexact values on a computer it is usually referred to as floating point notation.

When graphed, adjacent floating point numbers close to zero are very closely spaced, however, the gaps between adjacent floating point numbers far from zero are more than astronomically large!

3.5 IEEE Floating Point Representations

The Institute for Electrical and Electronics Engineers is an engineering professional society which has been concerned with establishing various standards in the field of electrical engineering, electronics and computer engineering. During the early years of computer design, each manufacturer designed its own method of using scientific notation to represent inexact computational values. Not only were these representations not compatible, but they also used different operational methods to handle conversions and rounding of inexact values. This meant that the same sequence of ineact operations on different computers may often produce different results!

In an effort to try to standardize inexact computation, the IEEE formed a committee in the mid 1980's which produced the IEEE Standard 754 for binary floating point arithmetic. Since then, most computer manufacturers have begun to use this standard for representations which is given, in part, in Section 3.5.1.

3.5.1 Formats

Listed below are three IEEE data types. All three are also SANE (Standard Apple Numeric Environment) data types as well. In fact, Apple's SANE package provides the most complete and accurate implementation of the IEEE standard to date. Each of the diagrams in the following pages is followed by a table that gives the rules for evaluating the number. In each field of each diagram, the leftmost bit is the msb (most significant bit) and the rightmost is the lsb (least significant bit). Symbols used in the diagrams are defined in the following table.

___________________________________________
Symbol   Description
___________________________________________
v        value of the number
s        sign bit
e        biased exponent
i        explicit one's bit (extended type only)
f        fraction

3.5.2 Single

The 32-bit single format is divided into three fields as shown below:

The value v of the number is determined by these fields as shown in the following table:

Values of single-format numbers (32 bits)


___________________________________________________________
e        f         v                           class of v
___________________________________________________________
0<e<255  (any)     v=(-1)s x 2(e-127) x (1.f)   normalized
e=0      f!=0      v=(-1)s x 2(e-126) x (0.f)   denormalized
e=0      f=0       v=(-1)s x 0                 zero
e=255    f=0       v=(-1)s x infinity          infinity
e=255    f!=0      v is a NaN                  NaN

3.5.3 Double

The 64-bit double format is divided into three fields as shown below:

The value v of the number is determined by these fields as shown in the following table:

Values of double-format numbers (64 bits)


___________________________________________________________
e        f         v                           class of v
___________________________________________________________
0<e<2047 (any)     v=(-1)s x 2(e-1023) x (1.f)  normalized
e=0      f!=0      v=(-1)s x 2(e-1022) x (0.f)  denormalized
e=0      f=0       v=(-1)s x 0                 zero
e=2047   f=0       v=(-1)s x infinity          infinity
e=2047   f!=0      v is a NaN                  NaN
For example, the double representation (in hex notation) of 1.5 is

3FF8000000000000
is

3F847AE147AE147A

3.5.4 Extended

The 80-bit extended format is divided into four fields as shown below:

The value v of the number is determined by these fields as shown in the following table:

Values of extended-format numbers (80 bits)


___________________________________________________________
e           i  f    v                            class of v
___________________________________________________________
0<=e<=32766 1 (any) v=(-1)s x 2(e-16383) x (1.f)  normalized
0<=e<=32766 0  f!=0 v=(-1)s x 2(e-16383) x (0.f)  denormalized
0<=e<=32766 0  f=0  v=(-1)s x 0                  zero
e=32767  (any) f=0  v=(-1)s x infinity           infinity
e=32767  (any) f!=0 v is a NaN                   NaN

3.6 Numbers We Cannot Represent

Finally, we need some discussion about the kinds of numbers which can be represented. When we talk about the real numbers from a mathematical point of view, we sometimes classify the reals into two groups; those which are rational and those which are not rational (irrational).

The rationals are those which can be represented in the form

a/b where a and b are integers and b not zero.

An equivalent formulation is that the rationals are those numbers which have a repeating representation using some radix.

The irrational numbers can be characterized as those numbers whose representation in any radix never repeat.

This means that an infinite amount of memory would be required to exactly represent an irrational number. Since this is impossible, we are forced to cut off the representation after a fixed number of digits. As soon as this is done, we are no longer representing the irrational exactly, but rather, we have substituted a rational (which approximates the irrational) whose radix representation repeats in the digit zero.

This means that all computer numeric representations (called machine numbers) are necessarily rational numbers.