Observation: The most significant bit in a two’s complement signed number will specify the sign.
Observation: To take the negative of a two’s complement signed number we first complement (flip) all the bits, then add 1.
Words and Halfwords
A
word on the ARM Cortex M will have 32 bits. Consider an unsigned number with 32 bits, where each bit b
31,...,b
0
is binary and has the value 1 or 0. If a 32-bit number is used to
represent an unsigned integer, then the value of the number is
N = 2
31 • b
31 + 2
30 • b
30 + ... + 2•b
1 + b
0 = sum(2
i • b
i) for i=0 to 31
There are 2
32 different unsigned 32-bit numbers. The smallest unsigned 32-bit number is 0 and the largest is 2
32-1. This range is 0 to about 4 billion.
A halfword or double byte contains 16 bits, where each bit b15,...,b0 is binary and has the value 1 or 0, as shown in Figure 2.4.
Figure 2.4. A halfword has 16 bits.
Similar
to the unsigned algorithm, we can use the basis to convert a decimal
number into signed binary. We will work through the algorithm with the
example of converting –100 to 8‑bit binary: We start with the most
significant bit (in this case –128) and decide do we need to include it
to make –100? Yes (without –128, we would be unable to add the other
basis elements together to get any negative result), so we set bit 7 and
subtract the basis element from our value. Our new value equals –100
minus –128, which is 28. We go the next largest basis element, 64 and
ask, “do we need it?” We do not need 64 to generate our 28, so bit 6 is
zero. Next we go the next basis element, 32 and ask, “do we need it?” We
do not need 32 to generate our 28, so bit 5 is zero. Now we need the
basis element 16, so we set bit 4, and subtract 16 from our number 28
(28-16=12). Continuing along, we need basis elements 8 and 4 but not 2,
1. Putting it together we get 100111002 (which means -128+16+8+4).
Table 2.2. Example conversion from decimal to signed 8-bit binary.
A second way to convert negative numbers into
binary is to first convert them into unsigned binary, then do a two’s
complement negate. For example, we earlier found that +100 is 011001002. The two’s complement negate is a two-step process. First we do a logical complement (flip all bits) to get 100110112. Then add one to the result to get 100111002.
A third way to convert negative numbers into binary uses the number wheel. Let n be the number of bits in the binary representation. We specify precision, M=2^n, as the number of distinct values that can be represented. To convert negative numbers into binary is to first add M
to the number, then convert the unsigned result to binary using the
unsigned method. This works because binary numbers with a finite n are like the minute-hand on a clock. If we add 60 minutes, the minute-hand is in the same position. Similarly if we add M to or subtract M from an n-bit
number, we go around the number wheel and arrive at the same place.
This is one of the beautiful properties of 2's complement: unsigned and
signed addition/subtraction are same operation. In this example we have
an 8-bit number so the precision is 256. So, first we add 256 to the
number, then convert the unsigned result to binary using the unsigned
method. For example, to find –100, we add 256 plus –100 to get 156. Then
we convert 156 to binary resulting in 100111002.
This method works because in 8-bit binary math adding 256 to number
does not change the value. E.g., 256-100 has the same 8-bit binary value
as –100.
When dealing with numbers on the computer, it will be convenient to memorize some Powers of 2 as shown in Table 2.3.
Table 2.3. Some powers of two that will be useful to memorize.
In C we can specify the number of bits used to store data using the data
type. The number of bits used will vary from machine to machine, so it
is wise to look up these specifications when developing code. The
definition for char in C can vary, so with 8-bit variables we suggest using unsigned char or signed char, just be perfectly clear. On this Keil compiler for this Cortex M, we have these data types
Fixed-Point Numbers
We will use fixed-point numbers
when we wish to express values in our computer that have noninteger
values. A fixed-point number contains two parts. The first part is a
variable integer, called I. The variable integer will be stored on the computer. The second part of a fixed-point number is a fixed constant, called the resolution Δ.
The fixed constant will NOT be stored on the
computer. The fixed constant is something we keep track of while
designing the software operations. The value of the number is the
product of the variable integer times the fixed constant. The integer
may be signed or unsigned. An unsigned fixed-point number is one that
has an unsigned variable integer. A signed fixed-point number is one
that has a signed variable integer.
Value = VariableInteger * FixedConstant = I*Δ
The precision of a number system is the
total number of distinguishable values that can be represented. The
precision of a fixed-point number is determined by the number of bits
used to store the variable integer. On most microcontrollers, we can use
8, 16, or 32 bits for the integer. With binary fixed point the fixed constant is a power of 2.
For example, consider a binary fixed-point number system where the resolution is 2-4. The resolution is not stored on the computer, just the integer I.
For example, consider a decmal fixed-point number system where the resolution is 10-2. The resolution is not stored on the computer, just the integer I.
EX:
If the resolution is 0.01 volts, what does the unsigned decimal number 1234 mean?
Ans : value=integer*resolution, so the value is 12.34 volts
If the resolution is 1/8 cm, how do you store 2.25 cm in the computer?
Ans : value = integer*resolution, so the integer is 8*2.25 = 18,
If the value is 12.3 ohms, and the integer is 12300, what is the resolution of the fixed-point number system?
Ans : value = integer*resolution, so resolution is 12.3ohms/12300 = 0.001ohms.