Operations on Bit Representations
Recall how we represent data in digital systems: by using Bits and Bit Technologies.
Boolean Algebra
We will work over the set of Bits:
We denote the set
Link to original
We write the possible outcomes as truth tables
Not (output) | Input |
---|---|
0 | 1 |
1 | 0 |
And | 0 | 1 |
---|---|---|
0 | 0 | 0 |
1 | 0 | 1 |
Or | 0 | 1 |
---|---|---|
0 | 0 | 1 |
1 | 1 | 1 |
XOr | 0 | 1 |
---|---|---|
0 | 0 | 1 |
1 | 1 | 0 |
Bit Parallel Operations
Boolean operations can be extended to multi-bit quantities.
In C
, we can write a&b
, a|b
, a^b
, ~a
to perform these operations independently and in parallel on corresponding bits of these quantities.
Ex: Given a = 0x4a0b
, b = 0xc3d5
Performing c = a & b
1. Convert a
and b
into binary from Hexadecimal: 1. a = 0100 1010 0000 1011
2. b = 1100 0011 1101 0101
2. Now, perform the and operation, we get c = 0100 0010 0000 0001
3. Can now convert this back to base 16, which is 4201
Bit Shifts
When shifting, you keep moving bits in the direction specified, and allow bits that overflow to just fall off and be replaced with 0
’s on the other side.
Given a -bit vector and an amount (with ), we define the following shifts:
- Logical Left Shift:
- C and Java:
x << k
- C and Java:
- Logical Right Shift:
- C:
x >> k
- Java:
x >>> k
- C:
- Arithmetic Right Shift:
- C and Java:
x >> k
- C and Java:
Ex: Given a = 0x4a0b
, c = 0xc3d5
Remember by converting,a = 0100 1010 0000 1011
, c (unsigned) = 1100 0011 1101 0101
b = a << 3
- Take the
a
binary representation, and shift each bit in it to the left by 3, replacing the new voids on the right with0
’s. - We get
0101 0000 0101 1000
- Deconstructing that back to Hex, we get
5058
.
- Take the
d = c >> 2
- Take the
c
binary representation, and shift each bit in it to the right by 2, replacing the new voids on the left by0
’s. - Get the new value (not written out)
- Convert back to base 16:
30f5
- Take the
Ambiguity of >>
in C
C just lets the operating system/architecture implementation perform whatever operator it wants.
In most cases, that ends up being the arithmetic shift.
Word Size
How many bits it takes to hold a pointer, or address.
Memory will be abstracted as an array of cells, indexed from through .
- Each cell holds one byte of data, each with an address.
- Memory accesses are in units of these cells (an integral number of bytes)
- This is called “byte-addressed memory”
- The number is called the word size of the machine
- This is the number of bits needed to hold a pointer (or address of the memory cell) on that machine
- AArch64 and x86-64:
- AArch32 and IA32:
- This size can be smaller/vary for embedded processing units
Byte Ordering
Defines the order by which bytes are counted when an address is defined.
This is very straightforward.
Say we have int x
in a C
program, where sizeof(x) = 4
and &x = 0x1000
, the occupied bytes of memory’s addresses are: 0x1000, 0x1001, 0x1002, 0x1003
because the bytes are stored consecutively.
Which byte goes into what cell? This question only matters if we have multi-byte values
Two Byte Orders
- Little-endian ordering: If the least significant byte is in the memory cell with the lowest-numbered address
- Byte is stored in memory
- Byte is stored in memory
- Big-endian ordering: If the most significant byte is in the memory cell with the lowest-numbered address
- Byte is stored in memory
- Byte is stored in memory
In AArch64, instructions are always stored in little-endian order.
Integers
For integers, we can do integer arithmetic.