Operations on Bit Representations

Recall how we represent data in digital systems: by using Bits and Bit Technologies.

Boolean Algebra

We will work over the set of Bits:

We denote the set

Link to original

We write the possible outcomes as truth tables

Not (output)Input
01
10
And01
000
101
Or01
001
111
XOr01
001
110

Bit Parallel Operations

Boolean operations can be extended to multi-bit quantities.

In C, we can write a&b, a|b, a^b, ~a to perform these operations independently and in parallel on corresponding bits of these quantities.

Ex: Given a = 0x4a0b, b = 0xc3d5

Performing c = a & b 1. Convert a and b into binary from Hexadecimal: 1. a = 0100 1010 0000 1011 2. b = 1100 0011 1101 0101 2. Now, perform the and operation, we get c = 0100 0010 0000 0001 3. Can now convert this back to base 16, which is 4201

Bit Shifts

When shifting, you keep moving bits in the direction specified, and allow bits that overflow to just fall off and be replaced with 0’s on the other side.

Given a -bit vector and an amount (with ), we define the following shifts:

  • Logical Left Shift:
    • C and Java: x << k
  • Logical Right Shift:
    • C: x >> k
    • Java: x >>> k
  • Arithmetic Right Shift:
    • C and Java: x >> k

Ex: Given a = 0x4a0b, c = 0xc3d5

Remember by converting,a = 0100 1010 0000 1011, c (unsigned) = 1100 0011 1101 0101

  1. b = a << 3
    1. Take the a binary representation, and shift each bit in it to the left by 3, replacing the new voids on the right with 0’s.
    2. We get 0101 0000 0101 1000
    3. Deconstructing that back to Hex, we get 5058.
  2. d = c >> 2
    1. Take the c binary representation, and shift each bit in it to the right by 2, replacing the new voids on the left by 0’s.
    2. Get the new value (not written out)
    3. Convert back to base 16: 30f5

Ambiguity of >> in C

C just lets the operating system/architecture implementation perform whatever operator it wants.

In most cases, that ends up being the arithmetic shift.

Word Size

How many bits it takes to hold a pointer, or address.

Memory will be abstracted as an array of cells, indexed from through .

  • Each cell holds one byte of data, each with an address.
  • Memory accesses are in units of these cells (an integral number of bytes)
    • This is called “byte-addressed memory”
  • The number is called the word size of the machine
    • This is the number of bits needed to hold a pointer (or address of the memory cell) on that machine
    • AArch64 and x86-64:
    • AArch32 and IA32:
    • This size can be smaller/vary for embedded processing units

Byte Ordering

Defines the order by which bytes are counted when an address is defined.

This is very straightforward.

Say we have int x in a C program, where sizeof(x) = 4 and &x = 0x1000, the occupied bytes of memory’s addresses are: 0x1000, 0x1001, 0x1002, 0x1003 because the bytes are stored consecutively.

Which byte goes into what cell? This question only matters if we have multi-byte values

Two Byte Orders

  • Little-endian ordering: If the least significant byte is in the memory cell with the lowest-numbered address
    • Byte is stored in memory
    • Byte is stored in memory
  • Big-endian ordering: If the most significant byte is in the memory cell with the lowest-numbered address
    • Byte is stored in memory
    • Byte is stored in memory

In AArch64, instructions are always stored in little-endian order.

Integers

For integers, we can do integer arithmetic.