Computer Architecture

RISC vs CISC
von Neumann Architecture Model
Memory and Addressing
The von Neumann Bottleneck
Modern Innovations in Computer Architecture

RISC vs CISC

The CPU executes instructions that are stored in various memory layers throughout the computer system (RAM, caches, registers).
A particular CPU has an Instruction Set Architecture (ISA), which defines:
- The set of instructions the CPU uses and their binary encoding.
- The set of CPU registers.
- The effects of executing instructions on the state of the processor.
- Examples of ISAs include SPARC, ARM, x86, MIPS, and PowerPC.
- A micro-architecture is a specific implementation of an ISA which can have different circuitry. AMD and Intel both produce x86 processors, but with different micro-architectures.

Key Differences Between RISC and CISC

RISC (Reduced Instruction Set Computer):
- Small set of basic instructions that execute quickly, typically in a single clock cycle.
- Simpler micro-architecture design, requiring fewer transistors.
- Programs may contain more instructions, but execution is highly efficient.
- Example: ARM processors, widely used in mobile devices.
CISC (Complex Instruction Set Computer):
- Designed to execute more complex instructions, which often take multiple cycles.
- Programs are smaller as they contain fewer instructions.
- Example: x86 processors, dominant in desktops and servers.
General Observations:
- RISC architectures excel in scenarios requiring high efficiency and low power, such as mobile devices.
- CISC architectures dominate general-purpose computing due to compatibility with legacy software and complex operations.

von Neumann Architecture Model

All modern processors adhere to the von Neumann architecture model.
The von Neumann architecture consists of five components:
1. Processing Unit:
  - Composed of the Arithmetic/Logic Unit (ALU) and Registers.
  - The ALU performs mathematical operations (addition, subtraction, etc.).
  - Registers are fast storage units for program data and instructions being executed.
2. Control Unit:
  - Responsible for loading instructions from memory and coordinating execution with the processing unit.
  - Contains the Program Counter (PC) and Instruction Register (IR).
3. Memory Unit:
  - Stores program data and instructions in Random Access Memory (RAM).
  - RAM provides fast, direct access to memory locations via unique addresses.
4. Input Unit:
  - Loads program data and instructions into the computer.
5. Output Unit:
  - Stores or displays program results.

Fetch-Decode-Execute-Store (FEDS) Cycle

Fetch: The control unit fetches the next instruction from memory using the program counter. The control unit places that address on the address bus and increments the PC. It also places the read command on the control bus. The memory unit then reads the bytes stored at the address and places them on the data bus which is then read by the control unit. The instruction register stores the bytes of the instruction received from the memory unit.
Decode: The control unit decodes the instuction stored in the instruction register. It decodes the opcode and operands, determining what action to take.
Execute: The processing unit executes the instruction. The ALU performs the necessary calculations or data manipulations.
Store: Results are stored in memory or registers.

Example: In modern systems, 32-bit processors can address up to (2^{32}) bytes of memory (4 GB).

Memory and Addressing

Smallest Addressable Unit: In modern systems, the smallest addressable memory unit is 1 byte (8 bits).
32-bit vs. 64-bit Architectures:
- 32-bit systems: Address up to (2^{32}) bytes (4 GB).
- 64-bit systems: Address up to (2^{64}) bytes (16 exabytes).
Memory Hierarchy:
- Registers > Cache > RAM > Secondary Storage.
- Each layer balances speed and capacity, with registers being the fastest but smallest.

The von Neumann Bottleneck

Definition: The limitation caused by the shared bus between memory and the CPU, which slows data transfer.
Consequences:
- Slower execution of memory-intensive programs.
- Limits on parallel execution.
Mitigation:
- Use of caches to reduce frequent memory access.
- Development of pipelining and out-of-order execution to improve instruction throughput.

Modern Innovations in Computer Architecture

Harvard Architecture:
- Separates data and instruction memory, reducing the von Neumann bottleneck.
Multicore Processors:
- Incorporate multiple CPUs (cores) on a single chip for parallel execution.
Pipelining:
- Breaks instruction execution into stages, allowing multiple instructions to be processed simultaneously.
- Each instruction takes 4 cycles: fetch, decode, execute, store, resulting in a CPI (cycles per instruction) of 4
- The control circuitry of a CPU can be tweaked to obtain a better CPI value
- The CPU circuity involved with executing each stage of the 4 stages is only actively involved once every 4 cycles. The other 3 cycles it sits idle. For example, in a given instruction, after the fetch stage, the fetch circuity sits idle for the remaining 3 clock cycles in the execution of the instruction. Pipelining is the act of allowing the fetch circuitry to execute the fetch stage for other instructions. Put another way, CPU pipelining is the idea of starting the execution of the next instruction before the current instruction has fully completed its execution. Sequences of instructions can overlap.
- The Intel Core i7 has a 14 stage pipeline
- A pipeline stall occurs when any stage of execution is forced to wait on another before it can continue
Speculative Execution:
- Predicts and executes instructions before they are needed, increasing efficiency.
Graphics Processing Units (GPUs):
- Specialized processors optimized for parallel computation, commonly used in machine learning and graphics.
RISC-V:
- A modern open-standard RISC architecture gaining popularity for its flexibility and extensibility.

Building a Processor

The CPU implements the processing and control units of the von Neumann architecture.
Key components include the ALU, registers, and control unit.

ALU

Performs all arithmetic and logical operations on signed and unsigned integers. A separate floating point unit performs arithmetic on floating-point numbers.
The ALU takes integer operands and opcode values that specify an operation to perform on the operands

Registers

Fast, small storage units within the CPU that hold data and instructions being executed.
Common registers include the Program Counter (PC), Instruction Register (IR), and General-Purpose Registers (GPRs).
The CPU’s set of general-purpose registers is organized into a register file circuit.
- A register file consists of a set of register circuits for storing data values and some control circuits for controlling reads and writes to its registers

notebook

Computer Architecture

Table of Contents

RISC vs CISC

Key Differences Between RISC and CISC

von Neumann Architecture Model

Fetch-Decode-Execute-Store (FEDS) Cycle

Memory and Addressing

The von Neumann Bottleneck

Modern Innovations in Computer Architecture

Building a Processor

ALU

Registers

Keyboard shortcuts

notebook