The Brief History Of Computer Architecture Information Technology Essay

The processor (often called the CPU) is the brain of our PC. As its name suggests a processor is device which can processes something, that something is data, this data is made up of 0’s and 1’s (zeroes and ones in digital electronics).

To understand a processor we should have knowledge about digital systems and its functions. All of the works are inside PC is carried out by the means of voltage, or more accurately two difference voltages.

Digital systems use only two voltages, one which is a low voltage for off stage and represent it as 0 and the high voltage for on stage represented as 1. These 0s and 1s are called bits. A single letter is 8 bits (like A, B)

8 bit 10010010 =1 byte

Processor Architecture

Figure 1

A processor (or CPU) processes bits (binary digits) of data. In its simplest form, the processor will retrieve some data as input, perform some process on that data using ALU, CU and memory, and then store the result in either its own internal memory (cache) or the systems memory. This called output.

Figure 2

According to the figure 2 computer Architecture is the way we are talking to machine. Actually computer architecture is things of high level components fix together and they work together to deliver performance.

The Brief History of Computer Architecture

3.1 First Generation (1940-1950) – Vacuum Tube

Figure 3

ENIAC- 1945: Designed by Mauchly & Echert, built by US army to calculate trajectories for ballistic shells during WWII, used 18000 vacuum tubes and 1500 relays, programmed by manually setting switches

UNIVAC – 1950: the first commercial computer

John Von Neumann architecture: Goldstine and Von Neumann took the idea of ENIAC and developed concept of storing a program in the memory. Known as the “Von Neumann” architecture and has been the basis for virtually every machine designed

Features:

Electron emitting devices

Data and programs are stored in a single read-write memory

Memory contents are addressable by location, regardless of the content itself

Machine language/Assemble language

Sequential execution

Use of drum memory or magnetic core memory, programs and data are loaded using paper tape or punch cards

2 Kb memory, 10 KIPS

Two types of models for a computing machine:

Harvard architecture – This has physically separate storage and signal pathways for its instructions and data. (The term invented the Harvard. Mark I(Relay-based computer system, which use punched tape for save store instructions and relay latches for data)

Von Neumann architecture – a single storage structure to hold both the set of instructions and the data. Such machines are also known as stored-program computers.

Von Neumann bottleneck – Which very small amount of bandwidth, or the data transfer rate, between CPU and memory.

3.2 Second Generation (1950-1964) – Transistors

Figure 4

William Shockley, John Bardeen, and Walter Brattain invent the transistor that reduce size of computers and improve reliability.

First operating Systems: handled one program at a time

On-off switches controlled by electricity

High level languages

Floating point arithmetic

Reduced the computational time from milliseconds to microseconds

First operating Systems: handled one program at a time

1959 – IBM´s 7000 series mainframes were the company´s first transistorized computers

3.4 Third Generation (1964-1974) – Integrated Circuits (IC)

Figure 5

Microprocessor chips combines thousands of transistors, entire circuit on one computer ship

Semiconductor memory

Multiple computer models with different performance characteristics

Smaller computers that did not need a specialized room

2 Mb memory, 5 MIPS

Use of cache memory

IBM’s System 360 – the first family of computers making a clear distinction between architecture and implementation

3.4 Fourth Generation (1974-present) Very Large-Scale Integration (VLSI)/Ultra Large Scale Integration (ULSI)

Combines millions of transistors

Single-chip processor and the single-board computer emerged

Creation of the Personal Computer (PC)

Wide spread use of data communications

Artificial intelligence: Functions & logic predicates

Object-Oriented programming: Objects & operations on objects

Massively parallel machine

Smallest in size because of the high component density

1971 – The 4004

Figure 6

The 4004 was the world’s first universal microprocessor, invented by Federico Faggin, Ted Hoff, and Stan Mazor. With just over 2,300 MOS transistors in an area of only 3 by 4 millimeters had as much power as the ENIAC.

Feathers:

4-bit CPU

1K data memory and 4K program memory

clock rate: 740kHz

Just a few years later, the word size of the 4004 was doubled to form the 8008.

Intel 8080 Motorola 68000 Intel 386 Alpha 21264

Figure 7

Caparison of CPU(Intel,Motoroal,Alpha)

Intel 8080-1974

Motorola 68000-1979

Intel 386-1985

Alpha 21264

8-bit Data

32 bit architecture internally but 16 bit data bus

32-bit Data

64-bit Address/Data, Adaptive Branch Prediction

16-bit Address

16 32-bit registers, 8 data and

8 address registers

improved addressing

Superscalar, 15.2M Transistors

6 μm NMOS

2 stage pipeline

security modes (kernal,

system services, application

services, applications)

Out-of-Order Execution, 0.35 μm CMOS Process

6K Transistors

no vertual memory support

256 TLB entries

2 MHz

68020 was fully 32 bit

externally

128KB Cache, 600 MHz

Table 1 – Caparison of CPU

History of Computer Invention

Computer History

Year/Enter

Computer History

Inventors/Inventions

Computer History

Description of Event

1936

Konrad Zuse – Z1 Computer

First freely programmable computer.

1942

John Atanasoff & Clifford Berry

ABC Computer

Who was first in the computing biz is not always as easy as ABC.

1944

Howard Aiken & Grace Hopper

Harvard Mark I Computer

The Harvard Mark 1 computer.

1946

John Presper Eckert & John W. Mauchly

ENIAC 1 Computer

20,000 vacuum tubes later…

1948

Frederic Williams & Tom Kilburn

Manchester Baby Computer & The Williams Tube

Baby and the Williams Tube turn on the memories.

1947/48

John Bardeen, Walter Brattain & Wiliam Shockley

The Transistor

No, a transistor is not a computer, but this invention greatly affected the history of computers.

1951

John Presper Eckert & John W. Mauchly

UNIVAC Computer

First commercial computer & able to pick presidential winners.

1953

International Business Machines

IBM 701 EDPM Computer

IBM enters into ‘The History of Computers’.

1954

John Backus & IBM

Read also  Study On Awareness Of Internet Banking

FORTRAN Computer Programming Language

The first successful high level programming language.

1955

(In Use 1959)

Stanford Research Institute, Bank of America, and General Electric

ERMA and MICR

The first bank industry computer – also MICR (magnetic ink character recognition) for reading checks.

1958

Jack Kilby & Robert Noyce

The Integrated Circuit

Otherwise known as ‘The Chip’

1962

Steve Russell & MIT

Spacewar Computer Game

The first computer game invented.

1964

Douglas Engelbart

Computer Mouse & Windows

Nicknamed the mouse because the tail came out the end.

1969

ARPAnet

The original Internet.

1970

Intel 1103 Computer Memory

The world’s first available dynamic RAM chip.

1971

Faggin, Hoff & Mazor

Intel 4004 Computer Microprocessor

The first microprocessor.

1971

Alan Shugart &IBM

The “Floppy” Disk

Nicknamed the “Floppy” for its flexibility.

1973

Robert Metcalfe & Xerox

The Ethernet Computer Networking

Networking.

1974/75

Scelbi & Mark-8 Altair & IBM 5100 Computers

The first consumer computers.

1976/77

Apple I, II & TRS-80 & Commodore Pet Computers

More first consumer computers.

1978

Dan Bricklin & Bob Frankston

VisiCalc Spreadsheet Software

Any product that pays for itself in two weeks is a surefire winner.

1979

Seymour Rubenstein & Rob Barnaby

WordStar Software

Word Processors.

1981

IBM

The IBM PC – Home Computer

From an “Acorn” grows a personal computer revolution

1981

Microsoft

MS-DOS Computer Operating System

From “Quick And Dirty” comes the operating system of the century.

1983

Apple Lisa Computer

The first home computer with a GUI, graphical user interface.

1984

Apple Macintosh Computer

The more affordable home computer with a GUI.

1985

Microsoft Windows

Microsoft begins the friendly war with Apple.

More info each invention was available at http://inventors.about.com/library/blcoindex.htm by clicking each year full details are available

Analyzes of Pentium 4 32-bit Microprocessor Architectures

Produced From 2000 to 2008

Manufacturer Intel

Max. CPU clock rate 3.6 GHz

FSB speeds 400 MHz

Feature size 180 nm to 65 nm

Instruction set x86 (i386), MMX, SSE2, rapid execution engine, hyper pipelined technology, advanced dynamic execution, a new cache subsystem

Micro architecture Net Burst

Socket(s) Socket 478

Core name(s) Willamette

Northwood

Prescott

Cedar Mill

Any computerChip details are available at http://happytrees.org/chips?page=manufacturer&manufacturer=Intel&family=4004

The mean of 32-bit Microprocessor

32-bit mentions number of bits that can be processed or transmitted in parallel, which basically means at the same time as one. A single element in a data format, called Octets (four Bytes) or double word.

The term ’32-bit’ is also applied to the following within:

A 32-bit microprocessor can process 32bit width of the data and memory addresses in registers.

Data bus 32bit of the physical number of wires which can transmit 32 bits in parallel.

32 bit will divided to two parts inside Graphical device, such as a scanner or digital camera, 24 bits are used to specifying the number of bits used to represent each pixel. That is true colour and the remaining 8 bits are used for control information.

With in Operating system number of bits used for memory addresses.

In computer architecture, memory addresses, data units are wide 32-bits (4 bytes or octets)

32-bits is store 0 to 4,294,967,295 or −2,147,483,648 to 2,147,483,647 for any number (integer) using two’s complement encoding.

Four gigabytes (4,000,000,000 bytes) of addressable memory are available for read/write data if a processor has 32-bit memory address data lines

32-bit is a most important implementation in computing for last 20 years to recently. Because 32-bit CPU and ALU are based on registers, address buses, or data buses of its size. And this term has become standard 32-bit processors.

Normally address and data buses are wider than 32-bits. But the 32-bit processor can store and manipulate internally as quantities

Pentium IV 3.6GHz Bus Architecture

Figure 8

North Bridge or Memory Controller Hub make main communication pathway. This called as the processor bus (front-side bus/ FSB) which is in between the CPU and motherboard chipset. This bus runs at 66MHz to 800MHz in modern systems according to mother board design and its chipset.

Here are the basic differences between Pentium 4 architecture and the other CPU architecture:

Pentium 4 can transfer four data in each clock cycle. This called as QDR (Quad Data Rate).Then the local bus can transfer data 4 times faster its actual clock rate, (see table 2 below). When the clock rate is 100MHz and its local bus is 400Mhz.Then the data transfer rate is 3.2 GB/s on System Interface.

Real Clock

Performance

Transfer Rate

100 MHz

400 MHz

3.2 GB/s

133 MHz

533 MHz

4.2 GB/s

200 MHz

800 MHz

6.4 GB/s

266 MHz

1,066 MHz

8.5 GB/s

Table 2

The L2 and L1 data path is 256-bit wide. This was 64bits in early Intel processors. (See figure 8,L2 is L2 cache/ control, L1 is L1 D-Cache/ D-TLB) So current processors can communicate is four times faster than early processors at same clock were running. However early and current processor data path in between L2 and the pre-fetch unit is 64-bit wider.( pre-fetch unit is BTB and I-TLB in figure 8)

The L1 instruction cache was relocated with a new name “Trace Cache”. L1 built before fetch unit and after decode unit. (Some people make mistakes because of this L1 new place and name. Actually L1 is not missing in Pentium IV. Just with new name and different place). New L1 in Pentium IV (trace cache) can get more than 12 K microinstructions. Because its size 150KB.If one instruction is 100 bit wider then PIV trace cache (L1) can work 8 times faster than early processors.(12KB*100/150)

Early Intel processors have 40 internal registers. But it is 128 On Pentium 4. This was done by registry renaming unit Register Alias Table as shown Figure 8(Rename/Alloc and RAT)

P IV built with 5 execution units in parallel. 2 units for loading and storing data to RAM memory.

This bus normally consist with 50-100 physical address lines(Circuit path).It has divided to three subassemblies

Read also  The Transaction Processing Systems

The address bus (memory bus) is unidirectional (one sides only at a time) transports memory addresses .The processor will read or write data when needed to access

The data bus is a bidirectional bus(both side at a same time) which use to transfers(send or get) instructions to the processor/from the processor

The control bus (command bus) also a bidirectional bus which can carry orders and synchronization signals receiving from the control unit and delivering to all other hardware components. Then the hardware will transmit its respond signal.

For example, If we have a Pentium 4(3.6GHz) processor which has 800MHz bus when its clock rate is 200MHz. By using the below formula we can calculate its maximum instantaneous transfer rate.

800MHz x 8 bytes (64 bits) = 6400MBps

Bus Architecture: – Three buses:

Address:

If I/O, a value between 0000H and FFFFH is issued.

If memory, it depends on the architecture:

20-bits (8086/8088)

24-bits (80286/80386SX)

25-bits (80386SL/SLC/EX)

32-bits (80386DX/80486/Pentium)

36-bits (Pentium Pro/II/III)

Data:

8-bits (8088)

16-bits (8086/80286/80386SX/SL/SLC/EX)

32-bits (80386DX/80486/Pentium)

64-bits (Pentium/Pro/II/III)

Control:

Most systems have at least 4 control bus connections (active low).

MRDC (Memory ReaD Control), MWRC, IORC (I/O Read Control), IOWC.

Bus Standards:

ISA (Industry Standard Architecture): 8 MHz

8-bit (8086/8088)

16-bit (80286-Pentium)

EISA: 8 MHz

32-bit (older 386 and 486 machines)

PCI (Peripheral Component Interconnect): 33 MHz

32-bit or 64-bit (Pentiums)

New: PCI Express and PCI-X 533 MTS

VESA (Video Electronic Standards Association): Runs at processor speed

32-bit or 64-bit (Pentiums)

Only disk and video. Competes with the PCI but is not popular

USB (Universal Serial Bus): 1.5 Mbps,12 Mbps and now 480 Mbps

Newest systems

Serial connection to microprocessor

For keyboards, the mouse, modems and sound cards

To reduce system cost through fewer wires

AGP (Advanced Graphics Port): 66MHz

Newest systems

Fast parallel connection: Across 64-bits for 533MB/sec

For video cards

To accommodate the new DVD (Digital Versatile Disk) players

Latest AGP 3.0 with peak bandwidth of 2.1GB/s

ALU (Arithmetic Logic Unit)

ALU (Arithmetic Logic Unit) is one most import part inside CPU for integer operations. This is a very small unit inside CPU (See Figure 9). In Intel processor they make this unit run in two times than processor clock. If CPU is working at 1.8GHz then ALU will works at 3.6GHz speed. But this will not realize that doubling of ALU speed will faster for other operation like floating-point operations in SSE or MMX

Figure 9

ALUs execute simple integer instructions; therefore the new CPU should prove just perfect in integer operations. However, the doubling of ALU working frequency doesn’t tell in any way on the Pentium 4 performance when working with floating-point operations, SSE or MMX.

However Pentium 4 1.4GHz ALU latency(logic, add, subtract, multiply, divide, shift) will work same as Pentium III 1GHz. Normally PIV 1.4GHz ALU latency will spend 0.35ns to execute an add(+) operation. But Pentium III takes 1ns for the same instruction. Although the ALU frequency was double but there is no big different in operation execution time.

Further details about ALU and SSE, MMX are available at http://www.xbitlabs.com/articles/cpu/display/pentium4-1400-1.html

MEMORY MANAGEMENT UNIT (MMU)

Figure 10

Memory management is managing computer memory. In simplest way this will provide portion of memory to programs when they wanted. Not only providing but also freeing it when not in use or no longer needed.

The MMU/IOMMU1 responsible is managing the computer’s memory system. This component located between the CPU and system memory as a buffer. MMU/IOMMU can translate CPU-visible virtual addresses to physical addresses; the IOMMU takes care of mapping device-visible virtual addresses (device addresses or I/O addresses) to physical addresses. They can provide memory protection for against misbehave devices weather it is separate chip usually it is interconnect with CPU.

There are three areas performed by MMU

Hardware memory management

Operating system memory management

Application memory management

The hardware memory management includes random access memory (RAM) and memory caches. RAM is the physical storage that is located on the system board. It is the main storage where the computer read and written data. Memory caches will helps to CPU speed up its processing time by holding copies of some data in main memory.

1. IOMMU is the Graphics Address Remapping Table (GART) used by AGP and PCI Express graphics cards

Operating system memory management is using hard disk allocated space as memory when the physical memory is out of memory space. This hard disk space called as Virtual Memory. (Figure 10) This process is done by the computer automatically when the program requested it. This allocation is done by the MMU according to the operating system and other applications. The virtual address area in CPU is included a range of addresses divided into pages which are allows operating system to allocated space in hard disk in equal size. (Figure 12)

Figure 11

Application memory management is the process of allocating the memory for program to run. There are many copies for one program in larger operating systems. The memory management unit will assign memory address for the program which is best fits to its run. These kinds of program assign same address. Also the memory management unit distributes memory resources (Garbage collection1) for programs on its needs. Finally the memory will be recycled by the MMU for further usage when operation is done.

1. Garbage collection is the automated allocation and removing of computer memory resources for a program.

.

Figure 12

IA-32

Intel invented 80386, microprocessor in 1985 which extended to 8086 with 32-bit and IA-32 architecture. This architecture support P5 (Pentium), P6 (PentiumPro, II, III), P7 (Pentium 4), and Pentium M family processor over a long time while maintaining full software compatibility with OS code even the MMU is extremely complex with many different possible operating modes.

Table 3- Summarization of Intel IA-32 in Major processors

Example for other well known memory management Architecture

VAX

ARM

BM System/370 and successors

DEC Alpha

MIPS

Sun 1

PowerPC

IA-32 x86-64(Extended of IA-32)

Unisys MCP Systems (Burroughs B5000)

Pentium 4 Pipeline

Pipeline is a technique used in the CPU and other digital electronic devices to increase their processing speed. By reducing its fetch, decode and execute time. Intel Pentium III use 11 stage pipelines but Pentium 4 has 20 stages .So Pentium 4 processor will executed a instruction faster than a Pentium III. If its on 90 nm Pentium 4 generation processors much faster than both. Because of 90 nm Pentium 4(Prescott) has 31-stage pipeline.

Read also  Examining The Importance Of CIO Surveys

Pipeline is using in order to reduce processing time for an instruction or else increase the clock rate of processor. These stages are constructed by using fewer transistors. By having more stages for each individual stage, helps to achieve higher clock rates. Pentium 4 faster than Pentium III. Because it can work at a higher clock rate. But Pentium III CPU would be faster than a Pentium 4 at same clock rate because of pipeline size

Therefore Intel has already announced that they not use Net burst (Pentium 4) architecture for their 8th generation processors. They are planned to use Pentium M architecture. That is Pentium III architecture base on Intel’s 6th generation architecture

In Figure 13, shows Pentium 4 20-stage pipeline.

Figure 13

Pentium 4 pipeline.

How a given instruction is processed by Pentium 4 processors in each stage (See figure 13)

TC Nxt IP: This stage used for branch target buffer (BTB) waiting for the next microinstruction to be executed. This is a Trace cache next instruction pointer. This used two stages.

TC Fetch: microinstruction fetched to the Trace cache. This step used two stages.

Drive: The resource allocator and register renaming circuit gets processed microinstruction

Alloc: Checks what resources will be needed to the CPU according the microinstruction and Allocated- EX the memory load and store buffers.

Rename: x86 registers it will be renamed into one of the 128 internal registers. This step used two stages.

Que: According to microinstruction type they will be categorize in a Queue (Ex integer or floating point). Keeps them in the scheduler until same type is an open.

Sch: This Schedule will take all Microinstructions are according to their type to be executed. It must be in order before arriving to this stage. Other the scheduler will re-orders all instruction to keep all execution units full. This step used three stages.

Disp: Sends the microinstructions to execution engines and dispatched. This step used two stages.

RF: Stored instructions which are read in the internal registers called Register file. This step used two stages.

Ex: Executed Microinstructions.

Flgs: Updated microprocessor flags.

Br Ck: The branch prediction circuit will check that program is same predicted. Branch Check.

Drive: branch target buffer (BTB) present on the processor’s entrance for the sent results

Figure 14

F: instruction fetch

D: decode

E: execute

M: memory access

W: register write-back

More details are available at http://www.hardwaresecrets.com/article/Inside-Pentium-4-Architecture/235/3

The Future of Microprocessor Architecture

A new micro-architecture which is using by Intel CPUs on 2011 called Sandy Bridge. A Nehalem micro-architecture which was used in the Core i7, Core i3 and Core i5 processors was evaluated to the Sandy Bridge

Intel’s 7th generation micro-architecture for the Pentium 4 called Netburst no longer using for their 8th generation. They decide to go to their 6th generation micro-architecture which is use in Pentium Pro, Pentium II, and Pentium III, dubbed P6. which proved to be more efficient. Intel developed the Core architecture by using the Pentium M CPU (6th generation CPU). Finally Intel develop this little bit more by adding an integrated memory controller and released it as the Nehalem micro-architecture and used for Core 2 processor series (Core 2 Duo, Core 2 Quad, Core i3, Core i5, and Core i7). All new generation of Core i3, Core i5, and Core i7 processors to be released in 2011 and 2012 use Sandy-Bridge micro-architecture

The main features of the Sandy Bridge micro-architecture

The north bridge chip integrated with memory controller, graphics controller and PCI Express controller as the rest of the CPU.

32-nm manufacturing process

Ring architecture

New decoded microinstructions cache for storing 1,536 microinstructions, which can translates in more or less to 6 kB in L0

32 kB L1 instruction and 32 kB L1 data cache (same as Nehalem)

L2 was renamed to “mid-level cache” (MLC) with 256 kB

L3 memory cache called LLC (Last Level Cache) which is shared for CPU cores and the graphics engine

Next generation Turbo Boost technology

New AVX (Advanced Vector Extensions) instruction set

Improved graphics controller

Redesigned DDR3 supporting memories up to DDR3-1333

Integrated PCI Express controller (x16 lane or two x8 lanes-same as Nehalem)

socket 1155 pins

More details are available at http://www.hardwaresecrets.com/article/Inside-the-Intel-Sandy-Bridge- Microarchitecture/1161/1

Conclusions

We can’t get any definite conclusion on Pentium 4 performance. There are bunch of cool advantages. This will allow to Intel to easily increase the processor working However Pentium 4 falls behind Athlon processor, because of the super deep 20-stage pipeline and small L1 data cache. The performance of Pentium 4 is same in some application. That’s why in the nearest Pentium 4 won’t be able to beat Athlon CPU. They can get higher working frequencies using new Palomino core and DDR SDRAM support.

On other hand Pentium 4 have some bad drawbacks. One is cost for the CPU comparing to AMD and price of their RDRAM and the main boards for Pentium 4. Second one is there is no big different in application running based on Athlon CPU which is equal to Pentium III

However now they have shift to new architecture called Sandy Bridge and new chip set and new memory. But still we haven’t got a any idea about its prices. However Intel will shift to 0.13 micron manufacturing technology and new chipsets which can support cheaper memory than the today’s RDRAM. Then Intel will win market and High ended workstation. 

Order Now

Order Now

Type of Paper
Subject
Deadline
Number of Pages
(275 words)