Misplaced Pages

Microcode: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 15:13, 17 January 2015 editDsimic (talk | contribs)Extended confirmed users, Pending changes reviewers, Rollbackers39,664 edits The reason for microprogramming: Why was this unlinked?← Previous edit Latest revision as of 05:10, 15 January 2025 edit undoCmelrose.29 (talk | contribs)37 editsm fixed citation needed tag formatting 
(388 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Short description|Layer of hardware-level instructions or data structures}}
'''Microcode''' is a layer of hardware-level instructions that implement higher-level ] instructions or internal ] sequencing in many ] elements. Microcode is used in general ]s, in more specialized processors such as ]s, ]s, ], ]s, ]s, ]s, ]s, and in other hardware.
{{For|the CAD software vendor|MicroCode Engineering, Inc.}}
{{Program execution}}


In ], '''microcode''' serves as an intermediary layer situated between the ] (CPU) hardware and the programmer-visible ] of a computer, also known as its ].<ref name="Kent2813">{{cite book |last1=Kent |first1=Allen |url=https://books.google.com/books?id=EjWV8J8CQEYC |title=Encyclopedia of Computer Science and Technology: Volume 28 - Supplement 13 |last2=Williams |first2=James G. |date=April 5, 1993 |publisher=Marcel Dekker, Inc |isbn=0-8247-2281-7 |location=New York |access-date=Jan 17, 2016 |archive-url=https://web.archive.org/web/20161120161636/https://books.google.com/books?id=EjWV8J8CQEYC |archive-date=November 20, 2016 |url-status=live}}</ref>{{Page needed|date=July 2022}} It consists of a set of hardware-level instructions that implement the higher-level machine code instructions or control internal ] sequencing in many ] components. While microcode is utilized in ] and ] general-purpose CPUs in contemporary desktops and laptops, it functions only as a fallback path for scenarios that the faster ] is unable to manage.<ref name="FogMicro">{{cite report |url=https://www.agner.org/optimize/microarchitecture.pdf |title=The microarchitecture of Intel, AMD and VIA CPUs |last1=Fog |first1=Agner |date=2017-05-02 |publisher=Technical University of Denmark |access-date=2024-08-21 |archive-url= https://web.archive.org/web/20170328065929/https://agner.org/optimize/microarchitecture.pdf |archive-date=2017-03-28 |url-status=live}}</ref>
Microcode typically resides in special high-speed memory and translates machine instructions, ] data or other input into sequences of detailed circuit-level operations. It separates the machine instructions from the underlying ] so that instructions can be designed and altered more freely. It also facilitates the building of complex multi-step instructions, while reducing the complexity of computer circuits. Writing microcode is often called '''microprogramming''' and the microcode in a particular processor implementation is sometimes called a '''microprogram'''.


Housed in special high-speed memory, microcode translates machine instructions, ] data, or other input into sequences of detailed circuit-level operations. It separates the machine instructions from the underlying ], thereby enabling greater flexibility in designing and altering instructions. Moreover, it facilitates the construction of complex multi-step instructions, while simultaneously reducing the complexity of computer circuits. The act of writing microcode is often referred to as ''microprogramming'', and the microcode in a specific processor implementation is sometimes termed a ''microprogram''.
More extensive microcoding allows small and simple ]s to ] more powerful architectures with wider ], more ]s and so on, which is a relatively simple way to achieve software compatibility between different products in a processor family.


Through extensive microprogramming, ]s of smaller scale and simplicity can ] more robust architectures with wider ] lengths, additional ]s, and so forth. This approach provides a relatively straightforward method of ensuring software compatibility between different products within a processor family.
Some hardware vendors, especially ], use the term "microcode" as a synonym for "]". That way, all code in a device is termed "microcode" regardless of it being microcode or ]; for example, ]s are said to have their microcode updated, though they typically contain both microcode and firmware.<ref></ref>


Some hardware vendors, notably ] and ], use the term ''microcode'' interchangeably with '']''. In this context, all code within a device is termed microcode, whether it is microcode or machine code. For instance, updates to a ]'s microcode often encompass updates to both its microcode and firmware.<ref>{{cite web |title=IBM pSeries Servers - Microcode Update for Ultrastar 73LZX (US73) 18/36 GB |url=http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/73lzx.html |url-status=dead |archive-url=https://web.archive.org/web/20190419105117/http://download.boulder.ibm.com/ibmdl/pub/software/server/firmware/73lzx.html |archive-date=April 19, 2019 |access-date=January 22, 2015 |website=IBM}}</ref>
== Overview ==
When compared to normal application programs, the elements composing a microprogram exist on a lower conceptual level. To avoid confusion, each microprogram-related element is differentiated by the "micro" prefix: microinstruction, microassembler, microprogrammer, ], etc.


==Overview==
Engineers normally write the microcode during the design phase of a processor, storing it in a ROM (]) or PLA (])<ref>{{cite journal |last1=Manning |first1=B.M. |last2=Mitby |first2=J.S |last3=Nicholson |first3=J.O. |title=Microprogrammed Processor Having PLA Control Store |journal=IBM Technical Disclosure Bulletin |volume=22 |issue=6 |date=November 1979 |url=http://www.computerhistory.org/collections/accession/102660026}}</ref> structure, or in a combination of both.<ref>Often denoted a ROM/PLA control store in the context of usage in a CPU; {{cite web|title=J-11: DEC's fourth and last PDP-11 microprocessor design ... features ... ROM/PLA control store|url=http://simh.trailing-edge.com/semi/j11.html}}</ref> However, machines also exist that have some (or all) microcode stored in ] or ]. This is traditionally denoted a "writeable ]" in the context of computers, which can be either read-only or ]. In the latter case, the CPU initialization process loads microcode into the control store from another storage medium, with the possibility of altering the microcode to correct bugs in the instruction set, or to implement new machine instructions.
===Instruction sets===
At the hardware level, processors contain a number of separate areas of circuitry, or "units", that perform different tasks. Commonly found units include the ] (ALU) which performs instructions such as addition or comparing two numbers, circuits for reading and writing data to external memory, and small areas of onboard memory to store these values while they are being processed. In most designs, additional high-performance memory, the ], is used to store temporary values, not just those needed by the current instruction.<ref name=CPU>{{cite web |url=https://www.redhat.com/sysadmin/cpu-components-functionality |title=The central processing unit (CPU): Its components and functionality |website=Red Hat |first=David |last=Both |date=23 July 2020}}</ref>


To properly perform an instruction, the various circuits have to be activated in order. For instance, it is not possible to add two numbers if they have not yet been loaded from memory. In ] designs, the proper ordering of these instructions is largely up to the programmer, or at least to the ] of the ] they are using. So to add two numbers, for instance, the compiler may output instructions to load one of the values into one register, the second into another, call the addition function in the ALU, and then write the result back out to memory.<ref name=CPU/>
Complex digital processors may also employ more than one (possibly microcode-based) ] in order to delegate sub-tasks that must be performed (more or less) asynchronously in parallel. A high-level programmer, or even an ] programmer, does not normally see or change microcode. Unlike machine code, which often retains some ] among different processors in a family, microcode only runs on the exact ]ry for which it is designed, as it constitutes an inherent part of the particular processor design itself.


As the sequence of instructions needed to complete this higher-level concept, "add these two numbers in memory", may require multiple instructions, this can represent a performance bottleneck if those instructions are stored in ]. Reading those instructions one by one is taking up time that could be used to read and write the actual data. For this reason, it is common for non-RISC designs to have many different instructions that differ largely on where they store data. For instance, the ] has eight variations of the addition instruction, {{code|ADC}}, which differ only in where they look to find the two operands.<ref>{{cite web |url=http://www.6502.org/tutorials/6502opcodes.html |title= NMOS 6502 Opcodes |first= John |last=Pickens |website=6502.org}}</ref>
Microprograms consist of series of microinstructions, which control the CPU at a very fundamental level of hardware circuitry. For example, a single typical microinstruction might specify the following operations:


Using the variation of the instruction, or "]", that most closely matches the ultimate operation can reduce the number of instructions to one, saving memory used by the program code and improving performance by leaving the ] open for other operations. Internally, however, these instructions are not separate operations, but sequences of the operations the units actually perform. Converting a single instruction read from memory into the sequence of internal actions is the duty of the ], another unit within the processor.<ref name=microcode>{{cite web |url=http://www.righto.com/2022/11/how-8086-processors-microcode-engine.html#:~:text=In%201951%2C%20Maurice%20Wilkes%20came,memory%20called%20a%20control%20store. |title=How the 8086 processor's microcode engine works |website=Ken Shirriff's blog |first=Ken |last=Shirriff}}</ref>
* Connect Register 1 to the "A" side of the ]

* Connect Register 7 to the "B" side of the ALU
===Microcode===
* Set the ALU to perform ] addition
The basic idea behind microcode is to replace the custom hardware logic implementing the instruction sequencing with a series of simple instructions run in a "microcode engine" in the processor. Whereas a custom logic system might have a series of diodes and gates that output a series of voltages on various control lines, the microcode engine is connected to these lines instead, and these are turned on and off as the engine reads the microcode instructions in sequence. The microcode instructions are often bit encoded to those lines, for instance, if bit 8 is true, that might mean that the ALU should be paused awaiting data. In this respect microcode is somewhat similar to the paper rolls in a ], where the holes represent which key should be pressed.

The distinction between custom logic and microcode may seem small, one uses a pattern of diodes and gates to decode the instruction and produce a sequence of signals, whereas the other encodes the signals as microinstructions that are read in sequence to produce the same results. The critical difference is that in a custom logic design, changes to the individual steps require the hardware to be redesigned. Using microcode, all that changes is the code stored in the memory containing the microcode. This makes it much easier to fix problems in a microcode system. It also means that there is no effective limit to the complexity of the instructions, it is only limited by the amount of memory one is willing to use.

The lowest layer in a computer's software stack is traditionally raw ] instructions for the processor. In microcoded processors, fetching and decoding those instructions, and executing them, may be done by microcode. To avoid confusion, each microprogram-related element is differentiated by the ''micro'' prefix: microinstruction, microassembler, microprogrammer, etc.<ref>{{Cite web |title=ISO/IEC/IEEE 24765:2017(en) Systems and software engineering — Vocabulary |url=https://www.iso.org/obp/ui/#iso:std:iso-iec-ieee:24765:ed-2:v1:en |access-date=2024-06-23 |website=www.iso.org}}</ref>

Complex digital processors may also employ more than one (possibly microcode-based) ] in order to delegate sub-tasks that must be performed essentially asynchronously in parallel. For example, the ] has a hardwired IBox unit to fetch and decode instructions, which it hands to a microcoded EBox unit to be executed,<ref>{{cite book|url=http://www.bitsavers.org/pdf/dec/vax/9000/EK-KA90S-TD-001_VAX_9000_System_Technical_Description_May90.pdf|title=VAX 9000 System Technical Description|publisher=]|date=May 1990|id=EK-KA90S-TD-001|pages=3{{hyp}}5-3{{hyp}}32}}</ref> and the ] has both a microcoded IBox and a microcoded EBox.<ref>{{cite book|url=http://bitsavers.org/pdf/dec/vax/8800/EK-KA882_8800sysTech2_Jul86.pdf|title=VAX 8800 System Technical Description Volume 2|publisher=]|date=July 1986|id=EK-KA882-TD-PRE}}</ref>

A high-level programmer, or even an ] programmer, does not normally see or change microcode. Unlike machine code, which often retains some ] among different processors in a family, microcode only runs on the exact ]ry for which it is designed, as it constitutes an inherent part of the particular processor design itself.

===Design===
Engineers normally write the microcode during the design phase of a processor, storing it in a ] (ROM) or ] (PLA)<ref>{{cite journal |last1=Manning |first1=B.M. |last2=Mitby |first2=J.S |last3=Nicholson |first3=J.O. |title=Microprogrammed Processor Having PLA Control Store |journal=IBM Technical Disclosure Bulletin |volume=22 |issue=6 |date=November 1979 |url=http://www.computerhistory.org/collections/accession/102660026 |access-date=2011-07-10 |url-status=live |archive-url=https://web.archive.org/web/20121001165413/http://www.computerhistory.org/collections/accession/102660026 |archive-date=2012-10-01}}</ref> structure, or in a combination of both.<ref>Often denoted a ROM/PLA control store in the context of usage in a CPU; {{cite web |last=Supnik |first=Bob |date=24 February 2008 |title=J-11: DEC's fourth and last PDP-11 microprocessor design ... features ... ROM/PLA control store |url=http://simh.trailing-edge.com/semi/j11.html |access-date=2011-07-10 |url-status=live |archive-url=https://web.archive.org/web/20110709032923/http://simh.trailing-edge.com/semi/j11.html |archive-date=2011-07-09}}</ref> However, machines also exist that have some or all microcode stored in ] (SRAM) or ]. This is traditionally denoted as ''writable ]'' in the context of computers, which can be either read-only or ]. In the latter case, the CPU initialization process loads microcode into the control store from another storage medium, with the possibility of altering the microcode to correct bugs in the instruction set, or to implement new machine instructions.

===Microprograms===

Microprograms consist of series of microinstructions, which control the CPU at a very fundamental level of hardware circuitry. For example, a single typical ''horizontal'' microinstruction might specify the following operations:
* Connect register 1 to the ''A'' side of the ]
* Connect register 7 to the ''B'' side of the ALU
* Set the ALU to perform ] addition
* Set the ALU's carry input to zero * Set the ALU's carry input to zero
* Store the result value in Register 8 * Store the result value in register 8
* Update the "condition codes" with the ALU status flags ("Negative", "Zero", "Overflow", and "Carry") * Update the condition codes from the ALU status flags (''negative'', ''zero'', ''overflow'', and ''carry'')
* Microjump to Micro] nnn for the next microinstruction * Microjump to a given ] address for the next microinstruction


To simultaneously control all processor's features in one cycle, the microinstruction is often wider than 50 bits, e.g., 128 bits on a 360/85 with an emulator feature. Microprograms are carefully designed and optimized for the fastest possible execution, as a slow microprogram would result in a slow machine instruction and degraded performance for related application programs that use such instructions. To simultaneously control all processor's features in one cycle, the microinstruction is often wider than 50 bits; e.g., 128 bits on a ] with an emulator feature. Microprograms are carefully designed and optimized for the fastest possible execution, as a slow microprogram would result in a slow machine instruction and degraded performance for related application programs that use such instructions.


==Justification==
== The reason for microprogramming ==
Microcode was originally developed as a simpler method of developing the control logic for a computer. Initially, CPU ]s were "]". Each step needed to fetch, decode, and execute the machine instructions (including any operand address calculations, reads, and writes) was controlled directly by ] and rather minimal ] state machine circuitry. While very efficient, the need for powerful instruction sets with multi-step addressing and complex operations (''see below'') made such hard-wired processors difficult to design and debug; highly encoded and varied-length instructions can contribute to this as well, especially when very irregular encodings are used. Microcode was originally developed as a simpler method of developing the control logic for a computer. Initially, CPU ]s were ]. Each step needed to fetch, decode, and execute the machine instructions (including any operand address calculations, reads, and writes) was controlled directly by ] and rather minimal ] state machine circuitry. While such hard-wired processors were very efficient, the need for powerful instruction sets with multi-step addressing and complex operations (''see below'') made them difficult to design and debug; highly encoded and varied-length instructions can contribute to this as well, especially when very irregular encodings are used.


Microcode simplified the job by allowing much of the processor's behaviour and programming model to be defined via microprogram routines rather than by dedicated circuitry. Even late in the design process, microcode could easily be changed, whereas hard-wired CPU designs were very cumbersome to change. Thus, this greatly facilitated CPU design. Microcode simplified the job by allowing much of the processor's behaviour and programming model to be defined via microprogram routines rather than by dedicated circuitry. Even late in the design process, microcode could easily be changed, whereas hard-wired CPU designs were very cumbersome to change. Thus, this greatly facilitated CPU design.


From the 1940s to the late 1970s, much programming was done in ]; higher level instructions meant greater programmer productivity, so an important advantage of microcode was the relative ease by which powerful machine instructions could be defined.<ref>The ultimate extension of this were "Directly Executable High Level Language" designs. In these, each statement of a high level language such as ] would be entirely and directly executed by microcode, without compilation. The ] and ] Fountainhead Processor were examples of this.</ref> During the 1970s, CPU speeds grew more quickly than memory speeds and numerous techniques such as ], ] and ]s were used to alleviate this. High level machine instructions, made possible by microcode, helped further, as fewer more complex machine instructions require less memory bandwidth. For example, an operation on a character string could be done as a single machine instruction, thus avoiding multiple instruction fetches. From the 1940s to the late 1970s, a large portion of programming was done in ]; higher-level instructions mean greater programmer productivity, so an important advantage of microcode was the relative ease by which powerful machine instructions can be defined. The ultimate extension of this are "Directly Executable High Level Language" designs, in which each statement of a high-level language such as ] is entirely and directly executed by microcode, without compilation. The ] and ] Fountainhead Processor are examples of this. During the 1970s, CPU speeds grew more quickly than memory speeds and numerous techniques such as ], ] and ]s were used to alleviate this. High-level machine instructions, made possible by microcode, helped further, as fewer more complex machine instructions require less memory bandwidth. For example, an operation on a character string can be done as a single machine instruction, thus avoiding multiple instruction fetches.


Architectures with instruction sets implemented by complex microprograms included the ] ] and ] ]. The approach of increasingly complex microcode-implemented instruction sets was later called ]. An alternate approach, used in many ]s, is to use ]s or ]s (instead of combinational logic) mainly for instruction decoding, and let a simple state machine (without much, or any, microcode) do most of the sequencing.<ref>The ] is an example of a microprocessor using a PLA for instruction decode and sequencing. The PLA is visible in photomicrographs of the chip, such as those at the (across top edge of die photo), and the operation of the FPGA can be seen in the transistor-level simulation on that site. Architectures with instruction sets implemented by complex microprograms included the ] ] and ] ]. The approach of increasingly complex microcode-implemented instruction sets was later called ] (CISC). An alternate approach, used in many ]s, is to use one or more ] (PLA) or ] (ROM) (instead of combinational logic) mainly for instruction decoding, and let a simple state machine (without much, or any, microcode) do most of the sequencing. The ] is an example of a microprocessor using a PLA for instruction decode and sequencing. The PLA is visible in photomicrographs of the chip,<ref>{{cite web |url=http://www.visual6502.org/images/6502/ |title=6502 Images |access-date=January 22, 2015 |url-status=live |archive-url=https://web.archive.org/web/20160304093548/http://www.visual6502.org/images/6502/ |archive-date=March 4, 2016}}</ref> and its operation can be seen in the ]-level simulation.
</ref>


Microprogramming is still used in modern CPU designs. In some cases, after the microcode is debugged in simulation, logic functions are substituted for the control store.{{cn|date=March 2014}} Logic functions are often faster and less expensive than the equivalent microprogram memory. Microprogramming is still used in modern CPU designs. In some cases, after the microcode is debugged in simulation, logic functions are substituted for the control store.{{citation needed|date=March 2014}} Logic functions are often faster and less expensive than the equivalent microprogram memory.


=== Benefits === ===Benefits===
A processor's microprograms operate on a more primitive, totally different and much more hardware-oriented architecture than the assembly instructions visible to normal programmers. In coordination with the hardware, the microcode implements the programmer-visible architecture. The underlying hardware need not have a fixed relationship to the visible architecture. This makes it easier to implement a given instruction set architecture on a wide variety of underlying hardware micro-architectures. A processor's microprograms operate on a more primitive, totally different, and much more hardware-oriented architecture than the assembly instructions visible to normal programmers. In coordination with the hardware, the microcode implements the programmer-visible architecture. The underlying hardware need not have a fixed relationship to the visible architecture. This makes it easier to implement a given instruction set architecture on a wide variety of underlying hardware micro-architectures.


The IBM System/360 had a 32-bit architecture with 16 general-purpose registers, but most of the System/360 implementations actually use hardware that implemented a much simpler underlying microarchitecture; for example, the System/360 Model 30 had 8-bit data paths to the arithmetic logic unit (ALU) and main memory and implemented the general-purpose registers in a special unit of higher-speed ], and the System/360 Model 40 had 8-bit data paths to the ALU and 16-bit data paths to main memory and also implemented the general-purpose registers in a special unit of higher-speed core memory. The Model 50 and Model 65 had full 32-bit data paths; the Model 50 implemented the general-purpose registers in a special unit of higher-speed core memory<ref>{{cite book|title=IBM System/360 Model 50 Functional Characteristics|url=http://bitsavers.org/pdf/ibm/360/funcChar/A22-6898-1_360-50_funcChar_1967.pdf|publisher=]|page=7|year=1967|accessdate=2011-09-20}}</ref> and the Model 65 implemented the general-purpose registers in faster transistor circuits.{{Citation needed|date=September 2011}} In this way, microprogramming enabled IBM to design many System/360 models with substantially different hardware and spanning a wide range of cost and performance, while making them all architecturally compatible. This dramatically reduced the number of unique system software programs that had to be written for each model. The IBM System/360 has a 32-bit architecture with 16 general-purpose registers, but most of the System/360 implementations use hardware that implements a much simpler underlying microarchitecture; for example, the ] has 8-bit data paths to the arithmetic logic unit (ALU) and main memory and implemented the general-purpose registers in a special unit of higher-speed ], and the ] has 8-bit data paths to the ALU and 16-bit data paths to main memory and also implemented the general-purpose registers in a special unit of higher-speed core memory. The ] has full 32-bit data paths and implements the general-purpose registers in a special unit of higher-speed core memory.<ref>{{cite book|title=IBM System/360 Model 50 Functional Characteristics|url=http://bitsavers.org/pdf/ibm/360/functional_characteristics/A22-6898-1_360-50_funcChar_1967.pdf|publisher=]|id=A22-6898-1|page=7|year=1967|access-date=October 29, 2021}}</ref> The Model 65 through the Model 195 have larger data paths and implement the general-purpose registers in faster transistor circuits.{{Citation needed|date=September 2011}} In this way, microprogramming enabled IBM to design many System/360 models with substantially different hardware and spanning a wide range of cost and performance, while making them all architecturally compatible. This dramatically reduces the number of unique system software programs that must be written for each model.


A similar approach was used by Digital Equipment Corporation in their VAX family of computers. Initially a 32-bit ] processor in conjunction with supporting microcode implemented the programmer-visible architecture. Later VAX versions used different microarchitectures, yet the programmer-visible architecture did not change. A similar approach was used by Digital Equipment Corporation (DEC) in their VAX family of computers. As a result, different VAX processors use different microarchitectures, yet the programmer-visible architecture does not change.


Microprogramming also reduced the cost of field changes to correct defects (]s) in the processor; a bug could often be fixed by replacing a portion of the microprogram rather than by changes being made to ] and wiring. Microprogramming also reduces the cost of field changes to correct defects (]) in the processor; a bug can often be fixed by replacing a portion of the microprogram rather than by changes being made to ] and wiring.


==History== ==History==
===Early examples===
In 1947, the design of the ] introduced the concept of a control store as a way to simplify computer design and move beyond '']'' methods. The control store is a ]: a two-dimensional lattice, where one dimension accepts "control time pulses" from the CPU's internal clock, and the other connects to control signals on gates and other circuits. A "pulse distributor" takes the pulses generated by the ] and breaks them up into eight separate time pulses, each of which activates a different row of the lattice. When the row is activated, it activates the control signals connected to it.<ref>{{Cite tech report |last1=Everett |first1=R.R. |last2=Swain |first2=F.E. |year=1947 |title=Whirlwind I Computer Block Diagrams |publisher=MIT Servomechanisms Laboratory |id=R-127 |url=http://www.cryptosmith.com/wp-content/uploads/2009/05/whirlwindr-127.pdf |access-date=June 21, 2006 |url-status=dead |archive-url=https://web.archive.org/web/20120617112919/http://www.cryptosmith.com/wp-content/uploads/2009/05/whirlwindr-127.pdf |archive-date=June 17, 2012}}</ref>


In 1951, ]<ref>{{multiref|{{cite tech report |last=Wilkes |first=Maurice |year=1951 |title=The Best Way to Design an Automatic Calculating Machine |institution=]}}|{{cite book |last=Wilkes |first=Maurice |chapter=The Best Way to Design an Automatic Calculating Machine |chapter-url=https://www.cs.princeton.edu/courses/archive/fall09/cos375/BestWay.pdf |editor-first=M. |editor-last=Campbell-Kelly |title=The early British computer conferences |publisher=MIT Press |date=1989 |isbn=978-0-262-23136-7 |pages=182–4 }}}}</ref> enhanced this concept by adding ''conditional execution'', a concept akin to a ] in computer software. His initial implementation consisted of a pair of matrices: the first one generated signals in the manner of the Whirlwind control store, while the second matrix selected which row of signals (the microprogram instruction word, so to speak) to invoke on the next cycle. Conditionals were implemented by providing a way that a single line in the control store could choose from alternatives in the second matrix. This made the control signals conditional on the detected internal signal. Wilkes coined the term ''microprogramming'' to describe this feature and distinguish it from a simple control store.
In 1947, the design of the ] introduced the concept of a control store as a way to simplify computer design and move beyond '']'' methods. The control store was a ]: a two-dimensional lattice, where one dimension accepted "control time pulses" from the CPU's internal clock, and the other connected to control signals on gates and other circuits. A "pulse distributor" would take the pulses generated by the CPU clock and break them up into eight separate time pulses, each of which would activate a different row of the lattice. When the row was activated, it would activate the control signals connected to it.<ref>{{Cite paper | author=Everett, R.R., and Swain, F.E. | title=Whirlwind I Computer Block Diagrams | publisher=MIT Servomechanisms Laboratory | year=1947 | version=Report R-127 | url=http://www.cryptosmith.com/wp-content/uploads/2009/05/whirlwindr-127.pdf |format=PDF| accessdate=2006-06-21}}
</ref>


===The 360===
Described another way, the signals transmitted by the control store are being played much like a ] roll. That is, they are controlled by a sequence of very wide words constructed of ]s, and they are "played" sequentially. In a control store, however, the "song" is short and repeated continuously.
{{main|System/360}}
Microcode remained relatively rare in computer design as the cost of the ROM needed to store the code was not significantly different than using a custom control store. This changed through the early 1960s with the introduction of mass-produced ] and ], which was far less expensive than dedicated logic based on diode arrays or similar solutions. The first to take real advantage of this was ] in their 1964 ] series. This allowed the machines to have a very complex instruction set, including operations that matched high-level language constructs like formatting binary values as decimal strings, storing the complex series of instructions needed for this task in low cost memory.<ref name=IBM>{{cite web |url=https://www.righto.com/2022/01/ibm360model50.html |title=Simulating the IBM 360/50 mainframe from its microcode |website=Ken Shirriff's blog |first=Ken |last=Shirriff}}</ref>


But the real value in the 360 line was that one could build a series of machines that were completely different internally, yet run the same ISA. For a low-end machine, one might use an 8-bit ALU that requires multiple cycles to complete a single 32-bit addition, while a higher end machine might have a full 32-bit ALU that performs the same addition in a single cycle. These differences could be implemented in control logic, but the cost of implementing a completely different decoder for each machine would be prohibitive. Using microcode meant all that changed was the code in the ROM. For instance, one machine might include a ] and thus its microcode for multiplying two numbers might be only a few lines line, whereas on the same machine without the FPU this would be a program that did the same using multiple additions, and all that changed was the ROM.<ref name=IBM/>
In 1951 ] enhanced this concept by adding ''conditional execution'', a concept akin to a ] in computer software. His initial implementation consisted of a pair of matrices, the first one generated signals in the manner of the Whirlwind control store, while the second matrix selected which row of signals (the microprogram instruction word, as it were) to invoke on the next cycle. Conditionals were implemented by providing a way that a single line in the control store could choose from alternatives in the second matrix. This made the control signals conditional on the detected internal signal. Wilkes coined the term '''microprogramming''' to describe this feature and distinguish it from a simple control store.


The outcome of this design was that customers could use a low-end model of the family to develop their software, knowing that if more performance was ever needed, they could move to a faster version and nothing else would change. This lowered the barrier to entry and the 360 was a runaway success. By the end of the decade, the use of microcode was ''de rigueur'' across the mainframe industry.
==Examples of microprogrammed systems==
] BIOS showing a "] CPU uCode Loading Error" after a failed attempt to upload microcode patches into the CPU.]]<!-- probably just because the CPU revision isn't recognized by this BIOS revision.-->


===Moving up the line===
* In common with many other complex mechanical devices, ] ] used banks of ] to control each operation. That is, it had a read-only control store. As such it deserves to be recognised as the first microprogrammed computer to be designed, even if it has not yet been realised in hardware.{{citation needed|date=November 2013}}
] is stored in the two large square blocks in the upper right and controlled by circuitry to the right of it. It takes up a significant amount of the total chip surface.]]
* The ]<ref>{{cite web|url=http://www.emidec.org.uk/ |title=EMIDEC 1100 computer |publisher=Emidec.org.uk |date= |accessdate=2010-04-26}}</ref> reputedly used a hard-wired control store consisting of wires threaded through ferrite cores, known as 'the laces'.
Early ]s were far too simple to require microcode, and were more similar to earlier mainframes in terms of their instruction sets and the way they were decoded. But it was not long before their designers began using more powerful ]s that allowed for more complex ISAs. By the mid-1970s, most new minicomputers and ]s were using microcode as well, such as most models of the ] and, most notably, most models of the ], which included high-level instruction not unlike those found in the 360.<ref>{{cite book |date=May 1988 |title=VLSI VAX Micro-Architecture |first=Bob |last=Supnik |publisher=Digital Equipment |url=http://simh.trailing-edge.com/docs/microarch.pdf}}</ref>
* Most models of the IBM System/360 series were microprogrammed:
:* The Model 25 was unique among System/360 models in using the top 16k bytes of core storage to hold the control storage for the microprogram. The 2025 used a 16-bit microarchitecture with seven control words (or microinstructions). At power up, or full system reset, the microcode was loaded from the card reader. The ] emulation for this model was loaded this way.
:* The ], the slowest model in the line, used an 8-bit microarchitecture with only a few hardware registers; everything that the programmer saw was emulated by the microprogram. The microcode for this model was also held on special punched cards, which were stored inside the machine in a dedicated reader per card, called "CROS" units (Capacitor Read-Only Storage). A second CROS reader was installed for machines ordered with 1620 emulation.
:* The Model 40 used 56-bit control words. The 2040 box implements both the System/360 main processor and the multiplex channel (the I/O processor). This model used "TROS" dedicated readers similar to "CROS" units, but with an inductive pickup (Transformer Read-only Store).
:* The Model 50 had two internal datapaths which operated in parallel: a 32-bit datapath used for arithmetic operations, and an 8-bit data path used in some logical operations. The control store used 90-bit microinstructions.
:* The Model 85 had separate instruction fetch (I-unit) and execution (E-unit) to provide high performance. The I-unit is hardware controlled. The E-unit is microprogrammed; the control words are 108 bits wide on a basic 360/85 and wider if an emulator feature is installed.
* The ] was microprogrammed with hand wired ferrite cores (a ]) pulsed by a sequencer with conditional execution. Wires routed through the cores were enabled for various data and logic elements in the processor.
* The Digital Equipment Corporation ] processors, with the exception of the PDP-11/20, were microprogrammed.<ref>{{cite book|author=Daniel P. Siewiorek, ], ]|title=Computer Structures: Principles and Examples|publisher=]|location=]|year=1982|isbn=0-07-057302-6}}</ref>
* Most ] minicomputers were microprogrammed. The task of writing microcode for the ] was detailed in the Pulitzer Prize-winning book ].
* Many systems from ] were microprogrammed:
:* The B700 "microprocessor" executed application-level opcodes using sequences of 16-bit microinstructions stored in main memory; each of these was either a register-load operation or mapped to a single 56-bit "nanocode" instruction stored in read-only memory. This allowed comparatively simple hardware to act either as a mainframe peripheral controller or to be packaged as a standalone computer.
:* The ] was implemented with radically different hardware including bit-addressable main memory but had a similar multi-layer organisation. The operating system would preload the interpreter for whatever language was required. These interpreters presented different virtual machines for ], ], etc.
* ] produced computers in which the microcode was accessible to the user; this allowed the creation of custom assembler level instructions. Microdata's ] operating system design made extensive use of this capability.
* The ]'s ], which serves as the console's ] and audio processor, utilized microcode; it is possible to implement new effects or tweak the processor to achieve the desired output. Some well-known examples of custom microcode include ]'s Nintendo 64 ports of the '']'', '']'' and '']''.
* The VU0 and VU1 vector units in the ] ] are microprogrammable; in fact, VU1 was only accessible via microcode for the first several generations of the SDK.


The same basic evolution occurred with ]s as well. Early designs were extremely simple, and even the more powerful 8-bit designs of the mid-1970s like the ] had instruction sets that were simple enough to be implemented in dedicated logic. By this time, the control logic could be patterned into the same die as the CPU, making the difference in cost between ROM and logic less of an issue. However, it was not long before these companies were also facing the problem of introducing higher-performance designs but still wanting to offer ]. Among early examples of microcode in micros was the ].<ref name=microcode/>
==Implementation==
Each microinstruction in a microprogram provides the bits that control the functional elements that internally compose a CPU. The advantage over a hard-wired CPU is that internal CPU control becomes a specialized form of a computer program. Microcode thus transforms a complex electronic design challenge (the control of a CPU) into a less complex programming challenge.


Among the ultimate implementations of microcode in microprocessors is the ]. This offered a highly ] with a wide variety of ]s, all implemented in microcode. This did not come without cost, according to early articles, about 20% of the chip's surface area (and thus cost) is the microcode system<ref>{{cite magazine |magazine=Byte |date= April 1983 |title=Design Philosophy Behind Motorola's MC68000 |first= Thomas |last= Starnes |url=http://www.easy68k.com/paulrsm/doc/dpbm68k1.htm}}</ref> and {{cn|reason=later estimates suggest approximately 23,000|date=December 2024}} of the systems 68,000 transistors were part of the microcode system.
To take advantage of this, computers were divided into several parts:


===RISC enters===
A ] picked the next word of the control store. A sequencer is mostly a counter, but usually also has some way to jump to a different part of the control store depending on some data, usually data from the ] and always some part of the control store. The simplest sequencer is just a register loaded from a few bits of the control store.
While companies continued to compete on the complexity of their instruction sets, and the use of microcode to implement these was unquestioned, in the mid-1970s an internal project in IBM was raising serious questions about the entire concept. As part of a project to develop a high-performance all-digital ], a team led by ] began examining huge volumes of performance data from their customer's 360 (and ]) programs. This led them to notice a curious pattern: when the ISA presented multiple versions of an instruction, the ] almost always used the simplest one, instead of the one most directly representing the code. They learned that this was because those instructions were always implemented in hardware, and thus run the fastest. Using the other instruction might offer higher performance on some machines, but there was no way to know what machine they were running on. This defeated the purpose of using microcode in the first place, which was to hide these distinctions.<ref name=risc>{{Cite journal
| last1 = Cocke | first1 = John
| last2 = Markstein | first2 = Victoria
| doi = 10.1147/rd.341.0004
| url = https://www.cis.upenn.edu/~milom/cis501-Fall11/papers/cocke-RISC.pdf
| title = The evolution of RISC technology at IBM
| journal = IBM Journal of Research and Development
| volume = 34| issue = 1
| pages = 4–11
| date=January 1990
}}</ref>


The team came to a radical conclusion: "Imposing microcode between a computer and its users imposes an expensive overhead in performing the most frequently executed instructions."<ref name=risc/>
A ] set is a fast memory containing the data of the central processing unit. It may include the program counter, stack pointer, and other numbers that are not easily accessible to the application programmer. Often the register set is a triple-ported ]; that is, two registers can be read, and a third written at the same time.


The result of this discovery was what is today known as the ] concept. The complex microcode engine and its associated ROM is reduced or eliminated completely, and those circuits instead dedicated to things like additional registers or a wider ALU, which increases the performance of every program. When complex sequences of instructions are needed, this is left to the compiler, which is the entire purpose of using a compiler in the first place. The basic concept was soon picked up by university researchers in California, where simulations suggested such designs would trivially outperform even the fastest conventional designs. It was one such project, at the ], that introduced the term RISC.
An ] performs calculations, usually addition, logical negation, a right shift, and logical AND. It often performs other functions, as well.


The industry responded to the concept of RISC with both confusion and hostility, including a famous dismissive article by the VAX team at Digital.<ref name=comments>{{cite journal |url=https://dl.acm.org/doi/pdf/10.1145/641914.641918 |title=Comments on "The Case for the Reduced Instruction Computer" |first1=Douglas |last1=Clark |first2=William |last2=Strecker |date=September 1980 |journal=ACM|volume=8 |issue=6 |pages=34–38 |doi=10.1145/641914.641918 |s2cid=14939489 }}</ref> A major point of contention was that implementing the instructions outside of the processor meant it would spend much more time reading those instructions from memory, thereby slowing overall performance no matter how fast the CPU itself ran.<ref name=comments/> Proponents pointed out that simulations clearly showed the number of instructions was not much greater, especially when considering compiled code.<ref name=risc/>
There may also be a ] and a ], used to access the main ].


The debate raged until the first commercial RISC designs emerged in the second half of the 1980s, which easily outperformed the most complex designs from other companies. By the late 1980s it was over; even DEC was abandoning microcode for their ] designs, and CISC processors switched to using hardwired circuitry, rather than microcode, to perform many functions. For example, the ] uses hardwired circuitry to fetch and decode instructions, using microcode only to execute instructions; register-register move and arithmetic instructions required only one microinstruction, allowing them to be completed in one clock cycle.<ref>{{cite conference|url=https://ieeexplore.ieee.org/document/63682|title=The execution pipeline of the Intel i486 CPU|book-title= Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage|publisher=]|isbn=0-8186-2028-5|location=San Francisco, CA|doi=10.1109/CMPCON.1990.63682}}</ref> The ]'s fetch and decode hardware fetches instructions and decodes them into series of micro-operations that are passed on to the execution unit, which schedules and executes the micro-operations, possibly doing so ]. Complex instructions are implemented by microcode that consists of predefined sequences of micro-operations.<ref>{{cite web|url=http://stffrdhrn.github.io/content/2019/Intel_PentiumPro.pdf|title=Pentium Pro Processor At 150, 166, 180, and 200 MHz|publisher=]|date=November 1995|type=Datasheet}}</ref>
Together, these elements form an "]". Most modern ] have several execution units. Even simple computers usually have one unit to read and write memory, and another to execute user code.


Some processor designs use machine code that runs in a special mode, with special instructions, available only in that mode, that have access to processor-dependent hardware, to implement some low-level features of the instruction set. The DEC Alpha, a pure RISC design, used ] to implement features such as ] (TLB) miss handling and interrupt handling,<ref name="axp-architecture-manual">{{cite book|url=http://bitsavers.org/pdf/dec/alpha/Sites_AlphaAXPArchitectureReferenceManual_2ed_1995.pdf|title=Alpha AXP Architecture Reference Manual|edition=Second|chapter=Part I / Common Architecture, Chapter 6 Common PALcode Architecture|publisher=]|date=1995|isbn=1-55558-145-5}}</ref> as well as providing, for Alpha-based systems running ], instructions requiring interlocked memory access that are similar to instructions provided by the ] architecture.<ref name="axp-architecture-manual" /> CMOS ] CPUs, starting with the G4 processor, and ] CPUs use ] to implement some instructions.<ref>{{cite journal|last=Rogers|first=Bob|title=The What and Why of zEnterprise Millicode|journal=IBM Systems Magazine|date=Sep–Oct 2012|url=http://www.ibmsystemsmag.com/mainframe/administrator/performance/millicode_rogers/|archive-url=https://web.archive.org/web/20121009085728/http://www.ibmsystemsmag.com/mainframe/administrator/performance/millicode_rogers/|archive-date=October 9, 2012|url-status=dead}}</ref>
These elements could often be brought together as a single chip. This chip came in a fixed width that would form a "slice" through the execution unit. These were known as ']' chips. The ] family is one of the best known examples of bit slice elements.


==Examples==
The parts of the execution units and the execution units themselves are interconnected by a bundle of wires called a ].
* The ] envisioned by ] uses ] to store its internal procedures.
* The ]<ref>{{cite web |url=http://www.emidec.org.uk/ |title=EMIDEC 1100 computer |publisher=Emidec.org.uk |access-date=April 26, 2010 |url-status=live |archive-url=https://web.archive.org/web/20100612184405/http://www.emidec.org.uk/ |archive-date=June 12, 2010}}</ref> reputedly uses a hard-wired control store consisting of wires threaded through ferrite cores, known as "the laces".
* Most models of the IBM System/360 series are microprogrammed:
** The ] is unique among System/360 models in using the top 16&nbsp;K bytes of core storage to hold the control storage for the microprogram. The 2025 uses a 16-bit microarchitecture with seven control words (or microinstructions). After system maintenance or when changing operating mode, the microcode is loaded from the card reader, tape, or other device.<ref>{{cite book |title=IBM System/360 Model 25 Functional Characteristics |date=January 1968 |publisher=IBM |id=A24-3510-0 |page=22 |url=http://www.bitsavers.org/pdf/ibm/360/functional_characteristics/A24-3510-0_360-25_funcChar_Jan68.pdf |access-date=October 29, 2021}}</ref> The ] emulation for this model is loaded this way.
** The ] uses an 8-bit microarchitecture with only a few hardware registers; everything that the programmer saw is emulated by the microprogram. The microcode for this model is also held on special punched cards, which are stored inside the machine in a dedicated reader per card, called "CROS" units (Capacitor Read-Only Storage).<ref name="360-30-feto">{{cite book |url=http://www.bitsavers.org/pdf/ibm/360/fe/2030/Y24-3360-1_2030_FE_Theory_Opns_Jun67.pdf |title=Field Engineering Theory of Operation, 2030 Processing Unit, System/360 Model 30 |edition=First |date=June 1967 |publisher=IBM |id=Y24-3360-1 |access-date=2019-11-09 |url-status=live |archive-url=https://web.archive.org/web/20200401215647/http://www.bitsavers.org/pdf/ibm/360/fe/2030/Y24-3360-1_2030_FE_Theory_Opns_Jun67.pdf |archive-date=2020-04-01}}</ref>{{rp|2–5}} Another CROS unit is added for machines ordered with 1401/1440/1460 emulation<ref name="360-30-feto"/>{{rp|4–29}} and for machines ordered with 1620 emulation.<ref name="360-30-feto"/>{{rp|4–75}}
** The ] uses 56-bit control words. The 2040 box implements both the System/360 main processor and the multiplex channel (the I/O processor). This model uses ''TROS'' dedicated readers similar to ''CROS'' units, but with an inductive pickup (Transformer Read-only Store).
** The ] has two internal datapaths which operated in parallel: a 32-bit datapath used for arithmetic operations, and an 8-bit data path used in some logical operations. The control store uses 90-bit microinstructions.
** The ] has separate instruction fetch (I-unit) and execution (E-unit) to provide high performance. The I-unit is hardware controlled. The E-unit is microprogrammed; the control words are 108 bits wide on a basic 360/85 and wider if an emulator feature is installed.
* The ] is microprogrammed with hand wired ferrite cores (a ]) pulsed by a sequencer with conditional execution. Wires routed through the cores are enabled for various data and logic elements in the processor.
* The Digital Equipment Corporation ] processor, KL10 and KS10 ] processors, and ] processors with the exception of the PDP-11/20, are microprogrammed.<ref>{{cite book|url=https://archive.org/details/computerstructur01siew/page/671|chapter-url=http://gordonbell.azurewebsites.net/computer_structures_principles_and_examples/csp0687.htm|editor1=Daniel P. Siewiorek|editor-link1=Daniel Siewiorek|editor2=C. Gordon Bell|editor-link2=Gordon Bell|editor3=Allen Newell|editor-link3=Allen Newell|title=Computer Structures: Principles and Examples|chapter=Implementation and Performance Evaluation of the PDP-11 Family|author1=Edward A. Snow|author2=Daniel P. Siewiorek|page=|publisher=]|location=]|year=1982|isbn=0-07-057302-6|url-access=registration}}</ref>
* Most ] minicomputers are microprogrammed. The task of writing microcode for the ] is detailed in the Pulitzer Prize-winning book titled '']''.
* Many systems from ] are microprogrammed:
:* The B700 "microprocessor" execute application-level opcodes using sequences of 16-bit microinstructions stored in main memory; each of these is either a register-load operation or mapped to a single 56-bit "nanocode" instruction stored in read-only memory. This allows comparatively simple hardware to act either as a mainframe peripheral controller or to be packaged as a standalone computer.
:* The ] is implemented with radically different hardware including bit-addressable main memory but has a similar multi-layer organisation. The operating system preloads the interpreter for whatever language is required. These interpreters present different virtual machines for ], ], etc.
* ] produced computers in which the microcode is accessible to the user; this allows the creation of custom assembler level instructions. Microdata's ] operating system design makes extensive use of this capability.
* The ] workstation used a microcoded design but, unlike many computers, the microcode engine is not hidden from the programmer in a layered design. Applications take advantage of this to accelerate performance.
* The ] is described as having both ].<ref>{{cite journal|url=https://www.computer.org/csdl/magazine/co/1981/09/01667517/13rRUwciPii|title=Design of a Small Business Data Processing System|first=Frank|last=Soltis|journal=]|date=September 1981|volume=14|pages=77–93|doi=10.1109/C-M.1981.220610|s2cid=398484}}</ref> In practice, the processor implements an instruction set architecture named the ''Internal Microprogrammed Interface'' (IMPI) using a horizontal microcode format. The so-called vertical microcode layer implements the System/38's hardware-independent ] (MI) instruction set by translating MI code to IMPI code and executing it. Prior to the introduction of the ] processor line, early ] systems used the same architecture.<ref name="inside-as400">{{cite book|title=Inside the AS/400, Second Edition|url=https://books.google.com/books?id=5DoPAAAACAAJ|isbn=978-1882419661|author=Frank G. Soltis|year=1997|publisher=Duke Press}}</ref>
* The ]'s ] (RCP), which serves as the console's ] and audio processor, utilizes microcode; it is possible to implement new effects or tweak the processor to achieve the desired output. Some notable examples of custom RCP microcode include the high-resolution graphics, particle engines, and unlimited draw distances found in ]'s '']'', '']'', and '']'';<ref name="Interview: Battling the N64 (Naboo)">{{cite web |url=http://ign64.ign.com/articles/087/087646p1.html |title=Interview: Battling the N64 (Naboo) |publisher=IGN64 |date=November 10, 2000 |access-date=March 27, 2008 |url-status=live |archive-url=https://web.archive.org/web/20070913180626/http://ign64.ign.com/articles/087/087646p1.html |archive-date=September 13, 2007}}</ref><ref name="Indiana Jones and the Infernal Machine">{{cite web |title=Indiana Jones and the Infernal Machine |website=IGN |url=http://www.ign.com/articles/2000/12/13/indiana-jones-and-the-infernal-machine-2 |date=December 12, 2000 |access-date=September 24, 2013 |url-status=live |archive-url=https://web.archive.org/web/20130927083807/http://www.ign.com/articles/2000/12/13/indiana-jones-and-the-infernal-machine-2 |archive-date=September 27, 2013}}</ref> and the ] playback found in ]' '']''.<ref name="Postmortem RE2 N64">{{cite news |last=Meynink |first=Todd |date=July 28, 2000 |url=http://www.gamasutra.com/view/feature/3148/postmortem_angel_studios_.php |title=Postmortem: Angel Studios' Resident Evil 2 (N64 Version) |work=] |publisher=] |access-date=October 18, 2010 |url-status=live |archive-url=https://web.archive.org/web/20121021070818/http://www.gamasutra.com/view/feature/3148/postmortem_angel_studios_.php |archive-date=October 21, 2012}}</ref>
{{Further|topic=Nintendo 64 microcode|Nintendo 64 programming characteristics|Nintendo 64 Game Pak}}
* The VU0 and VU1 vector units in the ] ] are microprogrammable; in fact, VU1 is only accessible via microcode for the first several generations of the SDK.
* The MicroCore Labs {{Webarchive|url=https://web.archive.org/web/20161103224205/http://www.microcorelabs.com/mcl86.html |date=2016-11-03 }} , {{Webarchive|url=https://web.archive.org/web/20170202042033/http://www.microcorelabs.com/mcl51.html |date=2017-02-02 }} and {{Webarchive|url=https://web.archive.org/web/20181221000146/http://www.microcorelabs.com/mcl65.html |date=2018-12-21 }} are examples of highly encoded "vertical" microsequencer implementations of the Intel 8086/8088, 8051, and MOS 6502.
* The Meta 4 Series 16 computer system was a user-microprogammable system first available in 1970. The microcode had a primarily vertical style with 32-bit microinstructions.<ref>{{cite book |url=http://www.bitsavers.org/pdf/digitalScientific/7032MO_Meta4Series16RefMan.pdf |title=Digital Scientific Meta 4 Series 16 Computer System Reference Manual |id=7032MO |publisher=Digital Scientific Corporation |date=May 1971 |access-date=2020-01-14 |url-status=live |archive-url=https://web.archive.org/web/20200114014526/http://www.bitsavers.org/pdf/digitalScientific/7032MO_Meta4Series16RefMan.pdf |archive-date=2020-01-14}}</ref> The instructions were stored on replaceable program boards with a grid of bit positions. One (1) bits were represented by small metal squares that were sensed by amplifiers, zero (0) bits by the absence of the squares.<ref>{{cite book |url=http://www.bitsavers.org/pdf/digitalScientific/7024MO_ROMmanual_Mar70.pdf|title=Digital Scientific Meta 4 Computer System Read-Only Memory (ROM) Reference Manual |id=7024MO |publisher=Digital Scientific Corporation |date=March 1970 |access-date=2020-01-14 |url-status=live |archive-url=https://web.archive.org/web/20190923061816/http://bitsavers.org/pdf/digitalScientific/7024MO_ROMmanual_Mar70.pdf |archive-date=2019-09-23}}</ref> The system could be configured with up to 4K 16-bit words of microstore. One of Digital Scientific's products was an emulator for the ].<ref>{{cite book |url=http://www.bitsavers.org/pdf/digitalScientific/7006MO_Meta16_SysMan_Jun70.pdf |title=The Digital Scientific Meta 4 Series 16 Computer System Preliminary System Manual |id=7006MO |publisher=Digital Scientific Corporation |date=June 1970 |access-date=2020-01-14 |url-status=live |archive-url=https://web.archive.org/web/20190923061755/http://bitsavers.org/pdf/digitalScientific/7006MO_Meta16_SysMan_Jun70.pdf |archive-date=2019-09-23}}</ref><ref>{{cite book |url=http://www.bitsavers.org/pdf/digitalScientific/M4-005P-170_1130rom_Jan70.pdf |title=Digital Scientific Meta 4 Computer System Typical ROM Pattern Listing and Program To Simulate The IBM 1130 Instruction Set |id=M4/005P-170 |publisher=Digital Scientific Corporation |date=January 1970 |access-date=2020-01-14 |url-status=live |archive-url=https://web.archive.org/web/20200324115023/http://bitsavers.org/pdf/digitalScientific/M4-005P-170_1130rom_Jan70.pdf |archive-date=2020-03-24}}</ref>
* The ] is a ] made by ] from 1975 through the early 1980s. It was used to implement three different computer architectures in microcode: the ], the ], and the ] ], a cost-reduced PDP-11.<ref>{{cite web |url=http://www.antiquetech.com/?page_id=782 |title=Western Digital 1600 |publisher=AntiqueTech |access-date=5 January 2017 |url-status=dead |archive-url=https://web.archive.org/web/20170103021205/http://www.antiquetech.com/?page_id=782 |archive-date=3 January 2017}}</ref>
* Earlier ] processors are fully microcoded. x86 processors implemented ] (patch by ] or ]) since ] and ]. Such processors implemented microcode ROM and microcode SRAM in their silicon.
* Many ]s and ]s implement patchable microcode (patch by operating system). Such microcode is patched to device's ] or ], for example, ] of a video card.


==Implementation==
Programmers develop microprograms, using basic software tools. A ] allows a programmer to define the table of bits symbolically. A ] program executes the bits in the same way as the electronics (hopefully), and allows much more freedom to debug the microprogram.
Each microinstruction in a microprogram provides the bits that control the functional elements that internally compose a CPU. The advantage over a hard-wired CPU is that internal CPU control becomes a specialized form of a computer program. Microcode thus transforms a complex electronic design challenge (the control of a CPU) into a less complex programming challenge. To take advantage of this, a CPU is divided into several parts:
* An ] may decode instructions in hardware and determine the microcode address for processing the instruction in parallel with the ].
* A ] picks the next word of the control store. A sequencer is mostly a counter, but usually also has some way to jump to a different part of the control store depending on some data, usually data from the ] and always some part of the control store. The simplest sequencer is just a register loaded from a few bits of the control store.
* A ] set is a fast memory containing the data of the central processing unit. It may include registers visible to application programs, such as ] and the ], and may also include other registers that are not easily accessible to the application programmer. Often the register set is a triple-ported ]; that is, two registers can be read, and a third written at the same time.
* An ] performs calculations, usually addition, logical negation, a right shift, and logical AND. It often performs other functions, as well.


There may also be a ] and a ], used to access the main ]. Together, these elements form an "]". Most modern ] have several execution units. Even simple computers usually have one unit to read and write memory, and another to execute user code. These elements could often be brought together as a single chip. This chip comes in a fixed width that would form a "slice" through the execution unit. These are known as "]" chips. The ] family is one of the best known examples of bit slice elements.<ref>{{cite book
After the microprogram is finalized, and extensively tested, it is sometimes used as the input to a computer program that constructs logic to produce the same data. This program is similar to those used to optimize a ]. No known computer program can produce optimal logic, but even pretty good logic can vastly reduce the number of transistors from the number required for a ROM control store. This reduces the cost of producing, and the electricity consumed by, a CPU.
|title=Computer Architecture and Organization |last=Hayes |first=John P. |isbn=0-07-027363-4 |year=1978 |page=300|publisher=McGraw-Hill }}</ref> The parts of the execution units and the whole execution units are interconnected by a bundle of wires called a ].


Programmers develop microprograms, using basic software tools. A ] allows a programmer to define the table of bits symbolically. Because of its close relationship to the underlying architecture, "microcode has several properties that make it difficult to generate using a compiler."<ref name=Kent2813 /> A ] program is intended to execute the bits in the same way as the electronics, and allows much more freedom to debug the microprogram. After the microprogram is finalized, and extensively tested, it is sometimes used as the input to a computer program that constructs logic to produce the same data.{{citation needed|date=February 2018}} This program is similar to those used to optimize a ]. Even without fully optimal logic, heuristically optimized logic can vastly reduce the number of transistors from the number needed for a ] (ROM) control store. This reduces the cost to produce, and the electricity used by, a CPU.
Microcode can be characterized as '''horizontal''' or '''vertical'''. This refers primarily to whether each microinstruction directly controls CPU elements (horizontal microcode), or requires subsequent decoding by ] before doing so (vertical microcode). Consequently each horizontal microinstruction is wider (contains more bits) and occupies more storage space than a vertical microinstruction.

Microcode can be characterized as ''horizontal'' or ''vertical'', referring primarily to whether each microinstruction controls CPU elements with little or no decoding (horizontal microcode){{efn|IBM horizontally microcoded processors had multiple micro-orders and register select fields that required decoding.}} or requires extensive decoding by ] before doing so (vertical microcode). Consequently, each horizontal microinstruction is wider (contains more bits) and occupies more storage space than a vertical microinstruction.


===Horizontal microcode=== ===Horizontal microcode===
"Horizontal microcode has several discrete micro-operations that are combined in a single microinstruction for simultaneous operation."<ref name=Kent2813 /> Horizontal microcode is typically contained in a fairly wide control store; it is not uncommon for each word to be 108 bits or more. On each tick of a sequencer clock a microcode word is read, decoded, and used to control the functional elements that make up the CPU.
{{Unreferenced|section|date=September 2014}}

Horizontal microcode is typically contained in a fairly wide control store; it is not uncommon for each word to be 108 bits or more. On each tick of a sequencer clock a microcode word is read, decoded, and used to control the functional elements that make up the CPU.


In a typical implementation a horizontal microprogram word comprises fairly tightly defined groups of bits. For example, one simple arrangement might be: In a typical implementation a horizontal microprogram word comprises fairly tightly defined groups of bits. For example, one simple arrangement might be:

{| class="wikitable" {| class="wikitable"
|- |-
| register source A || register source B || destination register || ] operation || type of jump || jump address | Register source A || Register source B || Destination register || ] operation || Type of jump || Jump address
|} |}


For this type of micromachine to implement a JUMP instruction with the address following the opcode, the microcode might require two clock ticks. The engineer designing it would write microassembler source code looking something like this: For this type of micromachine to implement a JUMP instruction with the address following the opcode, the microcode might require two clock ticks. The engineer designing it would write microassembler source code looking something like this:
{{sxhl|

# Any line starting with a number-sign is a comment # Any line starting with a number-sign is a comment
# This is just a label, the ordinary way assemblers symbolically represent a # This is just a label, the ordinary way assemblers symbolically represent a
# memory address. # memory address.
InstructionJUMP: InstructionJUMP:
# To prepare for the next instruction, the instruction-decode microcode has already # To prepare for the next instruction, the instruction-decode microcode has already
Line 127: Line 176:
# instruction to the memory data register for use by the instruction decode. # instruction to the memory data register for use by the instruction decode.
# The sequencer instruction "next" means just add 1 to the control word address. # The sequencer instruction "next" means just add 1 to the control word address.
MDR, NONE, MAR, COPY, NEXT, NONE MDR, NONE, MAR, COPY, NEXT, NONE
# This places the address of the next instruction into the PC. # This places the address of the next instruction into the PC.
# This gives the memory system a clock tick to finish the fetch started on the # This gives the memory system a clock tick to finish the fetch started on the
# previous microinstruction. # previous microinstruction.
# The sequencer instruction is to jump to the start of the instruction decode. # The sequencer instruction is to jump to the start of the instruction decode.
MAR, 1, PC, ADD, JMP, InstructionDecode MAR, 1, PC, ADD, JMP, InstructionDecode
# The instruction decode is not shown, because it is usually a mess, very particular # The instruction decode is not shown, because it is usually a mess, very particular
# to the exact processor being emulated. Even this example is simplified. # to the exact processor being emulated. Even this example is simplified.
Line 138: Line 187:
# it from the word following the op-code. Therefore, rather than just one # it from the word following the op-code. Therefore, rather than just one
# jump instruction, those CPUs have a family of related jump instructions. # jump instruction, those CPUs have a family of related jump instructions.
|ucode}}

For each tick it is common to find that only some portions of the CPU are used, with the remaining groups of bits in the microinstruction being no-ops. With careful design of hardware and microcode, this property can be exploited to parallelise operations that use different areas of the CPU; for example, in the case above, the ALU is not required during the first tick, so it could potentially be used to complete an earlier arithmetic instruction. For each tick it is common to find that only some portions of the CPU are used, with the remaining groups of bits in the microinstruction being no-ops. With careful design of hardware and microcode, this property can be exploited to parallelise operations that use different areas of the CPU; for example, in the case above, the ALU is not required during the first tick, so it could potentially be used to complete an earlier arithmetic instruction.


===Vertical microcode=== ===Vertical microcode===
In vertical microcode, each microinstruction is encoded—that is, the bit fields may pass through intermediate combinatory logic that in turn generates the actual control signals for internal CPU elements (ALU, registers, etc.). In contrast, with horizontal microcode the bit fields themselves directly produce the control signals. Consequently vertical microcode requires smaller instruction lengths and less storage, but requires more time to decode, resulting in a slower CPU clock. In vertical microcode, each microinstruction is significantly encoded, that is, the bit fields generally pass through intermediate combinatory logic that, in turn, generates the control and sequencing signals for internal CPU elements (ALU, registers, etc.). This is in contrast with horizontal microcode, in which the bit fields either directly produce the control and sequencing signals or are only minimally encoded. Consequently, vertical microcode requires smaller instruction lengths and less storage, but requires more time to decode, resulting in a slower CPU clock.<ref>{{cite web
|url = http://euler.mat.uson.mx/~havillam/ca/CS323/0708.cs-323003.html
|title = CS-323: High Performance Microprocessors – Chapter 1. Microprogramming
|date = 2009-10-12
|access-date = 2015-08-08
|author1 = Neal Harman
|author2 = Andy Gimblett
|website = mat.uson.mx
|archive-date = 2015-04-19
|archive-url = https://web.archive.org/web/20150419164703/http://euler.mat.uson.mx/~havillam/ca/CS323/0708.cs-323003.html
|url-status = dead
}}</ref>


Some vertical microcode is just the assembly language of a simple conventional computer that is emulating a more complex computer. Some processors, such as ] processors and the CMOS microprocessors on later IBM ] mainframes and ] mainframes, have ] (the term used on Alpha processors) or ] (the term used on IBM mainframe microprocessors). This is a form of machine code, with access to special registers and other hardware resources not available to regular machine code, used to implement some instructions and other functions, such as page table walks on Alpha processors.<ref>{{cite web|url=http://download.majix.org/dec/palcode_dsgn_gde.pdf|title=PALcode for Alpha Microprocessors System Design Guide|publisher=]|date=May 1996|accessdate=November 7, 2013}}</ref><ref>{{cite book|url=http://digbib.ubka.uni-karlsruhe.de/volltexte/documents/2591965|title=High Availability and Scalability of Mainframe Environments using System z and z/OS as example|author=Robert Vaupel|isbn=978-3-7315-0022-3}}</ref><ref>{{cite journal|last=Rogers|first=Bob|title=The What and Why of zEnterprise Millicode|journal=IBM Systems Magazine|date=Sep–Oct 2012|url=http://www.ibmsystemsmag.com/mainframe/administrator/performance/millicode_rogers/}}</ref> Some vertical microcode is just the assembly language of a simple conventional computer that is emulating a more complex computer. Some processors, such as ] processors and the CMOS microprocessors on later IBM mainframes ] and ], use machine code, running in a special mode that gives it access to special instructions, special registers, and other hardware resources unavailable to regular machine code, to implement some instructions and other functions,<ref>{{cite book |last=Vaupel |first=Robert |year=2013 |url=https://books.google.com/books?id=1-dt0ABZQOcC&q=millicode |title=High Availability and Scalability of Mainframe Environments using System z and z/OS as example |page=26 |publisher=KIT Scientific |isbn=978-3-7315-0022-3}}</ref><ref>{{cite journal |last=Rogers |first=Bob |date=September–October 2012 |title=The What and Why of zEnterprise Millicode |journal=IBM Systems Magazine |url=http://www.ibmsystemsmag.com/mainframe/administrator/performance/millicode_rogers/ |access-date=2013-11-07 |url-status=dead |archive-url=https://web.archive.org/web/20131016100828/http://ibmsystemsmag.com/mainframe/administrator/performance/millicode_rogers/ |archive-date=2013-10-16}}</ref> such as page table walks on Alpha processors.<ref>{{cite web |url=http://download.majix.org/dec/palcode_dsgn_gde.pdf|title=PALcode for Alpha Microprocessors System Design Guide |publisher=] |date=May 1996 |access-date=November 7, 2013 |url-status=live |archive-url=https://web.archive.org/web/20110815022514/http://download.majix.org/dec/palcode_dsgn_gde.pdf |archive-date=August 15, 2011}}</ref> This is called ] on Alpha processors and ] on IBM mainframe processors.


Another form of vertical microcode has two fields: Another form of vertical microcode has two fields:
{| class="wikitable" {| class="wikitable"
|- |-
| field select || field value | Field select || Field value
|} |}


The "field select" selects which part of the CPU will be controlled by this word of the control store. The ''field select'' selects which part of the CPU will be controlled by this word of the control store. The ''field value'' controls that part of the CPU. With this type of microcode, a designer explicitly chooses to make a slower CPU to save money by reducing the unused bits in the control store; however, the reduced complexity may increase the CPU's clock frequency, which lessens the effect of an increased number of cycles per instruction.
The "field value" actually controls that part of the CPU.
With this type of microcode, a designer explicitly chooses to make a slower CPU to save money by reducing the unused bits in the control store;
however, the reduced complexity may increase the CPU's clock frequency, which lessens the effect of an increased number of cycles per instruction.


As transistors became cheaper, horizontal microcode came to dominate the design of CPUs using microcode, with vertical microcode being used less often. As transistors grew cheaper, horizontal microcode came to dominate the design of CPUs using microcode, with vertical microcode being used less often.


When both vertical and horizontal microcode are used, the horizontal microcode may be referred to as ''nanocode'' or ''picocode''.<ref>{{cite book |last=Spruth |first=Wilhelm |date=December 2012 |title=The Design of a Microprocessor |publisher=Springer Science & Business Media |isbn=978-3-642-74916-2 |page=31 |url=https://books.google.com/books?id=0YmqCAAAQBAJ |access-date=Jan 18, 2015 |url-status=live |archive-url=https://web.archive.org/web/20161120195023/https://books.google.com/books?id=0YmqCAAAQBAJ |archive-date=November 20, 2016}}</ref>
== {{Anchor|IML}}Writable control stores ==

=={{Anchor|IML}}Writable control store==
{{Main|Writable control store}} {{Main|Writable control store}}


A few computers were built using "writable microcode". In this design, rather than storing the microcode in ROM or hard-wired logic, the microcode was stored in a RAM called a ''Writable Control Store'' or ''WCS''. Such a computer is sometimes called a ''Writable Instruction Set Computer'' or ''WISC''.<ref> article by Philip Koopman Jr. 1987</ref> A few computers were built using ''writable microcode''. In this design, rather than storing the microcode in ROM or hard-wired logic, the microcode is stored in a RAM called a ''writable control store'' or ''WCS''. Such a computer is sometimes called a ''writable instruction set computer'' (WISC).<ref>{{cite journal |last=Koopman |first=Philip Jr. |date=1987 |url=http://www.ece.cmu.edu/~koopman/forth/rochester_87.pdf |title=Writable instruction set, stack oriented computers: The WISC Concept |journal=The Journal of Forth Application and Research |pages=49–71 |url-status=live |archive-url=https://web.archive.org/web/20080511192958/http://www.ece.cmu.edu/~koopman/forth/rochester_87.pdf |archive-date=2008-05-11}}</ref>


Many experimental prototype computers used writable control stores, and there were also commercial machines that used writable microcode, such as the ], early ] workstations, the ] ] 8800 ("Nautilus") family, the ] L- and G-machines, a number of IBM System/360 and ] implementations, some DEC ] machines,<ref>http://pdp10.nocrew.org/cpu/kl10-ucode.txt</ref> and the ].<ref>{{cite web|author=Mark Smotherman|title=CPSC 330 / The Soul of a New Machine|url=http://www.cs.clemson.edu/~mark/330/eagle.html|quote=4096 x 75-bit SRAM writeable control store: 74-bit microinstruction with 1 parity bit (18 fields)}}</ref> Many experimental prototype computers use ]; there are also commercial machines that use writable microcode, such as the ], early ] workstations, the ] ] 8800 (''Nautilus'') family, the ] L- and G-machines, a number of IBM System/360 and ] implementations, some DEC ] machines,<ref>{{cite newsgroup |last=Smith |first=Eric |date=3 September 2002 |url=http://pdp10.nocrew.org/cpu/kl10-ucode.txt |title=Re: What was the size of Microcode in various machines |message-id=qhn0qyveyu.fsf@ruckus.brouhaha.com |newsgroup=alt.folklore.computers |access-date=18 December 2008 |url-status=live |archive-url=https://web.archive.org/web/20090126231132/http://pdp10.nocrew.org/cpu/kl10-ucode.txt |archive-date=26 January 2009}}</ref> and the ].<ref>{{cite web |last=Smotherman |first=Mark |title=CPSC 3300 / The Soul of a New Machine |url=https://people.computing.clemson.edu/~mark/330/eagle.html |quote=4096 x 75-bit SRAM writable control store: 74-bit microinstruction with 1 parity bit (18 fields) |access-date=2023-10-27}}</ref>


Many more machines offered user-programmable writable control stores as an option (including the ], DEC ] and ] V-70 series ]s). The IBM System/370 included a facility called ''Initial-Microprogram Load'' (''IML'' or ''IMPL'')<ref>{{cite manual The IBM System/370 includes a facility called ''Initial-Microprogram Load'' (''IML'' or ''IMPL'')<ref>{{cite book
| author = IBM |publisher = IBM
| title = IBM System/370 Principles of Operation |title = IBM System/370 Principles of Operation
| id = GA22-7000-4 |id = GA22-7000-4
| version = Fourth Edition |version = Fourth Edition
| date = September 1974 |date = September 1974
| url = http://www.bitsavers.org/pdf/ibm/370/princOps/GA22-7000-4_370_Principles_Of_Operation_Sep75.pdf |url = http://www.bitsavers.org/pdf/ibm/370/princOps/GA22-7000-4_370_Principles_Of_Operation_Sep75.pdf
| pages = 98, 245 |pages = 98, 245
|access-date = 2012-08-27
| separator = ,
|archive-date = 2012-02-29
}}</ref> that could be invoked from the console, as part of '']'' (''POR'') or from another processor in a ] ] complex.
|archive-url = https://web.archive.org/web/20120229195635/http://bitsavers.org/pdf/ibm/370/princOps/GA22-7000-4_370_Principles_Of_Operation_Sep75.pdf
|url-status = live
}}</ref> that can be invoked from the console, as part of '']'' (''POR'') or from another processor in a ] ] complex.


Some commercial machines, for example IBM 360/85,<ref>{{cite manual Some commercial machines, for example IBM 360/85,<ref>{{cite book
| author = IBM |publisher = IBM
| title = IBM System/360 Model 85 Functional Characteristics |title = IBM System/360 Model 85 Functional Characteristics
| id = A22-6916-1 |id = A22-6916-1
| url = http://www.bitsavers.org/pdf/ibm/360/funcChar/A22-6916-1_360-85_funcChar_Jun68.pdf |url = http://www.bitsavers.org/pdf/ibm/360/functional_characteristics/A22-6916-1_360-85_funcChar_Jun68.pdf
| version = SECOND EDITION |version = SECOND EDITION
| date = June 1968 |date = June 1968
|access-date = October 29, 2021
| separator = ,
}}</ref><ref>{{cite manual }}</ref><ref>{{cite book
| author = IBM | publisher = IBM
| title = IBM System/360 Special Feature Description 709/7090/7094 Compatibility Feature for IBM System/360 Model 85 | title = IBM System/360 Special Feature Description 709/7090/7094 Compatibility Feature for IBM System/360 Model 85
| id = GA27-2733-0 | id = GA27-2733-0
| version = First Edition | version = First Edition
| date = March 1969}}</ref> have both a read-only storage and a writable control store for microcode.
| date = March 1969
| separator = ,
}}</ref> had both a Read-only storage and a Writable Control Store for microcode.


WCS offered several advantages including the ease of patching the microprogram and, for certain hardware generations, faster access than ROMs could provide. User-programmable WCS allowed the user to optimize the machine for specific purposes. WCS offers several advantages including the ease of patching the microprogram and, for certain hardware generations, faster access than ROMs can provide. User-programmable WCS allows the user to optimize the machine for specific purposes.


Starting with the ] in 1995, several ] CPUs have writable ].<ref name="Stiller_1996">{{cite journal |last1=Stiller |first1=Andreas |last2=Paul |first2=Matthias R.<!-- info contributor on processor internals --> |date=1996-05-12 |title=Prozessorgeflüster |series=Trends & News |language=de |journal=] |publisher=] |url=https://www.heise.de/ct/artikel/Prozessorgefluester-284546.html |access-date=2017-08-28 |url-status=live |archive-url=https://web.archive.org/web/20170828172141/https://www.heise.de/ct/artikel/Prozessorgefluester-284546.html |archive-date=2017-08-28}}</ref><ref>{{cite book |url=http://www.intel.com/Assets/PDF/manual/253668.pdf |title=Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A: System Programming Guide, Part 1 |chapter=9.11: Microcode update facilities |publisher=] |date=September 2016}}</ref> This, for example, has allowed bugs in the ] and Intel ] microcodes to be fixed by patching their microprograms, rather than requiring the entire chips to be replaced. A second prominent example is the set of microcode patches that Intel offered for some of their processor architectures of up to 10 years in age, in a bid to counter the security vulnerabilities discovered in their designs – ] and ] – which went public at the start of 2018.<ref> by Paul Alcorn on March 15, 2018</ref><ref>{{cite web |url=https://downloadcenter.intel.com/download/27591/Linux-Processor-Microcode-Data-File |title=Download Linux* Processor Microcode Data File |access-date=2018-03-21 |url-status=dead |archive-url=https://web.archive.org/web/20180319202103/https://downloadcenter.intel.com/download/27591/Linux-Processor-Microcode-Data-File |archive-date=2018-03-19}}</ref> A microcode update can be installed by Linux,<ref>{{cite web |url=http://urbanmyth.org/microcode/ |title=Intel Microcode Update Utility for Linux |archive-url=https://web.archive.org/web/20120226174302/http://urbanmyth.org/microcode/ |archive-date=2012-02-26 |url-status=dead}}</ref> ],<ref>{{cite web |url=https://svnweb.freebsd.org/ports/head/sysutils/cpupdate/ |title= Index of /head/sysutils/cpupdate |publisher=Freebsd.org |access-date=2020-01-16 |url-status=live |archive-url=https://web.archive.org/web/20200401215701/https://svnweb.freebsd.org/ports/head/sysutils/cpupdate/ |archive-date=2020-04-01}}</ref> Microsoft Windows,<ref>{{Cite news |url=http://support.microsoft.com/kb/936357 |title=A microcode reliability update is available that improves the reliability of systems that use Intel processors |access-date=2008-02-25 |url-status=live |archive-url=https://web.archive.org/web/20080223074207/http://support.microsoft.com/kb/936357 |archive-date=2008-02-23}}</ref> or the motherboard BIOS.<ref>{{cite web |url=http://www.intel.com/support/motherboards/server/sb/cs-021619.htm |title=Server Products - BIOS Update required when Missing Microcode message is seen during POST |date=January 24, 2013 |website=Intel |archive-url=https://web.archive.org/web/20140901063251/http://www.intel.com/support/motherboards/server/sb/cs-021619.htm |archive-date=September 1, 2014}}</ref>
Several Intel CPUs in the ] architecture family have writable microcode.<ref>
, chapter 9.11: "Microcode update facilities", December 2009.
</ref>
This has allowed bugs in the ] microcode and Intel ] microcode to be fixed in software, rather than requiring the entire chip to be replaced.


Some machines offer user-programmable writable control stores as an option, including the ], DEC ], ]/12,<ref>{{cite web |title=Model 990/12 LR Computer Depot Maintenance and Repair Manual |url=http://www.bitsavers.org/pdf/ti/990/990-12/2268241_990-12CPU_DepoRepair_Feb83.pdf |website=Bitsavers.org |publisher=Texas Instruments |access-date=15 February 2024}}</ref><ref>{{cite book |title=Texas Instruments Model 990 Computer MDS-990 Microcode Development System Programmer's Guide |location=Texas Instruments Archives, RG-20 accession 94-08, Box 10, 45C. DeGolyer Library, Southern Methodist University, Dallas, TX USA |edition=15 August 1979}}</ref> and ] V-70 series ]s.
==Microcode versus VLIW and RISC==


==Comparison to VLIW and RISC==
{{unreferenced section|date=August 2023}}
{{Update|section|reason=Many CISC processors now do instruction fetch and decode in hardware, and execute most if not all instructions in hardware, and both RISC and CISC processors execute several operations per clock cycle|date=December 2023}}
The design trend toward heavily microcoded processors with complex instructions began in the early 1960s and continued until roughly the mid-1980s. At that point the ] design philosophy started becoming more prominent. The design trend toward heavily microcoded processors with complex instructions began in the early 1960s and continued until roughly the mid-1980s. At that point the ] design philosophy started becoming more prominent.


A CPU that uses microcode generally takes several clock cycles to execute a single instruction, one clock cycle for each step in the microprogram for that instruction. Some ] processors include instructions that can take a very long time to execute. Such variations interfere with both ] ] and, what is far more important in modern systems, ]. A CPU that uses microcode generally takes several clock cycles to execute a single instruction, one clock cycle for each step in the microprogram for that instruction. Some ] processors include instructions that can take a very long time to execute. Such variations interfere with both ] and, what is far more important in modern systems, ].


When designing a new processor, a ] RISC has the following advantages over microcoded CISC: When designing a new processor, a ] RISC has the following advantages over microcoded CISC:
* Programming has largely moved away from assembly level, so it's no longer worthwhile to provide complex instructions for productivity reasons.
* Simpler instruction sets allow direct execution by hardware, avoiding the performance penalty of microcoded execution.
* Analysis shows complex instructions are rarely used, hence the machine resources devoted to them are largely wasted.
* The machine resources devoted to rarely used complex instructions are better used for expediting performance of simpler, commonly used instructions.
* Complex microcoded instructions may require many clock cycles that vary, and are difficult to ] for increased performance.


There are counterpoints as well:
*Programming has largely moved away from assembly level, so it's no longer worthwhile to provide complex instructions for productivity reasons.
* The complex instructions in heavily microcoded implementations may not take much extra machine resources, except for microcode space. For example, the same ALU is often used to calculate an effective address and to compute the result from the operands, e.g., the original ], ], and others.
*Simpler instruction sets allow direct execution by hardware, avoiding the performance penalty of microcoded execution.
* The simpler non-RISC instructions (i.e., involving direct memory ]s) are frequently used by modern compilers. Even immediate to stack (i.e., memory result) arithmetic operations are commonly employed. Although such memory operations, often with varying length encodings, are more difficult to pipeline, it is still fully feasible to do so - clearly exemplified by the ], ], ], ], etc.
*Analysis shows complex instructions are rarely used, hence the machine resources devoted to them are largely wasted.
*The machine resources devoted to rarely used complex instructions are better used for expediting performance of simpler, commonly used instructions.
*Complex microcoded instructions may require many clock cycles that vary, and are difficult to ] for increased performance.

It should be mentioned that there are counterpoints as well:
* The complex instructions in heavily microcoded implementations may not take much extra machine resources, except for microcode space. For instance, the same ALU is often used to calculate an effective address as well as computing the result from the actual operands (e.g. the original ], ], and others).
* The simpler non-RISC instructions (i.e. involving direct memory ]s) are frequently used by modern compilers. Even immediate to stack (i.e. memory result) arithmetic operations are commonly employed. Although such memory operations, often with varying length encodings, are more difficult to pipeline, it is still fully feasible to do so - clearly exemplified by the ], ], ], ], etc.
* Non-RISC instructions inherently perform more work per instruction (on average), and are also normally highly encoded, so they enable smaller overall size of the same program, and thus better use of limited cache memories. * Non-RISC instructions inherently perform more work per instruction (on average), and are also normally highly encoded, so they enable smaller overall size of the same program, and thus better use of limited cache memories.
* Modern CISC/RISC implementations, e.g. x86 designs, decode instructions into dynamically buffered ]s with instruction encodings similar to traditional fixed microcode. Ordinary static microcode is used as hardware assistance for complex multistep operations such as auto-repeating instructions and for ]s in the ]; it is also used for special purpose instructions (such as ]) and internal control and configuration purposes.
* The simpler instructions in CISC architectures are also directly executed in hardware in modern implementations.


Many RISC and ] processors are designed to execute every instruction (as long as it is in the cache) in a single cycle. This is very similar to the way CPUs with microcode execute one microinstruction per cycle. VLIW processors have instructions that behave similarly to very wide horizontal microcode, although typically without such fine-grained control over the hardware as provided by microcode. RISC instructions are sometimes similar to the narrow vertical microcode. Many RISC and ] processors are designed to execute every instruction (as long as it is in the cache) in a single cycle. This is very similar to the way CPUs with microcode execute one microinstruction per cycle. VLIW processors have instructions that behave similarly to very wide horizontal microcode, although typically without such fine-grained control over the hardware as provided by microcode. RISC instructions are sometimes similar to the narrow vertical microcode.


Microcoding remains popular in application-specific processors such as ]. Microcode has been popular in application-specific processors such as ]s, ]s, ]s, ]s, ]s, ]s, ]s, and in other hardware.

==Micro-operations==
Modern CISC implementations, such as the ] family starting with the ] Nx586, Intel ], and ]. decode instructions into dynamically buffered ]s with an instruction encoding similar to RISC or traditional microcode. A hardwired instruction decode unit directly emits microoperations for common x86 instructions, but falls back to a more traditional microcode ROM containing microoperations for more complex or rarely used instructions.<ref name=FogMicro/>

For example, an x86 might look up microoperations from microcode to handle complex multistep operations such as loop or string instructions, ] ]s or unusual values such as ]s, and special-purpose instructions such as ].


==See also== ==See also==
{{Portal|Electronics|Information technology}} {{Portal|Electronics}}
{{Div col|colwidth=20em}}
* ] (AGU) * ] (AGU)
* ] * ]
Line 233: Line 296:
* ] * ]
* ] (FPU) * ] (FPU)
* ]
* ] * ]
* ]
* ]
* ]
* ] * ]
{{div col end}}

==Notes==
{{Notelist}}


==References== ==References==
{{reflist|30em}} {{Reflist|30em}}
*{{cite journal | author=Smith, Richard E. | title=A Historical Overview of Computer Architecture | journal=Annals of the History of Computing | year=1988 | volume=10 | issue=4 | pages=277&ndash;303 | url=http://doi.ieeecomputersociety.org/10.1109/MAHC.1988.10039 | accessdate=2006-06-21 | doi = 10.1109/MAHC.1988.10039}}
*{{Cite paper | author=Smotherman, Mark | title=A Brief History of Microprogramming | year=2005 | url=http://www.cs.clemson.edu/~mark/uprog.html | accessdate=2006-07-30}}
*{{cite journal | author=] | title=The Genesis of Microprogramming | journal=Annals of the History of Computing | year=1986 | volume=8 | issue=2 | pages=116&ndash;126 | url=http://doi.ieeecomputersociety.org/10.1109/MAHC.1986.10035 | accessdate=2006-08-07 | doi = 10.1109/MAHC.1986.10035}}
*{{cite journal | author=], and ] | title=Microprogramming and the Design of the Control Circuits in an Electronic Digital Computer | journal=Proc. Cambridge Phil. Soc | volume=49 | issue= pt. 2 |date=April 1953 | pages=230–238 | url=http://research.microsoft.com/~gbell/Computer_Structures_Principles_and_Examples/csp0174.htm | accessdate=2006-08-23 | doi=10.1017/S0305004100028322}}
*{{cite book | author=Husson, S.S | title=Microprogramming Principles and Practices | publisher=Prentice-Hall | year=1970 | isbn=0-13-581454-5}}


== Further reading == ==Further reading==
* {{cite journal
* Tucker, S. G., ''IBM Systems Journal'', Volume 6, Number 4, pp.&nbsp;222–241 (1967)
| last=Smith |first=Richard E.
| year = 1988
| title = A Historical Overview of Computer Architecture
| journal = ]
| volume = 10
| issue = 4
| pages = 277–303
| url = http://doi.ieeecomputersociety.org/10.1109/MAHC.1988.10039
| access-date= June 21, 2006
| doi = 10.1109/MAHC.1988.10039 |s2cid = 16405547
}}
* {{cite web
| author = Smotherman, Mark
| title = A Brief History of Microprogramming
| date = October 2022
| url = https://people.computing.clemson.edu/~mark/uprog.html
| access-date = October 27, 2023}}
* {{cite journal
|last1=Wilkes |first1=M. V. |author1-link=Maurice Wilkes
| title = The Genesis of Microprogramming
| journal = Annals of the History of Computing
| year = 1986
| volume = 8
| issue = 2
| pages = 116–126
| url = http://doi.ieeecomputersociety.org/10.1109/MAHC.1986.10035
| access-date = August 7, 2006
| doi = 10.1109/MAHC.1986.10035 |s2cid = 1978847
}}
* {{cite journal
|last1=Wilkes |first1=M. V. |author1-link=Maurice Wilkes |last2=Stringer |first2=J. B. |author2-link=John Bentley Stringer
| date = April 1953
| title = Microprogramming and the Design of the Control Circuits in an Electronic Digital Computer
| journal = ]
| volume = 49
| issue = pt. 2
| pages = 230–238
| url = http://research.microsoft.com/~gbell/Computer_Structures_Principles_and_Examples/csp0174.htm
| access-date = August 23, 2006
| doi = 10.1017/S0305004100028322| bibcode = 1953PCPS...49..230W
|s2cid=62230627 }}
* {{cite book
| last=Husson |first=S.S.
| year = 1970
| title = Microprogramming Principles and Practices
| publisher = Prentice-Hall
| isbn = 0-13-581454-5
| url-access = registration
| url = https://archive.org/details/microprogramming00huss}}
* {{cite journal
| last = Tucker
| first = S.G.
| year = 1967
| url = http://domino.research.ibm.com/tchjr/journalindex.nsf/a3807c5b4823c53f85256561006324be/758c1e6a8a3e5d0285256bfa00685a2f?OpenDocument
| title = Microprogram control for SYSTEM/360
| journal = ]
| volume = 6
| issue = 4
| pages = 222–241
| doi = 10.1147/sj.64.0222
}}
* {{cite web |first=Ken |last=Shirriff |title=How the 8086 processor's microcode engine works |date=December 2022 |url=https://www.righto.com/2022/11/how-8086-processors-microcode-engine.html}}


==External links== ==External links==
{{Wiktionary}}
* <!-- {{cite web|url=http://www.mikrocodesimulator.de/index_eng.php |title=Mikrocodesimulator MikroSim 2010 |publisher=0/1-SimWare |date= |accessdate=2010-10-03}} -->
{{External links|date=February 2017}}
* *
* *
* *
* *
* (fixes the issues when running 32-bit virtual machines in PAE mode) * (fixes the issues when running 32-bit virtual machines in PAE mode)
* , March 2013, by Ben Hawkes, archived from the original on September 7, 2015
* , '']'', 2002, by Alexander Wolfe, archived from the original on March 9, 2003
* , July 26, 2004
* , July 26, 2022


{{CPU technologies}} {{Processor technologies}}


] ]
] ]
] ]
]

Latest revision as of 05:10, 15 January 2025

Layer of hardware-level instructions or data structures For the CAD software vendor, see MicroCode Engineering, Inc.
Program execution
General concepts
Types of code
Compilation strategies
Notable runtimes
Notable compilers & toolchains

In processor design, microcode serves as an intermediary layer situated between the central processing unit (CPU) hardware and the programmer-visible instruction set architecture of a computer, also known as its machine code. It consists of a set of hardware-level instructions that implement the higher-level machine code instructions or control internal finite-state machine sequencing in many digital processing components. While microcode is utilized in Intel and AMD general-purpose CPUs in contemporary desktops and laptops, it functions only as a fallback path for scenarios that the faster hardwired control unit is unable to manage.

Housed in special high-speed memory, microcode translates machine instructions, state machine data, or other input into sequences of detailed circuit-level operations. It separates the machine instructions from the underlying electronics, thereby enabling greater flexibility in designing and altering instructions. Moreover, it facilitates the construction of complex multi-step instructions, while simultaneously reducing the complexity of computer circuits. The act of writing microcode is often referred to as microprogramming, and the microcode in a specific processor implementation is sometimes termed a microprogram.

Through extensive microprogramming, microarchitectures of smaller scale and simplicity can emulate more robust architectures with wider word lengths, additional execution units, and so forth. This approach provides a relatively straightforward method of ensuring software compatibility between different products within a processor family.

Some hardware vendors, notably IBM and Lenovo, use the term microcode interchangeably with firmware. In this context, all code within a device is termed microcode, whether it is microcode or machine code. For instance, updates to a hard disk drive's microcode often encompass updates to both its microcode and firmware.

Overview

Instruction sets

At the hardware level, processors contain a number of separate areas of circuitry, or "units", that perform different tasks. Commonly found units include the arithmetic logic unit (ALU) which performs instructions such as addition or comparing two numbers, circuits for reading and writing data to external memory, and small areas of onboard memory to store these values while they are being processed. In most designs, additional high-performance memory, the register file, is used to store temporary values, not just those needed by the current instruction.

To properly perform an instruction, the various circuits have to be activated in order. For instance, it is not possible to add two numbers if they have not yet been loaded from memory. In RISC designs, the proper ordering of these instructions is largely up to the programmer, or at least to the compiler of the programming language they are using. So to add two numbers, for instance, the compiler may output instructions to load one of the values into one register, the second into another, call the addition function in the ALU, and then write the result back out to memory.

As the sequence of instructions needed to complete this higher-level concept, "add these two numbers in memory", may require multiple instructions, this can represent a performance bottleneck if those instructions are stored in main memory. Reading those instructions one by one is taking up time that could be used to read and write the actual data. For this reason, it is common for non-RISC designs to have many different instructions that differ largely on where they store data. For instance, the MOS 6502 has eight variations of the addition instruction, ADC, which differ only in where they look to find the two operands.

Using the variation of the instruction, or "opcode", that most closely matches the ultimate operation can reduce the number of instructions to one, saving memory used by the program code and improving performance by leaving the data bus open for other operations. Internally, however, these instructions are not separate operations, but sequences of the operations the units actually perform. Converting a single instruction read from memory into the sequence of internal actions is the duty of the control unit, another unit within the processor.

Microcode

The basic idea behind microcode is to replace the custom hardware logic implementing the instruction sequencing with a series of simple instructions run in a "microcode engine" in the processor. Whereas a custom logic system might have a series of diodes and gates that output a series of voltages on various control lines, the microcode engine is connected to these lines instead, and these are turned on and off as the engine reads the microcode instructions in sequence. The microcode instructions are often bit encoded to those lines, for instance, if bit 8 is true, that might mean that the ALU should be paused awaiting data. In this respect microcode is somewhat similar to the paper rolls in a player piano, where the holes represent which key should be pressed.

The distinction between custom logic and microcode may seem small, one uses a pattern of diodes and gates to decode the instruction and produce a sequence of signals, whereas the other encodes the signals as microinstructions that are read in sequence to produce the same results. The critical difference is that in a custom logic design, changes to the individual steps require the hardware to be redesigned. Using microcode, all that changes is the code stored in the memory containing the microcode. This makes it much easier to fix problems in a microcode system. It also means that there is no effective limit to the complexity of the instructions, it is only limited by the amount of memory one is willing to use.

The lowest layer in a computer's software stack is traditionally raw machine code instructions for the processor. In microcoded processors, fetching and decoding those instructions, and executing them, may be done by microcode. To avoid confusion, each microprogram-related element is differentiated by the micro prefix: microinstruction, microassembler, microprogrammer, etc.

Complex digital processors may also employ more than one (possibly microcode-based) control unit in order to delegate sub-tasks that must be performed essentially asynchronously in parallel. For example, the VAX 9000 has a hardwired IBox unit to fetch and decode instructions, which it hands to a microcoded EBox unit to be executed, and the VAX 8800 has both a microcoded IBox and a microcoded EBox.

A high-level programmer, or even an assembly language programmer, does not normally see or change microcode. Unlike machine code, which often retains some backward compatibility among different processors in a family, microcode only runs on the exact electronic circuitry for which it is designed, as it constitutes an inherent part of the particular processor design itself.

Design

Engineers normally write the microcode during the design phase of a processor, storing it in a read-only memory (ROM) or programmable logic array (PLA) structure, or in a combination of both. However, machines also exist that have some or all microcode stored in static random-access memory (SRAM) or flash memory. This is traditionally denoted as writable control store in the context of computers, which can be either read-only or read–write memory. In the latter case, the CPU initialization process loads microcode into the control store from another storage medium, with the possibility of altering the microcode to correct bugs in the instruction set, or to implement new machine instructions.

Microprograms

Microprograms consist of series of microinstructions, which control the CPU at a very fundamental level of hardware circuitry. For example, a single typical horizontal microinstruction might specify the following operations:

  • Connect register 1 to the A side of the ALU
  • Connect register 7 to the B side of the ALU
  • Set the ALU to perform two's-complement addition
  • Set the ALU's carry input to zero
  • Store the result value in register 8
  • Update the condition codes from the ALU status flags (negative, zero, overflow, and carry)
  • Microjump to a given μPC address for the next microinstruction

To simultaneously control all processor's features in one cycle, the microinstruction is often wider than 50 bits; e.g., 128 bits on a 360/85 with an emulator feature. Microprograms are carefully designed and optimized for the fastest possible execution, as a slow microprogram would result in a slow machine instruction and degraded performance for related application programs that use such instructions.

Justification

Microcode was originally developed as a simpler method of developing the control logic for a computer. Initially, CPU instruction sets were hardwired. Each step needed to fetch, decode, and execute the machine instructions (including any operand address calculations, reads, and writes) was controlled directly by combinational logic and rather minimal sequential state machine circuitry. While such hard-wired processors were very efficient, the need for powerful instruction sets with multi-step addressing and complex operations (see below) made them difficult to design and debug; highly encoded and varied-length instructions can contribute to this as well, especially when very irregular encodings are used.

Microcode simplified the job by allowing much of the processor's behaviour and programming model to be defined via microprogram routines rather than by dedicated circuitry. Even late in the design process, microcode could easily be changed, whereas hard-wired CPU designs were very cumbersome to change. Thus, this greatly facilitated CPU design.

From the 1940s to the late 1970s, a large portion of programming was done in assembly language; higher-level instructions mean greater programmer productivity, so an important advantage of microcode was the relative ease by which powerful machine instructions can be defined. The ultimate extension of this are "Directly Executable High Level Language" designs, in which each statement of a high-level language such as PL/I is entirely and directly executed by microcode, without compilation. The IBM Future Systems project and Data General Fountainhead Processor are examples of this. During the 1970s, CPU speeds grew more quickly than memory speeds and numerous techniques such as memory block transfer, memory pre-fetch and multi-level caches were used to alleviate this. High-level machine instructions, made possible by microcode, helped further, as fewer more complex machine instructions require less memory bandwidth. For example, an operation on a character string can be done as a single machine instruction, thus avoiding multiple instruction fetches.

Architectures with instruction sets implemented by complex microprograms included the IBM System/360 and Digital Equipment Corporation VAX. The approach of increasingly complex microcode-implemented instruction sets was later called complex instruction set computer (CISC). An alternate approach, used in many microprocessors, is to use one or more programmable logic array (PLA) or read-only memory (ROM) (instead of combinational logic) mainly for instruction decoding, and let a simple state machine (without much, or any, microcode) do most of the sequencing. The MOS Technology 6502 is an example of a microprocessor using a PLA for instruction decode and sequencing. The PLA is visible in photomicrographs of the chip, and its operation can be seen in the transistor-level simulation.

Microprogramming is still used in modern CPU designs. In some cases, after the microcode is debugged in simulation, logic functions are substituted for the control store. Logic functions are often faster and less expensive than the equivalent microprogram memory.

Benefits

A processor's microprograms operate on a more primitive, totally different, and much more hardware-oriented architecture than the assembly instructions visible to normal programmers. In coordination with the hardware, the microcode implements the programmer-visible architecture. The underlying hardware need not have a fixed relationship to the visible architecture. This makes it easier to implement a given instruction set architecture on a wide variety of underlying hardware micro-architectures.

The IBM System/360 has a 32-bit architecture with 16 general-purpose registers, but most of the System/360 implementations use hardware that implements a much simpler underlying microarchitecture; for example, the System/360 Model 30 has 8-bit data paths to the arithmetic logic unit (ALU) and main memory and implemented the general-purpose registers in a special unit of higher-speed core memory, and the System/360 Model 40 has 8-bit data paths to the ALU and 16-bit data paths to main memory and also implemented the general-purpose registers in a special unit of higher-speed core memory. The Model 50 has full 32-bit data paths and implements the general-purpose registers in a special unit of higher-speed core memory. The Model 65 through the Model 195 have larger data paths and implement the general-purpose registers in faster transistor circuits. In this way, microprogramming enabled IBM to design many System/360 models with substantially different hardware and spanning a wide range of cost and performance, while making them all architecturally compatible. This dramatically reduces the number of unique system software programs that must be written for each model.

A similar approach was used by Digital Equipment Corporation (DEC) in their VAX family of computers. As a result, different VAX processors use different microarchitectures, yet the programmer-visible architecture does not change.

Microprogramming also reduces the cost of field changes to correct defects (bugs) in the processor; a bug can often be fixed by replacing a portion of the microprogram rather than by changes being made to hardware logic and wiring.

History

Early examples

In 1947, the design of the MIT Whirlwind introduced the concept of a control store as a way to simplify computer design and move beyond ad hoc methods. The control store is a diode matrix: a two-dimensional lattice, where one dimension accepts "control time pulses" from the CPU's internal clock, and the other connects to control signals on gates and other circuits. A "pulse distributor" takes the pulses generated by the CPU clock and breaks them up into eight separate time pulses, each of which activates a different row of the lattice. When the row is activated, it activates the control signals connected to it.

In 1951, Maurice Wilkes enhanced this concept by adding conditional execution, a concept akin to a conditional in computer software. His initial implementation consisted of a pair of matrices: the first one generated signals in the manner of the Whirlwind control store, while the second matrix selected which row of signals (the microprogram instruction word, so to speak) to invoke on the next cycle. Conditionals were implemented by providing a way that a single line in the control store could choose from alternatives in the second matrix. This made the control signals conditional on the detected internal signal. Wilkes coined the term microprogramming to describe this feature and distinguish it from a simple control store.

The 360

Main article: System/360

Microcode remained relatively rare in computer design as the cost of the ROM needed to store the code was not significantly different than using a custom control store. This changed through the early 1960s with the introduction of mass-produced core memory and core rope, which was far less expensive than dedicated logic based on diode arrays or similar solutions. The first to take real advantage of this was IBM in their 1964 System/360 series. This allowed the machines to have a very complex instruction set, including operations that matched high-level language constructs like formatting binary values as decimal strings, storing the complex series of instructions needed for this task in low cost memory.

But the real value in the 360 line was that one could build a series of machines that were completely different internally, yet run the same ISA. For a low-end machine, one might use an 8-bit ALU that requires multiple cycles to complete a single 32-bit addition, while a higher end machine might have a full 32-bit ALU that performs the same addition in a single cycle. These differences could be implemented in control logic, but the cost of implementing a completely different decoder for each machine would be prohibitive. Using microcode meant all that changed was the code in the ROM. For instance, one machine might include a floating point unit and thus its microcode for multiplying two numbers might be only a few lines line, whereas on the same machine without the FPU this would be a program that did the same using multiple additions, and all that changed was the ROM.

The outcome of this design was that customers could use a low-end model of the family to develop their software, knowing that if more performance was ever needed, they could move to a faster version and nothing else would change. This lowered the barrier to entry and the 360 was a runaway success. By the end of the decade, the use of microcode was de rigueur across the mainframe industry.

Moving up the line

The microcode (and "nanocode") of the Motorola 68000 is stored in the two large square blocks in the upper right and controlled by circuitry to the right of it. It takes up a significant amount of the total chip surface.

Early minicomputers were far too simple to require microcode, and were more similar to earlier mainframes in terms of their instruction sets and the way they were decoded. But it was not long before their designers began using more powerful integrated circuits that allowed for more complex ISAs. By the mid-1970s, most new minicomputers and superminicomputers were using microcode as well, such as most models of the PDP-11 and, most notably, most models of the VAX, which included high-level instruction not unlike those found in the 360.

The same basic evolution occurred with microprocessors as well. Early designs were extremely simple, and even the more powerful 8-bit designs of the mid-1970s like the Zilog Z80 had instruction sets that were simple enough to be implemented in dedicated logic. By this time, the control logic could be patterned into the same die as the CPU, making the difference in cost between ROM and logic less of an issue. However, it was not long before these companies were also facing the problem of introducing higher-performance designs but still wanting to offer backward compatibility. Among early examples of microcode in micros was the Intel 8086.

Among the ultimate implementations of microcode in microprocessors is the Motorola 68000. This offered a highly orthogonal instruction set with a wide variety of addressing modes, all implemented in microcode. This did not come without cost, according to early articles, about 20% of the chip's surface area (and thus cost) is the microcode system and of the systems 68,000 transistors were part of the microcode system.

RISC enters

While companies continued to compete on the complexity of their instruction sets, and the use of microcode to implement these was unquestioned, in the mid-1970s an internal project in IBM was raising serious questions about the entire concept. As part of a project to develop a high-performance all-digital telephone switch, a team led by John Cocke began examining huge volumes of performance data from their customer's 360 (and System/370) programs. This led them to notice a curious pattern: when the ISA presented multiple versions of an instruction, the compiler almost always used the simplest one, instead of the one most directly representing the code. They learned that this was because those instructions were always implemented in hardware, and thus run the fastest. Using the other instruction might offer higher performance on some machines, but there was no way to know what machine they were running on. This defeated the purpose of using microcode in the first place, which was to hide these distinctions.

The team came to a radical conclusion: "Imposing microcode between a computer and its users imposes an expensive overhead in performing the most frequently executed instructions."

The result of this discovery was what is today known as the RISC concept. The complex microcode engine and its associated ROM is reduced or eliminated completely, and those circuits instead dedicated to things like additional registers or a wider ALU, which increases the performance of every program. When complex sequences of instructions are needed, this is left to the compiler, which is the entire purpose of using a compiler in the first place. The basic concept was soon picked up by university researchers in California, where simulations suggested such designs would trivially outperform even the fastest conventional designs. It was one such project, at the University of California, Berkeley, that introduced the term RISC.

The industry responded to the concept of RISC with both confusion and hostility, including a famous dismissive article by the VAX team at Digital. A major point of contention was that implementing the instructions outside of the processor meant it would spend much more time reading those instructions from memory, thereby slowing overall performance no matter how fast the CPU itself ran. Proponents pointed out that simulations clearly showed the number of instructions was not much greater, especially when considering compiled code.

The debate raged until the first commercial RISC designs emerged in the second half of the 1980s, which easily outperformed the most complex designs from other companies. By the late 1980s it was over; even DEC was abandoning microcode for their DEC Alpha designs, and CISC processors switched to using hardwired circuitry, rather than microcode, to perform many functions. For example, the Intel 80486 uses hardwired circuitry to fetch and decode instructions, using microcode only to execute instructions; register-register move and arithmetic instructions required only one microinstruction, allowing them to be completed in one clock cycle. The Pentium Pro's fetch and decode hardware fetches instructions and decodes them into series of micro-operations that are passed on to the execution unit, which schedules and executes the micro-operations, possibly doing so out-of-order. Complex instructions are implemented by microcode that consists of predefined sequences of micro-operations.

Some processor designs use machine code that runs in a special mode, with special instructions, available only in that mode, that have access to processor-dependent hardware, to implement some low-level features of the instruction set. The DEC Alpha, a pure RISC design, used PALcode to implement features such as translation lookaside buffer (TLB) miss handling and interrupt handling, as well as providing, for Alpha-based systems running OpenVMS, instructions requiring interlocked memory access that are similar to instructions provided by the VAX architecture. CMOS IBM System/390 CPUs, starting with the G4 processor, and z/Architecture CPUs use millicode to implement some instructions.

Examples

  • The Analytical engine envisioned by Charles Babbage uses pegs inserted into rotating drums to store its internal procedures.
  • The EMIDEC 1100 reputedly uses a hard-wired control store consisting of wires threaded through ferrite cores, known as "the laces".
  • Most models of the IBM System/360 series are microprogrammed:
    • The Model 25 is unique among System/360 models in using the top 16 K bytes of core storage to hold the control storage for the microprogram. The 2025 uses a 16-bit microarchitecture with seven control words (or microinstructions). After system maintenance or when changing operating mode, the microcode is loaded from the card reader, tape, or other device. The IBM 1410 emulation for this model is loaded this way.
    • The Model 30 uses an 8-bit microarchitecture with only a few hardware registers; everything that the programmer saw is emulated by the microprogram. The microcode for this model is also held on special punched cards, which are stored inside the machine in a dedicated reader per card, called "CROS" units (Capacitor Read-Only Storage). Another CROS unit is added for machines ordered with 1401/1440/1460 emulation and for machines ordered with 1620 emulation.
    • The Model 40 uses 56-bit control words. The 2040 box implements both the System/360 main processor and the multiplex channel (the I/O processor). This model uses TROS dedicated readers similar to CROS units, but with an inductive pickup (Transformer Read-only Store).
    • The Model 50 has two internal datapaths which operated in parallel: a 32-bit datapath used for arithmetic operations, and an 8-bit data path used in some logical operations. The control store uses 90-bit microinstructions.
    • The Model 85 has separate instruction fetch (I-unit) and execution (E-unit) to provide high performance. The I-unit is hardware controlled. The E-unit is microprogrammed; the control words are 108 bits wide on a basic 360/85 and wider if an emulator feature is installed.
  • The NCR 315 is microprogrammed with hand wired ferrite cores (a ROM) pulsed by a sequencer with conditional execution. Wires routed through the cores are enabled for various data and logic elements in the processor.
  • The Digital Equipment Corporation PDP-9 processor, KL10 and KS10 PDP-10 processors, and PDP-11 processors with the exception of the PDP-11/20, are microprogrammed.
  • Most Data General Eclipse minicomputers are microprogrammed. The task of writing microcode for the Eclipse MV/8000 is detailed in the Pulitzer Prize-winning book titled The Soul of a New Machine.
  • Many systems from Burroughs are microprogrammed:
  • The B700 "microprocessor" execute application-level opcodes using sequences of 16-bit microinstructions stored in main memory; each of these is either a register-load operation or mapped to a single 56-bit "nanocode" instruction stored in read-only memory. This allows comparatively simple hardware to act either as a mainframe peripheral controller or to be packaged as a standalone computer.
  • The B1700 is implemented with radically different hardware including bit-addressable main memory but has a similar multi-layer organisation. The operating system preloads the interpreter for whatever language is required. These interpreters present different virtual machines for COBOL, Fortran, etc.
  • Microdata produced computers in which the microcode is accessible to the user; this allows the creation of custom assembler level instructions. Microdata's Reality operating system design makes extensive use of this capability.
  • The Xerox Alto workstation used a microcoded design but, unlike many computers, the microcode engine is not hidden from the programmer in a layered design. Applications take advantage of this to accelerate performance.
  • The IBM System/38 is described as having both horizontal and vertical microcode. In practice, the processor implements an instruction set architecture named the Internal Microprogrammed Interface (IMPI) using a horizontal microcode format. The so-called vertical microcode layer implements the System/38's hardware-independent Machine Interface (MI) instruction set by translating MI code to IMPI code and executing it. Prior to the introduction of the IBM RS64 processor line, early IBM AS/400 systems used the same architecture.
  • The Nintendo 64's Reality Coprocessor (RCP), which serves as the console's graphics processing unit and audio processor, utilizes microcode; it is possible to implement new effects or tweak the processor to achieve the desired output. Some notable examples of custom RCP microcode include the high-resolution graphics, particle engines, and unlimited draw distances found in Factor 5's Indiana Jones and the Infernal Machine, Star Wars: Rogue Squadron, and Star Wars: Battle for Naboo; and the full motion video playback found in Angel Studios' Resident Evil 2.
Further information on Nintendo 64 microcode: Nintendo 64 programming characteristics and Nintendo 64 Game Pak

Implementation

Each microinstruction in a microprogram provides the bits that control the functional elements that internally compose a CPU. The advantage over a hard-wired CPU is that internal CPU control becomes a specialized form of a computer program. Microcode thus transforms a complex electronic design challenge (the control of a CPU) into a less complex programming challenge. To take advantage of this, a CPU is divided into several parts:

  • An I-unit may decode instructions in hardware and determine the microcode address for processing the instruction in parallel with the E-unit.
  • A microsequencer picks the next word of the control store. A sequencer is mostly a counter, but usually also has some way to jump to a different part of the control store depending on some data, usually data from the instruction register and always some part of the control store. The simplest sequencer is just a register loaded from a few bits of the control store.
  • A register set is a fast memory containing the data of the central processing unit. It may include registers visible to application programs, such as general-purpose registers and the program counter, and may also include other registers that are not easily accessible to the application programmer. Often the register set is a triple-ported register file; that is, two registers can be read, and a third written at the same time.
  • An arithmetic and logic unit performs calculations, usually addition, logical negation, a right shift, and logical AND. It often performs other functions, as well.

There may also be a memory address register and a memory data register, used to access the main computer storage. Together, these elements form an "execution unit". Most modern CPUs have several execution units. Even simple computers usually have one unit to read and write memory, and another to execute user code. These elements could often be brought together as a single chip. This chip comes in a fixed width that would form a "slice" through the execution unit. These are known as "bit slice" chips. The AMD Am2900 family is one of the best known examples of bit slice elements. The parts of the execution units and the whole execution units are interconnected by a bundle of wires called a bus.

Programmers develop microprograms, using basic software tools. A microassembler allows a programmer to define the table of bits symbolically. Because of its close relationship to the underlying architecture, "microcode has several properties that make it difficult to generate using a compiler." A simulator program is intended to execute the bits in the same way as the electronics, and allows much more freedom to debug the microprogram. After the microprogram is finalized, and extensively tested, it is sometimes used as the input to a computer program that constructs logic to produce the same data. This program is similar to those used to optimize a programmable logic array. Even without fully optimal logic, heuristically optimized logic can vastly reduce the number of transistors from the number needed for a read-only memory (ROM) control store. This reduces the cost to produce, and the electricity used by, a CPU.

Microcode can be characterized as horizontal or vertical, referring primarily to whether each microinstruction controls CPU elements with little or no decoding (horizontal microcode) or requires extensive decoding by combinatorial logic before doing so (vertical microcode). Consequently, each horizontal microinstruction is wider (contains more bits) and occupies more storage space than a vertical microinstruction.

Horizontal microcode

"Horizontal microcode has several discrete micro-operations that are combined in a single microinstruction for simultaneous operation." Horizontal microcode is typically contained in a fairly wide control store; it is not uncommon for each word to be 108 bits or more. On each tick of a sequencer clock a microcode word is read, decoded, and used to control the functional elements that make up the CPU.

In a typical implementation a horizontal microprogram word comprises fairly tightly defined groups of bits. For example, one simple arrangement might be:

Register source A Register source B Destination register Arithmetic and logic unit operation Type of jump Jump address

For this type of micromachine to implement a JUMP instruction with the address following the opcode, the microcode might require two clock ticks. The engineer designing it would write microassembler source code looking something like this:

   # Any line starting with a number-sign is a comment
   # This is just a label, the ordinary way assemblers symbolically represent a 
   # memory address.
 InstructionJUMP:
       # To prepare for the next instruction, the instruction-decode microcode has already
       # moved the program counter to the memory address register. This instruction fetches
       # the target address of the jump instruction from the memory word following the
       # jump opcode, by copying from the memory data register to the memory address register.
       # This gives the memory system two clock ticks to fetch the next 
       # instruction to the memory data register for use by the instruction decode.
       # The sequencer instruction "next" means just add 1 to the control word address.
    MDR, NONE, MAR, COPY, NEXT, NONE
       # This places the address of the next instruction into the PC.
       # This gives the memory system a clock tick to finish the fetch started on the
       # previous microinstruction.
       # The sequencer instruction is to jump to the start of the instruction decode.
    MAR, 1, PC, ADD, JMP, InstructionDecode
       # The instruction decode is not shown, because it is usually a mess, very particular
       # to the exact processor being emulated. Even this example is simplified.
       # Many CPUs have several ways to calculate the address, rather than just fetching
       # it from the word following the op-code. Therefore, rather than just one
       # jump instruction, those CPUs have a family of related jump instructions.

For each tick it is common to find that only some portions of the CPU are used, with the remaining groups of bits in the microinstruction being no-ops. With careful design of hardware and microcode, this property can be exploited to parallelise operations that use different areas of the CPU; for example, in the case above, the ALU is not required during the first tick, so it could potentially be used to complete an earlier arithmetic instruction.

Vertical microcode

In vertical microcode, each microinstruction is significantly encoded, that is, the bit fields generally pass through intermediate combinatory logic that, in turn, generates the control and sequencing signals for internal CPU elements (ALU, registers, etc.). This is in contrast with horizontal microcode, in which the bit fields either directly produce the control and sequencing signals or are only minimally encoded. Consequently, vertical microcode requires smaller instruction lengths and less storage, but requires more time to decode, resulting in a slower CPU clock.

Some vertical microcode is just the assembly language of a simple conventional computer that is emulating a more complex computer. Some processors, such as DEC Alpha processors and the CMOS microprocessors on later IBM mainframes System/390 and z/Architecture, use machine code, running in a special mode that gives it access to special instructions, special registers, and other hardware resources unavailable to regular machine code, to implement some instructions and other functions, such as page table walks on Alpha processors. This is called PALcode on Alpha processors and millicode on IBM mainframe processors.

Another form of vertical microcode has two fields:

Field select Field value

The field select selects which part of the CPU will be controlled by this word of the control store. The field value controls that part of the CPU. With this type of microcode, a designer explicitly chooses to make a slower CPU to save money by reducing the unused bits in the control store; however, the reduced complexity may increase the CPU's clock frequency, which lessens the effect of an increased number of cycles per instruction.

As transistors grew cheaper, horizontal microcode came to dominate the design of CPUs using microcode, with vertical microcode being used less often.

When both vertical and horizontal microcode are used, the horizontal microcode may be referred to as nanocode or picocode.

Writable control store

Main article: Writable control store

A few computers were built using writable microcode. In this design, rather than storing the microcode in ROM or hard-wired logic, the microcode is stored in a RAM called a writable control store or WCS. Such a computer is sometimes called a writable instruction set computer (WISC).

Many experimental prototype computers use writable control stores; there are also commercial machines that use writable microcode, such as the Burroughs Small Systems, early Xerox workstations, the DEC VAX 8800 (Nautilus) family, the Symbolics L- and G-machines, a number of IBM System/360 and System/370 implementations, some DEC PDP-10 machines, and the Data General Eclipse MV/8000.

The IBM System/370 includes a facility called Initial-Microprogram Load (IML or IMPL) that can be invoked from the console, as part of power-on reset (POR) or from another processor in a tightly coupled multiprocessor complex.

Some commercial machines, for example IBM 360/85, have both a read-only storage and a writable control store for microcode.

WCS offers several advantages including the ease of patching the microprogram and, for certain hardware generations, faster access than ROMs can provide. User-programmable WCS allows the user to optimize the machine for specific purposes.

Starting with the Pentium Pro in 1995, several x86 CPUs have writable Intel Microcode. This, for example, has allowed bugs in the Intel Core 2 and Intel Xeon microcodes to be fixed by patching their microprograms, rather than requiring the entire chips to be replaced. A second prominent example is the set of microcode patches that Intel offered for some of their processor architectures of up to 10 years in age, in a bid to counter the security vulnerabilities discovered in their designs – Spectre and Meltdown – which went public at the start of 2018. A microcode update can be installed by Linux, FreeBSD, Microsoft Windows, or the motherboard BIOS.

Some machines offer user-programmable writable control stores as an option, including the HP 2100, DEC PDP-11/60, TI-990/12, and Varian Data Machines V-70 series minicomputers.

Comparison to VLIW and RISC

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed. (August 2023) (Learn how and when to remove this message)
This section needs to be updated. The reason given is: Many CISC processors now do instruction fetch and decode in hardware, and execute most if not all instructions in hardware, and both RISC and CISC processors execute several operations per clock cycle. Please help update this article to reflect recent events or newly available information. (December 2023)

The design trend toward heavily microcoded processors with complex instructions began in the early 1960s and continued until roughly the mid-1980s. At that point the RISC design philosophy started becoming more prominent.

A CPU that uses microcode generally takes several clock cycles to execute a single instruction, one clock cycle for each step in the microprogram for that instruction. Some CISC processors include instructions that can take a very long time to execute. Such variations interfere with both interrupt latency and, what is far more important in modern systems, pipelining.

When designing a new processor, a hardwired control RISC has the following advantages over microcoded CISC:

  • Programming has largely moved away from assembly level, so it's no longer worthwhile to provide complex instructions for productivity reasons.
  • Simpler instruction sets allow direct execution by hardware, avoiding the performance penalty of microcoded execution.
  • Analysis shows complex instructions are rarely used, hence the machine resources devoted to them are largely wasted.
  • The machine resources devoted to rarely used complex instructions are better used for expediting performance of simpler, commonly used instructions.
  • Complex microcoded instructions may require many clock cycles that vary, and are difficult to pipeline for increased performance.

There are counterpoints as well:

  • The complex instructions in heavily microcoded implementations may not take much extra machine resources, except for microcode space. For example, the same ALU is often used to calculate an effective address and to compute the result from the operands, e.g., the original Z80, 8086, and others.
  • The simpler non-RISC instructions (i.e., involving direct memory operands) are frequently used by modern compilers. Even immediate to stack (i.e., memory result) arithmetic operations are commonly employed. Although such memory operations, often with varying length encodings, are more difficult to pipeline, it is still fully feasible to do so - clearly exemplified by the i486, AMD K5, Cyrix 6x86, Motorola 68040, etc.
  • Non-RISC instructions inherently perform more work per instruction (on average), and are also normally highly encoded, so they enable smaller overall size of the same program, and thus better use of limited cache memories.

Many RISC and VLIW processors are designed to execute every instruction (as long as it is in the cache) in a single cycle. This is very similar to the way CPUs with microcode execute one microinstruction per cycle. VLIW processors have instructions that behave similarly to very wide horizontal microcode, although typically without such fine-grained control over the hardware as provided by microcode. RISC instructions are sometimes similar to the narrow vertical microcode.

Microcode has been popular in application-specific processors such as network processors, digital signal processors, channel controllers, disk controllers, network interface controllers, flash memory controllers, graphics processing units, and in other hardware.

Micro-operations

Modern CISC implementations, such as the x86 family starting with the NexGen Nx586, Intel Pentium Pro, and AMD K5. decode instructions into dynamically buffered micro-operations with an instruction encoding similar to RISC or traditional microcode. A hardwired instruction decode unit directly emits microoperations for common x86 instructions, but falls back to a more traditional microcode ROM containing microoperations for more complex or rarely used instructions.

For example, an x86 might look up microoperations from microcode to handle complex multistep operations such as loop or string instructions, floating-point unit transcendental functions or unusual values such as denormal numbers, and special-purpose instructions such as CPUID.

See also

Notes

  1. IBM horizontally microcoded processors had multiple micro-orders and register select fields that required decoding.

References

  1. ^ Kent, Allen; Williams, James G. (April 5, 1993). Encyclopedia of Computer Science and Technology: Volume 28 - Supplement 13. New York: Marcel Dekker, Inc. ISBN 0-8247-2281-7. Archived from the original on November 20, 2016. Retrieved Jan 17, 2016.
  2. ^ Fog, Agner (2017-05-02). The microarchitecture of Intel, AMD and VIA CPUs (PDF) (Report). Technical University of Denmark. Archived (PDF) from the original on 2017-03-28. Retrieved 2024-08-21.
  3. "IBM pSeries Servers - Microcode Update for Ultrastar 73LZX (US73) 18/36 GB". IBM. Archived from the original on April 19, 2019. Retrieved January 22, 2015.
  4. ^ Both, David (23 July 2020). "The central processing unit (CPU): Its components and functionality". Red Hat.
  5. Pickens, John. "NMOS 6502 Opcodes". 6502.org.
  6. ^ Shirriff, Ken. "How the 8086 processor's microcode engine works". Ken Shirriff's blog.
  7. "ISO/IEC/IEEE 24765:2017(en) Systems and software engineering — Vocabulary". www.iso.org. Retrieved 2024-06-23.
  8. VAX 9000 System Technical Description (PDF). Digital Equipment Corporation. May 1990. pp. 3-5 – 3-32. EK-KA90S-TD-001.
  9. VAX 8800 System Technical Description Volume 2 (PDF). Digital Equipment Corporation. July 1986. EK-KA882-TD-PRE.
  10. Manning, B.M.; Mitby, J.S; Nicholson, J.O. (November 1979). "Microprogrammed Processor Having PLA Control Store". IBM Technical Disclosure Bulletin. 22 (6). Archived from the original on 2012-10-01. Retrieved 2011-07-10.
  11. Often denoted a ROM/PLA control store in the context of usage in a CPU; Supnik, Bob (24 February 2008). "J-11: DEC's fourth and last PDP-11 microprocessor design ... features ... ROM/PLA control store". Archived from the original on 2011-07-09. Retrieved 2011-07-10.
  12. "6502 Images". Archived from the original on March 4, 2016. Retrieved January 22, 2015.
  13. IBM System/360 Model 50 Functional Characteristics (PDF). IBM. 1967. p. 7. A22-6898-1. Retrieved October 29, 2021.
  14. Everett, R.R.; Swain, F.E. (1947). Whirlwind I Computer Block Diagrams (PDF) (Technical report). MIT Servomechanisms Laboratory. R-127. Archived from the original (PDF) on June 17, 2012. Retrieved June 21, 2006.
  15. ^ Shirriff, Ken. "Simulating the IBM 360/50 mainframe from its microcode". Ken Shirriff's blog.
  16. Supnik, Bob (May 1988). VLSI VAX Micro-Architecture (PDF). Digital Equipment.
  17. Starnes, Thomas (April 1983). "Design Philosophy Behind Motorola's MC68000". Byte.
  18. ^ Cocke, John; Markstein, Victoria (January 1990). "The evolution of RISC technology at IBM" (PDF). IBM Journal of Research and Development. 34 (1): 4–11. doi:10.1147/rd.341.0004.
  19. ^ Clark, Douglas; Strecker, William (September 1980). "Comments on "The Case for the Reduced Instruction Computer"". ACM. 8 (6): 34–38. doi:10.1145/641914.641918. S2CID 14939489.
  20. "The execution pipeline of the Intel i486 CPU". Digest of Papers Compcon Spring '90. Thirty-Fifth IEEE Computer Society International Conference on Intellectual Leverage. San Francisco, CA: IEEE. doi:10.1109/CMPCON.1990.63682. ISBN 0-8186-2028-5.
  21. "Pentium Pro Processor At 150, 166, 180, and 200 MHz" (PDF) (Datasheet). Intel. November 1995.
  22. ^ "Part I / Common Architecture, Chapter 6 Common PALcode Architecture". Alpha AXP Architecture Reference Manual (PDF) (Second ed.). Digital Press. 1995. ISBN 1-55558-145-5.
  23. Rogers, Bob (Sep–Oct 2012). "The What and Why of zEnterprise Millicode". IBM Systems Magazine. Archived from the original on October 9, 2012.
  24. "EMIDEC 1100 computer". Emidec.org.uk. Archived from the original on June 12, 2010. Retrieved April 26, 2010.
  25. IBM System/360 Model 25 Functional Characteristics (PDF). IBM. January 1968. p. 22. A24-3510-0. Retrieved October 29, 2021.
  26. ^ Field Engineering Theory of Operation, 2030 Processing Unit, System/360 Model 30 (PDF) (First ed.). IBM. June 1967. Y24-3360-1. Archived (PDF) from the original on 2020-04-01. Retrieved 2019-11-09.
  27. Edward A. Snow; Daniel P. Siewiorek (1982). "Implementation and Performance Evaluation of the PDP-11 Family". In Daniel P. Siewiorek; C. Gordon Bell; Allen Newell (eds.). Computer Structures: Principles and Examples. New York, NY: McGraw-Hill Book Company. p. 671. ISBN 0-07-057302-6.
  28. Soltis, Frank (September 1981). "Design of a Small Business Data Processing System". IEEE Computer. 14: 77–93. doi:10.1109/C-M.1981.220610. S2CID 398484.
  29. Frank G. Soltis (1997). Inside the AS/400, Second Edition. Duke Press. ISBN 978-1882419661.
  30. "Interview: Battling the N64 (Naboo)". IGN64. November 10, 2000. Archived from the original on September 13, 2007. Retrieved March 27, 2008.
  31. "Indiana Jones and the Infernal Machine". IGN. December 12, 2000. Archived from the original on September 27, 2013. Retrieved September 24, 2013.
  32. Meynink, Todd (July 28, 2000). "Postmortem: Angel Studios' Resident Evil 2 (N64 Version)". Gamasutra. United Business Media LLC. Archived from the original on October 21, 2012. Retrieved October 18, 2010.
  33. Digital Scientific Meta 4 Series 16 Computer System Reference Manual (PDF). Digital Scientific Corporation. May 1971. 7032MO. Archived (PDF) from the original on 2020-01-14. Retrieved 2020-01-14.
  34. Digital Scientific Meta 4 Computer System Read-Only Memory (ROM) Reference Manual (PDF). Digital Scientific Corporation. March 1970. 7024MO. Archived (PDF) from the original on 2019-09-23. Retrieved 2020-01-14.
  35. The Digital Scientific Meta 4 Series 16 Computer System Preliminary System Manual (PDF). Digital Scientific Corporation. June 1970. 7006MO. Archived (PDF) from the original on 2019-09-23. Retrieved 2020-01-14.
  36. Digital Scientific Meta 4 Computer System Typical ROM Pattern Listing and Program To Simulate The IBM 1130 Instruction Set (PDF). Digital Scientific Corporation. January 1970. M4/005P-170. Archived (PDF) from the original on 2020-03-24. Retrieved 2020-01-14.
  37. "Western Digital 1600". AntiqueTech. Archived from the original on 3 January 2017. Retrieved 5 January 2017.
  38. Hayes, John P. (1978). Computer Architecture and Organization. McGraw-Hill. p. 300. ISBN 0-07-027363-4.
  39. Neal Harman; Andy Gimblett (2009-10-12). "CS-323: High Performance Microprocessors – Chapter 1. Microprogramming". mat.uson.mx. Archived from the original on 2015-04-19. Retrieved 2015-08-08.
  40. Vaupel, Robert (2013). High Availability and Scalability of Mainframe Environments using System z and z/OS as example. KIT Scientific. p. 26. ISBN 978-3-7315-0022-3.
  41. Rogers, Bob (September–October 2012). "The What and Why of zEnterprise Millicode". IBM Systems Magazine. Archived from the original on 2013-10-16. Retrieved 2013-11-07.
  42. "PALcode for Alpha Microprocessors System Design Guide" (PDF). Digital Equipment Corporation. May 1996. Archived (PDF) from the original on August 15, 2011. Retrieved November 7, 2013.
  43. Spruth, Wilhelm (December 2012). The Design of a Microprocessor. Springer Science & Business Media. p. 31. ISBN 978-3-642-74916-2. Archived from the original on November 20, 2016. Retrieved Jan 18, 2015.
  44. Koopman, Philip Jr. (1987). "Writable instruction set, stack oriented computers: The WISC Concept" (PDF). The Journal of Forth Application and Research: 49–71. Archived (PDF) from the original on 2008-05-11.
  45. Smith, Eric (3 September 2002). "Re: What was the size of Microcode in various machines". Newsgroupalt.folklore.computers. Usenet: qhn0qyveyu.fsf@ruckus.brouhaha.com. Archived from the original on 26 January 2009. Retrieved 18 December 2008.
  46. Smotherman, Mark. "CPSC 3300 / The Soul of a New Machine". Retrieved 2023-10-27. 4096 x 75-bit SRAM writable control store: 74-bit microinstruction with 1 parity bit (18 fields)
  47. IBM System/370 Principles of Operation (PDF). Fourth Edition. IBM. September 1974. pp. 98, 245. GA22-7000-4. Archived (PDF) from the original on 2012-02-29. Retrieved 2012-08-27.
  48. IBM System/360 Model 85 Functional Characteristics (PDF). SECOND EDITION. IBM. June 1968. A22-6916-1. Retrieved October 29, 2021.
  49. IBM System/360 Special Feature Description 709/7090/7094 Compatibility Feature for IBM System/360 Model 85. First Edition. IBM. March 1969. GA27-2733-0.
  50. Stiller, Andreas; Paul, Matthias R. (1996-05-12). "Prozessorgeflüster". c't – magazin für computertechnik. Trends & News (in German). Heise Verlag. Archived from the original on 2017-08-28. Retrieved 2017-08-28.
  51. "9.11: Microcode update facilities". Intel 64 and IA-32 Architectures Software Developer's Manual, Volume 3A: System Programming Guide, Part 1 (PDF). Intel. September 2016.
  52. Intel Patches All Recent CPUs, Promises Hardware Fixes For Upcoming 8th Gen Chips by Paul Alcorn on March 15, 2018
  53. "Download Linux* Processor Microcode Data File". Archived from the original on 2018-03-19. Retrieved 2018-03-21.
  54. "Intel Microcode Update Utility for Linux". Archived from the original on 2012-02-26.
  55. "[ports] Index of /head/sysutils/cpupdate". Freebsd.org. Archived from the original on 2020-04-01. Retrieved 2020-01-16.
  56. "A microcode reliability update is available that improves the reliability of systems that use Intel processors". Archived from the original on 2008-02-23. Retrieved 2008-02-25.
  57. "Server Products - BIOS Update required when Missing Microcode message is seen during POST". Intel. January 24, 2013. Archived from the original on September 1, 2014.
  58. "Model 990/12 LR Computer Depot Maintenance and Repair Manual" (PDF). Bitsavers.org. Texas Instruments. Retrieved 15 February 2024.
  59. Texas Instruments Model 990 Computer MDS-990 Microcode Development System Programmer's Guide (15 August 1979 ed.). Texas Instruments Archives, RG-20 accession 94-08, Box 10, 45C. DeGolyer Library, Southern Methodist University, Dallas, TX USA.{{cite book}}: CS1 maint: location (link)

Further reading

External links

This article's use of external links may not follow Misplaced Pages's policies or guidelines. Please improve this article by removing excessive or inappropriate external links, and converting useful links where appropriate into footnote references. (February 2017) (Learn how and when to remove this message)
Processor technologies
Models
Architecture
Instruction set
architectures
Types
Instruction
sets
Execution
Instruction pipelining
Hazards
Out-of-order
Speculative
Parallelism
Level
Multithreading
Flynn's taxonomy
Processor
performance
Types
By application
Systems
on chip
Hardware
accelerators
Word size
Core count
Components
Functional
units
Logic
Registers
Control unit
Datapath
Circuitry
Power
management
Related
Categories: