You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 60001 - Advanced Computer Architecture/credit/credit.tex
+3-3Lines changed: 3 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
\chapter{Credit}
2
2
\section*{Image Credit}
3
3
\begin{center}
4
-
\begin{tabular}{r p{.8\textwidth}}
5
-
\textbf{Front Cover} & Intel i386 die shot by Pauli Rautakorpi \href{https://commons.wikimedia.org/wiki/File:Intel_80386_DX_die.JPG}{on wikimedia here}. \\
6
-
\end{tabular}
4
+
\begin{tabular}{r p{.8\textwidth}}
5
+
\textbf{Front Cover} & Intel i386 die shot by Pauli Rautakorpi \href{https://commons.wikimedia.org/wiki/File:Intel_80386_DX_die.JPG}{on wikimedia here}. \\
Copy file name to clipboardExpand all lines: 60001 - Advanced Computer Architecture/oooscheduling/oooscheduling.tex
+39-39Lines changed: 39 additions & 39 deletions
Original file line number
Diff line number
Diff line change
@@ -3,41 +3,41 @@ \chapter{Dynamic Scheduling}
3
3
\section{Bypassing Stalls}
4
4
The basic concept behind out of order scheduling is that instructions behind a stall can be allowed to continue provided data dependence/hazards allow.
5
5
\begin{itemize}
6
-
\item When an instruction stalls (e.g cache miss or forwarding not possible) save the state of that instruction.
7
-
\item Instructions are issued in order, have dependencies analysed and can then be executed out of order.
8
-
\item When operands are available allow execution pof the stalled instruction to continue.
6
+
\item When an instruction stalls (e.g cache miss or forwarding not possible) save the state of that instruction.
7
+
\item Instructions are issued in order, have dependencies analysed and can then be executed out of order.
8
+
\item When operands are available allow execution pof the stalled instruction to continue.
9
9
\end{itemize}
10
10
\begin{definitionbox}{Read After Write / True Dependence}
The output of one instruction is required as the input to another.
15
+
The output of one instruction is required as the input to another.
16
16
\end{definitionbox}
17
17
18
18
\begin{definitionbox}{Write After Read / Anti Dependence}
19
-
\begin{minted}{text}
19
+
\begin{minted}{text}
20
20
sub $4, $3, $6 # $4 = $3 - $6 (Read $3) (use $3 before the next instruction overwrites)
21
21
add $3, $2, $1 # $3 = $2 + $1 (Write $3)
22
22
\end{minted}
23
-
Some instruction will overwrite an input to a preceding instruction.
23
+
Some instruction will overwrite an input to a preceding instruction.
24
24
\end{definitionbox}
25
25
26
26
\begin{definitionbox}{Write after Write / Output Dependence}
27
-
\begin{minted}{text}
27
+
\begin{minted}{text}
28
28
add $3, $2, $1 # $3 = $2 + $1 (Write $3)
29
29
sub $3, $4, $6 # $3 = $4 - $6 (Write $3)
30
30
addiu $7, $3, 100 # $7 = $3 + 100 (Read $3)
31
31
\end{minted}
32
-
The writes have a dependency as they write to the same location, the correct value must be present in the location for subsequent reads.
32
+
The writes have a dependency as they write to the same location, the correct value must be present in the location for subsequent reads.
33
33
\end{definitionbox}
34
34
35
35
\section{Tomasulo's Algorithm}
36
36
An out of order execution algorithm used to dynamically rename registers to bypass the limited number of floating-point registers in the IBM architecture specification, and allow faster computation on the IBM 360/91.
37
37
\begin{itemize}
38
-
\item Each registers contains a tag. (null means the value is present, otherwise it is the identifier of the unit the result will come from)
39
-
\item By adding tags register renaming (simple) is achieved
40
-
\item A common data bus is used to broadcast the result of an operation, with its tag (unit it came from)
38
+
\item Each registers contains a tag. (null means the value is present, otherwise it is the identifier of the unit the result will come from)
39
+
\item By adding tags register renaming (simple) is achieved
40
+
\item A common data bus is used to broadcast the result of an operation, with its tag (unit it came from)
41
41
\end{itemize}
42
42
\begin{minted}{Python}
43
43
"""Super abbreviated pseudocode for the IBM360/91 Out of Order Execution """
\textbf{Complexity} & Led to delays in design, hardware overhead to overcome an ISA issue. \\
85
-
\textbf{Limited by CDB} & CBD must go through all functional units, and only one instruction can write to bus per cycle. \\
86
-
\textbf{No Precise Interrupts} & As instructions are executed out of order, we cannot clearly define a point in the \textit{in-order} program text where the processor is at at any given time. \\
84
+
\textbf{Complexity} & Led to delays in design, hardware overhead to overcome an ISA issue. \\
85
+
\textbf{Limited by CDB} & CBD must go through all functional units, and only one instruction can write to bus per cycle. \\
86
+
\textbf{No Precise Interrupts} & As instructions are executed out of order, we cannot clearly define a point in the \textit{in-order} program text where the processor is at at any given time. \\
87
87
\end{tabbox}
88
88
89
89
It is possible to overlap loop iterations:
90
90
\begin{itemize}
91
-
\item (Effectively) Register renaming allows for different physical destinations (e.g ignore register and straight to functional unit).
92
-
\item Reservation stations can buffer old values to avoid write after read / anti dependence stalls.
91
+
\item (Effectively) Register renaming allows for different physical destinations (e.g ignore register and straight to functional unit).
92
+
\item Reservation stations can buffer old values to avoid write after read / anti dependence stalls.
93
93
\end{itemize}
94
94
95
95
\lectlink{https://imperial.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=2faf6023-3195-440e-b4f3-af2a010f70d6}{Chapter 2 - Part 2: Speculation}
96
96
97
97
\section{Precise Interrupts}
98
98
In order to use precise interrupts we need a consistent state.
99
99
\begin{itemize}
100
-
\item All instructions up to some point have committed changes to machine state (registers \& memory).
101
-
\item No instructions past have committed.
102
-
\item Hence on an interrupt (e.g page fault, syscall) we can easily save state, and restart where the interrupt suspended execution of a program.
103
-
\item This is also important for branches (need to undo prevent committing work executed speculatively)
100
+
\item All instructions up to some point have committed changes to machine state (registers \& memory).
101
+
\item No instructions past have committed.
102
+
\item Hence on an interrupt (e.g page fault, syscall) we can easily save state, and restart where the interrupt suspended execution of a program.
103
+
\item This is also important for branches (need to undo prevent committing work executed speculatively)
104
104
\end{itemize}
105
105
Hence we want to make a \textit{speculative tomasulo algorithm}
106
106
\begin{enumerate}
107
-
\item Issue/Dispatch (Get instruction from buffer of fetched instructions, send operands \& reorder buffer number to destination)
108
-
\item Execution (Out of order execution of issued instructions)
109
-
\item Write Back (in order to common data bus and waiting functional units)
110
-
\item Commit (Update register with reorder result, reorder buffer takes completed instructions, puts in issue order and updates state)
107
+
\item Issue/Dispatch (Get instruction from buffer of fetched instructions, send operands \& reorder buffer number to destination)
108
+
\item Execution (Out of order execution of issued instructions)
109
+
\item Write Back (in order to common data bus and waiting functional units)
110
+
\item Commit (Update register with reorder result, reorder buffer takes completed instructions, puts in issue order and updates state)
111
111
\end{enumerate}
112
112
This requires several additions
113
113
\begin{itemize}
114
-
\item Commit unit to manage reorder buffer
115
-
\item Issue side registers for execution
116
-
\item Commit side registers for the committed results
117
-
\item Ability to flush the reorder buffer on a branch mispredict
114
+
\item Commit unit to manage reorder buffer
115
+
\item Issue side registers for execution
116
+
\item Commit side registers for the committed results
117
+
\item Ability to flush the reorder buffer on a branch mispredict
118
118
\end{itemize}
119
119
120
120
\section{Store Buffering}
121
121
Stores are an issue as they cannot be completed until committed, but succeeding loads can be executed straight away.
122
122
\begin{itemize}
123
-
\item We could stall all preceding loads until the store is complete
124
-
\item We can buffer uncommitted stores, associated with addresses, and check these for any load (to get the nearest hit, or on miss go to memory). Loads must be stalled until all possibly aliasing store addresses are resolved
123
+
\item We could stall all preceding loads until the store is complete
124
+
\item We can buffer uncommitted stores, associated with addresses, and check these for any load (to get the nearest hit, or on miss go to memory). Loads must be stalled until all possibly aliasing store addresses are resolved
125
125
\end{itemize}
126
126
Loads and stores use computed addresses (not always known at issue time)
127
127
\begin{itemize}
128
-
\item Can speculate, and forward a store's result to a load
129
-
\item Must recover when the computed address is not the speculated
128
+
\item Can speculate, and forward a store's result to a load
129
+
\item Must recover when the computed address is not the speculated
130
130
\end{itemize}
131
131
Hence we can add a \textit{forwarding predictor} to determine if a store should be forwarded to some load behind it in the pipeline.
132
132
\begin{sidenotebox}{Dependence Prediction}
133
-
More can be read about predicting the dependence of a load on another store instruction \href{here}{https://jilp.org/vol2/v2paper13.pdf}.
133
+
More can be read about predicting the dependence of a load on another store instruction \href{here}{https://jilp.org/vol2/v2paper13.pdf}.
134
134
\end{sidenotebox}
135
135
136
136
\section{Register Update Unit}
137
137
An alternative to reservation stations and the reorder buffer.
138
138
\begin{itemize}
139
-
\item A single table of instructions after fetch, acting as a reservation station.
140
-
\item Once the operands are found, the instruction can be issued (hence functional unit determined after operands, unlike in Tomasulo's)
141
-
\item RUU entries are committed to update the commit side registers.
139
+
\item A single table of instructions after fetch, acting as a reservation station.
140
+
\item Once the operands are found, the instruction can be issued (hence functional unit determined after operands, unlike in Tomasulo's)
141
+
\item RUU entries are committed to update the commit side registers.
142
142
\end{itemize}
143
143
\begin{tabbox}{prosbox}
144
-
\textbf{Monitors} & In Tomasulo's every reservation station and reorder buffer entry needs to have a comparator and monitor the common data bus. With the RUU strategy, fewer comparators are required. \\
145
-
\textbf{Tags} & With RUU the tags are ROB entries. Furthermore the RUU is indexed by the tag. \\
144
+
\textbf{Monitors} & In Tomasulo's every reservation station and reorder buffer entry needs to have a comparator and monitor the common data bus. With the RUU strategy, fewer comparators are required. \\
145
+
\textbf{Tags} & With RUU the tags are ROB entries. Furthermore the RUU is indexed by the tag. \\
0 commit comments