Skip to content

Commit 01aa2e7

Browse files
committed
Fixed formating with latexindent
1 parent 5664515 commit 01aa2e7

File tree

37 files changed

+3368
-3368
lines changed

37 files changed

+3368
-3368
lines changed
Binary file not shown.

60001 - Advanced Computer Architecture/caches/caches.tex

Lines changed: 256 additions & 256 deletions
Large diffs are not rendered by default.

60001 - Advanced Computer Architecture/credit/credit.tex

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
\chapter{Credit}
22
\section*{Image Credit}
33
\begin{center}
4-
\begin{tabular}{r p{.8\textwidth}}
5-
\textbf{Front Cover} & Intel i386 die shot by Pauli Rautakorpi \href{https://commons.wikimedia.org/wiki/File:Intel_80386_DX_die.JPG}{on wikimedia here}. \\
6-
\end{tabular}
4+
\begin{tabular}{r p{.8\textwidth}}
5+
\textbf{Front Cover} & Intel i386 die shot by Pauli Rautakorpi \href{https://commons.wikimedia.org/wiki/File:Intel_80386_DX_die.JPG}{on wikimedia here}. \\
6+
\end{tabular}
77
\end{center}
88

99
\section*{Content}

60001 - Advanced Computer Architecture/exploiting_parallelism/exploiting_parallelism.tex

Lines changed: 96 additions & 96 deletions
Large diffs are not rendered by default.
Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,40 @@
11
\chapter{Introduction}
22
\section{Course Structure and Logistics}
33
\begin{minipage}{.3\textwidth}
4-
\begin{center}
5-
\begin{tikzpicture}
6-
\clip (0,0) circle (2cm) ;
7-
\node[anchor=center] at (0,-0.5) {\includegraphics[width=4.5cm]{introduction/images/paul_kelly.jpg}};
8-
\end{tikzpicture}
9-
\centerline{\textbf{Prof Paul Kelly}}
10-
\end{center}
4+
\begin{center}
5+
\begin{tikzpicture}
6+
\clip (0,0) circle (2cm) ;
7+
\node[anchor=center] at (0,-0.5) {\includegraphics[width=4.5cm]{introduction/images/paul_kelly.jpg}};
8+
\end{tikzpicture}
9+
\centerline{\textbf{Prof Paul Kelly}}
10+
\end{center}
1111
\end{minipage}
1212
\hfill
1313
\begin{minipage}{.68\textwidth}
14-
Teaching the entire course.
15-
\begin{itemize}
16-
\item Microprocessor design.
17-
\item Optimising software for hardware, and compiler design.
18-
\item Optimising hardware for specific software tasks.
19-
\item Challenges past, present \& future.
20-
\end{itemize}
21-
Taught through pre-recorded lectures and live tutorial sessions.
14+
Teaching the entire course.
15+
\begin{itemize}
16+
\item Microprocessor design.
17+
\item Optimising software for hardware, and compiler design.
18+
\item Optimising hardware for specific software tasks.
19+
\item Challenges past, present \& future.
20+
\end{itemize}
21+
Taught through pre-recorded lectures and live tutorial sessions.
2222
\end{minipage}
2323
\\ \begin{minipage}{.63\textwidth}
24-
This course is largely textbook based.
25-
\begin{itemize}
26-
\item $936$ pages covering the course content and more.
27-
\item Useful appendices covering both introductory and advanced material.
28-
\end{itemize}
29-
The book is written by John Hennessy and David Patterson.
24+
This course is largely textbook based.
25+
\begin{itemize}
26+
\item $936$ pages covering the course content and more.
27+
\item Useful appendices covering both introductory and advanced material.
28+
\end{itemize}
29+
The book is written by John Hennessy and David Patterson.
3030
\end{minipage}
3131
\hfill
3232
\begin{minipage}{.35\textwidth}
33-
\begin{center}
34-
\includegraphics[height=6cm]{introduction/images/comp_arch_quant_approach.jpg}
35-
\end{center}
36-
\centerline{\textbf{Computer Architecture:}}
37-
\centerline{\textbf{A Quantitative Approach (\nth{6} Edition)}}
33+
\begin{center}
34+
\includegraphics[height=6cm]{introduction/images/comp_arch_quant_approach.jpg}
35+
\end{center}
36+
\centerline{\textbf{Computer Architecture:}}
37+
\centerline{\textbf{A Quantitative Approach (\nth{6} Edition)}}
3838
\end{minipage}
3939

4040
\lectlink{https://imperial.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=9c34ae74-31d9-4a8b-89cd-af2a010ddb5d}{Chapter 1 - Part 1: Introduction}

60001 - Advanced Computer Architecture/oooscheduling/oooscheduling.tex

Lines changed: 39 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -3,41 +3,41 @@ \chapter{Dynamic Scheduling}
33
\section{Bypassing Stalls}
44
The basic concept behind out of order scheduling is that instructions behind a stall can be allowed to continue provided data dependence/hazards allow.
55
\begin{itemize}
6-
\item When an instruction stalls (e.g cache miss or forwarding not possible) save the state of that instruction.
7-
\item Instructions are issued in order, have dependencies analysed and can then be executed out of order.
8-
\item When operands are available allow execution pof the stalled instruction to continue.
6+
\item When an instruction stalls (e.g cache miss or forwarding not possible) save the state of that instruction.
7+
\item Instructions are issued in order, have dependencies analysed and can then be executed out of order.
8+
\item When operands are available allow execution pof the stalled instruction to continue.
99
\end{itemize}
1010
\begin{definitionbox}{Read After Write / True Dependence}
11-
\begin{minted}{text}
11+
\begin{minted}{text}
1212
add $3, $2, $1 # $3 = $2 + $1 (Write $3)
1313
sub $4, $3, $6 # $4 = $3 - $6 (Read $3) (needs previous instruction's value)
1414
\end{minted}
15-
The output of one instruction is required as the input to another.
15+
The output of one instruction is required as the input to another.
1616
\end{definitionbox}
1717

1818
\begin{definitionbox}{Write After Read / Anti Dependence}
19-
\begin{minted}{text}
19+
\begin{minted}{text}
2020
sub $4, $3, $6 # $4 = $3 - $6 (Read $3) (use $3 before the next instruction overwrites)
2121
add $3, $2, $1 # $3 = $2 + $1 (Write $3)
2222
\end{minted}
23-
Some instruction will overwrite an input to a preceding instruction.
23+
Some instruction will overwrite an input to a preceding instruction.
2424
\end{definitionbox}
2525
2626
\begin{definitionbox}{Write after Write / Output Dependence}
27-
\begin{minted}{text}
27+
\begin{minted}{text}
2828
add $3, $2, $1 # $3 = $2 + $1 (Write $3)
2929
sub $3, $4, $6 # $3 = $4 - $6 (Write $3)
3030
addiu $7, $3, 100 # $7 = $3 + 100 (Read $3)
3131
\end{minted}
32-
The writes have a dependency as they write to the same location, the correct value must be present in the location for subsequent reads.
32+
The writes have a dependency as they write to the same location, the correct value must be present in the location for subsequent reads.
3333
\end{definitionbox}
3434

3535
\section{Tomasulo's Algorithm}
3636
An out of order execution algorithm used to dynamically rename registers to bypass the limited number of floating-point registers in the IBM architecture specification, and allow faster computation on the IBM 360/91.
3737
\begin{itemize}
38-
\item Each registers contains a tag. (null means the value is present, otherwise it is the identifier of the unit the result will come from)
39-
\item By adding tags register renaming (simple) is achieved
40-
\item A common data bus is used to broadcast the result of an operation, with its tag (unit it came from)
38+
\item Each registers contains a tag. (null means the value is present, otherwise it is the identifier of the unit the result will come from)
39+
\item By adding tags register renaming (simple) is achieved
40+
\item A common data bus is used to broadcast the result of an operation, with its tag (unit it came from)
4141
\end{itemize}
4242
\begin{minted}{Python}
4343
"""Super abbreviated pseudocode for the IBM360/91 Out of Order Execution """
@@ -81,68 +81,68 @@ \section{Tomasulo's Algorithm}
8181
self.value = data
8282
\end{minted}
8383
\begin{tabbox}[.6\textwidth]{consbox}
84-
\textbf{Complexity} & Led to delays in design, hardware overhead to overcome an ISA issue. \\
85-
\textbf{Limited by CDB} & CBD must go through all functional units, and only one instruction can write to bus per cycle. \\
86-
\textbf{No Precise Interrupts} & As instructions are executed out of order, we cannot clearly define a point in the \textit{in-order} program text where the processor is at at any given time. \\
84+
\textbf{Complexity} & Led to delays in design, hardware overhead to overcome an ISA issue. \\
85+
\textbf{Limited by CDB} & CBD must go through all functional units, and only one instruction can write to bus per cycle. \\
86+
\textbf{No Precise Interrupts} & As instructions are executed out of order, we cannot clearly define a point in the \textit{in-order} program text where the processor is at at any given time. \\
8787
\end{tabbox}
8888

8989
It is possible to overlap loop iterations:
9090
\begin{itemize}
91-
\item (Effectively) Register renaming allows for different physical destinations (e.g ignore register and straight to functional unit).
92-
\item Reservation stations can buffer old values to avoid write after read / anti dependence stalls.
91+
\item (Effectively) Register renaming allows for different physical destinations (e.g ignore register and straight to functional unit).
92+
\item Reservation stations can buffer old values to avoid write after read / anti dependence stalls.
9393
\end{itemize}
9494

9595
\lectlink{https://imperial.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=2faf6023-3195-440e-b4f3-af2a010f70d6}{Chapter 2 - Part 2: Speculation}
9696

9797
\section{Precise Interrupts}
9898
In order to use precise interrupts we need a consistent state.
9999
\begin{itemize}
100-
\item All instructions up to some point have committed changes to machine state (registers \& memory).
101-
\item No instructions past have committed.
102-
\item Hence on an interrupt (e.g page fault, syscall) we can easily save state, and restart where the interrupt suspended execution of a program.
103-
\item This is also important for branches (need to undo prevent committing work executed speculatively)
100+
\item All instructions up to some point have committed changes to machine state (registers \& memory).
101+
\item No instructions past have committed.
102+
\item Hence on an interrupt (e.g page fault, syscall) we can easily save state, and restart where the interrupt suspended execution of a program.
103+
\item This is also important for branches (need to undo prevent committing work executed speculatively)
104104
\end{itemize}
105105
Hence we want to make a \textit{speculative tomasulo algorithm}
106106
\begin{enumerate}
107-
\item Issue/Dispatch (Get instruction from buffer of fetched instructions, send operands \& reorder buffer number to destination)
108-
\item Execution (Out of order execution of issued instructions)
109-
\item Write Back (in order to common data bus and waiting functional units)
110-
\item Commit (Update register with reorder result, reorder buffer takes completed instructions, puts in issue order and updates state)
107+
\item Issue/Dispatch (Get instruction from buffer of fetched instructions, send operands \& reorder buffer number to destination)
108+
\item Execution (Out of order execution of issued instructions)
109+
\item Write Back (in order to common data bus and waiting functional units)
110+
\item Commit (Update register with reorder result, reorder buffer takes completed instructions, puts in issue order and updates state)
111111
\end{enumerate}
112112
This requires several additions
113113
\begin{itemize}
114-
\item Commit unit to manage reorder buffer
115-
\item Issue side registers for execution
116-
\item Commit side registers for the committed results
117-
\item Ability to flush the reorder buffer on a branch mispredict
114+
\item Commit unit to manage reorder buffer
115+
\item Issue side registers for execution
116+
\item Commit side registers for the committed results
117+
\item Ability to flush the reorder buffer on a branch mispredict
118118
\end{itemize}
119119

120120
\section{Store Buffering}
121121
Stores are an issue as they cannot be completed until committed, but succeeding loads can be executed straight away.
122122
\begin{itemize}
123-
\item We could stall all preceding loads until the store is complete
124-
\item We can buffer uncommitted stores, associated with addresses, and check these for any load (to get the nearest hit, or on miss go to memory). Loads must be stalled until all possibly aliasing store addresses are resolved
123+
\item We could stall all preceding loads until the store is complete
124+
\item We can buffer uncommitted stores, associated with addresses, and check these for any load (to get the nearest hit, or on miss go to memory). Loads must be stalled until all possibly aliasing store addresses are resolved
125125
\end{itemize}
126126
Loads and stores use computed addresses (not always known at issue time)
127127
\begin{itemize}
128-
\item Can speculate, and forward a store's result to a load
129-
\item Must recover when the computed address is not the speculated
128+
\item Can speculate, and forward a store's result to a load
129+
\item Must recover when the computed address is not the speculated
130130
\end{itemize}
131131
Hence we can add a \textit{forwarding predictor} to determine if a store should be forwarded to some load behind it in the pipeline.
132132
\begin{sidenotebox}{Dependence Prediction}
133-
More can be read about predicting the dependence of a load on another store instruction \href{here}{https://jilp.org/vol2/v2paper13.pdf}.
133+
More can be read about predicting the dependence of a load on another store instruction \href{here}{https://jilp.org/vol2/v2paper13.pdf}.
134134
\end{sidenotebox}
135135

136136
\section{Register Update Unit}
137137
An alternative to reservation stations and the reorder buffer.
138138
\begin{itemize}
139-
\item A single table of instructions after fetch, acting as a reservation station.
140-
\item Once the operands are found, the instruction can be issued (hence functional unit determined after operands, unlike in Tomasulo's)
141-
\item RUU entries are committed to update the commit side registers.
139+
\item A single table of instructions after fetch, acting as a reservation station.
140+
\item Once the operands are found, the instruction can be issued (hence functional unit determined after operands, unlike in Tomasulo's)
141+
\item RUU entries are committed to update the commit side registers.
142142
\end{itemize}
143143
\begin{tabbox}{prosbox}
144-
\textbf{Monitors} & In Tomasulo's every reservation station and reorder buffer entry needs to have a comparator and monitor the common data bus. With the RUU strategy, fewer comparators are required. \\
145-
\textbf{Tags} & With RUU the tags are ROB entries. Furthermore the RUU is indexed by the tag. \\
144+
\textbf{Monitors} & In Tomasulo's every reservation station and reorder buffer entry needs to have a comparator and monitor the common data bus. With the RUU strategy, fewer comparators are required. \\
145+
\textbf{Tags} & With RUU the tags are ROB entries. Furthermore the RUU is indexed by the tag. \\
146146
\end{tabbox}
147147

148148
\section{Register Alias Tables}

0 commit comments

Comments
 (0)