-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path#presentation.Rmd#
65 lines (49 loc) · 1.49 KB
/
#presentation.Rmd#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
title: "Parallelization and Remote Servers"
author: Abby Bratt, Maria Kuruvilla, and Colin Okasaki
date: February 5, 2020
output: ioslides_presentation
---
# Why Parallelize?
## Big Data and Slow Code
- Lots of data, simple operations, or:
- Not so much data, complex operations, or even:
- Lots of data, complex operations
## (Dis)Advantages of parallelization
- $N$ cores may speed up your code by a factor of $N$, but:
- only if you can efficiently distribute your code to cores
- Iterative algorithms: the bane of parallel computation
- Communication: the real bane of parallel computation
## Basic Structure of a computer
- Some CPUs (processors)
- Each CPU has some cores
- Each core can do 1 computation at a time
- Theoretically top speed = all cores active
- Reality: cores need to communicate
## Why use a remote server
- More cores
- Built for heavy computation
- Doesn't slow down your personal computer
- Feel like a pro, brag to your friends
## Recap
- When should you parallelize
- When should you not parallelize
- When should you use a remote server
## Practice
- MCMC
- Yes and no
- Can run multiple chains but each chain is iterative
- Computationally intensive, run remotely
- Bootstrap
- Yes
- Embarassingly parallel
-
- Likelihood evaluation
- Often, for big, conditionally independent data
- Linear algebra
- Yes, for some operations
- Matrix multiplication, e.g.
- Optimization
# How to Access the QERM/SEFS Servers
# A Basic R Program
# How to Access Hyak