-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathIntro-en.Rmd
213 lines (139 loc) · 5.98 KB
/
Intro-en.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
---
title: "Data Science and visualisation : introduction"
author: "Etienne Côme"
date: "October 17, 2024"
output:
revealjs::revealjs_presentation:
theme: white
transition: none
self_contained: true
css: slides.css
beamer_presentation:
toc: false
incremental: false
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
```
## Data Science ?
*The next sexy job*
*The ability to take data to be able to understand it, to process
it, to extract value from it, to visualize it, to communicate it,
that’s going to be a hugely important skill.*
**-- Hal Varian, Google**
## Data Science ?
*Data science, as it's practiced, is a blend of Red-Bull-fueled
__hacking__ and espresso-inspired __statistics__.*
*Data science is the civil engineering of data. Its acolytes
possess a practical knowledge of tools & materials, coupled
with a theoretical understanding of what's possible*
**-- Mike Driscoll, CEO of metamarkets**
## <span style="color:#000">Drew Conway’s Data Science Venn Diagram </span> {data-background=#ffffff}
<div style="text-align:center">
<img src="./images/VennDiagram.png" height="500px">
</div>
## Data Science ?
*A data scientist is someone who can obtain, scrub, explore, model
and interpret data, blending hacking, statistics and machine
learning. Data scientists not only are adept at working with data, but
appreciate data itself as a first-class product.*
**-- Hilary Mason, chief scientist at bit.ly**
## Data Science?
*Talking about data also evokes the __datascientist__, this five-legged sheep of data with __statistical, computer skills__, perfectly understanding __the business stakes__ of the company... Is he also a fantasy of the ambient discourse on big data ?*
## Data Science?
*While there may exist profiles that come close to this description, reality shows most often that datascience, like science in general, does not happen alone but __in a group__.(...) *
*Another little-known fact about the datascientist is that he is first and foremost a __craftsman's trade__. Each problem and each dataset always requires a specific approach that cannot be industrialized, which many people still don't understand.*
## A fashion with ancient origins
<div style="text-align:center"><img src="./images/640px-Johannes_Kepler_1610.jpg" height="500px"></br>
<a href="http://fr.wikipedia.org/wiki/Johannes_Kepler">Johann Kepler</a></div>
## A fashion with ancient origins
<div style="text-align:center"><img src="./images/minardportrait.jpg" height="500px"></br>
<a href="https://en.wikipedia.org/wiki/Charles_Joseph_Minard">Charles Joseph Minard</a></div>
## A fashion with ancient origins
<div style="text-align:center"><img src="./images/Minard.png" width="100%"></br>
<a href="https://en.wikipedia.org/wiki/Charles_Joseph_Minard">Charles Joseph Minard</a></div>
## A fashion with ancient origins
<div style="text-align:center">
<img src="./images/William_Sealy_Gosset.jpg" height="500px"></br>
<a href="https://en.wikipedia.org/wiki/William_Sealy_Gosset">William Sealy Gosset (Student)</a></div>
## Key competencies
### 1. Prepare data (DB)
Recover, mix, enrich, filter, clean, verify, format, transform data...
### 2. Models (ML/Stats)
Decision tree, regression, clustering, graphical model, SVM...
### 3. Interpret/share (Visualisation)
Graphics, Data visualization, Maps...
## Key competencies
### 1. Prepare data (DB) -- 80% of the job
Recover, mix, enrich, filter, clean, verify, format, transform data...
### 2. Implementing a method a model (ML/Stats)
Decision tree, regression, clustering, graphical model, SVM...
### 3. Interpret/share (Visualisation) -- 80% of the job
Graphics, Data visualization, Maps...
## Key competencies
### 1. Data Munging
Retrieve, mix, enrich, filter, clean, verify, format, transform data
### 2. Statistics
Traditional data analysis
### 3. Visualisation
Graphics, Data visualization, Maps...
## Course Outline
<ul>
<li> handling R data with dplyr
<li> introduction to visualization, good practices & common mistakes</li>
<li> ggplot and grammar of graphics </li>
<li> spatial data </li>
<li> introduction to cartography </li>
</ul>
## {data-background="images/lbcbig.jpg"}
<h1 style="color:#000">Some projects </h1>
<h4 style="text-align:center" class="shadow"><a href="http://www.comeetie.fr/galerie/leboncoin/">
http://www.comeetie.fr/galerie/leboncoin/</a></h4>
## {data-background="https://www.comeetie.fr/assets/img/fp2019.png"}
<h1 style="color:#000">Some projects </h1>
<a href="https://www.comeetie.fr/galerie/francepixels2023/">
<h4 style="text-align:center">https://www.comeetie.fr/galerie/francepixels2023/</h4>
</a>
## {data-background="#fff"}
<h1 style="color:#000">Some projects </h1>
<div style="text-align:center">
<a href="https://www.comeetie.fr/galerie/sankeystif/">
<img src="images/metro.png" height="50%"></img>
<h4 style="text-align:center">https://www.comeetie.fr/galerie/sankeystif/</h4>
</a>
</div>
## {data-background="#fff"}
<h1>Smart-card data analysis</h1>
<div style="text-align:center">
<img src="images/valid_metro_PontchaillouNColor.png" height="50%"></img>
</div>
## {data-background="#fff"}
<h1>Smart-card data analysis</h1>
<div style="text-align:center">
<img src="images/day_calendar.png" height="50%"></img>
</div>
## {data-background="#fff"}
<h1>Smart-card data analysis</h1>
<div style="text-align:center">
<img src="images/decomp_m1.png" height="50%"></img>
</div>
## {data-background="#fff"}
<h1>Smart-card data analysis</h1>
<div style="text-align:center">
<img src="images/clusters_1_4_7_9MMOG.png" width="60%"></img>
</div>
## {data-background="#fff"}
<h1>Metro load prediction</h1>
<div style="text-align:center">
<img src="images/predlaod.png" width="70%"></img>
</div>
## {data-background="#fff"}
<h1>Metro load prediction</h1>
<div style="text-align:center">
<img src="images/model_nogrey.png" height="50%"></img>
</div>
## {data-background="#fff"}
<h1>Metro load prediction</h1>
<div style="text-align:center">
<img src="images/inciden_corr.png" height="50%"></img>
</div>