Skip to content

Commit 71e8a2d

Browse files
chennesyjt14den
authored andcommitted
add exercises to dataviz
1 parent 265867b commit 71e8a2d

File tree

4 files changed

+103
-0
lines changed

4 files changed

+103
-0
lines changed

episodes/data-visualisation.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,109 @@ fig.show()
188188

189189
Here is a view of the [interactive output of the Plotly bar chart](learners/bar_plot_int.html).
190190

191+
::::::::::::::::::::::::::::::::::::::: challenge
192+
193+
## Plotting with Pandas
194+
195+
1. Load the dataset `df_long.pkl` using Pandas.
196+
2. Create a new DataFrame that only includes the data for the "Chinatown" branch.
197+
3. Use the Pandas plotting function to plot the "circulation" column over time.
198+
199+
200+
201+
::::::::::::::: solution
202+
203+
## Solution
204+
205+
```python
206+
import pandas as pd
207+
df_long = pd.read_pickle('data/df_long.pkl')
208+
chinatown = df_long[df_long['branch'] == 'Chinatown']
209+
chinatown['circulation'].plot()
210+
```
211+
212+
![Chinatown plot](fig/chinatown_circulation.png){alt='image showing the circulation of the Chinatown branch over ten years'}
213+
214+
:::::::::::::::::::::::::
215+
216+
::::::::::::::::::::::::::::::::::::::::::::::::::
217+
218+
::::::::::::::::::::::::::::::::::::::: challenge
219+
220+
## Modify a plot display
221+
222+
Add a line to the code below to plot the Uptown branch circulation including the following plot elements:
223+
224+
- A title, "Uptown Circulation"
225+
- "Year" and "Circulation Count" labels for the x and y axes
226+
- A green plot line
227+
228+
229+
```python
230+
import pandas as pd
231+
df_long = pd.read_pickle('data/df_long.pkl')
232+
uptown = df_long[df_long['branch'] == 'Uptown']
233+
```
234+
235+
::::::::::::::: solution
236+
237+
## Solution
238+
239+
```python
240+
uptown['circulation'].plot(title='Uptown Circulation',
241+
color='green',
242+
xlabel='Year',
243+
ylabel='Circulation Count')
244+
```
245+
246+
![Uptown plot](fig/uptown_plot.png){alt='image showing the circulation of the Uptown branch with labels'}
247+
248+
:::::::::::::::::::::::::
249+
250+
::::::::::::::::::::::::::::::::::::::::::::::::::
251+
252+
::::::::::::::::::::::::::::::::::::::: challenge
253+
254+
## Plot the top five branches
255+
256+
Modify the code below to only plot the five Chicago Public Library branches with the highest circulation.
257+
258+
259+
```python
260+
import plotly.express as px
261+
import pandas as pd
262+
df_long = pd.read_pickle('data/df_long.pkl')
263+
total_circulation_by_branch = df_long.groupby('branch')['circulation'].sum().reset_index()
264+
265+
top_five = total_circulation_by_branch.___________________
266+
267+
# Create a bar plot
268+
fig = px.bar(top_five._______, x='branch', y='circulation', width=600, height=600, title='Total Circulation by Branch')
269+
fig.show()
270+
```
271+
272+
::::::::::::::: solution
273+
274+
## Solution
275+
276+
```python
277+
total_circulation_by_branch.sort_values(by='circulation', ascending=False)
278+
df_long = pd.read_pickle('data/df_long.pkl')
279+
total_circulation_by_branch = df_long.groupby('branch')['circulation'].sum().reset_index()
280+
281+
top_five = total_circulation_by_branch.sort_values(by='circulation', ascending=False)
282+
283+
# Create a bar plot
284+
fig = px.bar(top_five.head(), x='branch', y='circulation', width=600, height=600, title='Total Circulation by Branch')
285+
fig.show()
286+
287+
```
288+
289+
![Top five circulation branches](fig/top_five_circ.png){alt='a bar plot of the top five branch circulation figures'}
290+
291+
:::::::::::::::::::::::::
292+
293+
::::::::::::::::::::::::::::::::::::::::::::::::::
191294

192295

193296
::: keypoints
35.3 KB
Loading

episodes/fig/top_five_circ.png

47 KB
Loading

episodes/fig/uptown_plot.png

91.1 KB
Loading

0 commit comments

Comments
 (0)