Skip to content

Commit

Permalink
add video for loading spreadsheets into a DataFrame
Browse files Browse the repository at this point in the history
  • Loading branch information
Steve Baskauf committed Mar 4, 2024
1 parent 98bd5f7 commit b18cc57
Showing 1 changed file with 39 additions and 1 deletion.
40 changes: 39 additions & 1 deletion script/codegraf/ees3/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,9 +180,47 @@ states_df.loc['PA', 'population']

The resulting object has the type of the cell value.


---

# Loading a DataFrame from a file

One nice thing about loading spreadsheet data into a pandas DataFrame is that the file can come either from your local file directory or from a URL. The same function can be used for either data source.

## Loading a spreadsheet via a URL (3m02s)

Functions for reading and writing from spreadsheets to pandas DataFrames:

`pd.read_csv()` read from a CSV file into a data frame.

`pd.to_csv()` write from a data frame to a CSV file.

`pd.read_excel()` read from an Excel file into a data frame.

`pd.to_excel()` write from a data frame to an Excel file.

For details about reading from particular sheets in an Excel file, delimiters other than commas, etc. see the [pandas User Guide](https://pandas.pydata.org/docs/user_guide/io.html) and [this Stack Overflow post](https://stackoverflow.com/questions/26521266/using-pandas-to-pd-read-excel-for-multiple-worksheets-of-the-same-workbook).

<iframe width="1120" height="630" src="https://www.youtube.com/embed/JVwKj7H8QU0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Note: when loading files via a URL, be sure that the URL delivers the raw file, not an HTML representation of the file.

**Examining the DataFrame**

Use the `.head()` method to view only the first few lines of a DataFrame (default is 5 if `number_of_lines` argument omitted):
```
dataframe.head(number_of_lines)
```

The `.tail()` method is similar, but shows the last few lines of a DataFrame

The `.shape` attribute returns a tuple of the number of rows and number of columns.

The `.columns` attribute returns the column names as a pandas Index object. Use the `list()` function to convert into a simple Python list.

The `.index` returns the row label indices as a pandas Index object. Use the `list()` function to convert into a simple Python list.

----

# Loops

## for loops (5m46s)
Expand Down

0 comments on commit b18cc57

Please sign in to comment.