[DOC] Pivot Table#3825
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3825 +/- ##
==========================================
+ Coverage 84.76% 84.93% +0.16%
==========================================
Files 374 376 +2
Lines 69172 70136 +964
==========================================
+ Hits 58637 59571 +934
- Misses 10535 10565 +30 |
|
|
||
| **Outputs** | ||
|
|
||
| - Pivot Table: contingency matrix as set in the widget |
There was a problem hiding this comment.
... as showed? as seen? showed?
|
|
||
| - Pivot Table: contingency matrix as set in the widget | ||
| - Filtered Data: subset selected from the plot | ||
| - Grouped Data: data table grouped by row values |
There was a problem hiding this comment.
Maybe - Grouped Data: aggregates over groups defined by row values?
| - Filtered Data: subset selected from the plot | ||
| - Grouped Data: data table grouped by row values | ||
|
|
||
| **Pivot Table** summarizes the data of a more extensive table into a table of statistics. The statistics can include sums, averages, counts, etc. The widget also allows selecting a subset from the plot and grouping by row values, which have to be a discrete variable. Data with only numeric variables cannot be displayed in the plot. |
There was a problem hiding this comment.
plot? Probably table? (This appears twice.)
|
|
||
|  | ||
|
|
||
| 1. Discrete or numeric variable that will be used for row values. Numeric variables are considered as integers in this case. Variable values will appear as rows in the table. |
There was a problem hiding this comment.
The last sentence is perhaps redundant. You can also remove "in this case." Also perhaps that will be.
|
|
||
| 1. Discrete or numeric variable that will be used for row values. Numeric variables are considered as integers in this case. Variable values will appear as rows in the table. | ||
| 2. Discrete variable that will be used for column values. Variable values will appear as columns in the table. | ||
| 3. Values that will be used for aggregation. Aggregated values will appear as cells in the table. |
There was a problem hiding this comment.
Consider removing that will be; also above.
| 3. Values that will be used for aggregation. Aggregated values will appear as cells in the table. | ||
| 4. Aggregation methods: | ||
| - For any variable type: | ||
| - *Count*: number of instances that appear in the data |
There was a problem hiding this comment.
- *Count*: size of the group, that is, the number of instances with the given row and column value.
I'm not totally sure this is better, though. :)
| 4. Aggregation methods: | ||
| - For any variable type: | ||
| - *Count*: number of instances that appear in the data | ||
| - *Count defined*: number of non-empty (not NaN) instances in the data. |
There was a problem hiding this comment.
Huh, maybe "number of instances with this combination of the row and column value, for which the value that is used for aggregation is defined".
|
|
||
|  | ||
|
|
||
| Example of a pivot table with only discrete variables selected. We are using *heart-disease* data set for this example. We are using the values of *diameter narrowing* as row values, namely 0 and 1. Our columns are values of *gender*, namely female and male. We are using *thal* as values in our cells. |
There was a problem hiding this comment.
We are using the values of *diameter narrowing* as row values -> Rows correspond to values of *diameter narrowing* variable.
You can skip namely.
|
|
||
|  | ||
|
|
||
| Example of a pivot table with numeric variables. We are using *heart-disease* data set for this example. We are using the values of *diameter narrowing* as row values, namely 0 and 1. Our columns are values of *gender*, namely female and male. We are using *rest SBP* as values in our cells. |
| Example | ||
| ------- | ||
|
|
||
| We are using *Forest Fires* for this example. The data is loaded in the [Datasets](../data/datasets.md) widget and passed to **Pivot Table**. *Forest Fires* datasets reports forest fires by the month and day they happened. We can aggregate all occurrences of forest fires by selecting *Count* as aggregation method and using *month* as row and *day* as column values. Since we are using *Count*, it does not matter what our *Values* variable will be, so we will leave it as is. |
There was a problem hiding this comment.
it does not matter what our *Values* variable will be -> maybe *Values* is unimportant (or something similar)
|
Comments addressed as well as possible. |
Issue
Pivot Table needs docs.
#3823
Description of changes
Add documentation for the widget.
Includes