|
| 1 | +--- |
| 2 | +jupyter: |
| 3 | + jupytext: |
| 4 | + notebook_metadata_filter: all |
| 5 | + text_representation: |
| 6 | + extension: .md |
| 7 | + format_name: markdown |
| 8 | + format_version: '1.2' |
| 9 | + jupytext_version: 1.4.2 |
| 10 | + kernelspec: |
| 11 | + display_name: Python 3 |
| 12 | + language: python |
| 13 | + name: python3 |
| 14 | + language_info: |
| 15 | + codemirror_mode: |
| 16 | + name: ipython |
| 17 | + version: 3 |
| 18 | + file_extension: .py |
| 19 | + mimetype: text/x-python |
| 20 | + name: python |
| 21 | + nbconvert_exporter: python |
| 22 | + pygments_lexer: ipython3 |
| 23 | + version: 3.7.7 |
| 24 | + plotly: |
| 25 | + description: How to add empirical cumulative distribution function (ECDF) plots. |
| 26 | + display_as: statistical |
| 27 | + language: python |
| 28 | + layout: base |
| 29 | + name: Empirical Cumulative Distribution Plots |
| 30 | + order: 16 |
| 31 | + page_type: u-guide |
| 32 | + permalink: python/ecdf-plots/ |
| 33 | + thumbnail: thumbnail/figure-labels.png |
| 34 | +--- |
| 35 | + |
| 36 | +### Overview |
| 37 | + |
| 38 | +[Empirical cumulative distribution function plots](https://en.wikipedia.org/wiki/Empirical_distribution_function) are a way to visualize the distribution of a variable, and Plotly Express has a built-in function, `px.ecdf()` to generate such plots. [Plotly Express](/python/plotly-express/) is the easy-to-use, high-level interface to Plotly, which [operates on a variety of types of data](/python/px-arguments/) and produces [easy-to-style figures](/python/styling-plotly-express/). |
| 39 | + |
| 40 | +Alternatives to ECDF plots for visualizing distributions include [histograms](https://plotly.com/python/histograms/), [violin plots](https://plotly.com/python/violin/), [box plots](https://plotly.com/python/box-plots/) and [strip charts](https://plotly.com/python/strip-charts/). |
| 41 | + |
| 42 | +### Simple ECDF Plots |
| 43 | + |
| 44 | +Providing a single column to the `x` variable yields a basic ECDF plot. |
| 45 | + |
| 46 | +```python |
| 47 | +import plotly.express as px |
| 48 | +df = px.data.tips() |
| 49 | +fig = px.ecdf(df, x="total_bill") |
| 50 | +fig.show() |
| 51 | +``` |
| 52 | + |
| 53 | +Providing multiple columns leverage's Plotly Express' [wide-form data support](https://plotly.com/python/wide-form/) to show multiple variables on the same plot. |
| 54 | + |
| 55 | +```python |
| 56 | +import plotly.express as px |
| 57 | +df = px.data.tips() |
| 58 | +fig = px.ecdf(df, x=["total_bill", "tip"]) |
| 59 | +fig.show() |
| 60 | +``` |
| 61 | + |
| 62 | +It is also possible to map another variable to the color dimension of a plot. |
| 63 | + |
| 64 | +```python |
| 65 | +import plotly.express as px |
| 66 | +df = px.data.tips() |
| 67 | +fig = px.ecdf(df, x="total_bill", color="sex") |
| 68 | +fig.show() |
| 69 | +``` |
| 70 | + |
| 71 | +### Configuring the Y axis |
| 72 | + |
| 73 | +By default, the Y axis shows probability, but it is also possible to show raw counts by setting the `ecdfnorm` argument to `None` or to show percentages by setting it to `percent`. |
| 74 | + |
| 75 | +```python |
| 76 | +import plotly.express as px |
| 77 | +df = px.data.tips() |
| 78 | +fig = px.ecdf(df, x="total_bill", color="sex", ecdfnorm=None) |
| 79 | +fig.show() |
| 80 | +``` |
| 81 | + |
| 82 | +If a `y` value is provided, the Y axis is set to the sum of `y` rather than counts. |
| 83 | + |
| 84 | +```python |
| 85 | +import plotly.express as px |
| 86 | +df = px.data.tips() |
| 87 | +fig = px.ecdf(df, x="total_bill", y="tip", color="sex", ecdfnorm=None) |
| 88 | +fig.show() |
| 89 | +``` |
| 90 | + |
| 91 | +### Reversed and Complementary CDF plots |
| 92 | + |
| 93 | +By default, the Y value represents the fraction of the data that is *at or below* the value on on the X axis. Setting `ecdfmode` to `"reversed"` reverses this, with the Y axis representing the fraction of the data *at or above* the X value. Setting `ecdfmode` to `"complementary"` plots `1-ECDF`, meaning that the Y values represent the fraction of the data *above* the X value. |
| 94 | + |
| 95 | +In `standard` mode (the default), the right-most point is at 1 (or the total count/sum, depending on `ecdfnorm`) and the right-most point is above 0. |
| 96 | + |
| 97 | +```python |
| 98 | +import plotly.express as px |
| 99 | +fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="standard", |
| 100 | + title="ecdfmode='standard' (Y=fraction at or below X value, this the default)") |
| 101 | +fig.show() |
| 102 | +``` |
| 103 | + |
| 104 | +In `reversed` mode, the right-most point is at 1 (or the total count/sum, depending on `ecdfnorm`) and the left-most point is above 0. |
| 105 | + |
| 106 | +```python |
| 107 | +import plotly.express as px |
| 108 | +fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="reversed", |
| 109 | + title="ecdfmode='reversed' (Y=fraction at or above X value)") |
| 110 | +fig.show() |
| 111 | +``` |
| 112 | + |
| 113 | +In `complementary` mode, the right-most point is at 0 and no points are at 1 (or the total count/sum) per the definition of the CCDF as 1-ECDF, which has no point at 0. |
| 114 | + |
| 115 | +```python |
| 116 | +import plotly.express as px |
| 117 | +fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="complementary", |
| 118 | + title="ecdfmode='complementary' (Y=fraction above X value)") |
| 119 | +fig.show() |
| 120 | +``` |
| 121 | + |
| 122 | +### Orientation |
| 123 | + |
| 124 | +By default, plots are oriented vertically (i.e. the variable is on the X axis and counted/summed upwards), but this can be overridden with the `orientation` argument. |
| 125 | + |
| 126 | +```python |
| 127 | +import plotly.express as px |
| 128 | +df = px.data.tips() |
| 129 | +fig = px.ecdf(df, x="total_bill", y="tip", color="sex", ecdfnorm=None, orientation="h") |
| 130 | +fig.show() |
| 131 | +``` |
| 132 | + |
| 133 | +### Markers and/or Lines |
| 134 | + |
| 135 | +ECDF Plots can be configured to show lines and/or markers. |
| 136 | + |
| 137 | +```python |
| 138 | +import plotly.express as px |
| 139 | +df = px.data.tips() |
| 140 | +fig = px.ecdf(df, x="total_bill", color="sex", markers=True) |
| 141 | +fig.show() |
| 142 | +``` |
| 143 | + |
| 144 | +```python |
| 145 | +import plotly.express as px |
| 146 | +df = px.data.tips() |
| 147 | +fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False) |
| 148 | +fig.show() |
| 149 | +``` |
| 150 | + |
| 151 | +### Marginal Plots |
| 152 | + |
| 153 | +ECDF plots also support [marginal plots](https://plotly.com/python/marginal-plots/) |
| 154 | + |
| 155 | +```python |
| 156 | +import plotly.express as px |
| 157 | +df = px.data.tips() |
| 158 | +fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False, marginal="histogram") |
| 159 | +fig.show() |
| 160 | +``` |
| 161 | + |
| 162 | +```python |
| 163 | +import plotly.express as px |
| 164 | +df = px.data.tips() |
| 165 | +fig = px.ecdf(df, x="total_bill", color="sex", marginal="rug") |
| 166 | +fig.show() |
| 167 | +``` |
| 168 | + |
| 169 | +### Facets |
| 170 | + |
| 171 | +ECDF Plots also support [faceting](https://plotly.com/python/facet-plots/) |
| 172 | + |
| 173 | +```python |
| 174 | +import plotly.express as px |
| 175 | +df = px.data.tips() |
| 176 | +fig = px.ecdf(df, x="total_bill", color="sex", facet_row="time", facet_col="day") |
| 177 | +fig.show() |
| 178 | +``` |
| 179 | + |
| 180 | +```python |
| 181 | + |
| 182 | +``` |
0 commit comments