Skip to content

Commit c2f959c

Browse files
committed
removing useless git LFS
1 parent dc123c5 commit c2f959c

File tree

481 files changed

+1884181
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

481 files changed

+1884181
-0
lines changed

bonus content/effective data visualization/Bonus - Effective Multi-dimensional Data Visualization.ipynb

+2,034
Large diffs are not rendered by default.

bonus content/effective data visualization/winequality-red.csv

+1,600
Large diffs are not rendered by default.

bonus content/effective data visualization/winequality-white.csv

+4,899
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
Citation Request:
2+
This dataset is public available for research. The details are described in [Cortez et al., 2009].
3+
Please include this citation if you plan to use this database:
4+
5+
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
6+
Modeling wine preferences by data mining from physicochemical properties.
7+
In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
8+
9+
Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016
10+
[Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf
11+
[bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib
12+
13+
1. Title: Wine Quality
14+
15+
2. Sources
16+
Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009
17+
18+
3. Past Usage:
19+
20+
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
21+
Modeling wine preferences by data mining from physicochemical properties.
22+
In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
23+
24+
In the above reference, two datasets were created, using red and white wine samples.
25+
The inputs include objective tests (e.g. PH values) and the output is based on sensory data
26+
(median of at least 3 evaluations made by wine experts). Each expert graded the wine quality
27+
between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model
28+
these datasets under a regression approach. The support vector machine model achieved the
29+
best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T),
30+
etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity
31+
analysis procedure).
32+
33+
4. Relevant Information:
34+
35+
The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine.
36+
For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009].
37+
Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables
38+
are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).
39+
40+
These datasets can be viewed as classification or regression tasks.
41+
The classes are ordered and not balanced (e.g. there are munch more normal wines than
42+
excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent
43+
or poor wines. Also, we are not sure if all input variables are relevant. So
44+
it could be interesting to test feature selection methods.
45+
46+
5. Number of Instances: red wine - 1599; white wine - 4898.
47+
48+
6. Number of Attributes: 11 + output attribute
49+
50+
Note: several of the attributes may be correlated, thus it makes sense to apply some sort of
51+
feature selection.
52+
53+
7. Attribute information:
54+
55+
For more information, read [Cortez et al., 2009].
56+
57+
Input variables (based on physicochemical tests):
58+
1 - fixed acidity
59+
2 - volatile acidity
60+
3 - citric acid
61+
4 - residual sugar
62+
5 - chlorides
63+
6 - free sulfur dioxide
64+
7 - total sulfur dioxide
65+
8 - density
66+
9 - pH
67+
10 - sulphates
68+
11 - alcohol
69+
Output variable (based on sensory data):
70+
12 - quality (score between 0 and 10)
71+
72+
8. Missing Attribute Values: None

0 commit comments

Comments
 (0)