🍽️ Restaurant Tips Analysis

This project aims to use the restaurant tips dataset to practice creating composition plots and visualizations. We will examine the relationship between different variables and the tips given.

The dataset consists of information from 244 restaurant bills, collected in the US in 1987.

It includes details about the tips given to restaurant staff, such as the total bill, tip amount, gender of the person paying, smoking status, day of the week, time of day, and party size.

Data details

Source: Swiss coding academy

The main goal of this analysis: We will learn more the relationship between different variables and the tips given

We need to answers below to find main goal:

What is the data like?
What does the data need to clean?
What does the data group customer by?
What does the data need to calculate?
How do we need to compare groups by criteria?

How do we do to answer?

What is the data like?

Import pandas ,matplotlib
Read and Check data Results as :

The day it occurred
If it was at lunch or dinner
The total bill
The sex of the person
If they were a smoker or not
The size of the party

You can see table 5 first row of data :

	Id	Total_bill	Tip	Sex	Smoker	Day	Time	Size
0	0	16.99	1.01	Female	No	Sun	Dinner	2
1	1	10.34	1.66	Male	No	Sun	Dinner	3
2	2	21.01	3.50	Male	No	Sun	Dinner	3
3	3	23.68	3.31	Male	No	Sun	Dinner	2
4	4	24.59	3.61	Female	No	Sun	Dinner	4

What does the data need to clean?

Check values is null, Check values is duplicate. Results: they are fine
Check typyes and We have string columns considered as objects. So we fix their types correct.

Result: We have dtypes: Float64(2), Int64(2), string(4). They fixed correct.

What does the data group customer by?

We need to calculate the tip figures between the two customer files. And we categorize the datas to be compared as follows:

Smokers and non-smokers
Male and Female
Weekends and weekdays
Lunch and dinner ( We checked data about column time, and the result is the restaurant only has 2 service times)

What does the data need to calculate?

View describe data
Calculate the metrics for the customer groups listed above as: Min, Max, Mean, Median

How do we need to compare groups by criteria? We use T-test to compare groups by criteria. And we use matplotlib to distribution comparison.

You can see how I do them in detail here: https://colab.research.google.com/drive/1ZyT3H1C8TqUXtE99oqKlXoY_WLR1vGZl?usp=sharing

📝 SUMMARY

After We have results. We have some Insights and conclussion as :

The First: smokers and non-smokers

Insights based on measures of central tendency comparison:

Based on the measure :

The max tip value is belong smokers group. It's 10USD
The average tip value: The smokers is higher than non-smokers

General conclusion: The smokers usually give tip higher non-smokers

We have TtestResult:

statistic=0.09222805186888201
pvalue=0.9265931522244976

Insights based on distribution comparison:

Based on the T-test between smokers and non-smokers, we have the result is pvalue = 0.926 > 0.05. We can conclude that these two customer groups do not have much difference in the amount of money tipped to the restaurant's service staff.

General conclusion:

Through the calculation table of min, max, median parameters and the distribution image, we can see that: The average tip amount is 2.9 USD. The highest tip amount is 10 USD. The smokers group give tip more than non_smokers. But there isn't significant difference .Amount tip from 1 USD - 2.5 USD that is amount for restaurant staff receive the most.

The second: Female and male

Insights based on measures of central tendency comparison:

Based on the measure :

The max tip value is belong male. It's 10USD
The average tip value: male is higher than female

General conclusion: The male usually give tip higher female

We have TtestResult:

statistic=-1.387859705421269
pvalue=0.16645623503456755

Insights based on distribution comparison:

Based on the T-test between male and female, we have the result that pvalue = 0.16 > 0.05. We can conclude that these two groups of customers do not have much difference in the amount of tips for restaurant's service staff.

General conclusion:

Through the table of min, max, average parameters and the distribution image, we see that: The male give tip more than female.But there isn't significant difference.

The third: Weekends and Weekdays

Insights based on measures of central tendency comparison:

Based on the measure :

The max tip value is on Weekends. It's 10USD
The average tip value: The weekends is higher than weekdays

General conclusion: The weekends usually give tip higher weekdays

We have TtestResult:

statistic=1.1028993019409794
pvalue=0.27154326510606286

Insights based on distribution comparison:

Based on the T-test between weekends and weekdays, we have the result that pvalue = 0.27 > 0.05. We can conclude that these two groups of customers do not have much difference in the amount of tips for restaurant's service staff.

General conclusion:

Through the table of min, max, average parameters and the distribution image, we see that: The weekend's customers usually tip more than weekday's customers.But there isn't significant difference.

The last: Dinner and lunch

Insights based on measures of central tendency comparison:

Based on the measure :

The max tip value is dinner. It's 10USD
The average tip value: The dinner is higher than lunch

General conclusion: The dinner usually give tip higher lunch

We have TtestResult:

statistic=1.9062569301202392
pvalue=0.05780153475171558

Insights based on distribution comparison:

Based on the T-test between dinner and lunch, we have the result that pvalue = 0.0578 > 0.05. We can conclude that these two groups of customers do not have much difference in the amount of tips for restaurant's service staff.

General conclusion:

Through the table of min, max, average parameters and the distribution image, we see that: The dinner group usually tip more than lunch group. But there isn't significant difference.

You can see how I do them in detail here: https://colab.research.google.com/drive/1ZyT3H1C8TqUXtE99oqKlXoY_WLR1vGZl?usp=sharing

📝 CONCLUSION

After processing and calculating the data, we have the highlights:

The smokers tip more than non-smokers
The male tip more than female
The weekends have tip more than weekdays
The dinner group have tip more than lunch group

However, there aren't any much difference between the relationship variables and the tips given.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
Restaurant_tips_analysis.ipynb		Restaurant_tips_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🍽️ Restaurant Tips Analysis

📝 SUMMARY

The First: smokers and non-smokers

Insights based on measures of central tendency comparison:

General conclusion: The smokers usually give tip higher non-smokers

Insights based on distribution comparison:

General conclusion:

The second: Female and male

Insights based on measures of central tendency comparison:

General conclusion: The male usually give tip higher female

Insights based on distribution comparison:

General conclusion:

The third: Weekends and Weekdays

Insights based on measures of central tendency comparison:

General conclusion: The weekends usually give tip higher weekdays

Insights based on distribution comparison:

General conclusion:

The last: Dinner and lunch

Insights based on measures of central tendency comparison:

General conclusion: The dinner usually give tip higher lunch

Insights based on distribution comparison:

General conclusion:

📝 CONCLUSION

After processing and calculating the data, we have the highlights:

About

Releases

Packages

Languages

Maianh2510/Restaurant-tips-analysis

Folders and files

Latest commit

History

Repository files navigation

🍽️ Restaurant Tips Analysis

📝 SUMMARY

The First: smokers and non-smokers

Insights based on measures of central tendency comparison:

General conclusion: The smokers usually give tip higher non-smokers

Insights based on distribution comparison:

General conclusion:

The second: Female and male

Insights based on measures of central tendency comparison:

General conclusion: The male usually give tip higher female

Insights based on distribution comparison:

General conclusion:

The third: Weekends and Weekdays

Insights based on measures of central tendency comparison:

General conclusion: The weekends usually give tip higher weekdays

Insights based on distribution comparison:

General conclusion:

The last: Dinner and lunch

Insights based on measures of central tendency comparison:

General conclusion: The dinner usually give tip higher lunch

Insights based on distribution comparison:

General conclusion:

📝 CONCLUSION

After processing and calculating the data, we have the highlights:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages