AtliQ Hotels Data Analysis Project
Data Load¶
Datasets¶
We have 5 csv files with records over 1 lac
- dim_date.csv
- dim_hotels.csv
- dim_rooms.csv
- fact_aggregated_bookings.csv
- fact_bookings.csv
Quick Exploration¶
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT11 | 16558 | 27-04-22 | 1/5/2022 | 2/5/2022 | -3.0 | RT1 | direct online | 1.0 | Checked Out | 10010 | 10010 |
1 | May012216558RT12 | 16558 | 30-04-22 | 1/5/2022 | 2/5/2022 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
2 | May012216558RT13 | 16558 | 28-04-22 | 1/5/2022 | 4/5/2022 | 2.0 | RT1 | logtrip | 5.0 | Checked Out | 9100000 | 9100 |
3 | May012216558RT14 | 16558 | 28-04-22 | 1/5/2022 | 2/5/2022 | -2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
4 | May012216558RT15 | 16558 | 27-04-22 | 1/5/2022 | 2/5/2022 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 |
date | mmm yy | week no | day_type | |
---|---|---|---|---|
0 | 01-May-22 | May 22 | W 19 | weekend |
1 | 02-May-22 | May 22 | W 19 | weekeday |
2 | 03-May-22 | May 22 | W 19 | weekeday |
3 | 04-May-22 | May 22 | W 19 | weekeday |
4 | 05-May-22 | May 22 | W 19 | weekeday |
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
0 | 16559 | 1-May-22 | RT1 | 25 | 30.0 |
1 | 19562 | 1-May-22 | RT1 | 28 | 30.0 |
2 | 19563 | 1-May-22 | RT1 | 23 | 30.0 |
3 | 17558 | 1-May-22 | RT1 | 30 | 19.0 |
4 | 16558 | 1-May-22 | RT1 | 18 | 19.0 |
property_id | property_name | category | city | |
---|---|---|---|---|
0 | 16558 | Atliq Grands | Luxury | Delhi |
1 | 16559 | Atliq Exotica | Luxury | Mumbai |
2 | 16560 | Atliq City | Business | Delhi |
3 | 16561 | Atliq Blu | Luxury | Delhi |
4 | 16562 | Atliq Bay | Luxury | Delhi |
room_id | room_class | |
---|---|---|
0 | RT1 | Standard |
1 | RT2 | Elite |
2 | RT3 | Premium |
3 | RT4 | Presidential |
Preparing Data¶
We can clearly see that the dates are inconsistent across the dataset.
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT11 | 16558 | 2022-04-27 00:00:00 | 2022-05-01 00:00:00 | 2022-05-02 00:00:00 | -3.0 | RT1 | direct online | 1.0 | Checked Out | 10010 | 10010 |
1 | May012216558RT12 | 16558 | 2022-04-30 00:00:00 | 2022-05-01 00:00:00 | 2022-05-02 00:00:00 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
date | mmm yy | week no | day_type | |
---|---|---|---|---|
0 | 2022-05-01 | May 22 | W 19 | weekend |
1 | 2022-05-02 | May 22 | W 19 | weekeday |
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 |
<class 'pandas.core.frame.DataFrame'> RangeIndex: 92 entries, 0 to 91 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 date 92 non-null datetime64[ns] 1 mmm yy 92 non-null object 2 week no 92 non-null object 3 day_type 92 non-null object dtypes: datetime64[ns](1), object(3) memory usage: 3.0+ KB
<class 'pandas.core.frame.DataFrame'> RangeIndex: 9200 entries, 0 to 9199 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 property_id 9200 non-null int64 1 check_in_date 9200 non-null datetime64[ns] 2 room_category 9200 non-null object 3 successful_bookings 9200 non-null int64 4 capacity 9198 non-null float64 dtypes: datetime64[ns](1), float64(1), int64(2), object(1) memory usage: 359.5+ KB
<class 'pandas.core.frame.DataFrame'> RangeIndex: 134590 entries, 0 to 134589 Data columns (total 12 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 booking_id 134590 non-null object 1 property_id 134590 non-null int64 2 booking_date 134590 non-null object 3 check_in_date 134590 non-null object 4 checkout_date 134590 non-null object 5 no_guests 134587 non-null float64 6 room_category 134590 non-null object 7 booking_platform 134590 non-null object 8 ratings_given 56683 non-null float64 9 booking_status 134590 non-null object 10 revenue_generated 134590 non-null int64 11 revenue_realized 134590 non-null int64 dtypes: float64(2), int64(3), object(7) memory usage: 12.3+ MB
df_bookings date columns are still object dtypes
<class 'pandas.core.frame.DataFrame'> RangeIndex: 134590 entries, 0 to 134589 Data columns (total 12 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 booking_id 134590 non-null object 1 property_id 134590 non-null int64 2 booking_date 134590 non-null datetime64[ns] 3 check_in_date 134590 non-null datetime64[ns] 4 checkout_date 134590 non-null datetime64[ns] 5 no_guests 134587 non-null float64 6 room_category 134590 non-null object 7 booking_platform 134590 non-null object 8 ratings_given 56683 non-null float64 9 booking_status 134590 non-null object 10 revenue_generated 134590 non-null int64 11 revenue_realized 134590 non-null int64 dtypes: datetime64[ns](3), float64(2), int64(3), object(4) memory usage: 12.3+ MB
The datatypes are successfully converted
Helper functions for Visualization¶
Exploring bookings data
(134590, 12)
array(['RT1', 'RT2', 'RT3', 'RT4'], dtype=object)
array(['direct online', 'others', 'logtrip', 'tripster', 'makeyourtrip', 'journey', 'direct offline'], dtype=object)
booking_platform others 55066 makeyourtrip 26898 logtrip 14756 direct online 13379 tripster 9630 journey 8106 direct offline 6755 Name: count, dtype: int64
Number of bookings done through booking platforms¶
Overall Aggregation of Bookings data
property_id | booking_date | check_in_date | checkout_date | no_guests | ratings_given | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|
count | 134590.000000 | 134590 | 134590 | 134590 | 134587.000000 | 56683.000000 | 1.345900e+05 | 134590.000000 |
mean | 18061.113493 | 2022-06-11 05:51:49.871461120 | 2022-06-14 22:52:29.939816960 | 2022-06-17 07:51:51.262352640 | 2.036170 | 3.619004 | 1.537805e+04 | 12696.123256 |
min | 16558.000000 | 2022-04-07 00:00:00 | 2022-05-01 00:00:00 | 2022-05-02 00:00:00 | -17.000000 | 1.000000 | 6.500000e+03 | 2600.000000 |
25% | 17558.000000 | 2022-05-19 00:00:00 | 2022-05-23 00:00:00 | 2022-05-25 00:00:00 | 1.000000 | 3.000000 | 9.900000e+03 | 7600.000000 |
50% | 17564.000000 | 2022-06-11 00:00:00 | 2022-06-15 00:00:00 | 2022-06-17 00:00:00 | 2.000000 | 4.000000 | 1.350000e+04 | 11700.000000 |
75% | 18563.000000 | 2022-07-04 00:00:00 | 2022-07-08 00:00:00 | 2022-07-10 00:00:00 | 2.000000 | 5.000000 | 1.800000e+04 | 15300.000000 |
max | 19563.000000 | 2022-07-31 00:00:00 | 2022-07-31 00:00:00 | 2022-08-06 00:00:00 | 6.000000 | 5.000000 | 2.856000e+07 | 45220.000000 |
std | 1093.055847 | NaN | NaN | NaN | 1.034885 | 1.235009 | 9.303604e+04 | 6928.108124 |
Min and Max Revenue generated
Minimum Revenue: 6500 Maximum Revenue: 28560000
Exploring rest of the files
(25, 4)
property_id | property_name | category | city | |
---|---|---|---|---|
0 | 16558 | Atliq Grands | Luxury | Delhi |
1 | 16559 | Atliq Exotica | Luxury | Mumbai |
2 | 16560 | Atliq City | Business | Delhi |
3 | 16561 | Atliq Blu | Luxury | Delhi |
4 | 16562 | Atliq Bay | Luxury | Delhi |
category Luxury 16 Business 9 Name: count, dtype: int64
city Delhi 5 Hyderabad 6 Bangalore 6 Mumbai 8 Name: count, dtype: int64
Number of Hotels per City¶
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 |
3 | 17558 | 2022-05-01 | RT1 | 30 | 19.0 |
4 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 |
array([16559, 19562, 19563, 17558, 16558, 17560, 19558, 19560, 17561, 16560, 16561, 16562, 16563, 17559, 17562, 17563, 18558, 18559, 18561, 18562, 18563, 19559, 19561, 17564, 18560])
Total Successful Bookings done by each Property
property_id 16558 3153 16559 7338 16560 4693 16561 4418 16562 4820 16563 7211 17558 5053 17559 6142 17560 6013 17561 5183 17562 3424 17563 6337 17564 3982 18558 4475 18559 5256 18560 6638 18561 6458 18562 7333 18563 4737 19558 4400 19559 4729 19560 6079 19561 5736 19562 5812 19563 5413 Name: successful_bookings, dtype: int64
Properties where Successful Bookings crossed the Capacity
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
3 | 17558 | 2022-05-01 | RT1 | 30 | 19.0 |
12 | 16563 | 2022-05-01 | RT1 | 100 | 41.0 |
4136 | 19558 | 2022-06-11 | RT2 | 50 | 39.0 |
6209 | 19560 | 2022-07-02 | RT1 | 123 | 26.0 |
8522 | 19559 | 2022-07-25 | RT1 | 35 | 24.0 |
9194 | 18563 | 2022-07-31 | RT4 | 20 | 18.0 |
Max Capacity
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
27 | 17558 | 2022-05-01 | RT2 | 38 | 50.0 |
128 | 17558 | 2022-05-02 | RT2 | 27 | 50.0 |
229 | 17558 | 2022-05-03 | RT2 | 26 | 50.0 |
328 | 17558 | 2022-05-04 | RT2 | 27 | 50.0 |
428 | 17558 | 2022-05-05 | RT2 | 29 | 50.0 |
... | ... | ... | ... | ... | ... |
8728 | 17558 | 2022-07-27 | RT2 | 22 | 50.0 |
8828 | 17558 | 2022-07-28 | RT2 | 21 | 50.0 |
8928 | 17558 | 2022-07-29 | RT2 | 23 | 50.0 |
9028 | 17558 | 2022-07-30 | RT2 | 32 | 50.0 |
9128 | 17558 | 2022-07-31 | RT2 | 30 | 50.0 |
92 rows × 5 columns
Data Cleaning¶
date | mmm yy | week no | day_type | |
---|---|---|---|---|
0 | 2022-05-01 | May 22 | W 19 | weekend |
1 | 2022-05-02 | May 22 | W 19 | weekeday |
2 | 2022-05-03 | May 22 | W 19 | weekeday |
3 | 2022-05-04 | May 22 | W 19 | weekeday |
4 | 2022-05-05 | May 22 | W 19 | weekeday |
We can see that the day_type column values are inconsistent
date | mmm yy | week no | day_type | |
---|---|---|---|---|
0 | 2022-05-01 | May 22 | W 19 | weekend |
1 | 2022-05-02 | May 22 | W 19 | weekday |
2 | 2022-05-03 | May 22 | W 19 | weekday |
3 | 2022-05-04 | May 22 | W 19 | weekday |
4 | 2022-05-05 | May 22 | W 19 | weekday |
Now it is cleaned
Cleaning df_bookings
property_id | booking_date | check_in_date | checkout_date | no_guests | ratings_given | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|
count | 134590.000000 | 134590 | 134590 | 134590 | 134587.000000 | 56683.000000 | 1.345900e+05 | 134590.000000 |
mean | 18061.113493 | 2022-06-11 05:51:49.871461120 | 2022-06-14 22:52:29.939816960 | 2022-06-17 07:51:51.262352640 | 2.036170 | 3.619004 | 1.537805e+04 | 12696.123256 |
min | 16558.000000 | 2022-04-07 00:00:00 | 2022-05-01 00:00:00 | 2022-05-02 00:00:00 | -17.000000 | 1.000000 | 6.500000e+03 | 2600.000000 |
25% | 17558.000000 | 2022-05-19 00:00:00 | 2022-05-23 00:00:00 | 2022-05-25 00:00:00 | 1.000000 | 3.000000 | 9.900000e+03 | 7600.000000 |
50% | 17564.000000 | 2022-06-11 00:00:00 | 2022-06-15 00:00:00 | 2022-06-17 00:00:00 | 2.000000 | 4.000000 | 1.350000e+04 | 11700.000000 |
75% | 18563.000000 | 2022-07-04 00:00:00 | 2022-07-08 00:00:00 | 2022-07-10 00:00:00 | 2.000000 | 5.000000 | 1.800000e+04 | 15300.000000 |
max | 19563.000000 | 2022-07-31 00:00:00 | 2022-07-31 00:00:00 | 2022-08-06 00:00:00 | 6.000000 | 5.000000 | 2.856000e+07 | 45220.000000 |
std | 1093.055847 | NaN | NaN | NaN | 1.034885 | 1.235009 | 9.303604e+04 | 6928.108124 |
(1) Cleaning invalid guests
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT11 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | -3.0 | RT1 | direct online | 1.0 | Checked Out | 10010 | 10010 |
3 | May012216558RT14 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-02 | -2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
17924 | May122218559RT44 | 18559 | 2022-05-12 | 2022-05-12 | 2022-05-14 | -10.0 | RT4 | direct online | NaN | No Show | 20900 | 20900 |
18020 | May122218561RT22 | 18561 | 2022-05-08 | 2022-05-12 | 2022-05-14 | -12.0 | RT2 | makeyourtrip | NaN | Cancelled | 9000 | 3600 |
18119 | May122218562RT311 | 18562 | 2022-05-05 | 2022-05-12 | 2022-05-17 | -6.0 | RT3 | direct offline | 5.0 | Checked Out | 16800 | 16800 |
18121 | May122218562RT313 | 18562 | 2022-05-10 | 2022-05-12 | 2022-05-17 | -4.0 | RT3 | direct online | NaN | Cancelled | 14400 | 5760 |
56715 | Jun082218562RT12 | 18562 | 2022-06-05 | 2022-06-08 | 2022-06-13 | -17.0 | RT1 | others | NaN | Checked Out | 6500 | 6500 |
119765 | Jul202219560RT220 | 19560 | 2022-07-19 | 2022-07-20 | 2022-07-22 | -1.0 | RT2 | others | NaN | Checked Out | 13500 | 13500 |
134586 | Jul312217564RT47 | 17564 | 2022-07-30 | 2022-07-31 | 2022-08-01 | -4.0 | RT4 | logtrip | 2.0 | Checked Out | 38760 | 38760 |
As you can see above, the number of guests having less than zero value represents data error. We can filter these records.
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
2 | May012216558RT13 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-04 | 2.0 | RT1 | logtrip | 5.0 | Checked Out | 9100000 | 9100 |
4 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 |
5 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 |
6 | May012216558RT17 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-06 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
134584 | Jul312217564RT45 | 17564 | 2022-07-30 | 2022-07-31 | 2022-08-01 | 2.0 | RT4 | others | 2.0 | Checked Out | 32300 | 32300 |
134585 | Jul312217564RT46 | 17564 | 2022-07-29 | 2022-07-31 | 2022-08-03 | 1.0 | RT4 | makeyourtrip | 2.0 | Checked Out | 32300 | 32300 |
134587 | Jul312217564RT48 | 17564 | 2022-07-30 | 2022-07-31 | 2022-08-02 | 1.0 | RT4 | tripster | NaN | Cancelled | 32300 | 12920 |
134588 | Jul312217564RT49 | 17564 | 2022-07-29 | 2022-07-31 | 2022-08-01 | 2.0 | RT4 | logtrip | 2.0 | Checked Out | 32300 | 32300 |
134589 | Jul312217564RT410 | 17564 | 2022-07-31 | 2022-07-31 | 2022-08-01 | 2.0 | RT4 | makeyourtrip | NaN | Cancelled | 32300 | 12920 |
134578 rows × 12 columns
(134578, 12)
After removing invalid guests:
Rows : 134578
Columns : 12
(2) Outlier removal in revenue generated
Minimum Revenue: 6500 Maximum Revenue: 28560000
(15378.036937686695, 93040.1549314641)
294498.50173207896
-263742.4278567056
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | May012216558RT13 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-04 | 2.0 | RT1 | logtrip | 5.0 | Checked Out | 9100000 | 9100 |
111 | May012216559RT32 | 16559 | 2022-04-29 | 2022-05-01 | 2022-05-02 | 6.0 | RT3 | direct online | NaN | Checked Out | 28560000 | 28560 |
315 | May012216562RT22 | 16562 | 2022-04-28 | 2022-05-01 | 2022-05-04 | 2.0 | RT2 | direct offline | 3.0 | Checked Out | 12600000 | 12600 |
562 | May012217559RT118 | 17559 | 2022-04-26 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 2000000 | 4420 |
129176 | Jul282216562RT26 | 16562 | 2022-07-21 | 2022-07-28 | 2022-07-29 | 2.0 | RT2 | direct online | 3.0 | Checked Out | 10000000 | 12600 |
The Rows above show revenue values higher than the higher limit. Therefore, they should be removed.
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
4 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 |
5 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 |
6 | May012216558RT17 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-06 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
7 | May012216558RT18 | 16558 | 2022-04-26 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | logtrip | NaN | No Show | 9100 | 9100 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
134584 | Jul312217564RT45 | 17564 | 2022-07-30 | 2022-07-31 | 2022-08-01 | 2.0 | RT4 | others | 2.0 | Checked Out | 32300 | 32300 |
134585 | Jul312217564RT46 | 17564 | 2022-07-29 | 2022-07-31 | 2022-08-03 | 1.0 | RT4 | makeyourtrip | 2.0 | Checked Out | 32300 | 32300 |
134587 | Jul312217564RT48 | 17564 | 2022-07-30 | 2022-07-31 | 2022-08-02 | 1.0 | RT4 | tripster | NaN | Cancelled | 32300 | 12920 |
134588 | Jul312217564RT49 | 17564 | 2022-07-29 | 2022-07-31 | 2022-08-01 | 2.0 | RT4 | logtrip | 2.0 | Checked Out | 32300 | 32300 |
134589 | Jul312217564RT410 | 17564 | 2022-07-31 | 2022-07-31 | 2022-08-01 | 2.0 | RT4 | makeyourtrip | NaN | Cancelled | 32300 | 12920 |
134573 rows × 12 columns
After removing the rows showing values above higher limit of revenue:
Rows: 134573
Columns: 12
(134573, 12)
Checking for Revenue realized
count 134573.000000 mean 12695.983585 std 6927.791692 min 2600.000000 25% 7600.000000 50% 11700.000000 75% 15300.000000 max 45220.000000 Name: revenue_realized, dtype: float64
33479.358661845814
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
137 | May012216559RT41 | 16559 | 2022-04-27 | 2022-05-01 | 2022-05-07 | 4.0 | RT4 | others | NaN | Checked Out | 38760 | 38760 |
139 | May012216559RT43 | 16559 | 2022-05-01 | 2022-05-01 | 2022-05-02 | 6.0 | RT4 | tripster | 3.0 | Checked Out | 45220 | 45220 |
143 | May012216559RT47 | 16559 | 2022-04-28 | 2022-05-01 | 2022-05-03 | 3.0 | RT4 | others | 5.0 | Checked Out | 35530 | 35530 |
149 | May012216559RT413 | 16559 | 2022-04-24 | 2022-05-01 | 2022-05-07 | 5.0 | RT4 | logtrip | NaN | Checked Out | 41990 | 41990 |
222 | May012216560RT45 | 16560 | 2022-04-30 | 2022-05-01 | 2022-05-03 | 5.0 | RT4 | others | 3.0 | Checked Out | 34580 | 34580 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
134328 | Jul312219560RT49 | 19560 | 2022-07-31 | 2022-07-31 | 2022-08-02 | 6.0 | RT4 | direct online | 5.0 | Checked Out | 39900 | 39900 |
134331 | Jul312219560RT412 | 19560 | 2022-07-31 | 2022-07-31 | 2022-08-01 | 6.0 | RT4 | others | 2.0 | Checked Out | 39900 | 39900 |
134467 | Jul312219562RT45 | 19562 | 2022-07-28 | 2022-07-31 | 2022-08-01 | 6.0 | RT4 | makeyourtrip | 4.0 | Checked Out | 39900 | 39900 |
134474 | Jul312219562RT412 | 19562 | 2022-07-25 | 2022-07-31 | 2022-08-06 | 5.0 | RT4 | direct offline | 5.0 | Checked Out | 37050 | 37050 |
134581 | Jul312217564RT42 | 17564 | 2022-07-31 | 2022-07-31 | 2022-08-01 | 4.0 | RT4 | makeyourtrip | 4.0 | Checked Out | 38760 | 38760 |
1299 rows × 12 columns
One observation we get from the above dataframe is that all rooms are of RT4 category which means a presidential suite. Now, since RT4 is a luxurious room it is likely it's rent will be higher. To make a fair analysis, we need to do data analysis only on RT4 room types
Checking Revenue realized for RT4 rooms
room_id | room_class | |
---|---|---|
0 | RT1 | Standard |
1 | RT2 | Elite |
2 | RT3 | Premium |
3 | RT4 | Presidential |
count 16071.000000 mean 23439.308444 std 9048.599076 min 7600.000000 25% 19000.000000 50% 26600.000000 75% 32300.000000 max 45220.000000 Name: revenue_realized, dtype: float64
50583
Here, the higher limit comes to be 50583 and in our dataframe above we can see that the max value for the revenue realized is 45220. Hence, we can conclude that there are no outliers in the room_category column and therefore, we don't need to do any data cleaning on it.
Checking for null values
booking_id 0 property_id 0 booking_date 0 check_in_date 0 checkout_date 0 no_guests 0 room_category 0 booking_platform 0 ratings_given 77897 booking_status 0 revenue_generated 0 revenue_realized 0 dtype: int64
Total values in our dataframe are 1,34,573. Out of that 77,899 rows have null rating. Since there are many rows with null rating, we should not filter these values. Also we should not replace this rating with a median or mean rating.
Cleaning df_agg_bookings
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 |
3 | 17558 | 2022-05-01 | RT1 | 30 | 19.0 |
4 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 |
Checking for null values
property_id 0 check_in_date 0 room_category 0 successful_bookings 0 capacity 2 dtype: int64
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
8 | 17561 | 2022-05-01 | RT1 | 22 | NaN |
14 | 17562 | 2022-05-01 | RT1 | 12 | NaN |
Filling the null values with median values
25.0
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
8 | 17561 | 2022-05-01 | RT1 | 22 | 25.0 |
14 | 17562 | 2022-05-01 | RT1 | 12 | 25.0 |
Properties where successful bookings are more than capacity. That is an outlier
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
3 | 17558 | 2022-05-01 | RT1 | 30 | 19.0 |
12 | 16563 | 2022-05-01 | RT1 | 100 | 41.0 |
4136 | 19558 | 2022-06-11 | RT2 | 50 | 39.0 |
6209 | 19560 | 2022-07-02 | RT1 | 123 | 26.0 |
8522 | 19559 | 2022-07-25 | RT1 | 35 | 24.0 |
9194 | 18563 | 2022-07-31 | RT4 | 20 | 18.0 |
(9200, 5)
(9194, 5)
Rows after cleaning: 9194
Columns after cleaning: 5
Data Transformation¶
property_id | check_in_date | room_category | successful_bookings | capacity | |
---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 |
4 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 |
5 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 |
Create occupancy percentage column
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | |
---|---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 | 0.833333 |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 | 0.933333 |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 | 0.766667 |
4 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 | 0.947368 |
5 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 | 0.700000 |
Converting values to percentage format
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | |
---|---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 | 83.33 |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 | 93.33 |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 | 76.67 |
4 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 | 94.74 |
5 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 | 70.00 |
<class 'pandas.core.frame.DataFrame'> Index: 9194 entries, 0 to 9199 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 property_id 9194 non-null int64 1 check_in_date 9194 non-null datetime64[ns] 2 room_category 9194 non-null object 3 successful_bookings 9194 non-null int64 4 capacity 9194 non-null float64 5 occ_pct 9194 non-null float64 dtypes: datetime64[ns](1), float64(2), int64(2), object(1) memory usage: 502.8+ KB
Column is successfully created.
Insights Generation¶
1. What is the average occupancy rate in each of the room categories?
room_category RT1 57.89 RT2 58.01 RT3 58.03 RT4 59.28 Name: occ_pct, dtype: float64
Manager may say, "I don't understand RT1, RT2".
Therefore:
Printing room categories such as Standard, Premium, Elite etc along with average occupancy percentage would be better
room_id | room_class | |
---|---|---|
0 | RT1 | Standard |
1 | RT2 | Elite |
2 | RT3 | Premium |
3 | RT4 | Presidential |
Merging df_agg_bookings and df_rooms
New datframe: df
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | room_id | room_class | |
---|---|---|---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 | 83.33 | RT1 | Standard |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 | 93.33 | RT1 | Standard |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 | 76.67 | RT1 | Standard |
3 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 | 94.74 | RT1 | Standard |
4 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 | 70.00 | RT1 | Standard |
Dropping room_id which is already in df_agg_bookings by name room_category
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | room_class | |
---|---|---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 | 83.33 | Standard |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 | 93.33 | Standard |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 | 76.67 | Standard |
3 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 | 94.74 | Standard |
4 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 | 70.00 | Standard |
Average Occupancy % by Room Class
room_class Elite 58.01 Premium 58.03 Presidential 59.28 Standard 57.89 Name: occ_pct, dtype: float64
2. Average occupancy rate per city.
property_id | property_name | category | city | |
---|---|---|---|---|
0 | 16558 | Atliq Grands | Luxury | Delhi |
1 | 16559 | Atliq Exotica | Luxury | Mumbai |
2 | 16560 | Atliq City | Business | Delhi |
3 | 16561 | Atliq Blu | Luxury | Delhi |
4 | 16562 | Atliq Bay | Luxury | Delhi |
Merging df and df_hotels
df already has df_agg_bookings data and df_rooms data before this
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | room_class | property_name | category | city | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 | 83.33 | Standard | Atliq Exotica | Luxury | Mumbai |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 | 93.33 | Standard | Atliq Bay | Luxury | Bangalore |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 | 76.67 | Standard | Atliq Palace | Business | Bangalore |
3 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 | 94.74 | Standard | Atliq Grands | Luxury | Delhi |
4 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 | 70.00 | Standard | Atliq City | Business | Mumbai |
Average Occupancy Rate by City¶
3. When was the occupancy better? Weekday or Weekend?
date | mmm yy | week no | day_type | |
---|---|---|---|---|
0 | 2022-05-01 | May 22 | W 19 | weekend |
1 | 2022-05-02 | May 22 | W 19 | weekday |
2 | 2022-05-03 | May 22 | W 19 | weekday |
3 | 2022-05-04 | May 22 | W 19 | weekday |
4 | 2022-05-05 | May 22 | W 19 | weekday |
Merging df and df_date
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | room_class | property_name | category | city | date | mmm yy | week no | day_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 | 83.33 | Standard | Atliq Exotica | Luxury | Mumbai | 2022-05-01 | May 22 | W 19 | weekend |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 | 93.33 | Standard | Atliq Bay | Luxury | Bangalore | 2022-05-01 | May 22 | W 19 | weekend |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 | 76.67 | Standard | Atliq Palace | Business | Bangalore | 2022-05-01 | May 22 | W 19 | weekend |
3 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 | 94.74 | Standard | Atliq Grands | Luxury | Delhi | 2022-05-01 | May 22 | W 19 | weekend |
4 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 | 70.00 | Standard | Atliq City | Business | Mumbai | 2022-05-01 | May 22 | W 19 | weekend |
Occupancy %- Weekday vs Weekend
day_type weekday 51.81 weekend 73.96 Name: occ_pct, dtype: float64
4. Monthly Trend by Occupancy
mmm yy May 22 58.93 Jun 22 58.03 Jul 22 57.94 Name: occ_pct, dtype: float64
Monthly Trend by Occupancy¶
In the month of June, what is the occupancy for different cities?
array(['May 22', 'Jun 22', 'Jul 22'], dtype=object)
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | room_class | property_name | category | city | date | mmm yy | week no | day_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3098 | 16559 | 2022-06-01 | RT1 | 14 | 30.0 | 46.67 | Standard | Atliq Exotica | Luxury | Mumbai | 2022-06-01 | Jun 22 | W 23 | weekday |
3099 | 18560 | 2022-06-01 | RT1 | 18 | 30.0 | 60.00 | Standard | Atliq City | Business | Hyderabad | 2022-06-01 | Jun 22 | W 23 | weekday |
3100 | 19562 | 2022-06-01 | RT1 | 18 | 30.0 | 60.00 | Standard | Atliq Bay | Luxury | Bangalore | 2022-06-01 | Jun 22 | W 23 | weekday |
3101 | 19563 | 2022-06-01 | RT1 | 14 | 30.0 | 46.67 | Standard | Atliq Palace | Business | Bangalore | 2022-06-01 | Jun 22 | W 23 | weekday |
3102 | 17558 | 2022-06-01 | RT1 | 8 | 19.0 | 42.11 | Standard | Atliq Grands | Luxury | Mumbai | 2022-06-01 | Jun 22 | W 23 | weekday |
Average Occupancy % by City in June
city Delhi 61.46 Mumbai 57.79 Hyderabad 57.69 Bangalore 55.85 Name: occ_pct, dtype: float64
Reading and Exploring new data
property_id | property_name | category | city | room_category | room_class | check_in_date | mmm yy | week no | day_type | successful_bookings | capacity | occ% | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 16559 | Atliq Exotica | Luxury | Mumbai | RT1 | Standard | 2022-08-01 | Aug-22 | W 32 | weekeday | 30 | 30 | 100.00 |
1 | 19562 | Atliq Bay | Luxury | Bangalore | RT1 | Standard | 2022-08-01 | Aug-22 | W 32 | weekeday | 21 | 30 | 70.00 |
2 | 19563 | Atliq Palace | Business | Bangalore | RT1 | Standard | 2022-08-01 | Aug-22 | W 32 | weekeday | 23 | 30 | 76.67 |
3 | 19558 | Atliq Grands | Luxury | Bangalore | RT1 | Standard | 2022-08-01 | Aug-22 | W 32 | weekeday | 30 | 40 | 75.00 |
4 | 19560 | Atliq City | Business | Bangalore | RT1 | Standard | 2022-08-01 | Aug-22 | W 32 | weekeday | 20 | 26 | 76.92 |
Cleaning the day_type column
property_id | property_name | category | city | room_category | room_class | check_in_date | mmm yy | week no | day_type | successful_bookings | capacity | occ% | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 16559 | Atliq Exotica | Luxury | Mumbai | RT1 | Standard | 2022-08-01 | Aug-22 | W 32 | weekday | 30 | 30 | 100.0 |
1 | 19562 | Atliq Bay | Luxury | Bangalore | RT1 | Standard | 2022-08-01 | Aug-22 | W 32 | weekday | 21 | 30 | 70.0 |
Index(['property_id', 'property_name', 'category', 'city', 'room_category', 'room_class', 'check_in_date', 'mmm yy', 'week no', 'day_type', 'successful_bookings', 'capacity', 'occ%'], dtype='object')
Index(['property_id', 'check_in_date', 'room_category', 'successful_bookings', 'capacity', 'occ_pct', 'room_class', 'property_name', 'category', 'city', 'date', 'mmm yy', 'week no', 'day_type'], dtype='object')
(7, 13)
Columns in august data: 13
(9194, 14)
Columns in the dataframe where the august data is to be appended: 14
Number of Columns should be same
Dropping the extra date column created due to merge operation
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | room_class | property_name | category | city | mmm yy | week no | day_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 16559 | 2022-05-01 | RT1 | 25 | 30.0 | 83.33 | Standard | Atliq Exotica | Luxury | Mumbai | May 22 | W 19 | weekend |
1 | 19562 | 2022-05-01 | RT1 | 28 | 30.0 | 93.33 | Standard | Atliq Bay | Luxury | Bangalore | May 22 | W 19 | weekend |
2 | 19563 | 2022-05-01 | RT1 | 23 | 30.0 | 76.67 | Standard | Atliq Palace | Business | Bangalore | May 22 | W 19 | weekend |
3 | 16558 | 2022-05-01 | RT1 | 18 | 19.0 | 94.74 | Standard | Atliq Grands | Luxury | Delhi | May 22 | W 19 | weekend |
4 | 17560 | 2022-05-01 | RT1 | 28 | 40.0 | 70.00 | Standard | Atliq City | Business | Mumbai | May 22 | W 19 | weekend |
(9194, 13)
Updating the records
property_id | check_in_date | room_category | successful_bookings | capacity | occ_pct | room_class | property_name | category | city | mmm yy | week no | day_type | occ% | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
9196 | 19563 | 2022-08-01 | RT1 | 23 | 30.0 | NaN | Standard | Atliq Palace | Business | Bangalore | Aug-22 | W 32 | weekday | 76.67 |
9197 | 19558 | 2022-08-01 | RT1 | 30 | 40.0 | NaN | Standard | Atliq Grands | Luxury | Bangalore | Aug-22 | W 32 | weekday | 75.00 |
9198 | 19560 | 2022-08-01 | RT1 | 20 | 26.0 | NaN | Standard | Atliq City | Business | Bangalore | Aug-22 | W 32 | weekday | 76.92 |
9199 | 17561 | 2022-08-01 | RT1 | 18 | 26.0 | NaN | Standard | Atliq Blu | Luxury | Mumbai | Aug-22 | W 32 | weekday | 69.23 |
9200 | 17564 | 2022-08-01 | RT1 | 10 | 16.0 | NaN | Standard | Atliq Seasons | Business | Mumbai | Aug-22 | W 32 | weekday | 62.50 |
5. Revenue realized per city
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
4 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 |
5 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 |
property_id | property_name | category | city | |
---|---|---|---|---|
0 | 16558 | Atliq Grands | Luxury | Delhi |
1 | 16559 | Atliq Exotica | Luxury | Mumbai |
2 | 16560 | Atliq City | Business | Delhi |
Merging df_bookings and df_hotels
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | property_name | category | city | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 | Atliq Grands | Luxury | Delhi |
1 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 | Atliq Grands | Luxury | Delhi |
2 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 | Atliq Grands | Luxury | Delhi |
city Bangalore 420383550 Delhi 294404488 Hyderabad 325179310 Mumbai 668569251 Name: revenue_realized, dtype: int64
Revenue realized by City¶
6. Month by month revenue
array(['May 22', 'Jun 22', 'Jul 22'], dtype=object)
date | mmm yy | week no | day_type | |
---|---|---|---|---|
0 | 2022-05-01 | May 22 | W 19 | weekend |
1 | 2022-05-02 | May 22 | W 19 | weekday |
Merging df_bookings_all and df_date
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | property_name | category | city | date | mmm yy | week no | day_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 | Atliq Grands | Luxury | Delhi | 2022-05-01 | May 22 | W 19 | weekend |
1 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 | Atliq Grands | Luxury | Delhi | 2022-05-01 | May 22 | W 19 | weekend |
2 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 | Atliq Grands | Luxury | Delhi | 2022-05-01 | May 22 | W 19 | weekend |
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | property_name | category | city | mmm yy | week no | day_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 | Atliq Grands | Luxury | Delhi | May 22 | W 19 | weekend |
1 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 | Atliq Grands | Luxury | Delhi | May 22 | W 19 | weekend |
2 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 | Atliq Grands | Luxury | Delhi | May 22 | W 19 | weekend |
mmm yy Jul 22 572843348 Jun 22 553925855 May 22 581767396 Name: revenue_realized, dtype: int64
Monthly Trend by Revenue¶
7. Revenue realized per hotel type
property_name Atliq Seasons 66086735 Atliq Grands 211462134 Atliq Bay 259996918 Atliq Blu 260851922 Atliq City 285798439 Atliq Palace 304081863 Atliq Exotica 320258588 Name: revenue_realized, dtype: int64
Index(['Atliq Exotica', 'Atliq Palace', 'Atliq City', 'Atliq Blu', 'Atliq Bay', 'Atliq Grands', 'Atliq Seasons'], dtype='object', name='property_name')
Top Hotels by Revenue¶
8. Average rating per city
city Delhi 3.78 Hyderabad 3.66 Mumbai 3.65 Bangalore 3.41 Name: ratings_given, dtype: float64
Index(['Delhi', 'Hyderabad', 'Mumbai', 'Bangalore'], dtype='object', name='city')
Avg Rating per City¶
Revenue realized per Booking Platform¶
Revenue by City and Booking Platform¶
Revenue: Weekday vs Weekend¶
Finding Revenue by Room Category
Merging df_bookings and df_rooms
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
4 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 |
5 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 |
6 | May012216558RT17 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-06 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 |
7 | May012216558RT18 | 16558 | 2022-04-26 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | logtrip | NaN | No Show | 9100 | 9100 |
room_id | room_class | |
---|---|---|
0 | RT1 | Standard |
1 | RT2 | Elite |
2 | RT3 | Premium |
3 | RT4 | Presidential |
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | room_id | room_class | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 | RT1 | Standard |
1 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 | RT1 | Standard |
2 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 | RT1 | Standard |
3 | May012216558RT17 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-06 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 | RT1 | Standard |
4 | May012216558RT18 | 16558 | 2022-04-26 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | logtrip | NaN | No Show | 9100 | 9100 | RT1 | Standard |
booking_id | property_id | booking_date | check_in_date | checkout_date | no_guests | room_category | booking_platform | ratings_given | booking_status | revenue_generated | revenue_realized | room_class | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | May012216558RT12 | 16558 | 2022-04-30 | 2022-05-01 | 2022-05-02 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 | Standard |
1 | May012216558RT15 | 16558 | 2022-04-27 | 2022-05-01 | 2022-05-02 | 4.0 | RT1 | direct online | 5.0 | Checked Out | 10920 | 10920 | Standard |
2 | May012216558RT16 | 16558 | 2022-05-01 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | others | 4.0 | Checked Out | 9100 | 9100 | Standard |
3 | May012216558RT17 | 16558 | 2022-04-28 | 2022-05-01 | 2022-05-06 | 2.0 | RT1 | others | NaN | Cancelled | 9100 | 3640 | Standard |
4 | May012216558RT18 | 16558 | 2022-04-26 | 2022-05-01 | 2022-05-03 | 2.0 | RT1 | logtrip | NaN | No Show | 9100 | 9100 | Standard |