1645 - Hopper Company Queries II [LeetCode]
Aug 31, 2024
4 min read
(To see the full code, go to my github) Write an SQL query to calculate the percentage of working drivers (working_percentage) for each month of 2020. The working percentage is determined by the ratio of drivers who accepted rides in a given month to the total number of drivers available during that month.
If the number of available drivers during a month is zero, the working_percentage should be 0.00.
The result should display all months from 1 to 12 (January to December), with the working_percentage rounded to two decimal places.
Drivers: Stores information about drivers and the date they joined the platform.
![1645 - Hopper Company Queries II [LeetCode]](https://static.wixstatic.com/media/b635b2_754a166e17eb47d1a99f5643b6ff92a7~mv2.png/v1/fill/w_177,h_264,al_c,q_85,usm_0.66_1.00_0.01,enc_auto/b635b2_754a166e17eb47d1a99f5643b6ff92a7~mv2.png)
Rides: Contains information about rides requested by users and the date of request.
![1645 - Hopper Company Queries II [LeetCode]](https://static.wixstatic.com/media/b635b2_5d1f855a8ae740b2b8fcf3e734e887cf~mv2.png/v1/fill/w_239,h_515,al_c,q_85,usm_0.66_1.00_0.01,enc_auto/b635b2_5d1f855a8ae740b2b8fcf3e734e887cf~mv2.png)
AcceptedRides: Details about rides that have been accepted by drivers, including distance and duration.
![1645 - Hopper Company Queries II [LeetCode]](https://static.wixstatic.com/media/b635b2_81c7273df8864b1885d9989261ca7d64~mv2.png/v1/fill/w_329,h_354,al_c,q_85,usm_0.66_1.00_0.01,enc_auto/b635b2_81c7273df8864b1885d9989261ca7d64~mv2.png)
Solution
We will approach the problem in four comprehensive steps:
Generate a sequence of months from 1 to 12.
Calculate the cumulative total number of drivers available by each month in 2020.
Determine the number of active drivers for each month in 2020.
Compute the working percentage for each month and present the final result.
Let's explore each step in detail.
Step 1: Generate a Sequence of Months from 1 to 12
Ensure that all months from January to December are represented in the final result, even if there is no data for certain months.
We create a Common Table Expression (CTE) named Months.
The CTE recursively generates numbers from 1 to 12, each representing a month of the year.
SELECT 1 AS month: Initiates the sequence with month 1 (January).
UNION ALL SELECT month + 1 FROM Months WHERE month < 12: Recursively adds 1 to the previous month until it reaches 12 (December).
WITH Months AS (
SELECT 1 AS month
UNION ALL
SELECT month + 1
FROM Months
WHERE month < 12
)
![1645 - Hopper Company Queries II [LeetCode]](https://static.wixstatic.com/media/b635b2_8ba4ca916aaf4635807bcc6fe8082ba2~mv2.png/v1/fill/w_93,h_598,al_c,q_85,blur_3,enc_auto/b635b2_8ba4ca916aaf4635807bcc6fe8082ba2~mv2.png)
Step 2: Calculate the Cumulative Total Number of Drivers Available by Each Month in 2020
Determine how many drivers were available (i.e., had joined the platform) up to each month in 2020.
Joining Months and Drivers:
We perform a FULL JOIN between the Months CTE and the Drivers table.
Join Condition: MONTH(d.join_date) = m.month AND YEAR(d.join_date) = 2020
WHERE Condition: YEAR(join_date) IS NULL OR YEAR(join_date) <= 2020
This ensures we consider drivers who joined on or before each month in 2020 and include all months even if no drivers joined in a particular month.
Grouping and Counting Drivers per Month:
GROUP BY m.month: Groups records by each month.
COUNT(d.driver_id): Counts the number of drivers who joined in that specific month.
Calculating Cumulative Total:
SUM(COUNT(d.driver_id)) OVER (ORDER BY m.month) AS total_drivers:
The window function calculates a running total of drivers up to each month.
This gives us the cumulative number of drivers available by the end of each month.
TotalDrivers AS (
SELECT
m.month,
-- Use SUM() as a window function to get the cumulative count of drivers who joined up to that month
SUM(COALESCE(drivers_joined, 0)) OVER (ORDER BY m.month) AS total_drivers
FROM (
-- For each month, count the number of drivers who joined in that month
SELECT
m.month,
COUNT(d.driver_id) AS drivers_joined
FROM
Months m
FULL JOIN
Drivers d
ON
MONTH(d.join_date) = m.month
AND YEAR(d.join_date) = 2020
WHERE
YEAR(join_date) IS NULL
OR YEAR(join_date) <= 2020
GROUP BY
m.month
) m
)
![1645 - Hopper Company Queries II [LeetCode]](https://static.wixstatic.com/media/b635b2_e8d72da3232447798c5c97e17dd9b51f~mv2.png/v1/fill/w_60,h_161,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/b635b2_e8d72da3232447798c5c97e17dd9b51f~mv2.png)
Step 3: Determine the Number of Active Drivers for Each Month in 2020
Identify how many drivers were actively working (i.e., accepted at least one ride) in each month of 2020.
Joining Rides and AcceptedRides:
INNER JOIN between Rides and AcceptedRides on ride_id.
Ensures we only consider rides that were accepted by drivers.
Filtering for Year 2020:
WHERE YEAR(r.requested_at) = 2020: Focuses only on rides requested in 2020.
Grouping and Counting Active Drivers:
GROUP BY MONTH(r.requested_at): Groups data by each month.
COUNT(DISTINCT a.driver_id): Counts unique drivers who accepted rides in each month.
ActiveDrivers AS (
SELECT
MONTH(r.requested_at) AS month,
COUNT(DISTINCT a.driver_id) AS active_drivers
FROM
Rides r
-- Join with the AcceptedRides table to consider only rides that were accepted by a driver
INNER JOIN
AcceptedRides a
ON
r.ride_id = a.ride_id
WHERE
YEAR(r.requested_at) = 2020
GROUP BY
MONTH(r.requested_at)
)
![1645 - Hopper Company Queries II [LeetCode]](https://static.wixstatic.com/media/b635b2_42d2c9c9db1e48a9a63d49eb59a75c6b~mv2.png/v1/fill/w_62,h_81,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/b635b2_42d2c9c9db1e48a9a63d49eb59a75c6b~mv2.png)
Step 4: Compute the Working Percentage for Each Month and Present the Final Result
Calculate the working percentage for each month by comparing active drivers to total available drivers and compile the final result set.
Joining All Data Sources:
LEFT JOIN between Months and TotalDrivers on month.
LEFT JOIN between the result and ActiveDrivers on month.
LEFT JOINs ensure that all months are included, even if there are no drivers or no active drivers in a particular month.
Calculating Working Percentage:
COALESCE(a.active_drivers, 0): If active_drivers is NULL (no active drivers), defaults to 0.
COALESCE(t.total_drivers, 1): If total_drivers is NULL (no drivers available), defaults to 1 to avoid division by zero.
(active_drivers * 100.0) / total_drivers: Calculates the percentage.
ROUND(..., 2): Rounds the result to two decimal places.
Ordering Results:
ORDER BY m.month: Ensures the results are displayed from January to December.
SELECT
m.month,
-- Calculate the percentage of active drivers out of the total available drivers
ROUND(
COALESCE(a.active_drivers, 0) * 100.0 /
COALESCE(t.total_drivers, 1),
2) AS working_percentage
FROM
Months m
-- Left join with TotalDrivers to get the cumulative count of drivers up to each month
LEFT JOIN
TotalDrivers t
ON
m.month = t.month
-- Left join with ActiveDrivers to get the count of active drivers for each month
LEFT JOIN
ActiveDrivers a
ON
m.month = a.month
ORDER BY
m.month;
FinalResult:
![1645 - Hopper Company Queries II [LeetCode]](https://static.wixstatic.com/media/b635b2_7eb1f4d806654c0f9b7a56b2bd533686~mv2.png/v1/fill/w_78,h_150,al_c,q_85,usm_0.66_1.00_0.01,blur_2,enc_auto/b635b2_7eb1f4d806654c0f9b7a56b2bd533686~mv2.png)
Thank you for taking the time to explore this solution!
I hope you found the explanation and the SQL query helpful. If you have any questions, suggestions, or ideas to improve the code, feel free to reach out or open an issue.
Hi