Given a list of dates, group the dates in a successive day ranges from the initial date of the list. We will form a group of each successive range of K dates, starting from the smallest date.
Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 10
Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0)]), (1, [datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]
Explanation : 27 Dec – 4 Jan is in same group as diff. of dates are less than 10, successively, each set of dates are grouped by 10 days delta.
Input : test_list = [datetime(2020, 1, 4), datetime(2019, 12, 30), datetime(2020, 1, 7), datetime(2019, 12, 27), datetime(2020, 1, 20), datetime(2020, 1, 10)], K = 14
Output : [(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0)]), (1, [datetime.datetime(2020, 1, 10, 0, 0), datetime.datetime(2020, 1, 20, 0, 0)])]
Explanation : 27 Dec – 7 Jan is in same group as diff. of dates are less than 14, successively, each set of dates are grouped by 14 days delta.
Method : Using groupby() + sort()
In this, we sort the dates and then perform grouping of a set of dates depending upon grouping function.
Python3
# Python3 code to demonstrate working of # Group dates in K ranges # Using groupby() + sort() from itertools import groupby from datetime import datetime # initializing list test_list = [datetime( 2020 , 1 , 4 ), datetime( 2019 , 12 , 30 ), datetime( 2020 , 1 , 7 ), datetime( 2019 , 12 , 27 ), datetime( 2020 , 1 , 20 ), datetime( 2020 , 1 , 10 )] # printing original list print ( "The original list is : " + str (test_list)) # initializing K K = 7 # initializing start date min_date = min (test_list) # utility fnc to form groupings def group_util(date): return (date - min_date).days / / K # sorting before grouping test_list.sort() temp = [] # grouping by utility function to group by K days for key, val in groupby(test_list , key = lambda date : group_util(date)): temp.append((key, list (val))) # using strftime to convert to userfriendly # format res = [] for sub in temp: intr = [] for ele in sub[ 1 ]: intr.append(ele.strftime( "%Y/%m/%d" )) res.append((sub[ 0 ], intr)) # printing result print ( "Grouped Digits : " + str (res)) |
Output:
The original list is : [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2019, 12, 30, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2020, 1, 20, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]
Grouped Digits : [(0, [‘2019/12/27’, ‘2019/12/30’]), (1, [‘2020/01/04’, ‘2020/01/07’]), (2, [‘2020/01/10’]), (3, [‘2020/01/20’])]
Method #2: Using Sort and iterate
Approach
1. Sort the list of dates in ascending order.
2. Initialize a list of tuples to store the groups.
3. Initialize variables to keep track of the current group number and the start date of the current group.
4. Iterate through the sorted list of dates, comparing the current date with the start date of the current group.
5. If the difference between the current date and the start date is less than or equal to K days, add the current date to the current group.
6. If the difference between the current date and the start date is greater than K days, create a new group with the current date as the start date and add the current date to the new group.
7. Return the list of tuples.
Algorithm
1. Sort the given list of dates in ascending order.
2. Initialize an empty dictionary to store the groups of dates.
3. For each date in the sorted list, calculate the number of days since the previous date using the timedelta function.
4. If the number of days is greater than K, add the date to a new group. Otherwise, add the date to the previous group.
5. Convert the dictionary into a list of tuples and return the result.
Python3
from datetime import datetime, timedelta from collections import defaultdict def group_dates(dates, K): groups = defaultdict( list ) dates.sort() group_num = 0 start_date = None for date in dates: if start_date is None : start_date = date else : diff = (date - start_date).days if diff > K: group_num + = 1 start_date = date groups[group_num].append(date) return list (groups.items()) dates = [datetime( 2020 , 1 , 4 ), datetime( 2019 , 12 , 30 ), datetime( 2020 , 1 , 7 ), datetime( 2019 , 12 , 27 ), datetime( 2020 , 1 , 20 ), datetime( 2020 , 1 , 10 )] K = 7 print (group_dates(dates, K)) |
[(0, [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), (1, [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), (2, [datetime.datetime(2020, 1, 20, 0, 0)])]
Time complexity: O(n log n) – sorting the list of dates takes O(n log n) time, where n is the number of dates. The loop that iterates through the sorted list of dates takes O(n) time.
Auxiliary Space: O(n) – we store the groups of dates in a dictionary that can potentially contain n elements.
Method 3 : use a while loop to iterate over the dates and create groups based on the K value.
Approach:
- Sort the dates in ascending order
- Initialize an empty list called “groups”
- Set a variable called “current_group” to 0
- Set a variable called “group_start_date” to the first date in the sorted list
- Set a variable called “group_end_date” to None
- While there are still dates left in the list:
- Get the next date in the list
- If the difference between the current date and the group start date is less than or equal to K:
- Set the group end date to the current date
Else: - Append the current group (i.e., the dates between the group start date and the group end date) to the “groups” list
- Set the group start date to the current date
- Set the group end date to None
- Increment the current group number
- Append the final group to the “groups” list
- Return the “groups” list.
Python3
from collections import defaultdict from datetime import datetime, timedelta def group_dates(dates, K): groups = defaultdict( list ) dates.sort() group_num = 0 start_date = None for date in dates: if start_date is None : start_date = date else : diff = (date - start_date).days if diff > K: group_num + = 1 start_date = date groups[ str (group_num)].append(date) print (groups) return list (groups.items()) # input dates = [datetime( 2020 , 1 , 4 ), datetime( 2019 , 12 , 30 ), datetime( 2020 , 1 , 7 ), datetime( 2019 , 12 , 27 ), datetime( 2020 , 1 , 20 ), datetime( 2020 , 1 , 10 )] K = 7 print (group_dates(dates, K)) |
defaultdict(<class 'list'>, {'0': [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)], '1': [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)], '2': [datetime.datetime(2020, 1, 20, 0, 0)]}) [('0', [datetime.datetime(2019, 12, 27, 0, 0), datetime.datetime(2019, 12, 30, 0, 0)]), ('1', [datetime.datetime(2020, 1, 4, 0, 0), datetime.datetime(2020, 1, 7, 0, 0), datetime.datetime(2020, 1, 10, 0, 0)]), ('2', [datetime.datetime(2020, 1, 20, 0, 0)])]
Time complexity: O(n), where n is the number of dates in the input list.
Auxiliary space: O(1) since it only uses a fixed number of variables.