Spilt DateTime Ranges in SQL Server





1.00/5 (1 vote)
Split date time ranges into days or possibly other ranges using CTE in SQL Server
Background
I had a case where I had to convert a date range into a set of rows. For example, I had a record saying a product discount from 1st June till 10th June. This single row needs to be transformed into 10 rows, one for each day of sales. How could I do this in a fast and scalable manner, since I had thousands and thousands of records which might result in millions? That's not all.
- There could be different types of products
- Overlapping date ranges, leading to more discounts (sum)
Split Ranges to Day Wise Range
Here, we are going to convert the date range to possible day records.
Data
DECLARE @tblDateRange TABLE (
Id INT IDENTITY(1, 1),
TypeId INT,
FromDateTime DATETIME,
ToDateTime DATETIME
)
INSERT INTO @tblDateRange (TypeId, FromDateTime, ToDateTime)
VALUES
(1, '2023-01-01 10:00:00.000', '2023-01-01 10:00:00.000'), --start and end is same
(2, '2023-02-02 00:00:00.000', '2023-02-04 23:59:59.000'), --start and end is start
--and end of a day
(3, '2023-03-05 10:00:00.000', '2023-03-06 23:59:59.000'), --start in middle of
--another date
(4, '2023-04-07 00:00:00.000', '2023-04-08 21:00:00.000'), --end in middle of
--another date
(5, '2023-05-09 11:00:00.000', '2023-05-11 11:00:00.000'), --start and end in
--middle of another date
(6, '2023-06-01 10:00:00.000', '2023-06-01 22:00:00.000'); --start and end in
--middle of same date
Split Query
In this split process, we are using recursion to populate new rows inside a CTE.
WITH
DateRanges(Id, LevelNo, [Date], DateWiseStartDateTime, DateWiseEndDateTime)
AS
(
SELECT
p.Id,
1,
CAST(p.FromDateTime AS DATE),
p.FromDateTime,
IIF(DATEADD(SECOND, -1, DATEADD(DAY, DATEDIFF(DAY, 0, p.FromDateTime) + _
1, 0)) < p.ToDateTime, DATEADD(SECOND, -1, _
DATEADD(DAY, DATEDIFF(DAY, 0, p.FromDateTime) + 1, 0)), p.ToDateTime)
FROM @tblDateRange p
UNION ALL
SELECT
c.Id,
p.LevelNo + 1,
CAST(DATEADD(DAY, DATEDIFF(DAY, 0, _
DATEADD(DAY, 1, p.DateWiseStartDateTime)), 0) AS DATE),
DATEADD(DAY, DATEDIFF(DAY, 0, DATEADD(DAY, 1, p.DateWiseStartDateTime)), 0),
IIF(c.ToDateTime < DATEADD(DAY, 1, p.DateWiseEndDateTime), _
c.ToDateTime, DATEADD(DAY, 1, p.DateWiseEndDateTime))
FROM @tblDateRange c
JOIN DateRanges p ON c.Id = p.Id
WHERE DATEADD(DAY, DATEDIFF(DAY, 0, _
DATEADD(DAY, 1, p.DateWiseStartDateTime)), 0) < _
c.ToDateTime --date wise start datetime < range to datetime
)
SELECT
*
FROM DateRanges
ORDER BY Id, LevelNo
OPTION (MAXRECURSION 30000) -- default 100, max 32767, no limit 0
By default, it is going to support 100 recursive calls, and we don't need to use OPTION (MAXRECURSION 30000)
. If it requires more than that, we need to use that OPTION
clause where the max value can be 32767. Use 0 for no limit.
Join Split Rows to Actual Rows
Let's join the split rows with actual data. Here is the new select
query in the CTE:
SELECT
d.*,
r.LevelNo,
r.Date,
r.DateWiseStartDateTime,
r.DateWiseEndDateTime
FROM @tblDateRange d
JOIN DateRanges r ON d.Id = r.Id
ORDER BY d.Id, r.LevelNo
OPTION (MAXRECURSION 30000) -- default 100, max 32767, no limit 0
Limitations
Imagine we have ranges for 10 or 20 years. For each year, it's going to generate about 356 rows each year. The end query will be a bit slow. Another way is splitting them into small ranges based on overlapping and nonoverlapping ranges.
Split Ranges to Overlapping/Related Ranges
Here, we are going to convert the date range to all possible small ranges. Here is an idea:
1 12
|-------|-----------------------------|---------|-----------|--------|
2 6
|-----------------------------|
8 10
|-----------|
Input
Range Part
1-12 1
2-6 1
8-10 1
Expected Result
NewRange SUM of Parts
1-2 1
3-6 2
7-8 1
9-10 2
11-12 1
As we can see, here overlapping and nonoverlapping ranges are available.
Data
DECLARE @tableDataRanges TABLE(
Id INT IDENTITY(1, 1),
TypeId INT,
FromDateTime DATETIME,
ToDateTime DATETIME,
Points INTEGER
);
INSERT INTO @tableDataRanges (TypeId, FromDateTime, ToDateTime, Points)
VALUES
(1, '2023-01-13', '2023-01-20 23:59:59', 20), --no overlapping, no split
(1, '2023-02-10', '2023-02-20 23:59:59', 10), --15-20 overlapping,
--should be splited in two
(1, '2023-02-15', '2023-02-25 23:59:59', 10), --15-20 overlapping,
--should be splited in two
(2, '2023-02-10 12:00:00', '2023-02-20 23:00:00', 20), --15-20 overlapping,
--should be splited in two
(2, '2023-02-15 06:00:00', '2023-02-25 23:59:59', 20), --15-20 overlapping,
--should be splited in two
(3, '2023-02-10', '2023-02-20 23:59:59', 30), --has overlapping, but no split
(3, '2023-02-05', '2023-02-25 23:59:59', 20); --10-20 overlapping,
--should be splited in three
Split Query
WITH
PosibleStartDateTimes
AS
(
SELECT FromDateTime AS StartDateTime, TypeId, 1 AS Cover
FROM @tableDataRanges
UNION ALL
SELECT DATEADD(SECOND, 1, ToDateTime) AS StartDateTime, TypeId, -1 AS Cover
FROM @tableDataRanges
)
,AggregatedStartDateTimes
AS
(
SELECT StartDateTime, TypeId, SUM(Cover) AS Cover
FROM PosibleStartDateTimes
GROUP BY StartDateTime, TypeId
)
,StartDateTimeToRanges
AS
(
SELECT
StartDateTime AS FromDateTime,
TypeId,
LEAD(StartDateTime) OVER (PARTITION BY TypeId ORDER BY StartDateTime) _
AS ToDateTime,
SUM(Cover) OVER (PARTITION BY TypeId ORDER BY StartDateTime) AS NumberOfParts
FROM AggregatedStartDateTimes
)
,PossibleRanges
AS
(
SELECT
FromDateTime,
DATEADD(SECOND, -1, ToDateTime) AS ToDateTime,
TypeId,
NumberOfParts
FROM StartDateTimeToRanges
WHERE NumberOfParts > 0
)
SELECT
*
FROM PossibleRanges
--WHERE FromDateTime < ToDateTime --optional
ORDER BY TypeId, FromDateTime;
We have different TypeId
data in the table. That's why in the OVER
clause, we are using PARTITION BY TypeId
partitioning by row type.
Join Split Rows to Actual Rows
Let's join the split range rows with actual data. Here is the new select
query in the CTE:
SELECT
d.*,
r.NumberOfParts,
r.FromDateTime AS RangeFromDateTime,
r.ToDateTime AS RangeToDateTime
FROM @tableDataRanges d
JOIN PossibleRanges r ON d.TypeId = r.TypeId _
AND (d.FromDateTime <= r.FromDateTime AND d.ToDateTime >= r.ToDateTime)
ORDER BY d.Id, r.FromDateTime;
Others
If there is no type difference, simply we can do a query without using PARTITION BY
:
DECLARE @tableRanges TABLE(
Id INT IDENTITY(1, 1),
FromDateTime DATETIME,
ToDateTime DATETIME,
Points INTEGER
);
INSERT INTO @tableRanges (FromDateTime, ToDateTime, Points)
VALUES
('2018-01-01', '2018-01-31 23:59:59', 80),
('2018-01-07', '2018-01-10 23:59:59', 10),
('2018-01-07', '2018-01-31 23:59:59', 10),
('2018-01-11', '2018-01-31 23:59:59', 5),
('2018-01-25', '2018-01-27 23:59:59', 5),
('2018-02-02', '2018-02-23 23:59:59', 100);
WITH
PosibleStartDateTimes
AS
(
SELECT FromDateTime AS StartDateTime, Points, 1 AS Cover
FROM @tableRanges
UNION ALL
SELECT DATEADD(SECOND, 1, ToDateTime) AS StartDateTime, -1 * Points, -1 AS Cover
FROM @tableRanges
)
,AggregatedStartDateTimes
AS
(
SELECT StartDateTime, SUM(Points) AS Points, SUM(Cover) AS Cover
FROM PosibleStartDateTimes
GROUP BY StartDateTime
)
,StartDateTimeToRanges
AS
(
SELECT
StartDateTime AS FromDateTime,
LEAD(StartDateTime) OVER (ORDER BY StartDateTime) AS ToDateTime,
SUM(Points) OVER (ORDER BY StartDateTime) AS Points,
SUM(Cover) OVER (ORDER BY StartDateTime) AS NumberOfParts
FROM AggregatedStartDateTimes
)
,PossibleRanges
AS
(
SELECT
FromDateTime,
DATEADD(SECOND, -1, ToDateTime) AS ToDateTime,
Points,
NumberOfParts
FROM StartDateTimeToRanges
WHERE NumberOfParts > 0
)
SELECT
*
FROM PossibleRanges
--WHERE FromDateTime < ToDateTime --optional
ORDER BY FromDateTime;
References
- Split overlapping ranges into all unique ranges
- Get distinct consecutive date ranges from overlapping date ranges
- How to Expand a Range of Dates into Rows using a SQL Server Numbers Table
Conclusion
Nonrecursive PARTITION BY
query was the fastest of them all
Code/Environment
Find the same code .sql file as the zip, if things not working as expected let me know. Tested code in Microsoft SQL Server 2019 (RTM) - 15.0.2000.5 (X64).
History
- 1st July, 2023: Initial version