Saturday, September 21, 2024
Google search engine
HomeData Modelling & AI#BackToBasics: DATEFROMPARTS()

#BackToBasics: DATEFROMPARTS()

Just before 2015 drew to a close, I challenged myself – in response to a tweet from Tim Ford – to publish an entry level post on the first Wednesday of every month.

Well, today’s July 6th, and I apologize for coming in at the 11th hour, but I still wanted to talk briefly about a new set of functions introduced in SQL Server 2012:

 

We’ll focus on just DATEFROMPARTS() for now, to keep things simple.

The point of these functions is to make it easier to construct a date, or datetime, or datetime2 variable, when you know the individual parts. DATEFROMPARTS() takes three arguments: year, month, and day, and returns a date value. So, for example, SELECT DATEFROMPARTS(2016,7,6); would yield the date 2016-07-06.

When would you use this? One example is something I often see in reporting stored procedures, where they accept the year and month as individual parameters:

CREATE PROCEDURE dbo.MonthlyReport

  @Year  int,

  @Month int

 

This is often because a web form or application has a dropdown for the year and a separate dropdown for the month. Which is okay, it just leads to passing those integers in directly, and then the query has to fumble around with turning those into a usable range query.

So then what? Well, I see all kinds of fun ways people use these input parameters in the where clause:

-- applying year/month functions to the date column:

WHERE YEAR(DateColumn) = @Year

  AND MONTH(DateColumn) = @Month;

-- or

WHERE DATEPART(YEAR, DateColumn) = @Year

  AND DATEPART(MONTH, DateColumn) = @Month;

-- converting both sides to a string:

WHERE CONVERT(char(6), DateColumn, 112)
= CONVERT(char(4), @Year) + RIGHT('0' + CONVERT(varchar(2), @Month), 2);

-- or this more elaborate form:

WHERE CONVERT(char(6), DateColumn, 112) + '01'
= CONVERT(char(4), @Year) + RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01';

-- or even more elaborate, making the comparison between dates:

WHERE CONVERT(date, CONVERT(char(6), DateColumn, 112) + '01')
= CONVERT(date, CONVERT(char(4), @Year)
+ RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01');

-- or converting to an int and then to a string:

WHERE CONVERT(char(6), DateColumn, 112)
= CONVERT(char(6), @Year * 100 + @Month);

-- or by converting column to string and back to date,
-- and funky math with 0-date on the parameters:

WHERE CONVERT(date, CONVERT(char(6), DateColumn, 112) + '01')
= DATEADD(MONTH, @Month-1, DATEADD(YEAR, @Year-1900, 0));

-- or a variety of other ways - yes, I've seen all of these and more!
-- please don't get me started on CONVERT(float, FLOOR/CEILING(...)) varieties

 

 

To see a more elaborate script you can copy to demonstrate the output of these expressions, click here.

-- this acts as our column, a random day in July:
DECLARE @DateColumn date = '20160706';

SELECT YEAR(@DateColumn), -- 2016
MONTH(@DateColumn), -- 7
DATEPART(YEAR,@DateColumn), -- 2016
DATEPART(MONTH,@DateColumn); -- 7

-- converting both sides to a string:
SELECT CONVERT(char(6), @DateColumn, 112); -- '201607'
SELECT CONVERT(char(4), @Year)
+ RIGHT('0' + CONVERT(varchar(2), @Month), 2); -- '201607'

-- or this more elaborate form:
SELECT CONVERT(char(6), @DateColumn, 112) + '01'; -- '20160701'
SELECT CONVERT(char(4), @Year)
+ RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01'; -- '20160701'

-- or even more elaborate:
SELECT CONVERT(date, CONVERT(char(6), @DateColumn, 112) + '01'); -- 2016-07-01
SELECT CONVERT(date, CONVERT(char(4), @Year)
+ RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01'); -- 2016-07-01

-- or converting to an int and then to a string:
SELECT CONVERT(char(6), @DateColumn, 112); -- '201607'
SELECT CONVERT(char(6), @Year * 100 + @Month); -- '201607'

-- or by converting column to string and back to date,
-- and funky math with 0-date on the parameters:
SELECT CONVERT(date, CONVERT(char(6), @DateColumn, 112) + '01'); -- 2016-07-01
SELECT CONVERT(date, DATEADD(MONTH, @Month-1,
DATEADD(YEAR, @Year-1900, 0))); -- 2016-07-01

 

Now, these all get the right answer, so what’s the problem? They’re ugly. And, more importantly, non-sargable. Which means they won’t be able to use an index on DateColumn or take advantage of partition elimination; SQL Server will have to scan the entire table or index and apply those calculations to every single row in order to perform any comparisons.

The way around this is to use an open-ended date range (see this post to understand why, and this post to see why we don’t use BETWEEN).

But how do we do that with parameters like @Year and @Month? Before SQL Server 2012, you’d have a mix of things from above. The following is sargable, but still has that ugly feeling:

SELECT ...

WHERE DateColumn >= CONVERT(date, CONVERT(char(4), @year)

  + RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01')

  -- replace this expression with any method from above

  AND DateColumn < DATEADD(month, 1, CONVERT(date, CONVERT(char(4), @year)

  + RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01'));

 

And you could do it a little more cleanly (my preference) by using a local variable, so you only have the ugly code once:

DECLARE @Start date = CONVERT(char(4), @year)

  + RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01';

  -- replace this expression with any method from above

SELECT ...
 WHERE DateColumn >= @Start
 AND DateColumn < DATEADD(MONTH, 1, @Start);

 

 

But that’s still pretty ugly. The beauty of functions like DATEFROMPARTS() is that I can turn this pair of parameters into a valid date value (the first of the given month) in a very simple and clean way, and still get the same performance as the above method:

DECLARE @Start date = DATEFROMPARTS(@Year, @Month, 1);

SELECT ...
 WHERE DateColumn >= @Start
 AND DateColumn < DATEADD(MONTH, 1, @Start);

 

 

(And this works similarly if I only accept the year as a parameter, or all three parts, or the parts that make up the input parameters to one of the other functions listed above.)

The Pudding

Now, to see why NOT being sargable is bad, I threw together this small repro, which only requires the presence of any copy of AdventureWorks, and then manually determining the date ranges present in Sales.SalesOrderHeader:

-- create a simple table with an index

-- on a date column:

CREATE TABLE dbo.DateEntries
(
id int IDENTITY(1,1) PRIMARY KEY,
DateColumn date
);

CREATE INDEX SeekMe ON dbo.DateEntries(DateColumn);

-- insert 31,645 rows (most copies of AdventureWorks):

INSERT dbo.DateEntries(DateColumn)
SELECT OrderDate
FROM AdventureWorks.Sales.SalesOrderHeader;
GO

-- now create a stored procedure that will
-- attempt a reporting query using the various
-- methods presented above:

CREATE PROCEDURE dbo.MonthlyReport
@Year int,
@Month int
AS
BEGIN
SET NOCOUNT ON;

DECLARE @Start_Convert date = CONVERT(char(4), @year)
+ RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01';

DECLARE @Start_DateFromParts date = DATEFROMPARTS(@Year, @Month, 1);

SELECT /* year/month */ DateColumn
FROM dbo.DateEntries
WHERE YEAR(DateColumn) = @Year
AND MONTH(DateColumn) = @Month;

SELECT /* datepart(year/month) */ DateColumn
FROM dbo.DateEntries
WHERE DATEPART(YEAR, DateColumn) = @Year
AND DATEPART(MONTH, DateColumn) = @Month;

SELECT /* both sides to string */ DateColumn
FROM dbo.DateEntries
WHERE CONVERT(char(6), DateColumn, 112)
= CONVERT(char(4), @Year) + RIGHT('0'
+ CONVERT(varchar(2), @Month), 2);

SELECT /* more elaborate both string */ DateColumn
FROM dbo.DateEntries
WHERE CONVERT(char(6), DateColumn, 112) + '01'
= CONVERT(char(4), @Year) + RIGHT('0'
+ CONVERT(varchar(2), @Month), 2) + '01';

SELECT /* even more elaborate */ DateColumn
FROM dbo.DateEntries
WHERE CONVERT(date, CONVERT(char(6), DateColumn, 112) + '01')
= CONVERT(date, CONVERT(char(4), @Year)
+ RIGHT('0' + CONVERT(varchar(2), @Month), 2) + '01');

SELECT /* to int then string */ DateColumn
FROM dbo.DateEntries
WHERE CONVERT(char(6), DateColumn, 112)
= CONVERT(char(6), @Year * 100 + @Month);

SELECT /* funky math */ DateColumn
FROM dbo.DateEntries
WHERE CONVERT(date, CONVERT(char(6), DateColumn, 112) + '01')
= DATEADD(MONTH, @Month-1, DATEADD(YEAR, @Year-1900, 0));

SELECT /* using @Start_Convert */ DateColumn
FROM dbo.DateEntries
WHERE DateColumn >= @Start_Convert
AND DateColumn < DATEADD(MONTH, 1, @Start_Convert);

SELECT /* using @Start_DateFromParts */ DateColumn
FROM dbo.DateEntries
WHERE DateColumn >= @Start_DateFromParts
AND DateColumn < DATEADD(MONTH, 1, @Start_DateFromParts);
END
GO

-- now, to figure out what ranges we have
-- note that your range may be different
-- depending on which copy of AdventureWorks

SELECT
MIN(DateColumn), -- 2011-05-31
MAX(DateColumn) -- 2014-06-30
FROM dbo.DateEntries;
GO

-- let's pick a random month in that range (I'll
-- go with July 2012), and examine the output

-- remember, the range you need may be different

EXEC dbo.MonthlyReport @Year = 2012, @Month = 7;

 

 

We can run this last stored procedure call on its own in SQL Sentry Plan Explorer, and look at the results grid and the graphical plans. In the results grid, we can see how all of the approaches except for the final two resulted in an index scan and a much higher number of reads:

Scans vs. Seeks in results gridResults grid, showing scans vs. seeks

And in the graphical plan, we can see the seek on the left (representing the two open-ended range queries), and the scan on the right (representing the other approaches). Note also the warning you get about residual I/O, just a further indicator that you performed a scan when you didn’t have to – and this meant plenty of wasted I/O:

Two plans - seek on the left, scan on the rightGraphical plan, showing more expensive scan and residual warnings on the right

Conclusion

If you’re currently performing any of these dirty tricks to get valid date ranges out of individual numeric or string date components, consider using the *FROMPARTS() family instead. It will make your code a little cleaner, and hopefully will encourage more open-ended range queries, too.

Aaron (@AaronBertrand) is a Data Platform MVP with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com.

Aaron’s blog focuses on T-SQL bad habits and best practices, as well as coverage of updates and new features in Plan Explorer, SentryOne, and SQL Server.

RELATED ARTICLES

Most Popular

Recent Comments