Wednesday, July 3, 2024
HomeDatabasesBad habits: Using (certain) metadata "helper" functions

Bad habits: Using (certain) metadata “helper” functions

[See an index of all bad habits / best practices posts]

For a long time, I was a big proponent of all the metadata helper functions. These are the little one-liners, like OBJECT_ID(), that let you derive properties of an object, usually based on another property. While very handy in many cases, they can be problematic in others – especially if you are trying to use NOLOCK against the catalog views / DMVs. Let’s say you want to be able to see tables that have been created, even if the transaction hasn’t yet been committed:

BEGIN TRANSACTION;
CREATE TABLE dbo.foo(id INT);

 

In another window, run this, which works fine:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT name, [object_id] FROM sys.objects WHERE name = N'foo';

 

However, try this instead:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT OBJECT_ID(N'dbo.foo'); -- blocked

 

You can see why; in the window where the table was created, run the following query:

SELECT request_mode
  FROM sys.dm_tran_locks
  WHERE resource_database_id = DB_ID()
  AND resource_associated_entity_id = OBJECT_ID(N'dbo.foo');

 

You’ll see that the session_id that started the transaction has a Sch-M lock on the table. And while the read uncommitted query against sys.objects does not require any locking (even if you wrap it in a transaction), the call that goes through OBJECT_ID() actually requests a Sch-S lock that immediately goes into the WAIT state. And hence the block that will persist until the original transaction is either committed or rolled back. Which reminds me: don’t forget to roll back that transaction you started above.

Adam Machanic filed a Connect item for this issue way back in 2009, but it was closed as “Won’t Fix”:

So, something to keep in mind, if you are writing diagnostic queries of any kind, and you want to use read uncommitted to minimize both the impact of your queries on, and the impact on them by, other activity on the system. It is highly unlikely these types of queries will actually cause problems, unless you’re performing very aggressive queries against things like sys.dm_db_index_physical_stats(), but you can certainly be a victim in a lot of scenarios.

Another issue that may come up is the behavior when permissions block a user from accessing the metadata. Consider a case where a user executes a stored procedure that in turn calls another procedure. Sometimes you may want to have home-grown auditing set up, without worrying about giving every single user explicit access to the chain of things that auditing needs to access. However it may not work out as you expect:

USE [master];
GO
CREATE LOGIN peon WITH PASSWORD = N'peon', CHECK_POLICY = OFF;
GO
USE some_database;
GO
CREATE USER peon FOR LOGIN peon;
GO
CREATE PROCEDURE dbo.NestedProcedure
AS
BEGIN
  SET NOCOUNT ON;
 
  SELECT helper = OBJECT_SCHEMA_NAME(@@PROCID);
 
  SELECT [catalog] = s.name
    FROM sys.schemas AS s
    INNER JOIN sys.objects AS o
    ON s.[schema_id] = o.[schema_id]
    WHERE o.[object_id] = @@PROCID;
END
GO
CREATE PROCEDURE dbo.WrapperProcedure
AS
BEGIN
  SET NOCOUNT ON;
  EXEC dbo.NestedProcedure;
END
GO
GRANT EXECUTE ON dbo.WrapperProcedure TO peon;
GO
EXECUTE AS USER = N'peon';
EXEC dbo.WrapperProcedure;
REVERT;

 

 

In this case, the call to OBJECT_NAME() yields NULL, while the call to sys.objects yields an empty result set:

Results of queries in nested procedureResults from nested procedure

In both cases the name has been shielded from the user (yay security!), but it is important to understand the difference in functionality, depending on what you are doing with the output, how many joins are involved with the queries, etc. A workaround in this case would be to pass the @@PROCID to a utility stored procedure that executes as a higher-privileged user, and perform the name resolution there – certainly a simpler solution than granting blanket metadata permissions to all users.

Another case that came up recently from Kris Gruttemeyer is when investigating data from DMVs which cross database boundaries. He had a query like this:

SELECT [SP Name] = OBJECT_NAME(t.objectid),
   [Number of Executions] = SUM(s.execution_Count)
FROM sys.dm_exec_procedure_stats AS s
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t
WHERE OBJECT_NAME(t.objectid) = N'MySPName'
GROUP BY OBJECT_NAME(t.objectid);

 

This returned stored procedures that he was sure weren’t being called, and validated via trace. The reason? He was running the stats query from the right databases, but had procedures in other databases that *happened* to have been assigned the same object_id. While many people assume object_id for user objects will be unique across an instance, on my system, I was able to reproduce this scenario quite easily:

Same object_id, different namesSame object_id, different names

Then I executed only the stored procedure in db1:

EXEC db1.dbo.db1_foo;
GO 100

 

Now, I imagined that I was actually looking for executions of the other procedure, db2.dbo.db2_bar. When I ran his query above, and just changed the filter on the procedure name:

USE db2;
GO
SELECT [SP Name] = OBJECT_NAME(t.objectid),
   [Number of Executions] = SUM(s.execution_Count)
FROM sys.dm_exec_procedure_stats AS s
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t
WHERE OBJECT_NAME(t.objectid) = N'db2_bar'
GROUP BY OBJECT_NAME(t.objectid);

 

One would expect an empty set, but you actually get evidence of the executions of the other stored procedure, in the other database, that just happens to have the same object_id:

This is not the result you were looking forThis is not the result you were looking for

Joining to the catalog views on objectid/object_id would actually still cause the same problem, as it would still yield the local object name, but the local procedure might actually have never been executed:

SELECT [SP Name] = p.name,
   [Number of Executions] = SUM(s.execution_Count)
FROM sys.dm_exec_procedure_stats AS s  
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t
INNER JOIN db2.sys.procedures AS p
   ON p.[object_id] = t.objectid
WHERE p.name = N'db2_bar'
GROUP BY p.name;

 

Results are still showing the executions from that other procedure that “borrowed” our object_id:

Still not the result you were looking forStill not the result you were looking for

The proper way to do this, IMHO, would be to use the catalog views instead of the helper functions *and* add the following clause to filter to the database we care about:

AND t.[dbid] = DB_ID(N'db2')

 

You could just say DB_ID() but, by specifying the database name explicitly, you no longer have to care about the current database context. I’ve also been known to be a stickler about schema prefix. So the query becomes:

SELECT [SP Name] = p.name,
   [Number of Executions] = SUM(ps.execution_count)
FROM sys.dm_exec_procedure_stats AS ps  
CROSS APPLY sys.dm_exec_sql_text(ps.sql_handle) AS t
INNER JOIN db2.sys.procedures AS p
   ON p.[object_id] = t.objectid
INNER JOIN db2.sys.schemas AS s
   ON p.[schema_id] = s.[schema_id]
WHERE p.name = N'db2_bar'
  AND s.name = N'dbo'
  AND t.[dbid] = DB_ID(N'db2')
GROUP BY p.name;

 

And now we get an empty resultset, as expected. The real clincher was the filter on t.[dbid], but the fact that OBJECT_NAME() can quietly produce a false positive based on database context makes me that much less willing to use it.

There are other cases where this is not a problem that needs to be solved – consider functions like DB_ID() and DB_NAME()… how often would you expect to need read uncommitted to use these functions? After all, it’s not like you can run ALTER DATABASE in a transaction, or you’d want dirty data during CREATE DATABASE. And as it turns out, the database-related metadata functions don’t seem to be affected at all by isolation – I created a DDL trigger that just ran a WAITFOR and then rolled back, and while that was running, queries against sys.databases were blocked (unless I used read uncommitted), but DB_ID() and DB_NAME() returned immediately. Another case is ALTER LOGINSUSER_SNAME() is blocked even under read uncommitted, but so are queries against sys.server_principals, so there is no *additional* problem caused by using the function.

So, not ALL metadata helper functions are bad. These are the main ones where I try to write joins instead, primarily to allow read uncommitted *if* it is required at some point. As much as it pains me to write all that code instead of the much more convenient shorthand, using the long-way joins make my code much more consistent (even if a bit more verbose), and when I’m writing code that other people will use, ensures they will not be blocked when otherwise operating under read uncommitted:

  • OBJECT_ID()
  • OBJECT_NAME()
  • OBJECT_SCHEMA_NAME()
  • SCHEMA_ID()
  • SCHEMA_NAME()
  • OBJECTPROPERTY()
  • COLUMNPROPERTY()
  • HAS_PERMS_BY_NAME()

Another bonus is that accessing the catalog views works across databases and across linked servers without problem (you just need the (server and) database prefix), whereas some of the metadata functions have some real stumbling blocks. As demonstrated above, you can say OBJECT_ID(N'database.dbo.name'), but how do you do the inverse (OBJECT_NAME(<what goes here?>))? If you somehow know the object_id, you can use the optional parameter for database_id, and resolve that using yet another metadata function (e.g. OBJECT_NAME([object_id], DB_ID(N'database'))). But this doesn’t work across servers, and not all metadata functions have this optional parameter. So again, like many bad habits, one of the big payoffs can be consistency – where you are always using the joins, instead of only using them when you have to because your “preferred” method works in fewer scenarios.

If you never use read uncommitted when accessing metadata, and are okay with blocking or being blocked, and don’t mind having different kinds of hoops to jump through depending on which metadata you’re accessing from where, this post isn’t for you.

As always, “it depends”

There are some cases where a helper function is the only practical way to achieve the goal. For example, if you want to determine whether a #temp table exists in your session, you do this:

IF OBJECT_ID(N'tempdb.dbo.#tablename') IS NOT NULL

 

You can’t use tempdb.sys.objects directly, because in that catalog view, the table name is actually something like:

#tablename__________________________________________________________00000000ED04

 

You can’t just look for a table named with the pattern N'#tablename%', because then you’d also include similar temp tables from other users’ sessions. You could try some of the methods in Dude, who owns that #temp table?, but these are very heavy-handed for this purpose, and probably still not fool-proof.

Of course, when checking if a #temp table exists, my primary objection to the helper functions is largely irrelevant anyway – you’d have a hard time blocking yourself inadvertently in a self-contained session due to mismatched isolation semantics.

[See an index of all bad habits / best practices posts]

Aaron (@AaronBertrand) is a Data Platform MVP with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com.

Aaron’s blog focuses on T-SQL bad habits and best practices, as well as coverage of updates and new features in Plan Explorer, SentryOne, and SQL Server.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments