An AD User is permissioned through which AD groups that are logins on my SQL Server?

This is a useful query to impersonate any windows user and then list the various AD groups that the user is gaining access to your SQL Server through.  Even if the windows user isn’t an explicit login on the SQL Server, the script will return the AD groups that he or she is a member of that allowed access to SQL Server.

SELECT @LOGINNAME = ‘domainname\loginname’
SELECT distinct token.*,grp.*
FROM sys.login_token token
JOIN sys.server_principals grp
ON token.sid = grp.sid
WHERE token.[type] = ‘WINDOWS GROUP’
AND grp.[type] = ‘G’
order by 3

Prevent SQL Cluster Failover For Patching/Reboots (Powershell)

At my current shop we have almost 200 windows servers and VMs that are rebooted every 90 days as part of a regular windows patching process.  Each time we execute this process, our DBA team must validate all sql services, databases and clusters after the patching and reboot process is complete.  Our strong preference is to avoid any manual intervention by DBAs while still making sure that the reboot process doesn’t result in a clustered SQL Server instance being hosted by different node in the cluster than prior to the reboot.

To simplify this process we put together a pair of powershell scripts:

1) Pre Process – sets the possible owner of SQL Server network name cluster resources to only have their current hosting node as a possible owner (this will prevent the instance from failing over to another node a waves of reboots happen)

2)Post Process – sets the possible owners of the SQL Server clustered network name resource to all nodes in the cluster (our standard configuration for clustered sql instances).

We have a control table (and stored procedure) that is used to store the various Clustered names in our environment.  The powershell scripts adhere to this basic high level process flow:

  1. Open a result set of all sql cluster names
  2. Loop through the cluster names one at a time
    1. for each cluster name, pull all Network Name resources
      1. Loop through each network name
        1.  For each network name resource either set the possible owners to only the current owner (pre) or all nodes in the cluster (post)

Please note: the code must be run as a domain account that has the permissions to manage the cluster resources (in our case we use dedicated administrative level domain accounts when we run this).

First let’s look at the outer loop that gets a list of clusters:


The code first imports the FailoverClusters powershell module so that we can call cluster specific commands later in the code.  The remaining lines in the code above set up a new SQL Server connection and fill dataset with a list of clusters.  In the example above, I have a hardcoded, single cluster name but you could have a table of all your clusters and have them returned as a single column resultset.

The code below is the most important part that connects to the cluster, finds the network name resources and sets their possible owners.


Our approach to prevent failovers of sql servers to other nodes in the clusters is to temporarily set the SQL Server network name resource to only have it’s current cluster node host as it’s only possible owner.  With this configuration in place on the network name resource, rebooting the host will NOT cause the clustered sql instance to fail over to a different node in the cluster.  The code above prints the following information:

1) The name of the cluster

2) List the current owner node for the network name

3) List all possible owner nodes in the cluster for the network name resource

4) Sets the possible owners for the network name to ONLY THE CURRENT cluster node (so after this runs, the network name resource, and all other sql related resources in the cluster group, will NOT fail over in the event of an owner node reboot or failure)

The POST version of the script simply sets the possible owners of the network name resources to ALL nodes in the cluster (our standard configuration).  The code below shows similar logic as the PRE script, but includes a different call to Set-ClusterOwnerNode in the powershell (to set all nodes as possible owners);



Comment or email me if you need copies of the PS1 files for this.

My default schema is not working….

I ran into an issue the other day where we moved an application database that was using named schemas (other than DBO) to logically organize the tables within the database.  When we moved the database to a new sql server, we mapped the application’s SQL login to the new database (with the default schema set to a named schema within the database and NOT dbo).

The problem is that the application’s connection seemed to be using the dbo schema by default even though the mapped DB user had the correct default schema.  After a little digging, we learned that if a login has the sysadmin role at the server level, then all queries against user database will default to using the dbo schema regardless of what you have set for the default schema in the mapped database user account.

I found the answer here:

but there are countless other links describing this behavior.  Hope it saves someone a little time.

Simple way to get last SQL Server Service Restart

There are a lot of different ways to check when your sql instance last restarted but a coworker reminded me of the simple TempDB method.  Basically, every time that the SQL Server service restarts, TempDB is recreated.  So to find out the last instance restart, just run this:

SELECT CREATE_DATE FROM sys.databases where name=’tempdb’

It works on SQL2005+.  Old School!

Convert an Existing Normal Table to a Partitioned Table – Quick and Dirty Edition

I’m sure there a thousand posts on the best way to partition a table and manage it, but I wanted to provide a “quick and dirty” solution that we put together for something in my shop in the last week.  Here’s a little background….

We have a table called ApplicationLog that is a centralized table for logging application activity across multiple apps within our environment.  The table typically gets ~200K rows per month but in some heavier testing/release months, that number can climb into the tens of millions (depending on how verbose the logging is set to and if there are any botched releases that result in a flood of applog messages).  At the time the DBA team got involved, there was a monthly purging job that just ran a “delete from applicationlog where date<getdate()-180” to trim out entries older than 180 days.  This worked for a couple years without incident, but then more applications started logging their messages to the table and the volume of data just got too high.  The problem presented itself when the monthly purging job started getting deadlocks as it attempted to delete rows that were (presumably) on the same pages as new messages being logged. We wanted to use partitioning on this table to prevent deadlocks and blocking during the purge process.

First lets take a look at the table to be partitioned:


Pretty basic table, no funky datatypes, and the ID is an identity.  It also had a single index, clustered on the identity ID column… even though no-one every queries by the ID…. classic developer stuff.  The basic steps we followed to set up the table for partitioning include:

  1. Define a partition Function
  2. Define a Partition scheme
  3. Define a new table with the same structure (ApplicationLog_Part)
  4. Define a new clustered index on the Date field (and using the partition scheme)
  5. Populate the new partitioned table with the data from the existing table
  6. Rename the table (ApplicationLog becomes ApplicationLog_Backup and ApplicationLog_Part becomes ApplicationLog)

Lets take a look at these one by one.  The partition function is the piece of code that defines the boundaries between partitions.  In our case this is simple a series of dates on the first of each month.  The code below illustrates the Partition Function and Scheme we used:


The first statement makes the partition function with the historical and future partition boundaries predetermined.  The above is a condensed version for readability, but the real version has partition boundaries going out to 12/1/2025.  We added all the boundaries up front because we wanted to keep things simple and not have to dynamically regenerate the partition function with different boundary definitions each month.

Also notice that at the time of definition, a partition function has nothing to do with any particular table or data set.  It is merely a series of boundary definitions with a single parameter that can be passed (of the same datatype as the partition boundary values) [sort of like a big CASE statement]..

The partition scheme is just a construct that can employ the partition function as rows are inserted into an underlying table.

The next step was to define a new table with the identical scheme as ApplicationLog and then create a clustered index (that used the partition scheme/function) on it.  One of our requirements was that we couldn’t change the structure of the table at all (so that the existing logging code would continue to work without any changes).  So we made a new version of the table called ApplicationLog_Part:


After the table is created, we define a new clustered index on the Date field (not the identity ID) and then use the ON clause in the create index statement to specify how to store the data as it is inserted.  In the case above, we reference the partition scheme ps_dbo_ApplicationLog(date) with the column that it needs to determine which partition to route the data to.  The next step is to populate the temporary table with the last 120 days of data from the  current table:


In our case we had to preserve the identity column so you see the identity_insert set operation at the top.  Then you basically perform an insert from the original table to the temporary (_part) table.  Once the insert finishes, you can evaluate the distribution of data with the second query.  The final step is to rename the tables and reset the identity value.


As shown above, you can simply rename the tables and then reset the identity configuration on the _part table (now named ApplicationLog).  Now you have a working partitioned table called ApplicationLog that accepts inserts from all the existing code without issues.

The final piece I’ll cover is handling the routine deletion of aging data.  Since the table is now partitioned, we don’t need to run delete statements that can contend with applications writing to the table.  Instead we can take advantage of the SWITCH operation.  The SWITCH PARTITION option allows you to instantaneously move a whole partition of data out of the original partitioned table and into it’s own separate table.


The great thing about it is that the operation is almost instantaneous regardless of the number of rows in the partition.  There are countless overviews of partitioning out there and a lot of it handles the complexities of dynamically managing the partition function boundaries by recreating the partition function regularly.  My “quick and dirty” approach doesn’t handle these complexities.  Instead I simply define the partition function with boundaries predefined and going out 25 years (long past when I’d expect this applicationlog table to actually be in service.  Another element that we did not have to address was having partitions on different physical files/ drives.  That is something that can be useful when you have tiered storage and you want  keep your recent partitions on your fastest underlying disks and your aging partitions on your slower cheaper disks.

PS> sorry for all the images of code, but it looks nicer with the SSMS formatting.  If you need any of the code, I can send it along.

Which Databases are Actually Used? CPU and Memory Use BY Database Queries

We have a fairly large SQL Server environment where I work with over 200 total SQL instances.  They are spread about 60/40 non production vs. production and with an active development staff, we have many new databases being created and worked on in our non production environments every week.  In order to keep an eye on which databases are used (in use) we began capturing a snapshot of database level CPU and memory usage twice a day (6AM/6PM).  Over time this data can give a decent indicator of which databases have activity (user, developer or otherwise) and their overall “rank” against the other databases on a particular instance.  This type of data helps us make decisions on:

  • Clean Up – was this database created as a one off test data set in dev? can it be deleted?
  • Consolidation – is this database used minimally enough to be considered a candidate for consolidation to a shared sql instance?
  • High Level Resource Usage per instance – which DB is consuming the most resources on which instance?

In order to keep things simple, we use basic DMV queries to collect this data twice a day.. For CPU usage we look at sys.dm_exec_query_stats and for memory usage, we look at sys.dm_os_buffer_descriptors.  Let’s take a quick look at the collection queries:  Note: a colleague sent me some basic queries that I tweaked for my needs, so please forgive me if I am not attributing the right folks from the sql community who originally posted these (or similar).

–Declare a Date for the collection time

Declare @TheDate date;

Select @TheDate = getdate();




(SELECT DatabaseID, DB_Name(DatabaseID) AS [DatabaseName], SUM(total_worker_time) AS [CPU_Time_Ms]

FROM sys.dm_exec_query_stats AS qs

CROSS APPLY(SELECT CONVERT(int, value) AS [DatabaseID]            

FROM sys.dm_exec_plan_attributes(qs.plan_handle)       

WHERE attribute= N’dbid’) AS F_DB

GROUP BY DatabaseID)

SELECT @TheDate ,@@ServerNameDatabaseName, [CPU_Time_Ms],

CAST([CPU_Time_Ms] * 1.0 / SUM([CPU_Time_Ms]) OVER() * 100.0 AS DECIMAL(5, 2)) AS [CPUPercent]


WHERE DatabaseID > 4 — system databases

AND DatabaseID <> 32767 — ResourceDB



 The query above can be run on any SQL Server 2008+ machine and return the total CPU time consumed per database (since the last reboot or exec_stats flush) as well as the percentage of the total CPU consumed across the whole server.  The results look like this:


Remember that the above values are cumulative through the last reboot, so you need some intelligence to self join the table across intervals and calculate the deltas.

The other query determines which databases are consuming the most memory at the time of the collection:

Declare @TheDate date;

Select @TheDate = getdate();
— get memused per DB
With DB_MemUsed as(
SELECT @@ServerName as Servername, db_name(database_id) as databasename,database_id,

              COUNT(*) * 8 / 1024 AS MBUsed

FROM   sys.dm_os_buffer_descriptors

where database_id > 4

and database_id  <> 32767

GROUP BY      database_id


SELECT @TheDate,ServerName, DatabaseName, MBUsed,

CAST(MBUsed * 1.0 / SUM(MBUsed) OVER() * 100.0 AS DECIMAL(5, 2)) AS MBUsedPercent


WHERE Database_ID > 4 — system databases

AND Database_ID <> 32767 — ResourceDB



The result set looks like this:


Collecting the data to a central database table over time provides a nice, centrally managed data set to run trending queries against.  This information is NOT cumulative so each collection is just a snapshot of the memory consumption at that particular time. This makes it easier to run aggregates against the data over a time period like this:

Select datepart(year,a.capture_time) as captureyear,

datepart(week,a.capture_time) as captureweek,



avg(MBUsed) as AvgMB_MEMUsed,

Max(MBUsed) as MaxMB_MEMUsed,

Min(MBUsed) as MinMB_MEMUsed

from [DB_MemUsage_ByDatabase_ByDay] a

where a.servername = ‘<MyINstanceName>’

group by datepart(year,a.capture_time),

datepart(week,a.capture_time)  ,



The above query assumes you are saving the results of the memory usage query to a table called DB_MemUsage_ByDatabase_ByDay.  Once several days’ worth of collections are available, you can run the above query to get some aggregate memory usage statistics “By Week”.

Hope this is helpful, send me an email or comment if you need assistance or have any questions.

Wasting space with heaps

Over the last few months, our team of DBAs has been developing various data collection routines for monitoring (and designing the tables to store said monitoring data).  We are planning to remove a 3rd party SQL Server monitoring tool from our environment in the next couple of months, so most of the effort has been on making sure we don’t lose any critical monitoring/analysis functionality once we remove that product.  This is all well and good, but in our haste to “get the data collecting”, we found a few problematic issues.

1) If you are going to collect some data set (wait stats, file io stats, etc) ALWAYS ALWAYS ALWAYS implement a deleting process as well.  Set a ridiculous retention period (e.g. 10 years) if you don’t want to delete anything, but at least have the process in place to trim aging records off of your monitoring tables.  I know it’s the most obvious, but it is often overlooked or deferred with the thought that “we’ll delete when it grows too big”.

2) DON’T FORGET a Clustered Index!  In the vast majority of our routines, we execute a DMV query against a series of servers and then write the results to table on a central database server.  Nothing fancy there, but I found myself “cloning” routines (copy/paste powershell code, quickly make a table struct with a SELECT INTO, etc.) and not taking the time to define at least a clustered index.  In at least 2 cases, we had tables that were heaps (no clustered index) and over time their unused space grew to > 7 times the actual required space to house the rows.  This was caused by us deleting aging records from the heap.  A heap can’t reuse space allocated to it so it will always grow (unless you copy it to another heap [select into], runt he alter table rebuild command, or add a clustered index).

I added a few clustered indexes to the problem tables and the unused space dropped down to <1MB per table.

I’m sure the above is obvious to most readers, but I am hoping that by admitting to this sloppiness here, it will prevent me from overlooking this basic things in the future.

Unlocking SQL Logins Without a Password

So today I broke something.  While prepping to bring our disaster recovery SQL servers in line with their production counterparts, I was going through the servers and adding linked servers that were enabled on the main production servers but not enabled/existing on their DR counterparts.  It seemed simple enough, especially since most of our linked servers are using pass through authentication of the user’s windows domain credentials to connect.  HOWEVER, in a handful of cases we are using SQL Server logins for the security context of linked servers.

The problem arose when I attempted to add a linked server, using a SQL Server login/password.  I had the wrong password (it was wrong in our password management application.. but that is another story entirely).  I added the linked server without incident since SQL Server doesn’t confirm the provided credentials before creating the linked servers.  Consequently, when I added the linked server with the wrong password, and then attempted to open the linked server in Management Studio a few times, the SQL Server login on the target server was locked out.  This resulted in a chain reaction as several production applications were using that login for their normal security context.

Needless to say, internal customers started complaining about security errors coming from a core business application “all of a sudden”.

As soon as I saw the emails, I knew it had something to do with me and what I was working on.  I checked the SQL Error log on the server in question and saw the obvious set of failed login attempts followed by a stream of locked out login error messages.  A bonehead move to be sure, but when I tried to unlock the account in management studio, I was told that I had to provide the password to the alter login screen if I wanted to unlock the account.  Sigh… not a real problem, more of an annoyance… or so I thought.  When I went in to our password management application, of course, the sql login information was NOT there.

The fastest way to get the cow out of the ditch was to disable the account policy on the SQL login first, and THEN unlock it using SSMS’s login properties window.

So i had to execute this simple command:


Disabling the CHECK_POLICY prevented SQL Server from trying to apply the Domain level controls on AD accounts and I was able to unlock the SQL Login after that.

We decided as a rule to leave CHECK_POLICY=OFF for a handful of SQL Server logins in our environment to avoid the scenario of a sloppy (but good intentioned) admin causing a critical SQL Server login to get locked out due to a misconfiguration.

Collecting xEvents Query Stats From Multiple SQL Instances to a Central Database Part 4

In this final post about xEvent based query monitoring data collection, I’ll cover the following:

  1. Running daily pre-aggregations of query stats by query hash and purging aging raw xEvent data
  2. Some useful sample queries for making sense of all the data you are now collecting.

Once you have a solution like this in place and large datasets are growing larger on your central database, you need to consider a data retention process.  Obviously you can’t keep a record of every execution of every sql statement in perpetuity, but you do want to maintain a balance between low granularity in the short term and high granularity over a longer period of time.  The way I do this is to have a couple extra tables that store daily aggregations of stats (by database and query hash).  Basically I run a job every day to calculate the aggregates by database and query hash (individual statements).  I added some basic data warehousing fields (day/week/month/year/etc.) and I am aggregating Execution count, CPU, Reads and Durations.  The table diagram below shows this:


The queries to generate rows for these tables are straightforward.  For example:

declare @StartDate Date=getdate()-1
declare @EndDate date
Select @EndDate = dateadd(day,1,@Startdate)– delete any data you already have for yesterday
— Run the group by query aganist the raw table XEvents_Queries
— and insert rows to the pre-aggregate table [XEvents_Queries_ByDBAggregate]
INSERT INTO [dbo].[XEvents_Queries_ByDBAggregate]
Select b.dbserverid, a.servername, datepart(year,a.timestamp) rowyear,datepart(month,a.timestamp) rowmonth,datepart(day,a.timestamp) rowDay,
datepart(week,a.timestamp) rowweek,datepart(weekday,a.timestamp) rowdayofweek,a.databasename, count(*) as executioncount,
sum(a.logical_reads) as TotalReads,
Avg(a.logical_reads) as AvgReads,
Max(a.logical_reads) as MaxReads,
sum(cpu_time_ms) as TotalCPU,
avg(cpu_time_ms) as AvgCPU,
max(cpu_time_ms) as MaxCPU,
sum(duration_ms) as TotalDuration,
avg(duration_ms) as AvgDuration,
max(duration_ms) as MaxDuration
from XEvents_Queries a
inner join DB_ServerMain b
on a.ServerName = b.ServerName
WHERE a.timestamp >= @Startdate
and a.timestamp < @EndDate
group by b.dbserverid, a.servername, datepart(year,a.timestamp) ,datepart(month,a.timestamp) ,datepart(day,a.timestamp) ,
datepart(week,a.timestamp) ,datepart(weekday,a.timestamp) ,a.databasename

The code above sets two variables (notice they are date and not datetime) for yesterday and today and then it generates the aggregate data (by databasename in the example above).  As designed, you can put the logic above into a sql agent job and run it at 12:01AM each day to generate the aggregates for the previous day. The whole script to populate both the database level and queryhash level aggregate tables can be found here:aggregatesonceaday.sql  This script will create the aggregate tables:preaggregatetablecreates.sql

Once you have these in place, you can easily build trend analysis and troubleshooting queries like this:


— return the  queries by database AND highest total reads per week


as instancename,



isnull(objectname,) as objectname,

isnull(objectname,) + ‘ – ‘ + sql as sql,

dateadd(week, rowweekofyear, dateadd(year, rowyear1900, 0)) 4

    datepart(dw, dateadd(week, rowweekofyear, dateadd(year, rowyear1900, 0)) 4) + 1 as WeekStart,


sum(TotalReads) TotalReads,

avg(AvgReads) AvgReads,

max(MaxReads) MaxReads,

sum(TotalCPU) TotalCPU,

avg(AvgCPU) AvgCPU,

max(MaxCPU) MaxCPU,


avg(AvgDuration) AvgDuration,

max(MaxDuration) MaxDuration

,RANK()  over (PARTITION by dateadd(week, rowweekofyear, dateadd(year, rowyear1900, 0)) 4

   datepart(dw, dateadd(week, rowweekofyear, dateadd(year, rowyear1900, 0)) 4) + 1 DatabaseName

   order by sum(TotalReads)  desc) as Rank

from XEvents_Queries_ByDBByQueryHashAggregate qs

WHERE dateadd(week,rowweekofyear,‘1/1/’+convert(varchar(10),rowyear)) > getdate()-35 — for the last 5 weeks

AND qs.servername =  ‘<servernamehere>’

group by





isnull(objectname,) + ‘ – ‘ + sql,

dateadd(week, rowweekofyear, dateadd(year, rowyear1900, 0)) 4 – datepart(dw, dateadd(week, rowweekofyear, dateadd(year, rowyear1900, 0)) 4) + 1


And this… to get the aggregates by database only (so you can get a basic “how active is this database in general?” type metric:

— return the  query metrics by database only

select dbserverid,



dateadd(week, rowweek, dateadd(year, rowyear1900, 0)) 4

datepart(dw, dateadd(week, rowweek, dateadd(year, rowyear1900, 0)) 4) + 1 as StartDate,


sum(TotalReads) TotalReads,

avg(AvgReads) AvgReads,

max(MaxReads) MaxReads,

sum(TotalCPU) TotalCPU,

avg(AvgCPU) AvgCPU,

max(MaxCPU) MaxCPU,


avg(AvgDuration) AvgDuration,

max(MaxDuration) MaxDuration

from emdw..XEvents_Queries_ByDBAggregate

WHERE dateadd(week,rowweek,‘1/1/’+convert(varchar(10),rowyear)) > getdate()-35

AND servername =  ‘<servernamehere>’

group by




dateadd(week, rowweek, dateadd(year, rowyear1900, 0)) 4

datepart(dw, dateadd(week, rowweek, dateadd(year, rowyear1900, 0)) 4) + 1

order by dateadd(week, rowweek, dateadd(year, rowyear1900, 0)) 4

datepart(dw, dateadd(week, rowweek, dateadd(year, rowyear1900, 0)) 4) +desc

This should get you going and please post comments with any questions or requests for clarifications.

Collecting xEvents Query Stats From Multiple SQL Instances to a Central Database Part 3

In this post on collecting xEvent query execution data to a central database I will cover the logic in the powershell script (provided as a link below) that determines the instances to collect from and then connect and collect the data to the central database.

First a refresher with an image from my previous post.

figure 1

As you can see above, the powershell script first queries the central database (DB_ServerMain table) to determine what instances to collect from.  Then it loops through those instances and executes the xEvents tSQL queries from my previous posts.  Let’s take a quick look at some parts of the PS1 file.

Determine What Instances to Collect From

At the top of the powershell file is a query and then the beginning of a loop:


The first part of the PS1 file opens a connection to the central database and executes the query against DB_ServerMain to determine which SQL instances to collect xEvent query data from. At the bottom of the snippet above, you can see that it assumes all .XEL files will be in “C:\DBAMonitoring\traces\” on the remote machine.  It them goes into a foreach loop iterating through the list of remote SQL Server instances to collect from.

Within the loop, a connection string for source (remote server) and destination (central collection server) are defined, some remote instance specific variables are set (for incremental loading) and then the SQL statement is defined dynamically:


Note that xel filename and fileoffset are embedded in the dynamic SQL prior to it’s execution.  Then the actual call to run the query on the remote server and save the results to the central server is executed in a TRY/CATCH block:


The code first defines the destination table name in the central DB instance (“XEvents_Queries”).  Then it opens the connection to the remote instance, creates a SqlCommand object (with the query AND the source connection) and then executes the query (with the .ExecuteReader() method of  the command object).  At this point, nothing has been written to the central collection database.  If there is an error in the execution of the query on the remote instance, then the CATCH block will fire.  In the powershell script, my catch block basically assumes there is a file rollover condition by checking for the phrase “The offset” in the caught exception: $ex1.Contains(“The offset”).  If the named xel file and offset are no longer valid (due to a file rollover), it basically resets the sql query to call the system function with NULLs instead of named incremental starting point (like this:   FROM sys.fn_xe_file_target_read_file(‘C:\DBAMonitoring\traces\sql2012*.xel’, null,NULL,NULL))

The final actionable piece of the script uses the SQLBulkCopy object to connect to the central collection database, and then write resultset from the remote instance to it:


Basically, it connected to the central collection server with the conneciton string declared at the beginning of the script.  It also defines the destination table, bacthsize and timeout values.  The .WriteToServer method is called with the $Reader variable that contains the resultset from the remote instance.  The .WriteToServer method will perform a BCP insert operation in chunks of 5000 rows.

That’s pretty much it for the powershell script.  You can download the full version of it here FetchQueryMonitorxEvents.ps1.