Collecting xEvents Query Stats From Multiple SQL Instances to a Central Database Part 3

In this post on collecting xEvent query execution data to a central database I will cover the logic in the powershell script (provided as a link below) that determines the instances to collect from and then connect and collect the data to the central database.

First a refresher with an image from my previous post.

figure 1

As you can see above, the powershell script first queries the central database (DB_ServerMain table) to determine what instances to collect from.  Then it loops through those instances and executes the xEvents tSQL queries from my previous posts.  Let’s take a quick look at some parts of the PS1 file.

Determine What Instances to Collect From

At the top of the powershell file is a query and then the beginning of a loop:

ps1_part1

The first part of the PS1 file opens a connection to the central database and executes the query against DB_ServerMain to determine which SQL instances to collect xEvent query data from. At the bottom of the snippet above, you can see that it assumes all .XEL files will be in “C:\DBAMonitoring\traces\” on the remote machine.  It them goes into a foreach loop iterating through the list of remote SQL Server instances to collect from.

Within the loop, a connection string for source (remote server) and destination (central collection server) are defined, some remote instance specific variables are set (for incremental loading) and then the SQL statement is defined dynamically:

ps1_part12JPG

Note that xel filename and fileoffset are embedded in the dynamic SQL prior to it’s execution.  Then the actual call to run the query on the remote server and save the results to the central server is executed in a TRY/CATCH block:

ps1_part3

The code first defines the destination table name in the central DB instance (“XEvents_Queries”).  Then it opens the connection to the remote instance, creates a SqlCommand object (with the query AND the source connection) and then executes the query (with the .ExecuteReader() method of  the command object).  At this point, nothing has been written to the central collection database.  If there is an error in the execution of the query on the remote instance, then the CATCH block will fire.  In the powershell script, my catch block basically assumes there is a file rollover condition by checking for the phrase “The offset” in the caught exception: $ex1.Contains(“The offset”).  If the named xel file and offset are no longer valid (due to a file rollover), it basically resets the sql query to call the system function with NULLs instead of named incremental starting point (like this:   FROM sys.fn_xe_file_target_read_file(‘C:\DBAMonitoring\traces\sql2012*.xel’, null,NULL,NULL))

The final actionable piece of the script uses the SQLBulkCopy object to connect to the central collection database, and then write resultset from the remote instance to it:

ps1_part4

Basically, it connected to the central collection server with the conneciton string declared at the beginning of the script.  It also defines the destination table, bacthsize and timeout values.  The .WriteToServer method is called with the $Reader variable that contains the resultset from the remote instance.  The .WriteToServer method will perform a BCP insert operation in chunks of 5000 rows.

That’s pretty much it for the powershell script.  You can download the full version of it here FetchQueryMonitorxEvents.ps1.

Advertisements

Collecting xEvents Query Stats From Multiple SQL Instances to a Central Database Part 2

In this post I will continue from the previous post about centralizing xEvents based query execution stats collection and reporting.   The key topics I will cover are:

1) Determine which sql instances you need to collect from and when the last collection return (to handle incremental loads throughout the day from the source servers to the central DB server)

2) Review of the xPath oriented queries that parse the .XEL files into a relational format for storage in the central DB

In my last post, I ended with discussing the XEL file’s offset functionality and how to use it within the sys.fn_xe_file_target_read_file system function to bring back a set of events “since the last poll”.  This functionality is what I use to perform incremental loads of xEvent query execution data from various remote sql servers into my central database.  First let’s take a look at the basic data model I am using:

xEventStorageTable

DB_ServerMain has the name of the SQL Server and a flag (enablexEventQuerCollection) to denote whether it should be included in centralized collection.  Straightforward enough… you can get the list of SQL instances to collect xEvents from with a query like:

Select *
from DB_ServerMain
where enablexEventQueryCollection = 1

Recall from my previous post, that we can use the file_offset value in the .XEL file to basically run an incremental load (return everything after the specified offset).  In the data model presented above, you can see the XEvents_Queries table has the servername, file_name, and file_offset value (along with the query stats specifics).  In order to fully automate the collection process, we need the name of the SQL Server instance, the name of the last XEL file collected from AND the highest file_offset value within that last file that we have a record of.

Select  a.ServerName,isnull(xEventFile.xEventFile_Name,'NULL') as xEventFile_Name, isnull(convert(varchar(50),xEventFileOffset.xEventFile_Offset), 'NULL') as xEventFile_Offset 
from DB_ServerMain a 
OUTER APPLY (Select max(file_name) as xEventFile_Name from dbo.XEvents_Queries ws where s.servername = a.servername ) as xEventFile 
OUTER APPLY (Select max(file_offset) as xEventFile_Offset from dbo.XEvents_Queries ws where ws.servername = a.servername and ws.file_name = (Select max(file_name) as xEventFile_Name from dbo.XEvents_Queries ws where ws.servername = a.servername )) as xEventFileOffset

I’ll admit, it’s not the slickest code in the world, but it works to pull the servername, filename and file_offset for each instance to be monitored.  The OUTER APPLY(s) help in keeping it fast enough even after you have millions of rows in xEvent_Queries.    Once you know the instances you want to collect from (and the latest filename/fileoffset combination), you can connect to each SQL instance and select the contents of the .XEL files using the additional parameters in the call to sys.fn_xe_file_target_read_file() system function like this:

SELECT CAST(event_data AS XML) AS event_data_XML, file_name, file_offset INTO #Events FROM sys.fn_xe_file_target_read_file('C:\DBAMonitoring\traces\sql2012*.xel', null,'C:\DBAMonitoring\traces\sql2012_0_130295018340930000.xel', 141312) AS F

In the example above, the call will create (and fill) a temporary table called #Events with the alerttime and full event XML for all files that match the wildcard search “c:\DBAMonitoring\traces\sql2012*.xel”.  It will then start with the file and file_offset we specified (sql2012_0_130295018340930000.xel  and 141312) and disregard any earlier files or earlier entries within the file (based on the file_offset value).

The above query is basically the “incremental load” since the last poll, but it only pulls the raw XML event data (into an XML column) with the filename and file_offset associated with that event.  In order to make sense of the XML event, it needs to be parsed using xPath functions to pull out the individual elements (e.g. reads, writes, duration, etc).  The code below shows that logic:

 SELECT @@ServerName as ServerName, 
 file_name,
 file_offset,
 event_data_XML.value ('(/event/action[@name=''query_hash'']/value)[1]', 'nvarchar(100)') AS query_hash, 
 DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), GETDATE()), event_data_XML.value ('(/event/@timestamp)[1]', 'datetime' )) AS timestamp, 
 event_data_XML.value ('(/event/@name)[1]', 'vARCHAR(50)' ) AS EventName, 
 event_data_XML.value ('(/event/data [@name=''duration'' ]/value)[1]', 'int')/1000 AS duration_ms, 
 event_data_XML.value ('(/event/data [@name=''object_type'' ]/text)[1]', 'varchar(100)') AS object_type, 
 DB_Name(event_data_XML.value ('(/event/action [@name=''database_id'' ]/value)[1]', 'int')) AS DatabaseName, 
 CASE event_data_XML.value ('(/event/@name)[1]', 'vARCHAR(50)' ) 
 when 'sp_statement_completed' then event_data_XML.value ('(/event/data [@name=''object_name'' ]/value)[1]', 'NVARCHAR(4000)') 
 ELSE OBJECT_NAME(event_data_XML.value ('(/event/data [@name=''object_id'' ]/value)[1]', 'BIGINT'),event_data_XML.value ('(/event/action [@name=''database_id'' ]/value)[1]', 'int')) END as ObjectName, 
event_data_XML.value ('(/event/action [@name=''client_hostname'']/value)[1]', 'varchar(100)') as HostMachine, 
event_data_XML.value ('(/event/action [@name=''client_app_name'']/value)[1]', 'varchar(100)' ) as client_app_name, 
 event_data_XML.value ('(/event/action [@name=''nt_username'']/value)[1]', 'varchar(100)') as nt_username, 
 event_data_XML.value ('(/event/data [@name=''cpu_time'']/value)[1]', 'int')/1000 AS cpu_time_ms, 
 event_data_XML.value ('(/event/data [@name=''physical_reads'']/value)[1]', 'BIGINT') AS physical_reads, 
 event_data_XML.value ('(/event/data [@name=''logical_reads'' ]/value)[1]', 'BIGINT') AS logical_reads, 
 event_data_XML.value ('(/event/data [@name=''writes'' ]/value)[1]', 'BIGINT' ) AS writes, 
 event_data_XML.value ('(/event/data [@name=''row_count'' ]/value)[1]', 'BIGINT' ) AS row_count,
 CASE event_data_XML.value ('(/event/@name)[1]', 'vARCHAR(50)' ) 
 when 'sql_batch_completed' then event_data_XML.value ('(/event/data [@name=''batch_text'' ]/value)[1]', 'NVARCHAR(4000)') 
 ELSE event_data_XML.value ('(/event/data [@name=''statement'' ]/value)[1]', 'NVARCHAR(4000)') 
 END AS statement 
FROM #Events

The code above uses xPath to parse out the various elements of the XML event into usable tabular columns that match the order and structure of the xEvents_Queries table in the central database. The results look like this:

xpathresults

So now you have access to the reads/writes/cpu/durations as well as the statement within a larger batch or stored procedure that is using those resources.  You can run these type of queries on the host in question anytime to get a real-time look at the query activity, but pulling it all into a central database (with rows marked with an @@ServerName to delineate between instances) will be a more manageable solution for environments where there is more than one instance to keep an eye on (read: any environment)

With the information from this post and previous posts, you should be able to set up an active xEvents session to capture “statement complete” events.  You can then use the type of query similar to the above example to parse your captured xEvents data into easily consumable, aggregate-able (is that a word?) data sets for analysis.

I have used this type of setup on the local server to answer questions like this:

1) What are my top 5 executed tSQL statements on the server between 8:05AM and 10:30AM yesterday?

2) Which database had the most io/cpu usage for the last 6 hours?  Morning vs. Afternoon for the last 3 days? etc.

In my next post I’ll tie all of this together by going through a powershell ps1 file that has the above queries embedded into it and actually performs the write of query event rows from remote servers to the central database.

Collecting xEvents Query Stats From Multiple SQL Instances to a Central Database Part 1

In my previous post, I discussed setting up xEvents sessions in lieu of server side traces to collect statements level activity from a SQL Server instance.  I covered the basics of setting up the session to collect the sp_statement_complete and sql_statement_complete events to a .xel event file.  I also provided some queries that allow you to read the entries from all .xel files in directory and then parse the content into a regular tabular resultset for sorting, filtering, and aggregation.  While this is all very useful information to capture on a specific instance, most (if not all) of us in the DBA world are managing multiple SQL Server instances across production and non-production environments.  In this post I’ll cover the following topics:

1) Standardizing your xEvents sessions on individual sql instances to streamline the centralization process

2) Using powershell to automate the collection of xEvent data from multiple instances and store it in a set of centralized set of tables on one instance (a central management or monitoring sql server)

3) Automating an aggregation process to generate pre-aggregated table rows (by day, month, etc) for the query data we are collecting.

Standardize your Query Performance xEvents Sessions

In order to simplify the centralization of xEvent data, I adhered to the following simple configurations:

  1. All query performance xEvent sessions write out .xel file to the same named directory on any SQL instance.  In my case, I use c:\DBAMONITORING\traces for this.
  2. I configure the xEvent Session to write to 2MB files and rollover files after 40 have accumulated.  This ensures that I never have more that 80MB of disk space in use on  the local machine.  Later in the post, I’ll cover how I collect and centralize the data from the .xel files every 15 minutes so as long as you don’t product >80MB worth of .xel files every 15 minutes, you probably won’t lose captured event data between polls.
  3. All query performance xEvent .xel files should have the same prefix so they can be read into memory together.  In my case, all .xel files are started with SQL2012<somestring>.xel.

With these settings in place on each SQL 2012 instance that I want to collect xEvent query execution data from, I can use a powershell script executed on a central database instance that connects remotely to the monitored instances and pulls the recent xEvent data to a set of tables.

Centralized xEvent Processing and Collection

The diagram below illustrates the simple way that this solution works.

figure 1

The central database has a few tables to handle the following:

1) Control Table – has the names of the SQL instances that you would like to collect xEvent data from.  This list is used by the powershell script to determine which SQL instances to collect from each time that it runs.

2) xEvent Tables – These tables store the query performance information (xEvents) from the various servers in your environment.  Then you can run queries or write reports on top of this data to review instance specific query performance metrics.

The Powershell script performs the following actions:

1) Query the Control table to determine which instances to collect from.
2) Loop through the list of instances and for each one
a) connect and read the latest xEvent data from the .xel files on the server
b) Pre-process the data on the remote machine (in a temp table) and then load it directly into the xEvent tables in the central database

In my previous post (Capture and Review SQL 2012 Query Performance History with xEvents) I provided a TSQL script to run on a local instance that would read in the .xel files as XML strings to a temp table.  Then the second half of the query parsed the XML data into the familiar trace-like fields including:

Field Name Description
query_hash the unique query hash value that allows you to group identical calls without comparing/parsing the actual statement
timestamp time of the statement execution
EventName either sp_statement_completed or sql_statement_completed
duration_ms the duration in ms of the statement
object_type shows prepared or adhoc to indicate the type of call
DatabaseName the DB within which the statement was executed
ObjectName The name of the stored procedure called (if applicable)
cpu_time_ms total CPU ms used to complete the query
physical_reads physical read count required to complete the query
logical_reads logical read count required to complete the query
writes write count required to complete the query
row_count number of rows affected by the query
statement The actual statement that was executed

With those values above, you can run all kinds of grouping and time-windowed queries to see your most costly statements by cpu/reads/write/duration.  The only thing missing to centralize the content is to maintain an InstanceName with each record as well.  In the queries I make through the powershell script, I include an @@Servername as ServerName in the select list and few other control related fields including the .xel file name that the event came from and offset within the file that the event is part of.  Let’s take a quick step back and review how the .XEL files work in terms of logging xEvents to disk.

xEvents to Disk (.XEL file target), sys.fn_xe_file_target_read_file and Offsets

SQL Server provides a system function to read the data out of .xel files directly from TSQL.  The function is sys.fn_xe_file_target_read_file (documentation here: http://technet.microsoft.com/en-us/library/cc280743.aspx).  A basic call looks like this:

SELECT top 10000 file_name, file_offset, CAST(event_data AS XML) AS event_data_XML
FROM sys.fn_xe_file_target_read_file(‘C:\dbamonitoring\traces\sql2012*.xel’, null, null, null) AS F;

The above reads all of the files that start with “sql2012” and end with “.xel” in the directory and return a result set of events including the filename the event came from, the group offset number within the file (used for filtering) and an XML column of the full event data.  The results are shown below:

functionresults_offset

As you can see above, in an xel file, xEvents are logged in groups (denoted by a shared file_offset value).  When you call the  sys.fn_xe_file_target_read_file you can call it with a filename and a file_offset as parameters:

SELECT top 10000 file_name, file_offset, CAST(event_data AS XML) AS event_data_XML
FROM sys.fn_xe_file_target_read_file(‘C:\dbamonitoring\traces\sql2012*.xel’, null, ‘C:\dbamonitoring\traces\sql2012_0_130268187987170000.xel’, 775168) AS F;

In the example above, the select will return rows beginning with the specified file and file_offset and continue with rows from any subsequent files in the directory that match the sql2012*.xel search string.  Using this filtering functionality of the system function yielded better performance (in my limited tests) than comparing on an event timestamp which requires xpath-ing out the timestamp value and then a type conversion to make it a date for comparison.

part 2 coming soon….