Collecting xEvents Query Stats From Multiple SQL Instances to a Central Database Part 1

In my previous post, I discussed setting up xEvents sessions in lieu of server side traces to collect statements level activity from a SQL Server instance.  I covered the basics of setting up the session to collect the sp_statement_complete and sql_statement_complete events to a .xel event file.  I also provided some queries that allow you to read the entries from all .xel files in directory and then parse the content into a regular tabular resultset for sorting, filtering, and aggregation.  While this is all very useful information to capture on a specific instance, most (if not all) of us in the DBA world are managing multiple SQL Server instances across production and non-production environments.  In this post I’ll cover the following topics:

1) Standardizing your xEvents sessions on individual sql instances to streamline the centralization process

2) Using powershell to automate the collection of xEvent data from multiple instances and store it in a set of centralized set of tables on one instance (a central management or monitoring sql server)

3) Automating an aggregation process to generate pre-aggregated table rows (by day, month, etc) for the query data we are collecting.

Standardize your Query Performance xEvents Sessions

In order to simplify the centralization of xEvent data, I adhered to the following simple configurations:

  1. All query performance xEvent sessions write out .xel file to the same named directory on any SQL instance.  In my case, I use c:\DBAMONITORING\traces for this.
  2. I configure the xEvent Session to write to 2MB files and rollover files after 40 have accumulated.  This ensures that I never have more that 80MB of disk space in use on  the local machine.  Later in the post, I’ll cover how I collect and centralize the data from the .xel files every 15 minutes so as long as you don’t product >80MB worth of .xel files every 15 minutes, you probably won’t lose captured event data between polls.
  3. All query performance xEvent .xel files should have the same prefix so they can be read into memory together.  In my case, all .xel files are started with SQL2012<somestring>.xel.

With these settings in place on each SQL 2012 instance that I want to collect xEvent query execution data from, I can use a powershell script executed on a central database instance that connects remotely to the monitored instances and pulls the recent xEvent data to a set of tables.

Centralized xEvent Processing and Collection

The diagram below illustrates the simple way that this solution works.

figure 1

The central database has a few tables to handle the following:

1) Control Table – has the names of the SQL instances that you would like to collect xEvent data from.  This list is used by the powershell script to determine which SQL instances to collect from each time that it runs.

2) xEvent Tables – These tables store the query performance information (xEvents) from the various servers in your environment.  Then you can run queries or write reports on top of this data to review instance specific query performance metrics.

The Powershell script performs the following actions:

1) Query the Control table to determine which instances to collect from.
2) Loop through the list of instances and for each one
a) connect and read the latest xEvent data from the .xel files on the server
b) Pre-process the data on the remote machine (in a temp table) and then load it directly into the xEvent tables in the central database

In my previous post (Capture and Review SQL 2012 Query Performance History with xEvents) I provided a TSQL script to run on a local instance that would read in the .xel files as XML strings to a temp table.  Then the second half of the query parsed the XML data into the familiar trace-like fields including:

Field Name Description
query_hash the unique query hash value that allows you to group identical calls without comparing/parsing the actual statement
timestamp time of the statement execution
EventName either sp_statement_completed or sql_statement_completed
duration_ms the duration in ms of the statement
object_type shows prepared or adhoc to indicate the type of call
DatabaseName the DB within which the statement was executed
ObjectName The name of the stored procedure called (if applicable)
cpu_time_ms total CPU ms used to complete the query
physical_reads physical read count required to complete the query
logical_reads logical read count required to complete the query
writes write count required to complete the query
row_count number of rows affected by the query
statement The actual statement that was executed

With those values above, you can run all kinds of grouping and time-windowed queries to see your most costly statements by cpu/reads/write/duration.  The only thing missing to centralize the content is to maintain an InstanceName with each record as well.  In the queries I make through the powershell script, I include an @@Servername as ServerName in the select list and few other control related fields including the .xel file name that the event came from and offset within the file that the event is part of.  Let’s take a quick step back and review how the .XEL files work in terms of logging xEvents to disk.

xEvents to Disk (.XEL file target), sys.fn_xe_file_target_read_file and Offsets

SQL Server provides a system function to read the data out of .xel files directly from TSQL.  The function is sys.fn_xe_file_target_read_file (documentation here: http://technet.microsoft.com/en-us/library/cc280743.aspx).  A basic call looks like this:

SELECT top 10000 file_name, file_offset, CAST(event_data AS XML) AS event_data_XML
FROM sys.fn_xe_file_target_read_file(‘C:\dbamonitoring\traces\sql2012*.xel’, null, null, null) AS F;

The above reads all of the files that start with “sql2012” and end with “.xel” in the directory and return a result set of events including the filename the event came from, the group offset number within the file (used for filtering) and an XML column of the full event data.  The results are shown below:

functionresults_offset

As you can see above, in an xel file, xEvents are logged in groups (denoted by a shared file_offset value).  When you call the  sys.fn_xe_file_target_read_file you can call it with a filename and a file_offset as parameters:

SELECT top 10000 file_name, file_offset, CAST(event_data AS XML) AS event_data_XML
FROM sys.fn_xe_file_target_read_file(‘C:\dbamonitoring\traces\sql2012*.xel’, null, ‘C:\dbamonitoring\traces\sql2012_0_130268187987170000.xel’, 775168) AS F;

In the example above, the select will return rows beginning with the specified file and file_offset and continue with rows from any subsequent files in the directory that match the sql2012*.xel search string.  Using this filtering functionality of the system function yielded better performance (in my limited tests) than comparing on an event timestamp which requires xpath-ing out the timestamp value and then a type conversion to make it a date for comparison.

part 2 coming soon….

Advertisements

2 thoughts on “Collecting xEvents Query Stats From Multiple SQL Instances to a Central Database Part 1

  1. In the last query of part 1, you hard-coded the name of the initial .xel file. Is there anyway to get that without hard-coding it? Also, where do you store the file offset value so the next time your code runs, it’ll use the most recent offset?

    • If you check out Part 2, I store the filename and offset of then event in the XEvents_Queries table in the central database. Then in Part 3, the powershell script uses those values from the table to perform an incremental load of XEL files since the last load. I hope that clarifies.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s