Capture and Review SQL 2012 Query Performance History with xEvents

In this post, I’ll walk through the xEvents setup we have on a number of SQL 2012 instances. We use it to collect query performance metrics as a server-side trace replacement (note: this post assumes you are running SQL 2012+).  If you are familiar with server side tracing and/or profiler from 2008/2012, then a lot of this will make sense to you.  Basically I run an xEvent session that captures:

  • sp_statement_complete – individual statements within stored procedures that require more than 10 logical reads to complete
  • sql_statement_complete – individual statements within a sql batch (ad hoc) that require more than 10 logical reads to complete

We could just capture stored procedure completes and batch completes but going down to the statement level provides a few significant advantages (and challenges)

Advantages 

  • Statement level captures identify the specific sections within a larger stored procedure or batch which directs optimization efforts to the real slow elements.  If you capture events at the sp or batch completed level, then you’ll still need to do the work of finding the specific parts of the SP that are the slowest or most problematic.
  • In SQL 2012, sp and batch statement completed events include a query hash value so you can aggregate similar calls that use different parameters easily.  Back when we used server side traces to capture query execution information, i had to develop a complicated and resource intensive process that stripped raw sql calls of their parameter values and formatting so that they could then be aggregated for statistical analysis.

Disadvantages

  • Statement level captures will take in much more data than the sp/batch completed events, so you need to make sure you plan accordingly (more on this later…)
  • Since there is a much higher volume of data collected, it is harder to keep a long term, highly granular history.  (more on this below)

So the first step is to set up an xEvent session on the server in question.  I use this code:

CREATE EVENT SESSION [query_performance] ON SERVER
ADD EVENT sqlserver.sql_statement_completed(
ACTION(sqlserver.query_hash,sqlserver.client_hostname,sqlserver.database_id,
sqlserver.plan_handle, sqlserver.nt_username,
sqlserver.client_pid,
sqlserver.client_app_name)
WHERE (sqlserver.database_id >= 6 AND sqlserver.query_hash <> 0 )
),ADD EVENT sqlserver.sp_statement_completed(
ACTION(sqlserver.query_hash,sqlserver.client_hostname,sqlserver.database_id,
sqlserver.plan_handle, sqlserver.nt_username,
sqlserver.client_pid,
sqlserver.client_app_name)
WHERE sqlserver.database_id >= 6
AND sqlserver.query_hash <> 0)ADD TARGET package0.event_file(SET filename=N’C:\DBAMonitoring\traces\sql2012.xel’,
max_file_size=(2),
max_rollover_files=(40))
WITH (MAX_MEMORY=4096 KB,
EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=120 SECONDS,
MAX_EVENT_SIZE=0 KB,
MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,
STARTUP_STATE=ON)

The code above creates an event session (names the events to capture and whatever filters we may have) but does not start the session. Some additional comments about the CREATE EVENT SESSION code above:

  • I capture sql_statement_completed and SP_statement_completed to capture SQL  statement level information (within ad hoc batches and stored procedures respectively).  My preference is to capture the lowest level query activity only (so I don’t have batchcompleted and spcompleted events cluttering the data set of individual sql statements.
  • You have the option to capture xEvents into the SQL Server Ring Buffer, but I have found the .xel files easier to deal with and allow for more fine grained control of retention.  As you see above, I am capturing xEvents to .xel files in the C:\DBAMonitoring\Traces folder on the sql instance in question.  I also limit files to 2MB each and no more than 40 files total (so a max disk footprint of 80Mb before older files are rolled over (aged out and deleted))
  • Since I am capturing query execution statistics for aggregations later, it is OK if I don’t collect every single event and I don’t want to collect events in a way that might impact performance so I set the EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS setting on the event session.

Once you run the code above, the session is created on the SQL instance, but not started yet.  You can either run a start command in TSQL:

— start session
ALTER EVENT SESSION query_performance
ON SERVER
STATE = START;

or use SSMS to start the session

use SSMS to manage even sessions from within the Management folder of the instance

use SSMS to manage even sessions from within the Management folder of the instance

Once the xEvents Session is running you can see the .evt files collecting in the directory specified:

.evt files act just ilke the old .trc files ni that they reach a max size and then generate another file.  Once the rollover file count is met, then older files are deleted as new files are generated.

So just remember that the the “history” you can poll from active .evt files will be limited based on the volume of events captured and the sizing controls specified for the .evt file retention and rollover. You can query active .evt files much like you could with .trc files in SQL 2008 and 2005.  That said, I have found it easier to load the data into temp (or perm) tables from the evt files and then run additional queries for aggregations.

— copy event data into temp table #Events
— reads all .xel files that start with “sql2012” and end with “.xel” and puts each event into an XML column that we can query directly.
SELECT CAST(event_data AS XML) AS event_data_XML
INTO #Events
FROM sys.fn_xe_file_target_read_file(‘D:\DBAMonitoring\traces\sql2012*.xel’, null, null, null) AS F;

After the #Events table is populated, we can run the following XML xpath query to make ANOTHER temp table with the raw content separated out into individual columns.

— extract event details into a temp table #Queries
SELECT
event_data_XML.value (‘(/event/action[@name=”query_hash” ]/value)[1]’, ‘nvarchar(4000)’ ) AS query_hash,
event_data_XML.value (‘(/event/@timestamp)[1]’, ‘datetime’ ) AS timestamp,
event_data_XML.value (‘(/event/@name)[1]’, ‘vARCHAR(50)’ ) AS EventName,
event_data_XML.value (‘(/event/data [@name=”duration” ]/value)[1]’, ‘int’ )/1000 AS duration_ms,
event_data_XML.value (‘(/event/data [@name=”object_type” ]/text)[1]’, ‘varchar(100)’ ) AS object_type,
DB_Name(event_data_XML.value (‘(/event/action [@name=”database_id” ]/value)[1]’, ‘int’ )) AS DatabaseName,
CASE event_data_XML.value (‘(/event/data [@name=”object_type” ]/text)[1]’, ‘varchar(100)’ )
when ‘PROC’ then OBJECT_NAME(event_data_XML.value (‘(/event/data [@name=”object_id” ]/value)[1]’, ‘BIGINT’))
END as ObjectName,
event_data_XML.value (‘(/event/data [@name=”cpu_time” ]/value)[1]’, ‘int’ )/1000 AS cpu_time_ms,
event_data_XML.value (‘(/event/data [@name=”physical_reads”]/value)[1]’, ‘BIGINT’ ) AS physical_reads,
event_data_XML.value (‘(/event/data [@name=”logical_reads” ]/value)[1]’, ‘BIGINT’ ) AS logical_reads,
event_data_XML.value (‘(/event/data [@name=”writes” ]/value)[1]’, ‘BIGINT’ ) AS writes,
event_data_XML.value (‘(/event/data [@name=”row_count” ]/value)[1]’, ‘BIGINT’ ) AS row_count,
event_data_XML.value (‘(/event/data [@name=”statement” ]/value)[1]’, ‘NVARCHAR(MAX)’) AS statement
INTO #Queries
FROM #Events

With the raw data now in tablular format (#Queries) you can run any number of aggregations based on the query_hash field.  The query_hash is a unique identifier for similar queries that run with different parameters.  It is a huge time saver compared to the old server side trace approach where you had to write your own “generecize the sql statement so I can group on it” functionality.

So with the #queries table in place, you can run the query below to return the specific statements (by query _hash) ordered by total logical reads:

— filter and include a sample query
WITH QueryHashTotals AS
(
SELECT query_hash,DatabaseName, ObjectName,EventName,
COUNT(*) AS num_queries,
SUM(logical_reads) AS sum_logical_reads
FROM #Queries
GROUP BY query_hash,DatabaseName, ObjectName,EventName
),
RunningTotals AS
(
SELECT query_hash,DatabaseName, ObjectName, EventName,num_queries, sum_logical_reads,
CAST(100. * sum_logical_reads
/ SUM(sum_logical_reads) OVER()
AS NUMERIC(5, 2)) AS pct,
CAST(100. * SUM(sum_logical_reads) OVER(ORDER BY sum_logical_reads DESC
ROWS UNBOUNDED PRECEDING)
/ SUM(sum_logical_reads) OVER()
AS NUMERIC(5, 2)) AS running_pct
FROM QueryHashTotals
)
SELECT RT.*, (SELECT TOP (1) statement
FROM #Queries AS Q
WHERE Q.query_hash = RT.query_hash) AS sample_statement
FROM RunningTotals AS RT
ORDER BY sum_logical_reads DESC;

With the data in tables you can pretty much do whatever your want with it in terms of aggregations and queries.  Only simple modifications to the query are required to return results by writes, duration, cpu, etc.  In addition to the aggregations you can derive (Top N Queries), you can also use the raw data for investigations (“What was the query stream leading up to my database issue”).  It all depends on how much data you can retain and at what level of granularity.

My next post will discuss how to roll out this xEvents based query monitoring solution to multiple machines and collect the data to centralized database for reporting across an entire environment of SQL Server 2012 instance.  In the meantime, if you have any questions or issues with the samples above, email me or comment and I’ll try to reply as soon as  can.