SQL Server Online

Tuesday, November 1, 2011

Configuring a server side trace

When I’m after SQL Server performance problems, SQL Server Profiler is still my number one tool. Allthough I know that extended events provide a mor lightweight solution, those are still a bit cumbersome to use (but I’ve seen that we can expect some improvements with SQL Server 2012).

When I’m using profiler to isolate performance issues, I try to configure server side traces, whenever possible. Fortunately, SQL Server Profiler will help you creating a script for a server side trace (File/Export/Script Trace Definition), so you don’t have to figure out all the event- and column-codes. Very good!

As I was doing the same configuration again and again, I decided to separate the TSQL code for the configuration inside a stored procedure.

And here comes dbo.configureServerSideTrace:

if object_id('dbo.configureServerSideTrace', 'P') is not null
drop procedure dbo.configureServerSideTrace
go

-- Example for:
-- Start Trace
--   declare @traceID int
--   exec dbo.configureServerSideTrace @traceStatus = 1
--                                    ,@traceID = @traceID output
--                                    ,@maxFileSize = 10000
--                                    ,@traceFileName = N'e:\VMITrace\Undo'
--                                    ,@spId = @@spid
--
-- End Trace
--   exec dbo.configureServerSideTrace @traceStatus = 0, @traceID = @traceID

create procedure dbo.configureServerSideTrace
                                 (@traceStatus   bit                  -- 1 => Start Trace
                                                                      -- 0 => Stop Trace
                                 ,@traceID       int output           -- If the Trace is started, this param will return the TraceID
                                                                      -- For stopping the trace, the param has to be provided
                                 ,@spId          int           = null -- provide the @@spid, if you want to filter only events for this conection
                                                                      -- Optional. If not provided => no filter. Not needed for stopping the trace
                                 ,@maxFileSize   bigint        = 5000 -- Maximum Trace File Size in Megabyte. Trace will be stopped, if the filesize is reached.
                                 ,@traceFileName nvarchar(200) = null -- Name of the trace file (server side!)
                                                                      -- Optional. Not neded for stoping the trace
                                                                      -- Attention! If the file already exists, the SP will yield an error
                                                                      -- and no trace is started.
                                 ) as
begin

if (@traceStatus = 0)
begin
   exec sp_trace_setstatus @TraceID, 0
   exec sp_trace_setstatus @TraceID, 2
   return;
end

-- Create a Queue
declare @rc int

exec @rc = sp_trace_create @TraceID output, 0, @traceFileName, @maxfilesize, NULL
if (@rc != 0) goto error

-- Set the events
declare @on bit
set @on = 1
exec sp_trace_setevent @TraceID, 43, 15, @on
exec sp_trace_setevent @TraceID, 43, 48, @on
exec sp_trace_setevent @TraceID, 43, 1, @on
exec sp_trace_setevent @TraceID, 43, 34, @on
exec sp_trace_setevent @TraceID, 43, 35, @on
exec sp_trace_setevent @TraceID, 43, 51, @on
exec sp_trace_setevent @TraceID, 43, 4, @on
exec sp_trace_setevent @TraceID, 43, 12, @on
exec sp_trace_setevent @TraceID, 43, 13, @on
exec sp_trace_setevent @TraceID, 43, 14, @on
exec sp_trace_setevent @TraceID, 43, 22, @on
exec sp_trace_setevent @TraceID, 42, 1, @on
exec sp_trace_setevent @TraceID, 42, 14, @on
exec sp_trace_setevent @TraceID, 42, 22, @on
exec sp_trace_setevent @TraceID, 42, 34, @on
exec sp_trace_setevent @TraceID, 42, 35, @on
exec sp_trace_setevent @TraceID, 42, 51, @on
exec sp_trace_setevent @TraceID, 42, 4, @on
exec sp_trace_setevent @TraceID, 42, 12, @on
exec sp_trace_setevent @TraceID, 45, 16, @on
exec sp_trace_setevent @TraceID, 45, 48, @on
exec sp_trace_setevent @TraceID, 45, 1, @on
exec sp_trace_setevent @TraceID, 45, 17, @on
exec sp_trace_setevent @TraceID, 45, 18, @on
exec sp_trace_setevent @TraceID, 45, 34, @on
exec sp_trace_setevent @TraceID, 45, 35, @on
exec sp_trace_setevent @TraceID, 45, 51, @on
exec sp_trace_setevent @TraceID, 45, 4, @on
exec sp_trace_setevent @TraceID, 45, 12, @on
exec sp_trace_setevent @TraceID, 45, 13, @on
exec sp_trace_setevent @TraceID, 45, 14, @on
exec sp_trace_setevent @TraceID, 45, 22, @on
exec sp_trace_setevent @TraceID, 45, 15, @on
-- XML Statistics Profile
exec sp_trace_setevent @TraceID, 146, 1, @on
exec sp_trace_setevent @TraceID, 146, 51, @on
exec sp_trace_setevent @TraceID, 146, 4, @on
exec sp_trace_setevent @TraceID, 146, 12, @on

-- Filter: Log only events for the provided @@spid
if @spId is not null
exec sp_trace_setfilter @TraceID, 12, 0, 0, @spID

-- Set the trace status to start
exec sp_trace_setstatus @TraceID, 1

goto finish

error:
select ErrorCode=@rc

finish:
end
go

Some annotations:

For the parameters, see the comments.
Don’t specify a filename extension for the trace file. .TRC will be added automatically.
Ensure that the output file does not already exist. Otherwise you’ll get an error.
Very often I replace the code for starting and stopping the trace inside “interesting code” inside a stored procedure. That is, I’m wrapping some more or less awkward code by starting and stopping a trace like this:

declare @traceID int
exec dbo.configureServerSideTrace @traceStatus = 1
                                 ,@traceID = @traceID output
                                 ,@maxFileSize = 10000
                                 ,@traceFileName = N'e:\MyTrace\Test'
                                 ,@spId = @@spid

--
-- Code of interest
--

exec dbo.configureServerSideTrace @traceStatus = 0, @traceID = @traceID

Cheers.

Sunday, February 27, 2011

Calculating SQL Server Data Compression Savings

SQL Server 2008 Enterprise edition comes with an opportunity for storing table or index data in a compressed format which may save huge amount of storage space and – much more important – IO requests and buffer pool utilization. There’s two different options for data compression, namely Row and Page level compression. This blog post is not concerned with how these two work internally and will also not explain the differences between the two. If you like to know more about this, you find much of useful information on the internet - including links to further articles (e.g. here, here, and here).

Whether compression is worth or not isn’t an easy question to answer. One aspect that has to be taken into account is certainly the amount of storage that may be saved by storing a distinct table or index in any of the two compressed formats. SSMS offers a Data Compression Wizard than can provide storage-saving estimates for row or page level compression. From the context menu for a table or index just open Storage/Manage Compression… In the Combo box at the top select the compression type (Row or Page) and press the Calculate Button at the bottom. Here’s a sample of a calculated saving for an index:

Unfortunately, SSMS does not offer an option for calculating estimated savings for more than one table or index at once. If you, let’s say, would like to know the estimated storage savings of page level compression for you largest 10 tables, there’s no GUI support in SSMS that will assist you in finding an answer. This is, where the stored procedure sp_estimate_data_compression_savings comes in handy. This procedure – as you may have guessed from its name – provides estimated savings for row or page level compression for any table or index. You have to provide the table or index as a parameter to the procedure. In other words: The procedure will only calculate the estimations for one table or index at a time. If you want to retrieve the calculations of more than one table or index as a result set, there’s some more work to do, since the procedure has to be invoked multiple times. Here’s a script that calculates the estimated savings of page level compression for the database in context.

-- Determine the estimated impact of compression
-- NOTE: This script is only for SQL Server Enterprise and Developer edition.
set nocount on

-- We create a temp table for the result
if (object_id('tempdb..#comp', 'U') is not null)
drop table #comp
go
create table #comp
(
object_name sysname
,schema_name sysname
,index_id int
,partition_number int
,[size_with_current_compression_setting (KB)] bigint
,[size_with_requested_compression_setting (KB)] bigint
,[sample_size_with_current_compression_setting (KB)] bigint
,[sample_size_with_requested_compression_setting (KB)] bigint
)
go

-- Calculate estimated impact of page level compression for all
-- user-tables and indexes in all schemas.
-- NOTE:
-- 1) To get the estimated impact of row level compression change the last parameter
--     of sp_estimate_data_compression_savings to 'row' instead.
-- 2) We don't care about partitioning here. If this is important for you,
--     you have to modify forth parameter of sp_estimate_data_compression_savings.
--     Please refer to BOL.
declare @cmd nvarchar(max)
set @cmd = ''
select @cmd = @cmd
    +';insert #comp exec sp_estimate_data_compression_savings '''
    + schema_name(schema_id)+''','''
    + name + ''',null, null, ''page'''
from sys.tables
where objectproperty(object_id, 'IsUserTable') = 1
exec (@cmd)

;
-- Do some further calculations for a more meaningful result
with compressionSavings as
(
select quotename(schema_name) + '.' + quotename(object_name) as table_name
        ,index_id
        ,[size_with_current_compression_setting (KB)]
        ,[size_with_requested_compression_setting (KB)]
        ,cast(case
                when [size_with_current_compression_setting (KB)] = 0 then 0
                else 100.0*(1.0-1.0
                       *[size_with_requested_compression_setting (KB)]
                       /[size_with_current_compression_setting (KB)])
              end as decimal(6,2)) as [Estimated Savings (%)]
from #comp
)
select cs.table_name
       ,isnull(i.name, i.type_desc) as index_name
       ,cs.[size_with_current_compression_setting (KB)]
       ,cs.[size_with_requested_compression_setting (KB)]
       ,cs.[Estimated Savings (%)]
   from compressionSavings as cs
        left outer join sys.indexes as i
                     on i.index_id = cs.index_id
                    and i.object_id = object_id(cs.table_name, 'U')
order by cs.[Estimated Savings (%)] desc

-- Get rid of the temp table
drop table #comp
go

The script calculates the impact of Page level compression but may easily be adapted to consider Row level compression instead. Please read the comments inside the script. Also, please notice that the script will only run on SQL Server Enterprise and Developer edition. All other editions don’t provide the opportunity for data compression.

Here’s a partial result retrieved from running the script against the AdventureWorksDW2008R2 database.

If you execute the script, please be aware that it may produce some extensive I/O. Running the script against your production database at business hours wouldn’t be a very good idea therefore.

Tuesday, February 22, 2011

SQL Server Start Time

Have you tried finding out the time, your SQL Server instance has been started? There are some sophisticated solutions, like the one from Tracy Hamlin (twitter), which takes advantage of the fact that tempdb is re-created every time, SQL Server starts. Her solution goes like this:

select create_date
from sys.databases where database_id=2

Another answer to the question, I’ve seen sometimes on the internet queries the login time for any of the system processes:

select login_time
from sys.dm_exec_sessions where session_id=1

This was my preferred way – until yesterday, when I discovered the following simple method:

select sqlserver_start_time
from sys.dm_os_sys_info

Easy, isn’t it? Interestingly though, every of the above three queries yields a different result. Here’s a query with a sample output:

select (select sqlserver_start_time
          from sys.dm_os_sys_info) as sql_server_start_time
      ,(select create_date
          from sys.databases where database_id=2) as tempdb_creation_time
      ,(select login_time
          from sys.dm_exec_sessions where session_id=1) as sysprocess_login_time

Result:

It seems the SQL Server service must be started first. Only after the service is running, tempdb is created followed by a subsequent start of all sysprocesses. I can’t imagine that the diverse three times make any difference in practice, e.g. if you try finding out for how many hours your SQL Server instance is running. But out there may be existing applications that have to be aware of the difference.

Sunday, January 16, 2011

Exploring SQL Server Blockings and Timeouts

Last Thursday I was giving a presentation about information collection and evaluation of SQL Server Blockings and Timeouts at the regional PASS chapter meeting in Munich.

You may download the presentation as well as the corresponding scripts here (German only).

Thursday, December 9, 2010

How useful is your backup?

A backup is worth nothing, if you can’t utilize it for restore.

You probably agree with this well known word of wisdom, don’t you?

This week I had to learn another aspect - the hard way: There are many cases that may require a restore. Only one of those cases is recovering a database from a state of failure. Another situation may require access to legacy data of a meanwhile deactivated database, a database that doesn’t exist anymore. In this case only a successful restore from an older backup, although necessary, may not be sufficient. Here’s the story, why.

I was called by a customer who requested me to pull out two documents from a legacy (SQL Server 2000) database. We found an 8 year old backup we could rely on and restore worked well. Great! But then we discovered the documents had been stored in an IMAGE column and nobody had an idea how it had been encoded at the time it was stored. We soon recognized that we need the original application to get access to the documents, only to recognize that nobody had an idea where to find the installation package. Eventually I could find an 8 year old backup which included the old VB6 code. I could have used this code to rebuild the application and also an installation package only with considerable difficulties, because:

The application uses some ActiveX components that I may not be able to find anymore.
The original application run on Windows NT, so I may have to install this OS first, including the required service pack. Even if I find the installation CDs; I doubt, I will be able to find the appropriate device drivers but maybe, I can set up a virtual machine.
I had to install Visual Studio 6, including the latest service pack and I had no idea where to find the installation CDs.
The application may rely on deprecated SQL Server 2000 features. So I have to have a SQL Server 2000 installation on which the existing backup has to be restored. If it gets worse, I might even have to install the appropriate service pack to make the application working. I have no idea which service pack this would be, so there’s a chance I have to experiment.

I recognized that all those steps require a big effort and will take some days to accomplish. If only we had built a virtual machine of at least one legacy client system before replacing all client PCs by newer machines…

So, I’d modify the introductory statement like this:

A backup is worth nothing, if you can’t utilize the data that is contained inside this backup.

Thursday, November 4, 2010

Multiple statistics sharing the same leading column

SQL Server will not prevent you from creating identical indexes or statistics. You may think that everything is under your control, but have you ever added a missing index that has been reported by the query optimizer, or the DTA? If so, you will certainly have created duplicate statistics. Have a look at the following example:

-- create test table with 3 columns
-- filler is only nedded to widen the row
create table T1
(
c1 int not null identity(1,1) primary key
,c2 int not null
,filler nchar(200) not null default '#'
)
go

-- insert 500000 rows with random
-- values between 0 and 49999 for c2
insert T1(c2)
select top(500000) abs(checksum(newid())) % 50000
from sys.fulltext_system_stopwords as a
,sys.fulltext_system_stopwords as b

Before invoking the following SELECT command, allow the actual execution plan being displayed (Strg-M or selecting Query/Include Actual Execution Plan from the menu).

-- get some data
select * from T1 where c2 = 100

If you look at the execution plan, you see that the optimizer complains about a missing index on column c2. The prognosticated improvement is about 99%, so adding the index it’s certainly a good idea. Let’s do it:

-- add missing index
create nonclustered index ix_1 on t1(c2)

Perfect! The query runs much faster now and needs a lot fewer resources. But have a look at the table’s statistics:

You see three statistics, one for the primary key, a second one for our created index ix_1, and a third one that was automatically created during execution plan generation for the first SELECT statement. This is the statistics, named _WA_Sys…. If the AUTO CREATE STATISTICS option is set to ON, the optimizer will add missing statistics automatically. In our little experiment, the optimizer had to generate this column statistics on column c2 in order to make some assumptions about the number of rows that had to be processed.

And here’s the problem: When creating the index on column c2, a statistics on this column is also created, since every index has a corresponding linked statistics. That’s just the way it works. At the time the index was added, the column statistics on c2 (that _WA_Sys… statistics) already existed. If you don’t remove it manually, this statistics will remain there forever, although it is needless now. All it’s good for is to increase maintenance efforts during statistics updates. You can safely remove this statistics by executing:

drop statistics t1._WA_Sys_...

If you didn’t think about this before, there’s a chance that you’ll find some of those superfluous statistics duplicates inside your database(s). Here’s a query that finds index-related and column-statistics that match on the first column. Looking for matches on the first column is sufficient here, since the optimizer only automatically adds missing single-column statistics.

with all_stats as
(
select s.object_id, s.name as stats_name, s.stats_id, s.has_filter
        ,s.auto_created, sc.stats_column_id
        ,sc.column_id, c.name as column_name
    from sys.stats as s
         inner join sys.stats_columns as sc
                 on s.stats_id = sc.stats_id
                and s.object_id = sc.object_id
                and sc.stats_column_id = 1
         inner join sys.columns as c
                 on c.object_id = s.object_id
                and c.object_id = sc.object_id
                and c.column_id = sc.column_id
   where objectproperty(s.object_id, 'IsUserTable') = 1
)
select row_number() over(partition by s1.object_id, s1.column_name order by s1.column_name) as stats_no
      ,object_name(s1.object_id) as table_name
      ,s1.stats_name as stats_name
      ,s2.stats_name as identical_stats_name
      ,s2.column_name
from all_stats as s1
       inner join all_stats as s2
               on s1.object_id = s2.object_id
              and s1.stats_id != s2.stats_id
              and s1.stats_column_id = s2.stats_column_id
              and s1.column_id = s2.column_id
              and s1.has_filter = s2.has_filter
              and s1.auto_created != s2.auto_created

With that query at hand, you may easily find redundant statistics. Here’s a sample output:

If you’d like to find out more about SQL Server statistics, you may want to check out my series of two articles, published recently on the Simple-Talk platform.

Part 1: Queries, Damned Queries and Statistics
Part 2: SQL Server Statistics: Problems and Solutions

It’s free, so you might want to give it a try. Feel free to vote, if you like it!

SQL Server Online

Pages