Baselining SQL Server for Azure

When migrating SQL Server workloads to Azure there is a trade-off between optimizing for performance and optimizing for costs. You need to know how much work your sql servers are performing when evaluating them prior to a migration.

Ideally you will have a tool that will capture key metrics such as your servers wait statistics as well as resource consumption, however not all of us have that luxury, and they can often be hard to aggregate and report from. This post will show how you can capture a lot of these metrics from within sql server negating the need for a monitoring tool and at the same time giving you an easy repository to analyse from.

To start of you need to know what the metrics are you need to capture, these typically include:

CPU – example perfmon counters
- \Processor Information(_Total)% Processor Time
- \Process(sqlservr)% Processor Time
Memory
Storage
- \LogicalDisk\Disk Reads/Sec (read IOPS)
- \LogicalDisk\Disk Writes/Sec (write IOPS)
- \LogicalDisk\Disk Read Bytes/Sec (read throughput requirements for the data, log, and tempdb files)
- \LogicalDisk\Disk Write Bytes/Sec (write throughput requirements for the data, log, and tempdb files)

As documented here – https://learn.microsoft.com/en-us/azure/azure-sql/virtual-machines/windows/performance-guidelines-best-practices-collect-baseline?view=azuresql-vm

Perfmon is one way of capturing this, but it is also readily available from within sql server itself.

CPU

For CPU you can use sys.dm_os_ring_buffers DMV and specifically type ‘RING_BUFFER_SCHEDULER_MONITOR’ which contains cpu utilisation captured every 60 seconds to give you the SQL process utilisation.

–CPU

    WITH SQLProcessCPU
    AS (SELECT TOP (30)
               SQLProcessUtilization AS ‘SQL_CPU_Usage’,
               SystemIdle AS ‘CPU_Idle’,
               100 – y.SystemIdle – y.SQLProcessUtilization AS ‘Other_CPU_USage’,
               ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ‘row_number’
        FROM
        (
            SELECT record.value(‘(./Record/@id)[1]’, ‘int’) AS record_id,
                   record.value(‘(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]’, ‘int’) AS [SystemIdle],
                   record.value(‘(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]’, ‘int’) AS [SQLProcessUtilization],
                   timestamp
            FROM
            (
                SELECT timestamp,
                       CONVERT(XML, record) AS [record]
                FROM sys.dm_os_ring_buffers
                WHERE ring_buffer_type = N’RING_BUFFER_SCHEDULER_MONITOR’ 
                AN D record like ‘%<SystemHealth>%’
  ) AS x
        ) AS y )
    –INSERT INTO [DBA].perf.cpu
    SELECT GETDATE() AS Dt,
           AVG(SQLProcessCPU.SQL_CPU_Usage) AS ‘SQLProcessUtilization_1_Minutes’,
           AVG(SQLProcessCPU.CPU_Idle) AS [AVG_CPU_Idle],
           AVG(SQLProcessCPU.Other_CPU_USage) AS [AVG_Other_Processes]
    FROM SQLProcessCPU
    WHERE row_number
    BETWEEN 1 AND 2;

–CPU

WITH SQLProcessCPU

AS (SELECT TOP (30)

SQLProcessUtilization AS ‘SQL_CPU_Usage’,

SystemIdle AS ‘CPU_Idle’,

100 – y.SystemIdle – y.SQLProcessUtilization AS ‘Other_CPU_USage’,

ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ‘row_number’

FROM

(

SELECT record.value(‘(./Record/@id)[1]’, ‘int’) AS record_id,

record.value(‘(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]’, ‘int’) AS [SystemIdle],

record.value(‘(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]’, ‘int’) AS [SQLProcessUtilization],

timestamp

FROM

(

SELECT timestamp,

CONVERT(XML, record) AS [record]

FROM sys.dm_os_ring_buffers

WHERE ring_buffer_type = N‘RING_BUFFER_SCHEDULER_MONITOR’

AN D record like ‘%<SystemHealth>%’

) AS x

) AS y )

–INSERT INTO [DBA].perf.cpu

SELECT GETDATE() AS Dt,

AVG(SQLProcessCPU.SQL_CPU_Usage) AS ‘SQLProcessUtilization_1_Minutes’,

AVG(SQLProcessCPU.CPU_Idle) AS [AVG_CPU_Idle],

AVG(SQLProcessCPU.Other_CPU_USage) AS [AVG_Other_Processes]

FROM SQLProcessCPU

WHERE row_number

BETWEEN 1 AND 2;

Memory

Memory performance metrics can be found by querying the sys.dm_os_performance_counters and sys.dm_os_sys_memory DMV’s

–Memory 
 
    –INSERT INTO [DBA].perf.memory 
    SELECT GETDATE() AS Dt, 
           pc.object_name, 
           pc.counter_name, 
           pc.instance_name, 
           pc.cntr_value, 
           pc.cntr_type 
    FROM sys.dm_os_performance_counters pc 
    WHERE pc.counter_name IN ( ‘Target Server Memory (KB)’, ‘Total Server Memory (KB)’, ‘Lazy writes/sec’, 
                               ‘page life expectancy’ 
                             ) 
          AND pc.object_name NOT LIKE ‘%:Buffer Node%’ 
    UNION 
    SELECT GETDATE() AS Dt, 
           ‘sys.dm_os_sys_memory’ AS [object_name], 
           ‘available_physical_memory_kb’ AS counter_name, 
           ‘OS’ AS [instance_name], 
           available_physical_memory_kb AS [cntr_value], 
           ” AS [cntr_type] 
    FROM sys.dm_os_sys_memory;

—Memory

—INSERT INTO [DBA].perf.memory

SELECT GETDATE() AS Dt,

pc.object_name,

pc.counter_name,

pc.instance_name,

pc.cntr_value,

pc.cntr_type

FROM sys.dm_os_performance_counters pc

WHERE pc.counter_name IN ( ‘Target Server Memory (KB)’, ‘Total Server Memory (KB)’, ‘Lazy writes/sec’,

‘page life expectancy’

)

AND pc.object_name NOT LIKE ‘%:Buffer Node%’

UNION

SELECT GETDATE() AS Dt,

‘sys.dm_os_sys_memory’ AS [object_name],

‘available_physical_memory_kb’ AS counter_name,

‘OS’ AS [instance_name],

available_physical_memory_kb AS [cntr_value],

” AS [cntr_type]

FROM sys.dm_os_sys_memory;

Batch Requests/second

 –SQL Server Metrics 
 
    DECLARE @v1 BIGINT, 
            @delay SMALLINT = 2, 
            @time DATETIME; 
 
    SELECT @time = DATEADD(SECOND, @delay, ’00:00:02′); 
 
    SELECT @v1 = cntr_value 
    FROM master.sys.dm_os_performance_counters 
    WHERE counter_name = ‘Batch Requests/sec’; 
 
    WAITFOR DELAY @time; 
 
   /* INSERT INTO perf.SQLServer 
    ( 
        Dt, 
        Instance, 
        Metric, 
        Metric_Value 
    ) */
    SELECT GETDATE() AS [Dt], 
           @@SERVERNAME AS [Instance], 
           ‘BatchRequests/sec’ AS [Metric], 
           (cntr_value – @v1) / @delay AS [Metric_Value] 
    FROM master.sys.dm_os_performance_counters 
    WHERE counter_name = ‘Batch Requests/sec’;

—SQL Server Metrics

DECLARE @v1 BIGINT,

@delay SMALLINT = 2,

@time DATETIME;

SELECT @time = DATEADD(SECOND, @delay, ’00:00:02′);

SELECT @v1 = cntr_value

FROM master.sys.dm_os_performance_counters

WHERE counter_name = ‘Batch Requests/sec’;

WAITFOR DELAY @time;

/* INSERT INTO perf.SQLServer

(

Dt,

Instance,

Metric,

Metric_Value

) */

SELECT GETDATE() AS [Dt],

@@SERVERNAME AS [Instance],

‘BatchRequests/sec’ AS [Metric],

(cntr_value – @v1) / @delay AS [Metric_Value]

FROM master.sys.dm_os_performance_counters

WHERE counter_name = ‘Batch Requests/sec’;

Storage

I have taken Tim Radney’s modification of Paul Randal’s IO Script https://www.sqlskills.com/blogs/tim/capturing-throughput-usage-in-sql-server/

 –Storage

    /*============================================================================
  File:     ShortPeriodIOLatencies.sql
 
  Summary:  Short snapshot of I/O latencies
 
  SQL Server Versions: 2005 onwards
——————————————————————————
  Written by Paul S. Randal, SQLskills.com
 
  (c) 2014, SQLskills.com. All rights reserved.
 
  For more scripts and sample code, check out http://www.SQLskills.com
 
  You may alter this code for your own *non-commercial* purposes (e.g. in a
  for-sale commercial tool). Use in your own environment is encouraged.
  You may republish altered code as long as you include this copyright and
  give due credit, but you must obtain prior permission before blogging
  this code.
 
  THIS CODE AND INFORMATION ARE PROVIDED “AS IS” WITHOUT WARRANTY OF
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.

  OF – Modified to include Tim Radneys analysis for azure storage
============================================================================*/

IF EXISTS
    (
        SELECT *
        FROM [tempdb].[sys].[objects]
        WHERE [name] = N’##SQLskillsStats1′
    )
        DROP TABLE [##SQLskillsStats1];

    IF EXISTS
    (
        SELECT *
        FROM [tempdb].[sys].[objects]
        WHERE [name] = N’##SQLskillsStats2′
    )
        DROP TABLE [##SQLskillsStats2];


    SELECT [database_id],
           [file_id],
           [num_of_reads],
           [io_stall_read_ms],
           [num_of_writes],
           [io_stall_write_ms],
           [io_stall],
           [num_of_bytes_read],
           [num_of_bytes_written],
           [file_handle]
    INTO ##SQLskillsStats1
    FROM sys.dm_io_virtual_file_stats(NULL, NULL);

    WAITFOR DELAY 00:02:00 –2 minutes;

     SELECT [database_id],
           [file_id],
           [num_of_reads],
           [io_stall_read_ms],
           [num_of_writes],
           [io_stall_write_ms],
           [io_stall],
           [num_of_bytes_read],
           [num_of_bytes_written],
           [file_handle]
    INTO ##SQLskillsStats2
    FROM sys.dm_io_virtual_file_stats(NULL, NULL);


    WITH [DiffLatencies]
    AS (SELECT
            — Files that werent in the first snapshot
            [ts2].[database_id],
            [ts2].[file_id],
            [ts2].[num_of_reads],
            [ts2].[io_stall_read_ms],
            [ts2].[num_of_writes],
            [ts2].[io_stall_write_ms],
            [ts2].[io_stall],
            [ts2].[num_of_bytes_read],
            [ts2].[num_of_bytes_written]
        FROM [##SQLskillsStats2] AS [ts2]
            LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
                ON [ts2].[file_handle] = [ts1].[file_handle]
        WHERE [ts1].[file_handle] IS NULL
        UNION
        SELECT
            — Diff of latencies in both snapshots
            [ts2].[database_id],
            [ts2].[file_id],
            [ts2].[num_of_reads] – [ts1].[num_of_reads] AS [num_of_reads],
            [ts2].[io_stall_read_ms] – [ts1].[io_stall_read_ms] AS [io_stall_read_ms],
            [ts2].[num_of_writes] – [ts1].[num_of_writes] AS [num_of_writes],
            [ts2].[io_stall_write_ms] – [ts1].[io_stall_write_ms] AS [io_stall_write_ms],
            [ts2].[io_stall] – [ts1].[io_stall] AS [io_stall],
            [ts2].[num_of_bytes_read] – [ts1].[num_of_bytes_read] AS [num_of_bytes_read],
            [ts2].[num_of_bytes_written] – [ts1].[num_of_bytes_written] AS [num_of_bytes_written]
        FROM [##SQLskillsStats2] AS [ts2]
            LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
                ON [ts2].[file_handle] = [ts1].[file_handle]
        WHERE [ts1].[file_handle] IS NOT NULL)
    INSERT INTO DBA.perf.storage
    (
        Dt,
        DB,
        Drive,
        Type_Desc,
        Reads,
        Writes,
        ReadLatency_ms,
        WriteLatency_ms,
        AvgBPerRead,
        AvgBPerWrite,
        Physical_Name,
        num_of_mb_written,
        num_of_mb_read,
        mb_per_sec_Written,
        mb_per_Sec_Read
    )
    SELECT GETDATE() AS Dt,
           DB_NAME([vfs].[database_id]) AS [DB],
           LEFT([mf].[physical_name], 2) AS [Drive],
           [mf].[type_desc],
           [num_of_reads] AS [Reads],
           [num_of_writes] AS [Writes],
           [ReadLatency(ms)] = CASE
                                   WHEN [num_of_reads] = 0 THEN
                                       0
                                   ELSE
           ([io_stall_read_ms] / [num_of_reads])
                               END,
           [WriteLatency(ms)] = CASE
                                    WHEN [num_of_writes] = 0 THEN
                                        0
                                    ELSE
           ([io_stall_write_ms] / [num_of_writes])
                                END,
           — [Latency] =
           — CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
           — THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,
           [AvgBPerRead] = CASE
                               WHEN [num_of_reads] = 0 THEN
                                   0
                               ELSE
           ([num_of_bytes_read] / [num_of_reads])
                           END,
           [AvgBPerWrite] = CASE
                                WHEN [num_of_writes] = 0 THEN
                                    0
                                ELSE
           ([num_of_bytes_written] / [num_of_writes])
                            END,
           — [AvgBPerTransfer] =
           — CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
           — THEN 0 ELSE
           — (([num_of_bytes_read] + [num_of_bytes_written]) /
           — ([num_of_reads] + [num_of_writes])) END,
           [mf].[physical_name],
           (num_of_bytes_written / 1048576) AS num_of_mb_written,
           (num_of_bytes_read / 1048576) AS num_of_mb_read,
           (num_of_bytes_written / 1048576 / 120) AS mb_per_sec_Written,
           (num_of_bytes_read / 1048576 / 120) AS mb_per_sec_read
    FROM [DiffLatencies] AS [vfs]
        JOIN sys.master_files AS [mf]
            ON [vfs].[database_id] = [mf].[database_id]
               AND [vfs].[file_id] = [mf].[file_id]
    — ORDER BY [ReadLatency(ms)] DESC
    ORDER BY [WriteLatency(ms)] DESC;


    — Cleanup
    IF EXISTS
    (
        SELECT *
        FROM [tempdb].[sys].[objects]
        WHERE [name] = N’##SQLskillsStats1′
    )
        DROP TABLE [##SQLskillsStats1];

    IF EXISTS
    (
        SELECT *
        FROM [tempdb].[sys].[objects]
        WHERE [name] = N’##SQLskillsStats2′
    )
        DROP TABLE [##SQLskillsStats2];

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

—Storage

/*============================================================================

File: ShortPeriodIOLatencies.sql

Summary: Short snapshot of I/O latencies

SQL Server Versions: 2005 onwards

——————————————————————————

Written by Paul S. Randal, SQLskills.com

For more scripts and sample code, check out http://www.SQLskills.com

You may alter this code for your own *non-commercial* purposes (e.g. in a

for-sale commercial tool). Use in your own environment is encouraged.

You may republish altered code as long as you include this copyright and

give due credit, but you must obtain prior permission before blogging

this code.

THIS CODE AND INFORMATION ARE PROVIDED “AS IS” WITHOUT WARRANTY OF

ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED

TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A

PARTICULAR PURPOSE.

OF – Modified to include Tim Radneys analysis for azure storage

============================================================================*/

IF EXISTS

(

SELECT *

FROM [tempdb].[sys].[objects]

WHERE [name] = N‘##SQLskillsStats1’

)

DROP TABLE [##SQLskillsStats1];

IF EXISTS

(

SELECT *

FROM [tempdb].[sys].[objects]

WHERE [name] = N‘##SQLskillsStats2’

)

DROP TABLE [##SQLskillsStats2];

SELECT [database_id],

[file_id],

[num_of_reads],

[io_stall_read_ms],

[num_of_writes],

[io_stall_write_ms],

[io_stall],

[num_of_bytes_read],

[num_of_bytes_written],

[file_handle]

INTO ##SQLskillsStats1

FROM sys.dm_io_virtual_file_stats(NULL, NULL);

WAITFOR DELAY 00:02:00 —2 minutes;

SELECT [database_id],

[file_id],

[num_of_reads],

[io_stall_read_ms],

[num_of_writes],

[io_stall_write_ms],

[io_stall],

[num_of_bytes_read],

[num_of_bytes_written],

[file_handle]

INTO ##SQLskillsStats2

FROM sys.dm_io_virtual_file_stats(NULL, NULL);

WITH [DiffLatencies]

AS (SELECT

— Files that werent in the first snapshot

[ts2].[database_id],

[ts2].[file_id],

[ts2].[num_of_reads],

[ts2].[io_stall_read_ms],

[ts2].[num_of_writes],

[ts2].[io_stall_write_ms],

[ts2].[io_stall],

[ts2].[num_of_bytes_read],

[ts2].[num_of_bytes_written]

FROM [##SQLskillsStats2] AS [ts2]

LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]

ON [ts2].[file_handle] = [ts1].[file_handle]

WHERE [ts1].[file_handle] IS NULL

UNION

SELECT

— Diff of latencies in both snapshots

[ts2].[database_id],

[ts2].[file_id],

[ts2].[num_of_reads] – [ts1].[num_of_reads] AS [num_of_reads],

[ts2].[io_stall_read_ms] – [ts1].[io_stall_read_ms] AS [io_stall_read_ms],

[ts2].[num_of_writes] – [ts1].[num_of_writes] AS [num_of_writes],

[ts2].[io_stall_write_ms] – [ts1].[io_stall_write_ms] AS [io_stall_write_ms],

[ts2].[io_stall] – [ts1].[io_stall] AS [io_stall],

[ts2].[num_of_bytes_read] – [ts1].[num_of_bytes_read] AS [num_of_bytes_read],

[ts2].[num_of_bytes_written] – [ts1].[num_of_bytes_written] AS [num_of_bytes_written]

FROM [##SQLskillsStats2] AS [ts2]

LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]

ON [ts2].[file_handle] = [ts1].[file_handle]

WHERE [ts1].[file_handle] IS NOT NULL)

INSERT INTO DBA.perf.storage

(

Dt,

DB,

Drive,

Type_Desc,

Reads,

Writes,

ReadLatency_ms,

WriteLatency_ms,

AvgBPerRead,

AvgBPerWrite,

Physical_Name,

num_of_mb_written,

num_of_mb_read,

mb_per_sec_Written,

mb_per_Sec_Read

)

SELECT GETDATE() AS Dt,

DB_NAME([vfs].[database_id]) AS [DB],

LEFT([mf].[physical_name], 2) AS [Drive],

[mf].[type_desc],

[num_of_reads] AS [Reads],

[num_of_writes] AS [Writes],

[ReadLatency(ms)] = CASE

WHEN [num_of_reads] = 0 THEN

ELSE

([io_stall_read_ms] / [num_of_reads])

END,

[WriteLatency(ms)] = CASE

WHEN [num_of_writes] = 0 THEN

ELSE

([io_stall_write_ms] / [num_of_writes])

END,

— [Latency] =

— CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)

— THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,

[AvgBPerRead] = CASE

WHEN [num_of_reads] = 0 THEN

ELSE

([num_of_bytes_read] / [num_of_reads])

END,

[AvgBPerWrite] = CASE

WHEN [num_of_writes] = 0 THEN

ELSE

([num_of_bytes_written] / [num_of_writes])

END,

— [AvgBPerTransfer] =

— CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)

— THEN 0 ELSE

— (([num_of_bytes_read] + [num_of_bytes_written]) /

— ([num_of_reads] + [num_of_writes])) END,

[mf].[physical_name],

(num_of_bytes_written / 1048576) AS num_of_mb_written,

(num_of_bytes_read / 1048576) AS num_of_mb_read,

(num_of_bytes_written / 1048576 / 120) AS mb_per_sec_Written,

(num_of_bytes_read / 1048576 / 120) AS mb_per_sec_read

FROM [DiffLatencies] AS [vfs]

JOIN sys.master_files AS [mf]

ON [vfs].[database_id] = [mf].[database_id]

AND [vfs].[file_id] = [mf].[file_id]

— ORDER BY [ReadLatency(ms)] DESC

ORDER BY [WriteLatency(ms)] DESC;

— Cleanup

IF EXISTS

(

SELECT *

FROM [tempdb].[sys].[objects]

WHERE [name] = N‘##SQLskillsStats1’

)

DROP TABLE [##SQLskillsStats1];

IF EXISTS

(

SELECT *

FROM [tempdb].[sys].[objects]

WHERE [name] = N‘##SQLskillsStats2’

)

DROP TABLE [##SQLskillsStats2];

It’s important to note here the time interval of 2 minutes. If you are less worried about bursty throughput 5 minutes is a good interval, but you must be aware it can aggregate throughput over 2 minutes to give the appearance of lower throughput. See the example below:

Max of 1150mbps read at the time shown: