Sunday, October 17, 2010

Which counters I should use to troubleshoot my sql server hardware resources?

Performance monitor is one of the best tools you can use to look how your sql server is coping up with the resources.
If you type and run perfmon.exe from your windows OS, you have already taken a first-rate step to dive into your monitoring process.
Question normally comes to our mind is what counters should we use to monitor usage of resources for such a powerful application like SQL Server?
Whether you are a DBA or operation system engineer who is helping DBA, it is always handy to keep a user defined data collector set to monitor your sql server resources.
You can use this set to create your benchmark or you can also use this to troubleshoot performance from time to time.
I normally keep this collector set handy, so that whenever I visit the client site and if required I can just upload these counters and start collecting the data. I keep the collection interval to the default of 15 seconds.
I am allotting the performance counters as for each hardware resources here. When we think from hardware perspective, we think of Memory, CPU and Storage (IO). I do not want to go to Networking here, Each of these counters can give you quick insight of any potential bottlenecks.
MEMORY
Object
Counter
Threshold
Memory
Available Bytes
Set according to your system
Memory
Pages/sec
< 50
SQL Server: Buffer Manager
Checkpoint Pages/sec
< 30 pages per second
SQL Server: Buffer Manager
Buffer cache hit ratio
>= 90%(normally it should be 99-100% for healthy system)
SQL Server: Buffer Manager
Page Life Expectancy
>300 sec
SQL Server: Buffer Manager
Lazy Writes/sec
< 20



CPU
Object
Counter
Threshold
Processor(_Total)
% Processor time

< 80%
Processor(_Total)
% Privileged Time

< 10%
System
Context Switches/sec

<1000
SQL Server:Statistics
Batch Requests/sec

Set according to your system and compare with the benchmark value
SQL Server: Statistics
SQL Compilation/sec
>100


STORAGE
Object
Counter
Threshold
Physical Disk
Avg. Disk Sec/Read

<= 10 ms is good
>10 ms and <=15 ms is tolerable
>20 ms is very bad
Physical Disk
Avg. Disk Sec/Write

<= 10 ms is good
>10 ms and <=20 ms is tolerable
>20 ms is very bad
Physical Disk
% Disk Time

<85%


enjoy! :) :)

No comments:

Post a Comment