Tuesday, October 5, 2010

First Three things to do when MS SQL Server performance hits badly

I have experienced so many times at the client sites when manager comes to me and say that their xyz system is really slow right now
or xyz system is running like a dog.Can you please look at it!!
The first thing i do after hearing this is give response like "Ohh! let me have a look" or "Ouch". Now whats next?
In this article I would like to demonstrate what should be next after somebody complains and leave.
SQL Server 2005 and later is blessed with so many tools and built in system objects which can take you in the right direction.
one such feature is "WAIT STATISTICS". There is very useful DMV which you can use to know where your sql waits are and that is

"sys.dm_os_wait_stats "


Make sure you calculate the delta between two snapshot from sys.dm_os_wait_stats  taken at the interval of 5 seconds.
Check the top wait types and take that as a direction for your self to move to the next right path. Atleast you will come to know whether processes are waiting long on IO, CPU, memory or internal latches etc etc.

for example if you see notice that PAGEIOLATCH_xx waits are high then you can be sure that there is lot of physical IO is going on then what your storage system can digest.

simmilarly CXPACKET wait types can take you the conclusion that there are lot of parallel processes are running which could be the reason for inefficiency.

High percentage of signal wait indicates that processors are not keeping up.

Overall the analysis of result from above query will take you to the next right path, which I am going to explain now.

FILE STATISTICS
------------------------------
This is the second thing I look at. This statistics come from the DMV "sys.dm_io_virtual_file_stats".
This DMv shows you the actual read and write perform to the disk for a snapshot of time.
Here also it is important to take a two snapshots and calculate the Rate to get the actual idea about how much IO is perform on a particular drive or database.
I would leave this exercise for you :).Please contact me if you need assistance in that.
What you should be interested in knowing is IO_STALL(in milliseconds) per IO Operation.Anything more than 15 is a concern for disk subsystem.
Once I know that IO is an issue, I can drill down further to look into which queries are expensive in terms of IO.

TOP Queries in terms of CPU, Duration and IO
----------------------------------
SQL Server gives you handy real time view of top expensive queries in terms of IO and CPU.You can either use reports via SSMS or leverage
sys.dm_exec_sql_text(sql_handle) and sys.dm_exec_query_stats to find out the expensive queries.

Following above path can guide you through the rest of journey of troubleshooting.From here you can either start collecting few performance counters or even can run a trace to dig deeper.

No comments:

Post a Comment