Monday, April 22, 2013

My SSIS performance is slow! What can I do apart from sneaking into TSQL behind each package!

Hi Guys, it’s been while I posted something interesting but lately I experienced the scenario dealing with SSIS performance tuning and I thought it’s a good chance to share some knowledge with you!
I am sure, as a DBA's we come across scenarios where we get complaints like SSIS ETL data loading exceeds allowable time threshold etc etc
I am also sure, we try to leverage things like sneaking into queries running behind each step or turn on performance counters on source and destination sql server
But there is something else, we can look into and that is SSIS buffer usage!!
I would like to split this article into two major parts
1. Transformation
I would prefer not to just say to my client that SSIS design is not efficient. Rather I would say how efficient transformation behaviour for each tree in the SSIS package is :)!

Let’s take one step further, where we would like to collect how SSIS plays with its buffer when executing the threads in each tree. I am highlighting each sentence as a bullet point for better understanding!
·        We all are familiar with Buffer, but here I am focusing more on SSIS buffer instead of SQL Server DB engine buffer.
·        SSIS takes advantage of their in- memory structure (Buffer) in a same way as SQL engine does. Basically to avoid loading data all the time from the source, It controls the data using buffer as part of the transformation from source and destination
·        Behind SSIS, we have logic of execution trees and each tree has its own in memory structure a.k.a buffer to play with.
·        There can be multiple threads running for each tree.
·        Goal of design should be to avoid overhead of data into new buffer every time!

2. Buffer Sizing
To utilise buffer to its maximum potential, SSIS makes certain decisions, which are based on 4 input parameters:
Estimated Row Size: SSIS computes this value based on metadata
DefaultMaxBufferRows: setting which can be change (10,000 records)
DefaultMaxBufferSize: setting which can be change (10 MB)
MinBufferSize: setting which can be change (65536 bytes)

If Estimated Row Size * DefaultMaxBufferRows > MaxBufferSize
{
     SSIS reduces total number of rows which can be store in buffer
}
Else
{
SSIS increases the total number of rows which can be store in buffer
}
So next time, If you turn on package logging with the BufferSizeTuning property enabled, you can get better picture of where and how buffer size changes and you can adjust the values of DefaultMaxBufferRows and DefaultMaxBufferSize to optimize the number of rows in buffer