1. Non-Blocking Transformations, Streaming and Row-Based
Most of the SSIS Transformations are non-blocking.
This means that the transformation logic that is applied in the transformation
does not impede(block) the data from moving on to the next transformation
after the transformation logic is applied to the row.
The difference is whether the SSIS Transformation can use internal information and processes to handle its work(Streaming)
or whether the transformation has to call an external process to retrieve information it needs for the work(Row-Based).
2. Semi-Blocking Transformations
Hold up records in the Data Flow for a period of time before allowing the memory buffers to be passed downstream.
3. Blocking Transformations
For one reason or another, these components require a complete review of the upstream data
before releasing any row downstream to the connected transformations and destinations.
另一种分法,分同步和异步
Synchronous and Asynchronous Transformation Outputs
A transformation output is asynchronous
if the buffers used in the input are different from the buffers used in the output.
(The LineageIDs are different)
all of the semi-blocking and blocking transformations have asynchronous outputs by definition
— none of them can pass input buffers on downstream because the data is held up for processing and reorganized.
With the exception of the Union All, all of the non-blocking transformations have synchronous outputs.
在设计时候的建议
Data Flow Design Practices
1. Limit synchronicity (要异步,并发,利用DataFlow)
2. Reduce staging and disk IO
3. Reduce the reliance on an RDBM(SSSIS said it would have better performance in Data Flow than in RDBMS.)
一些优化的建议
Optimizing Buffers, Execution Trees, and Engine Threads
1. Remove unused columns
2. Add the available use of more execution threads for the Data Flow(设置参数,增加线程数)
3. Avoid Row-Based Transformations
4. When not enough RAM is available in the system, the blocking transformations may also require temporary disk storage.(注意内存)
监控
Pipeline Performance Monitoring
Use the Windows operating system tool called Performance Monitor (PerfMon for short)
Choose the SQLServer:SSIS Pipeline object
The Buffer Memory counter shows the total memory being used by SSIS
and can be compared with the amount of available system memory
to know if SSIS processing is bottlenecked by the available physical memory.
OLE DB Destination的说明
Rows Per Batch & Maximum Insert Commit Size
A few options of note here are the Rows Per Batch option,
which specifies how many rows are in each batch sent to the destination,
and another option is the Maximum Insert Commit Size,
which specifies how large the batch size will be prior to issuing a commit statement.
FAST LOAD
A common question is what is the difference between fast load and the normal load (table or view option) for the OLE DB Destination.
The Fast Load option specifies that SSIS will load data in bulk into the OLE DB Destination's target table.
Because this is a bulk operation, error handling via a redirection or ignoring of the data errors is not allowed.