Using this connection manager, the Dimension Processing Destination Editor accesses all the dimensions in the source and displays them as a list for you to select the one you want to pro
Trang 1To train the data mining models using this destination, you need a connection to SQL Server Analysis Services, where the mining structure and the mining models reside For this, you can use Analysis Services Connection Manager to connect to an instance of Analysis Services or to the Analysis Services project The Data Mining Model Training Editor has two tabs, Connection and Columns, in which you can configure the required properties In the Connection tab, you specify the connection manager for Analysis Services in the Connection Manager field and then specify the mining structure that contains the mining models you want this data to train Once you select a mining structure in the Mining structure field, the list of mining models
is displayed in the Mining models area, and this destination adapter will train all the models contained within the specified mining structure In the Columns tab, you can map available input columns to the Mining structure columns The processing of the mining model requires data to be sorted, which you can achieve by adding a sort transformation before the data mining model training destination
DataReader Destination
When your ADO.NET–compliant application needs to access data from the data flow
of an SSIS package, you can use the DataReader destination Integration Services can provide data straight from the pipeline to your ADO.NET application in cases where you need dynamic processing to happen when users request using the ADO.NET DataReader interface SSIS data processing extension facilitates provision of the data via the DataReader destination An excellent use of the DataReader destination is as a data source for an SSRS report
The DataReader destination doesn’t have a custom UI but uses the Advanced Editor
to expose all the properties organized in three tabs You can specify Name, Description, LocaleID, and ValidateExternalMetadata properties in the Common Properties section
of the Component Properties tab In the Custom Properties section, you can specify a ReadTimeout value in milliseconds, and if this value is exceeded, you can choose to fail the component in the FailOnTimeout field
In the Input Columns tab, you can select the columns you want to output, assign each
of them an output alias, and specify a usage type of READONLY or READWRITE from the drop-down list box Finally, the Input And Output Properties tab lists only the input column details, as DataReader destination has only one input and no error output
Dimension Processing Destination
One of the frequent uses of Integration Services is to load data warehouse dimensions using the dimension processing destination This destination can be used to load and process an SQL Server Analysis Services dimension Being a destination, it has no output and one input, and it does not support an error output
Trang 2The dimension processing destination has a custom user interface, but the Advanced
Editor can also be used to modify properties that are not available in the custom editor
In the Dimension Processing Destination Editor, the properties are grouped logically
in three different pages In the Connection Manager page, you can specify the connection manager for Analysis Services to connect to the Analysis Services server or an Analysis
Services project Using this connection manager, the Dimension Processing Destination
Editor accesses all the dimensions in the source and displays them as a list for you to
select the one you want to process Next you can choose the processing method from
add (incremental), full, or update options In the Mappings page, you can map the
Available Input Columns to the Available Destination Columns using a
drag-and-drop operation
The Advanced page allows you to configure error handling in the dimension
processing destination You can choose from several options to configure the way you
want the errors to be handled:
By default, this destination will use default Analysis Services error handling that
c
you can change by un-checking the Use Default Error Configuration check box
When the dimension processing destination processes a dimension to populate
c
values from the underlying columns, an unacceptable key value may be encountered
In such cases, you can use the Key Error Action field to specify that the record be
discarded by selecting the DiscardRecord value, or you can convert the unacceptable
key value to the UnknownMember value UnknownMember is a property of the
analysis services dimension indicating that the supporting column doesn’t have a value
Next you can specify the processing error limits and can choose to either ignore
c
errors or stop on error If you select Stop On Error option, then you can specify
the error threshold using the Number Of Errors option Also, you can specify
the on error action either to stop processing or to stop logging when the error
threshold is reached by selecting the StopProcessing or StopLogging value
You can also specify specific error conditions such as these:
c
When the destination raises an error of Key Not Found, you can select it to be c
IgnoreError or ReportAndStop, whereas, by default, it is ReportAndContinue
Similarly, you can configure for Duplicate Key error for which default action
c
is to IgnoreError You can set it to ReportAndStop or ReportAndContinue if
you wish
When a null key is converted to the UnknownMember value, you can choose
c
to ReportAndStop or ReportAndContinue By default, the destination will
IgnoreError
Trang 3When a null key value is not allowed in data, this destination will c
ReportAndContinue by default However, you can set it to IgnoreError
or ReportAndStop
You can specify a path for the error log using the Browse button
c
Excel Destination
Using the Excel destination, you can output data straight to an Excel workbook, worksheets, or ranges You use an Excel Connection Manager to connect to an Excel workbook Like an Excel Source, the Excel destination treats the worksheets and ranges in an Excel workbook as tables or views The Excel destination has one regular input and one error output
This destination has its own custom user interface that you can use to configure its properties; the Advanced Editor can also be used to modify the remaining properties The Excel Destination Editor lists its properties in three different pages
In the Connection Manager page, you can select the name of the connection manager from the drop-down list in the OLE DB Connection Manager field Then you can choose one of these three data access mode options:
Table or view
c Lets the Excel destination load data in the Excel worksheet or named range; specify the name of the worksheet or the range in the Name Of The Excel Sheet field
Table name or view name variable
that the name of the table or view is contained within a variable that you specify
in the Variable Name field
SQL command
c Allows you to load the results of an SQL statement to
an Excel file
In the Mappings page, you can map Available Input Columns to the Available Destination Columns using a drag-and-drop operation In the Error Output page you can configure the behavior of the Excel destination for errors and truncations You can ignore the failure, redirect the data, or fail the component for each of the columns in case of an error or a truncation
Flat File Destination
Every now and then you may require outputting some data from disparate sources to a text file, as this is the most convenient method to share data with external systems You can build an Integration Services package to connect to those disparate sources, extract data using customized extraction rules, and output the required data set to a text file
Trang 4using the flat file destination adapter This destination requires a Flat File Connection
Manager to connect to a text file When you configure a Flat File Connection Manager,
you also configure various properties to specify the type of the file and how the data will
reside in the file For example, you can choose the format of the file to be delimited,
fixed width, or ragged right (also called mixed format) You also specify how the columns
and rows will be delimited and the data type of each column In this way, the Flat File
Connection Manager provides a basic structure to the file, which the destination adapter
uses as is This destination has one output and no error output
The Flat File destination has a simple customized user interface, though you can
also use the Advanced Editor to configure some of the properties In the Flat File
Destination Editor, you can specify the connection manager you want to use for this
destination in the Flat File Connection Manager field and select the check box for
“Overwrite data in the file” if you want to overwrite the existing data in the flat file
Next you are given an opportunity to provide a block of text in the Header field, which
can be added before the data as a header to the file In the Mappings page, you can map Available Input Columns to the Available Destination Columns
OLE DB Destination
You can use the OLE DB destination when you want to load your transformed data
to OLE DB–compliant databases, such as Microsoft SQL Server, Oracle, or Sybase
database servers This destination adapter requires an OLE DB Connection Manager
with an appropriate OLE DB provider to connect to the data destination The OLE
DB destination has one regular input and one error output
This destination adapter has a custom user interface that can be used to configure
most of the properties alternatively you can also use the Advanced Editor In the
OLE DB Destination Editor, you can specify an OLE DB connection manager in
the Connections Manager page If you haven’t configured an OLE DB Connection
Manager in the package yet, you can create a new connection by clicking New Once
you’ve specified the OLE DB Connection Manager, you can select the data access
mode from the drop-down list Depending on the option you choose, the editor
interface changes to collect the relevant information Here you have five options to
choose from:
Table or view
c You can load data into a table or view in the database specified
by OLE DB Connection Manager Select the table or the view from the
drop-down list in the name of the table or the view field If you don’t already have a
table in the database where you want to load data, you can create a new table by
clicking New An SQL statement for creating a table is created for you when you
click New The columns use the data type and the length same as that of the input
Trang 5columns, which you can change if you want However, if you provide the wrong data type or a shorter column length, you will not be warned and may get errors
at run time If you are happy with the CREATE TABLE statement, all you need
to do is provide a table name replacing the [OLE DB Destination] string after CREATE TABLE in the SQL statement
Table or view—fast load
c The data is loaded into a table or view as in the preceding option; however, you can configure additional options here when you select fast load data access mode The additional fast load options are:
Keep identity
c During loading, the OLE DB destination needs to know whether it has to keep the identity values coming in the data or it has to assign unique values itself to the columns configured to have identity key
Keep nulls
c Tells the OLE DB destination to keep the null values in the data
Table lock
c Acquires a table lock during bulk load operation to speed up the loading process This option is selected by default
Check constraints
c Checks the constraints at the destination table during the data loading operation This option is selected by default
Rows per batch
c Specifies the number of rows in a batch in this box The loading operation handles the incoming rows in batches and the setting in this box will affect the buffer size So, you should test out a suitable value for this field based on the memory available to this process during run time on your server
Maximum insert commit size
box to indicate the maximum size that the OLE DB destination handles
to commit during loading The default value of 2147483647 indicates that these many rows are considered in a single batch and they will be handled together—i.e., they will commit or fail as a single batch Use this setting carefully, taking into consideration how busy your system is and how many rows you want to handle in a single batch A smaller value means more commits and hence the overall loading will take more time; however, if the server is a transactional server hosting other applications, then this might
be a good idea to share resources on the server However, if the server is a dedicated reporting or data mart server or you’re loading at a time when the other activities on the server are less active, then using a higher value in this box will reduce the overall loading time
Make sure you use fast load data access mode when loading with double-byte character set (DBCS) data; otherwise, you may get corrupted data loaded in your table or view The DBCS is a set of characters in which each character is represented by two bytes
Trang 6The environments using ideographic writing systems such as Japanese, Korean, and
Chinese use DBCS, as they contain more characters than can be represented by 256 code
points These double-byte characters are commonly called Unicode characters Examples
of data types that support Unicode data in SQL Server are nchar, nvarchar, and ntext,
whereas Integration Services has DT_WSTR and DT_NTEXT data types to support
Unicode character strings
Table name or view name variable
or view access mode except that in this access mode you supply the name of a
variable in the Variable Name field that contains the name of the table or the view
Table name or view name variable—fast load
table or view—fast load access mode except here you supply the name of a variable
in the Variable Name field that contains the name of the table or the view You
still specify the fast load options in this data access mode
SQL command
c Load the result set of an SQL statement using this option
You can provide the SQL query in the SQL Command Text dialog box or build
a query by clicking Build Query
In the Mappings page, you can map Available Input Columns to the Available
Destination Columns using a drag-and-drop operation, and in the Error Output page,
you can specify the behavior when an error occurs
Partition Processing Destination
The partition processing destination is used to load and process an SQL Server Analysis
Services partition and works like a dimension processing destination This destination
has a custom user interface that is like the one for the dimension processing destination
This destination adapter requires the Analysis Services Connection Manager to connect
to the cubes and its partitions that reside in an Analysis Services server or the Analysis
Services project
The Partition Processing Destination Editor has three pages to configure properties
In the Connection Manager page, you can specify an Analysis Services Connection
Manager and can choose from the three processing methods—Add (incremental) for
incremental processing; Full, which is a default option and performs full processing
of the partition; and Data only to perform update processing of the partition In the
Mappings page, you can map Available Input Columns to the Available Destination
Columns using a drag-and-drop operation In the Advanced page you can configure
error-handling options when various types of errors occur Error-handling options are
similar to those available on the Advanced page of dimension processing destination
Trang 7Raw File Destination
Sometimes you may need to stage data in between processes, for which you will want
to extract data at the fastest possible speed For example, if you have multiple packages that work on a data set one after another—i.e., a package needs to export the data at the end of its operation for the next package to continue its work on the data—a raw file destination and raw file source combination can be excellent choices The raw file destination writes raw data to the destination raw file in an SSIS native form that doesn’t require translation This raw data can be imported back to the system using the raw file source discussed earlier Using the raw file destination to export and raw file source to import data back into the system results in high performance for the staging
or export/import operation However, if you have binary large object (BLOB) data that needs to be handled in such a fashion, Raw File destination cannot help you, as it doesn’t support BLOB objects
The Raw File Destination Editor has two pages to expose the configurable properties The Connection Managers page allows you to select an access mode—File name or File name from variable—to specify how the filename information is provided You can either specify the filename and path in the File Name field directly or you can use a variable to pass these details Note that the Raw File destination doesn’t use a connection manager
to connect to the raw file and hence you don’t specify a connection manager in this page;
it connects to the raw file directly using the specified filename or by reading the filename from a variable
Next, you can choose from the following four options to write data to a file in the Write Option field:
Append
c Lets you use an existing file and append data to the already existing data This option requires that the metadata of the appended data match the metadata of the existing data in the file
Create Always
c This is a default option and always creates a new file using the filename details provided either directly in the File Name field or indirectly in
a variable specified in the Variable Name field
Create Once
c In the situations where you are using the data flow inside a repeating logic—i.e., inside a loop container—you may want to create a new file in the first iteration of the loop and then append the data to the file in the second and higher iterations You can achieve this requirement by using this option
Truncate And Append
c If you’ve an existing raw file that you want to use to write the data into, but want to delete the existing data before the new data is written into it, you can use this option to truncate the existing file first and then append the data to this file
Trang 8In all these options, wherever you use an existing file, the metadata of the data being
loaded to the destination must match with the metadata of the file specified
In the Columns tab, you can select the columns you want to write into the raw file
and assign them an output alias as well
Recordset Destination
Sometimes you may need to take a record set from the data flow to pass it over to
other elements in the package Of course, in this instance you do not want to write to
an external storage and then read from it unnecessarily You can achieve this by using
a variable and the recordset destination that populates an in-memory ADO record set
to the variable at run time
This destination adapter doesn’t have its own custom user interface but uses the
Advanced Editor to expose its properties When you double-click this destination, the
Advanced Editor for Recordset destination opens and displays properties organized in
three tabs In the Component Properties tab, you can specify the name of the variable
to hold the record set in the Variable Name field In the Input Columns tab, you can
select the columns you want to extract out to the variable and assign an alias to each of
the selected column along with specifying whether this is a read-only or a read-write
column As this source has only one input and no error output, the Input And Output
Properties tab lists only the input columns
Script Component Destination
You can use the script component as a data flow destination when you choose Destination
in the Select Script Component Type dialog box On being deployed as a destination, this
component supports only one input and no output, as you know data flow destinations
don’t have an output The script component as a destination is covered in Chapter 11
SQL Server Compact Destination
Integration Services stretches out to give you an SQL Server Compact destination,
enabling your packages to write data straight to an SQL Server Compact database
table This destination uses the SQL Server Compact Connection Manager to connect
to an SQL Server Compact database The SQL Server Compact Connection Manager
lets your package connect to a compact database file, and then you can specify the table
you want to update in an SQL Server Compact destination
You need to create an SQL Server Compact Connection Manager before you can
configure an SQL Server Compact destination This destination does not have a
custom user interface and hence uses the Advanced Editor to expose its properties
When you double-click this destination, the Advanced Editor for SQL Server
Trang 9Compact destination opens with four tabs Choose the connection manager for
a Compact database in the Connection Manager tab Specify the table name you want to update in the Table Name field under the Custom Properties section of the Component Properties tab
In the Column Mappings tab, you can map Available Input Columns to the Available Destination Columns using a drag-and-drop operation The Input and Output Properties tab shows you the External Columns and Input Columns in the Input Collection and the Output Columns in the Error Output Collection SQL Server Compact destination has one input and supports an error output
SQL Server Destination
We have looked at two different ways to import data into SQL Server—using the Bulk Insert Task in Chapter 5 and the OLE DB destination earlier in this chapter Though both are capable of importing data into SQL Server, they suffer from some limitations The Bulk Insert task is a faster way to import data but is a part of the control flow, not the data flow, and doesn’t let you transform data before import The OLE DB destination is part of the data flow and lets you transform the data before import; however, it isn’t the fastest method to import data into SQL Server The SQL Server destination combines benefits of both the components—it lets you transform the data before import and use the speed of the Bulk Insert task to import data into local SQL Server tables and views The SQL Server destination can write data into a local SQL Server only So, if you want to import data faster to an SQL Server table or a view on the same server where the package is running, use an SQL Server destination rather than an OLE DB destination Being a destination adapter, this has one input only and does not support an error output
SQL Server destination has a custom user interface, though you can also use the Advanced Editor to configure its properties In the Connection Manager page of the SQL Destination Editor, you can specify a connection manager, a data source, or a data source view in the Connection Manager field to connect to an SQL Server database Then select a table or view from the drop-down list in the Use A Table Or View field You also have an option to create a new connection manager or a table or view by clicking the New buttons provided In the Mappings page, you can map Available Input Columns
to the Available Destination Columns using a drag-and-drop operation
You specify the Bulk Insert options in the Advanced page of the SQL Destination Editor dialog box You can configure the following ten options in this page:
Keep identity
c This option is not checked by default Check this box to keep the identity values coming in the data rather than using the unique values assigned by SQL Server
Trang 10Keep nulls
c This option is not checked by default Check this box to retain the
null values
Table lock
c This option is checked by default Uncheck this option if you don’t
want to lock the table during loading time This option may impact the availability
of tables being loaded to other applications or users If you want to allow concurrent
use of SQL Server tables that are being loaded by this destination, uncheck this
box; however, if you are running this package at a quiet time—i.e., when no other
applications or users are accessing the tables being loaded, or you do not want to
allow concurrent use of those tables—it is better to leave the default setting
Check constraints
c This option is checked by default This means any constraint
on the table being loaded will be checked during loading time If you’re confident
the data being loaded does not break any constraints and want faster import of
data, you may uncheck this box to save processing overhead of checking constraints
Fire triggers
c This option is not checked by default Check this box to let the bulk insert operation execute insert triggers on target tables during loading Selecting to
execute insert triggers on the destination table may affect the performance of the
loading operation
First row
c Specify a value for the first row from which the bulk insert will start
Last row
c Specify a value in this field for the last row to insert
Maximum number of errors
c Provide a value for the maximum number of rows
that cannot be imported due to errors in data before the bulk insert operation
stops Leave the First Row, Last Row, and Maximum Number Of Errors fields
blank to indicate that you do not want to specify any limits However, if you’re
using the Advanced Editor, use a –1 value to indicate the same
Timeout
c Specify the number of seconds in this field before the bulk insert
operation times out
Order columns
c Specify a comma-delimited list of columns in this field to sort
data on in ascending or descending order
Data Flow Paths
First, think of how you connect tasks in the control flow You click the first task in the
control flow to highlight the task and display a green arrow, representing output from
the task Then you drag the green arrow onto the next task in the work flow to create a
connection between the tasks, represented by the green line by default The green line,
called a precedence constraint, enables you to define some conditions when the following
tasks can be executed In the data flow, you connect the components in the same way you