1. Trang chủ
  2. » Công Nghệ Thông Tin

Tài liệu Bust a Move with Your SSIS – Passing Package Variables docx

15 367 1

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Bust a move with your SSIS – passing package variables
Tác giả Bill Kenworthy
Trường học Global Knowledge Training LLC
Chuyên ngành Information technology
Thể loại White paper
Năm xuất bản 2007
Định dạng
Số trang 15
Dung lượng 546 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

The goal of this paper is to demonstrate the generation of a table of random data to be used in a development environment, exposing the nuances of passing variable values into various ta

Trang 1

Bust a Move with

Your SSIS – Passing Package

Variables

Expert Reference Series of White Papers

Trang 2

Integration Services (SSIS), the next generation of the Extract, Transport, and Load (ETL) feature included with Microsoft SQL Server 2005 has befuddled Database Administrators This is due to the new programming para-digm and the complexity of the development environment SSIS includes many new capabilities bundled with a Visual Studio front end This paper explores the creation of sample development data featuring the use of the most basic features in this new interface The challenge of declaring and passing package variables contains enough material for an hour’s skill-building session with this new tool

Extract,Transport, and Load (ELT)

The challenge of the new changes of the ETL feature is comparable to a rock band taking over a square dance It’s time to let go, learn to dance to a new beat, and bust some moves with this great new tool, SSIS The com-plexities of this new tool have caused many administrators to retreat to the familiar tempo of DTS However,

as we explore this tool, you will find the creativity available in this new environment compelling enough to bust some moves of your own

The goal of this paper is to demonstrate the generation of a table of random data to be used in a development environment, exposing the nuances of passing variable values into various tasks objects that make up the package

One of the challenges to application development in a database is to have data available for testing and reporting Adding to this challenge are the typical privacy requirements preventing use of actual patient per-sonal identification In my experience, it is always better to develop with test data that is as reflective of the test data as possible, especially when interviewing end users using a prototype of the system under develop-ment I will walk thru the steps needed to create a SSIS package to develop a table of test data

The end result of the execution of the package, the Person table, contains just a couple of challenges The package will be configured with package-level variables that will be passed from one task to another This poses the challenges of declaring the variables and referencing them throughout the package Our goal is to generate random Patient Name and SSN information to be inserted in the Person Table schema shown below

in Step 1 This purpose of this table is to replace a table containing an extract of live production data to facili-tate application testing Overwriting the contents of a table with randomly generated data will obscure the pri-vate details in a database, which are protected by legal constraints, while allowing proper acceptance testing and comparison to an existing system prior to cut over to the new system

Bill Kenworthy, Global Knowledge Instructor, MCDBA

Bust a Move with Your SSIS – Passing Package Variables

Trang 3

The project at hand starts with generation of a simple Database containing five tables and two stored proce-dures A SSIS solution is created to use this database and populate the Person table with random data in three columns: FirstName, LastName, and SSN

The name data is generated from the contents of two driving tables, FName and LName These tables contain the seed data for name generation The result of the execution of the name generation is a table containing an Identity column and columns for First and Last Name The SSN generation fills a separate table and requires no seed data

The name generation takes place inside a For Next Loop that uses a package variable to control the number of rows generated Once the loop completes, execution is passed to a SQL task that runs the SSN generation stored procedure After completion of this task, a Merge Join Task is the last major data manipulation activity The Merge Join, contained in a Data Flow Task, combines the two staging tables into the final product The moves that make this package possible are creating package variables, passing the variable values to the task objects, and careful attention to matching column data types A data dictionary for the development database

is contained in Resource A

The Project

1 Generate tables and stored procedures for the project

2 Creating a SSIS project adding needed variables

3 Add a SQL Task truncating working tables

4 Add a For Next Loop configure looping parameters

5 Call a stored procedure in the loop passing a parameter

6 Generating the SSN data using a execute sql task

7 Add a data flow task

8 Configure a data merge task

The tasks to be accomplished in this project

The following the individual steps of creating the package;

howev-er, the reader will find the configuration entries in the screen shots

are valuable in duplicating this demonstration The database and

its objects are the foundation upon which we build our

transforma-tion The solution requires 2 seed tables and 2 staging tables used

to hold temporary results and a final table holding the merged

con-tents of the two staging tables The database diagram in the

data-base is shown in Figure 1

Step 1 Generate tables and stored procedures for the

project

The data dictionary for this database is contained in the Resources

section of this document The script for generating the schema and

stored procedures is in Resource B The code has hard-coded

refer-ences to the DEV database; the script should be run in the context

of a database with that name

Figure 1 Database Schema for the project

Trang 4

Step 2 Open an Integration

Services Project.

Open your project, then right click on

any clear space on the control flow pane

and choose variables from the context

menu to open the variable declaration

dialog box Add two int32 variables,

Counter and MaxRows, with values of 0

and 1000, as shown in Figure 2 The

Counter variable is used to pass the

cur-rent loop index into the SQL task contained in the loop task that will be added to the project in step 4

MaxRows is the number of rows to be inserted into the Person Table Figure 3 shows the dialog box with appropriate entries

Note: references to variables in this

environment are case sensitive

Resource C contains a reference a topic

in the SQL Server 2005 Books Online describing variables and links to how-to: topics

Step 3 Add a SQL Task truncating working tables

Add a SQL Task to the Control Flow win-dow as the first task in the project and set its parameters as shown in the dia-gram Note the configuration of the ConnectionType as ADO.NET Although this SQL Task doesn’t pass parameters in the SQLStatement property, I like to keep settings of similar objects consistent Specifying ADO.NET as the connection type allows reference to parameters using the @ naming convention The SQL query simply truncates the Person and Name tables An appropriate reference to this object in the books online

is listed in Resource C

Step 4 Add a For Next Loop configure looping parameters

The For Loop container defines a repeating control flow in a package In this package, the For loop is used to repeat the execution of the MakeNames stored procedure until the required number of rows configured in the MaxRow variable are inserted into the Name table The For Loop container uses three elements to define the loop init, eval, and assign(increment) control values As you can see in Figure 4 above, the variable @Counter

is used for indexing in the loop This reference is case-sensitive and must match a package variable name, with the @ prefix necessary in this property page For example, the variable @MaxRows matches the MaxRows package variable An appropriate reference to this object in the books online is listed in Resource C

Figure 2 Variables dialog box

Figure 3 SQL Task to truncate the working tables

Trang 5

Figure 4 Configuring the For Loop Task

Step 5 Call a stored procedure in the loop passing a parameter.

SQL Task configured with connection type ADO.NET, calls stored procedure MakeAName Note my preference for the property, ConnectionType Each connection type supports a different syntax for passing parameters ADO.NET supports the @ reference, other connection type use a ? [question mark] I prefer the @ syntax, it is consistent with the syntax used in Transact SQL An appropriate reference to this object in the books online is listed in Resource C

Figure 5 Property page for the SQL Task embedded in the Loop Task

Trang 6

Figure 6 Parameter map entry passes variable value from loop.

The second part of configuring this SQL Task is the mapping required to tie the variable referenced in the SQL statement to the package variable value being passed into the SQL Task by its parent container The first three columns are selected from combo box choices, the developer enters the Parameter Name value by hand

Figure 7 Control Flow diagram of the project to this point.

Test it!

Now the project is at a point where it can be tested Your package should resemble the package shown in the figure above If your package errors out when you run the debugger, consult the Execution/Results view for error messages and resolve the errors

Trang 7

Step 6 Generating the SSN data using an Execute SQL task.

Figure 8 Configuration of the SQL Task that follows execution of the loop

Step 7 Add a data flow task

Assemble the data flow objects as shown in Figure 8 above; the properties to be set are in the table in Resource D Appropriate references to the objects used in this dataflow for lookup in the books online are

list-ed in Resource C The Merge Join Task properties are detaillist-ed in Step 8

Figure 9 The Data Flow contains 6 objects

Trang 8

Step 8 Configure a data merge task

This entire configuration of the Merge Join is shown in Figure 10 This is the only property page for the Merge Join

Data typing is strong in this environment The datatype of each column in the output table must match that of the corresponding column in the Person table The figure shows the FirstName output column has been configured as

a Unicode string, the LastName column in this datareader and the SSN column in the SSN datareader should be set to Unicode as well

Figure 10 Configuration of the Merge Join Object

Figure 11 Configuring the

datareader column datatype

properties

Trang 9

Figure 12 Final control flow of the project

The finished project should have a Control Flow diagram as shown in Figure 12 In this example annotations have been added to label each task in the flow

The diagram shows a few of the

rows from the Person table

popu-lated using the SSIS package A

weakness in my calls to the

RAND() SQL function inside the

MakeSSN procedure shows a lot of

commonality in the second and

third segments of the data in the

SSN column In this snapshot of

the data, you see the value of 71 is

very popular in the second

seg-ment of the string, and a modal

distribution in the last four

charac-ters of the string There are clumps

of similar values, ‘7137’ shows up

in rows 98 -103 I think the

inclu-sion of a Common Language

Runtime (CLR) Assembly with a

function to generate a random SSN

string would be a significant

improvement to the package and

provide a performance increase

Figure 13 The data generated by

Trang 10

I’ve presented a common development scenario, generating representation test data and a possible solution to this requirement The solution presented demonstrates a control flow containing a looping task, several SQL tasks, and a dataflow using a merge object Declaring package variables and passing variable values between tasks requires careful attention to detail when configuring the various tasks to share values amongst them SSIS presents a flexible programming structure allowing no practical limit to extension This flexibility brings with it a finer structure for controlling a group of tasks the complexity of which bears careful experimentation The environment provides many opportunities for tapping into the power of the NET Framework but brings with it some new baggage such as case sensitivity, connection type requirements, and strict type casting

Learn More

Learn more about how you can improve productivity, enhance efficiency, and sharpen your competitive edge Check out the following Global Knowledge courses:

Implementing and Maintaining Microsoft SQL Server 2005 Integration Services

Microsoft Certified IT Professional: Database Administrator Boot Camp

SQL Server 2005 Administration

SQL Server 2005 for Business Intelligence

SQL Server 2005 for Developers

SQL Server 2005 for Reporting Services

For more information or to register, visit www.globalknowledge.comor call 1-800-COURSESto speak with a sales representative

Our courses and enhanced, hands-on labs offer practical skills and tips that you can immediately put to use Our expert instructors draw upon their experiences to help you understand key concepts and how to apply them to your specific work situation Choose from our more than 700 courses, delivered through Classrooms, e-Learning, and On-site sessions, to meet your IT and management training needs

About the Author:

Bill Kenworthy has been working with SQL Server since version 6.0 His love for database challenges is

reflect-ed in his writing Bill lives with his wife and 2 dogs at the end of a dirt road in northern Washington State

Resources:

A Data Dictionary for the project

Staging Tables

FName, LName

Two seed tables – number of rows not necessarily equal These two tables contain the first and last name values that will be selected randomly and inserted into a row in the Name table

Name ,SSN

Working tables holding Name and SSN working data

Trang 11

Production Table

Person

Stores Patient Name and SSN data

Stored procedures

MakeAName, requires an integer variable that is used to seed the RAND() function The procedure

inserts a row into the Dev.dbo.Person table, providing values for the FirstName and LastName columns The name values are randomly selected from the staging tables

MakeASSN, populates the SSN table with a unique combination of characters generated by the

RAND() function The stored procedure checks the size of the Person table and inserts the same number of rows into the staging table

B Script for creation of the database objects

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id =

OBJECT_ID(N'[dbo].[FName]') AND type in (N'U'))

BEGIN

CREATE TABLE [dbo].[FName](

[Id] [int] IDENTITY(1,1) NOT NULL, [FirstName] [nvarchar](50) NULL

) ON [PRIMARY]

END

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id =

OBJECT_ID(N'[dbo].[LName]') AND type in (N'U'))

BEGIN

CREATE TABLE [dbo].[LName](

[Id] [int] IDENTITY(1,1) NOT NULL, [LastName] [nvarchar](50) NULL ) ON [PRIMARY]

END

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id =

OBJECT_ID(N'[dbo].[SSN]') AND type in (N'U'))

BEGIN

CREATE TABLE [dbo].[SSN](

[Id] [int] IDENTITY(1,1) NOT NULL, [SSN] [char](11) NULL

) ON [PRIMARY]

Ngày đăng: 17/01/2014, 06:20

TỪ KHÓA LIÊN QUAN

w