Add a Script destination component, rename it as Script Non Standard Destination, and join it with the asynchronous output from the Script Transformation component.. Debugging Techniqu
Trang 1If Not Row.LastName_IsNull Then Write(Row.LastName)
End If Write(columnDelimiter)
If Not Row.Title_IsNull Then Write(Row.Title)
End If Write(columnDelimiter)
If Not Row.SalesAmount_IsNull Then Write(Row.SalesAmount)
End If Write(columnDelimiter)
If Not Row.IsBonusPaid_IsNull Then Write(Row.IsBonusPaid)
End If Write(columnDelimiter)
If Not Row.BonusAmount_IsNull Then Write(Row.BonusAmount)
End If WriteLine() End With
Close the VSTA IDE and the Script Transformation Editor We will now move
on to create a script destination to write into a nonstandard text file
Nonstandard Text File Destination
In this part you will write the asynchronous output to a nonstandard text file Considering that you can now do some of the configurations yourself, this section is kept short and quick
52. Add a File Connection Manager to create a SalesSummary.txt file in the C:\SSIS\RawFiles folder
53 Add a Script destination component, rename it as Script Non Standard
Destination, and join it with the asynchronous output from the Script
Transformation component
54. Perform the following in the editor:
Change the language
c Select both the input columns
c Rename the Input 0 to
Add the SalesSummary.txt Connection Manager and name it as c
SalesSummary.
Trang 255. In the Code-design mode, perform the following:
Add the System.IO namespace in the Imports section
c
Create two script variables:
c
Dim cmSalesSummary As String
Private textWriter As StreamWriter
Add the following code for the AcquireConnections method:
c
Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
Dim connMgr As IDTSConnectionManager100 = Me.Connections.SalesSummary
cmSalesSummary = CType(connMgr.AcquireConnection(Nothing), String)
End Sub
56. Initialize the textWriter in the PreExecute method:
textWriter = New StreamWriter(cmSalesSummary, False)
57. Close the textWriter in the PostExecute Method:
textWriter.Close()
58. Last, type the following code in the DstAsyncInput_ProcessInputRow method:
With textWriter
.Write("This is a Summary Report for Sales")
.WriteLine()
.WriteLine()
.Write("Total Sales Amount: " + Row.TotalSalesAmount.ToString)
.WriteLine()
.Write("Total Bonus Amount: " + Row.TotalBonusAmount.ToString)
.WriteLine()
End With
59. Execute the package and check out the output files You will see the data from
synchronous output written into a comma-delimited text file and the data from
asynchronous output written into a nonstandard text file (Figure 11-19)
Figure 11-19 A nonstandard text file written with the data from asynchronous output
Trang 3Debugging Techniques for Script Component
In the previous parts, you’ve used a combination of Row Count transformation and the data viewers to break the data flow and inspect the records visually While this technique is very helpful in debugging and identifying the component that is causing issues, it fails if you want to know what’s happening within your own developed component You’ve used breakpoints within your scripts while working with the Script task to step through the code to isolate the error causing rows, but unfortunately the Script component does not support breakpoints So, in a Script component you’ve to use alternate techniques to capture the information within your code SSIS provides the following methods to capture information from within your script
Raise informational messages
using the MessageBox.Show, Console.Write, or Console.WriteLine methods You will be using these methods during development when you want to find out what rows or values are causing errors For example, when you use the MessageBox Show method, the output will be displayed in a modal message box, which you have to click to proceed further You won’t be using this method for the normal run; otherwise, you will have to keep clicking through the pipeline to progress the package execution You can test your package by adding the following line in the Script Transformation component in the XfrInput_ProcessInputRow method just before the line where you pass the BonusPaid value to the Row.IsBonusPaid column (refer to Figure 11-20) This will show you the Bonus for each row, and you will have to click OK to progress to the next row
MessageBox.Show("BonusAmount: " + Bonus.ToString)
Raise events
c You can raise various events to let the package know about some information, warnings, or errors raised within the script The Script component provides the following methods that you use to raise different events
FireCustomEvent Fires a user-defined custom event—e.g., you may want to capture certain data-related
attributes
FireError Fires an error event
FireInformation Fires an informational message—e.g., information for auditing purposes
FireProgress Fires the on-progress event
FireWarning Fires a warning that is less critical than an error—e.g., a warning for a varchar(50) column
is being mapped to varchar(10), which will work as long as the data doesn’t exceed ten characters
Trang 4As you know from the Script task, these methods were exposed by the event
property of the Dts global object, but the Script component doesn’t have any
such object and though the event-firing methods have the same names, they
are the methods of the IDTSComponentMetaData100 interface exposed by
the ComponentMetaData property of the ScriptMain class This also means
the usage will be different than what you used in the Script task Once the
events have been raised by the Script component, the package event handlers
then manage the event notifications You can build your own event handlers to
respond to a certain event—e.g., you can change the course of a package work
flow in response to an event raised by the Script component
To raise an event, you need to declare a variable for this interface as
Dim myMetadata as IDTSComponentMetaData100
and then add the following lines in the code where you earlier put the
MessageBox (Figure 11-20)
If Bonus > 1000 Then
myMetaData = Me.ComponentMetaData
myMetaData.FireInformation(0, "Script Transformation
Component", "Bonus paid: " + Bonus.ToString, String.Empty, 0, False)
End If
Figure 11-20 Raising events in a Script component
Trang 5This will raise informational messages to tell the package the amount of bonuses paid whenever the bonus is more than 1000 You can see these messages
displayed in the output window and in the Progress tab Refer to Books Online to know how different methods can be written and how they print their messages
Log information
c Much as when raising events, you can log detailed information about the execution progress, results, and errors You can also raise events first and then log those events later The main difference between first raising events and then logging compared to direct-logging to a log provider is that the direct logging method doesn’t provide you an opportunity to build event handlers to respond to the events If you want to use a log directly without using any other log event, then you must use the ScriptComponentLogEntry event while configuring logging for the Data Flow task as shown in Figure 11-21 The ScriptComponentLogEntry
Figure 11-21 Custom log entry for the Script component
Trang 6event gets added to the Data Flow task event in the Log Details tab when you
add a Script component into the Data Flow task After selecting this log entry, use
the Log method within the ScriptMain class to log directly to the log providers
enabled in the package For example, if you add the following code in the Script
Transformation component instead of the information event, the package will log
the lines where the bonus is more than 1000:
Dim bt(0) As Byte
If Bonus > 1000 Then
Me.Log(Bonus.ToString, 0, bt)
End If
Review
You’ve used the Script component to configure a source, a transformation, and a
destination in this exercise You’ve also learned the different types of outputs—i.e.,
synchronous and asynchronous outputs—that you can configure for a Script
Transformation component The general steps you followed to create any of these
components is to configure the component first in the metadata design mode and
then in the code design mode to write custom code for the processing rules Do not
underestimate the power of metadata design mode, as the settings you use in this
mode are used to create code for you in the three project items—BufferWrapper.vb,
ComponentWrapper.vb, and main.vb While the code is auto-generated for both the
BufferWrapper.vb and ComponentWrapper.vb items whenever you make a change,
this does not happen for the main.vb project item The code for the main.vb project
item is auto-generated only once, when you invoke the code design mode for the first
time Though you can always write the code to fix anything that has been missed by
the auto-generated code, it will be far easier and effective to let the auto-generation
engine work for you, especially during the initial days of your working with the Script
component Also, make sure you understand the differences between a Script task and
the Script component, especially how the different properties and methods are exposed
within both the objects For example, the variables and the connections are exposed by
the Dts object in the Script task, while they are exposed using strongly typed accessor
properties within the Script component Last, the debugging techniques are different
in both the objects—while you can use breakpoints to debug a Script task, you’ve to use
alternate methods to debug a Script component such as MessageBoxes, data viewers,
and events
Trang 7You’ve worked with the object model of Integration Services in this chapter and learned the different options it provides to extend an SSIS package Based on your problem and the skills you have in-house, you can choose an appropriate method from scripting, developing custom objects, or programming packages from scratch Later you worked with the Script task and the Script component and understood the various methods and properties provided by their object models You also learned that though you have to write code differently for both the objects to use their methods, they provide the same functions, such as access to package variables or connection managers and the ability
to raise events and log information There are many samples contained in the Books Online that can enhance your understanding about the workings of SSIS object model Also, check out the CodePlex site as well to see some rich programming content to hone your programming skills
Trang 8Data Warehousing
and SQL Server 2008 Enhancements
In This Chapter
c The Need for a Data Warehouse
c Data Warehouse Design
Approaches
c Data Warehouse Architecture
c Data Warehouse Data Models
c Loading a Dimension Using a
Slowly Changing Dimension
c Data Model Schema
c Building a Star
Schema
c SQL Server 2008 R2 Features
and Enhancements
c SQL Server 2008 R2 Data
Warehouse Editions
c SQL Server 2008 R2 Data
Warehouse Solutions
c SQL Server 2008 R2 Data
Warehouse Enhancements
c Summary
Trang 9Now that you have gone through all the development chapters on Integration
Services, it is time to take a slight digression to understand the facilities that are available outside SSIS that can be very helpful in developing a well-performing data operations and management system Soon after developers start using an ETL tool such as Integration Services, they will find themselves working on some operation of a data mart or a data warehouse It is therefore necessary that you understand the data warehousing and the processes involved along with the tools and utilities that are available in SQL Server 2008, which can be exploited to develop data warehousing processes You can use the tools and utilities provided in SQL Server
2008 to enhance an SSIS package or replace some of the package functionality so that overall solution performs well This chapter is primarily split in two parts, with first part focusing on data warehousing and the second part on SQL Server 2008 database engine enhancements, such as change data capture (CDC), backup compression, and the MERGE statement, along with two new editions specially targeted to data warehouses The purpose of this chapter is not to cover how to use or configure the new functionalities, as it would be too great a deviation from the subject of this book, but to know how you can use them alongside SSIS to best utilize the toolset you have However, if I succeed in my little effort to induce enough interest in you in data warehousing, there are tons of books and other online material available that I would recommend for you to check out
The Need for a Data Warehouse
As a business starts, it needs to capture information and record transactions to support business needs The applications that help to serve this need are called line of business (LOB) applications The systems used to support line of business applications are referred to as online transactional processing systems OLTP systems are designed to respond very quickly for accessing and recording small chunks of information Generally, lots of users are connected to such systems performing transactions to support activities such as sales, billing, and order management Technically, OLTP systems are highly normalized to capture the current state of information and not to keep the historical data As the business grows, the business managers need some reporting out of the information systems to take business forward, for example, how many sales yesterday
or the how many deliveries completed These kinds of reports are called operational reporting and to some extent can be served from OLTP systems However, when the business has grown enough and is in such a good position that the OLTP systems are running busy most of the time, you can’t run reports on them Running reports touches more data and requires more resources that affect the performance of already-busy OLTP systems and can result in lost sales So, the IT extracts data from OLTP systems into new systems so as to serve reports As the report recipients generally do not want to
Trang 10look at detailed information related to individual transactions, it makes sense to extract
data to only the lowest level of granularity required
Think of a web-based business that allows customers to buy products online The
underlying OLTP system keeps information about all the activities of the customers
that may not be important for creating business reports For example, a customer
might have added three similar types of products in the basket, but while checking
out removed two of the products from the basket after looking at the delivery charges
for each of the items While this data might be interesting for purposes such as data
mining or to understand customer behavior, it is not very interesting for the business
manager, who is concerned about the sales of the day only So, while taking out data
from an OLTP system, data is cleaned at the individual sales level, removing unwanted
details This data is still the detailed data as per requirements of the business manager
and is kept more or less in the same format as that of the source OLTP system
Furthermore, this data might be combined with data from other source systems to
create a bigger picture In our example, consider joining the sales information with
the information coming from the customer repository system to know who is an active
buyer These systems that keep cleaned, subject-oriented information at a detailed level
and collect this information in real- or near-real-time mode are called operational data
stores, or ODS in short
Once the business creates an ODS system, they look forward to get the right
information at the right time to make well-informed decisions However, an ODS is
limited by its very definition and cannot answer the questions that are spread across
the enterprise or are related to multiple departments This is the beginning of the need
to have a decision support system (DSS) within the organization As an organization
grows further where it becomes important to improve business performance and make
strategically right decisions, more complex questions are being put to information
systems These complex questions generally contain comparisons with a previous period
such as sales this quarter compared to sales in the same quarter last year This creates
a need to keep data over longer periods of time Some businesses that fall under strict
regulations—e.g., the financial services industry—may be required to keep data as long as
five to ten years Keeping data over longer periods is not just to comply with regulations,
but also to get some answers that otherwise would not be possible A business dependent
on sales around Christmas day might ask the sales comparison over a week not with
the last year, but with the year when Christmas was on the same day of the week (say
Thursday), previously Also, as the organization evolves and starts to sell multiple
products across multiple zones and countries, another need arises to see the single
version of truth This will mean that the data from multiple source systems needs to be
collected in a common place and business rules need to be applied to collate the data
to create an information store that can answer the complex questions across countries,
currencies, products, departments, and time The operational reports created on top