Analyzing Datawith Excel IN THIS CHAPTER Understanding the benefits of ad-hoc data analysis Building connections to both relational and multidimensional data Sorting, filtering, and revi
Trang 1To create a data-driven subscription, select the Subscriptions tab on the report you wish to deliver and
click the New Data-driven Subscription button This will guide you through the process of creating a
data-driven subscription Data-driven subscriptions can be delivered by e-mail or written to a file share
In either case, you can specify a data source containing the dynamic data for the report and write a
query to return the appropriate data
Figure 74-5 shows the options available to specify the command or query that returns data for the
data-driven subscription Just like a report, this data can be accessed from a variety of data sources, including
Microsoft SQL Server, Oracle, and XML The values returned in the command or query can be used to
execute the report, as shown in Figure 74-6
FIGURE 74-5
Use the Data-Driven Subscription feature to tailor report subscriptions to users based on another data
source
Trang 2FIGURE 74-6
To control report execution, provide static values or use values from the database
In addition to dynamically setting the delivery settings for the report, the query fields can also set
val-ues for the report parameters This powerful feature enables you to dynamically deliver the right report
with the right content to the right user Table 74-4 contains the delivery settings available for an e-mail
subscription, and Table 74-5 contains the delivery settings available for a file share subscription
Subscriptions can generate a variety of output formats, as detailed in Table 74-6 This provides great
flexibility to accommodate different usage of the output For example, one user might prefer to receive
the report as a PDF because all of the formatting of the report remains intact and the file may be
eas-ily distributed, while another user might prefer to receive the report as a comma-delimited file (CSV) so
the data can be imported into another system Both CSV and Excel formats are a good choice if the user
wants the data in Excel, although Excel will attempt to retain the formatting of the report within Excel,
Trang 3TABLE 74-4
Available E-Mail Delivery Settings
TO List of e-mail addresses to which the report
will be sent Separate multiple addresses with semicolons Required
myself@xyz.com;
myboss@xyz.com
CC List of e-mail addresses to which the report
should be copied Separate multiple addresses with semicolons Optional
mycoworker@xyz.com
BCC List of e-mail addresses to which the report
should be blind copied Separate multiple addresses with semicolons Optional
mysecretinformer@xyz.com
ReplyTo The e-mail address to which replies should
be sent Optional
reportReplies@xyz.com
IncludeReport True or False value Set to True to include
the report in the e-mail message Use RenderFormatto control the format
True
RenderFormat The format of the report See Table 74-6 for
the list of valid values Required when IncludeReportis True
Priority Use High, Normal, or Low to set the priority
of the e-mail message
High Subject The subject of the e-mail message Daily sales summary
Comment Text to be included in the body of the
e-mail message
This is the daily sales summary
Please review
IncludeLink True or False value Set to True to include a
link in the e-mail body to the report on the report server Note that this is a link to the actual report with the parameters used to execute the report for this subscription; it is not a link to a snapshot of the report
True
Data-driven subscriptions allow the same scheduling or trigger options as normal subscriptions Once
you create a data-driven subscription, it will appear in the list of subscriptions on the My Subscriptions
page Use this page to view information about the subscription, including trigger type, last run date and
time, and the subscription’s status You can also edit the subscription from this page
Trang 4TABLE 74-5
Available File Share Delivery Settings
FILENAME The name of the file to be written to the
shared folder
MyReport_1
FILEEXTN True or False value When this is True, the
file extension will be appended to the filename based on the specified render format
True
PATH The UNC path for the shared folder to
which the file will be written
\\computer\sharedFolder
RENDER_FORMAT The format of the report See Table 74-6 for
the list of valid values
USERNAME The username credential required to access
the file share
myDomain\bobUser
PASSWORD The password credential required to access
the file share
Bobpasswd
WRITEMODE Valid values include None,
AutoIncrement, and OverWrite
AutoIncrement
TABLE 74-6
Available Report Formats
HTML4.0 Web page for IE 5.0 or later (.htm)
XML XML file with report data
RPL Report Page Layout — Reporting Services internal binary format
Trang 5Reporting Services provides a robust set of facilities to enable administration of the report server SQL
Server Management Studio configures the basic server features and defines roles, while Report Manager
configures the application of those roles to individual objects Up-front planning of report server
permissioning, including both the granularity with which permissions will be managed and the reports
both shared and not shared between various users, can drive deployment strategies, especially the folder
hierarchy
Deploying reports and related objects via BIDS, Report Manager, or custom applications provide many
options to meet the needs of individual environments Consider deploying documentation and related
information directly to the report server as well — users welcome such supporting information and the
report server is happy to provide ad-hoc access to nearly any file type
Linked reports provide a way to customize report execution in many ways, enabling a ‘‘develop
once, deploy many times’’ strategy They also can be used to simplify permission schemes Standard
subscriptions provide users with a way to be notified of reports containing the latest information or
periodic updates Data-driven subscriptions provide enterprise installations with a convenient way to
manage centralized report generation and distribution
Given a reasonably well-thought-out configuration, these features combine to provide a platform that can
be an effective tool for users, developers, and administrators alike
Trang 6Analyzing Data
with Excel
IN THIS CHAPTER
Understanding the benefits of ad-hoc data analysis
Building connections to both relational and multidimensional data
Sorting, filtering, and reviewing relational data in Excel tables Discovering data relationships and trends using PivotTables and PivotCharts
Taking analysis to the next level using data mining add-ins Using data mining to detect erroneous data based only on data set patterns
Forecasting time series data based on historical trends
Reporting Services provides a method to create reports that expose trends,
exceptions, and other important aspects of data stored in SQL Server
Reports can be created with a level of interactivity, but even the most
interactive reports limit how the end-user interacts with the data
Using Microsoft Excel to analyze data gives users much greater flexibility and
interactivity Because Excel is in common use and most staff already know how to
use at least the basic features, it also lowers the training hurdles that prospective
users face This enables a much larger audience to undertake data analysis, so
they won’t have to make do with canned reports
The advantage of data analysis is the ability to discover trends and relationships
that are not obvious, and to look at data in ways and combinations not normally
performed Ad-hoc analysis is also a good way to quickly prototype reports,
enabling report development to happen once requirements are well understood
With the addition of data mining features to Excel 2007, options for including
mining models in routine analysis can be explored as well
This chapter focuses on the features of Microsoft Excel 2007 that use SQL Server
data or features — how to retrieve data from relational and multidimensional
databases, common ways of analyzing such data, and how data mining features
can help you understand the data
Organizational interest in data analysis tends to be focused among a small
popu-lation, with the majority of staff satisfied with reports created by others Interested
staff tend to share the following characteristics:
■ They perceive the value of data to their professional success (or feel
hindered by a lack of data)
■ They have mastered basic office automation skills (e.g., spreadsheet
construction)
Trang 7Championing data analysis among staff likely to have these characteristics can have a positive impact on the
organization, increasing the availability of data to staff while decreasing the number of reports needed
Data Connections
A data connection describes how to connect to a server or other source of data, and optionally the query
or table from which to retrieve data This may seem like a mundane topic, but how a workbook’s
con-nections are defined has important implications for validity, reuse, and sharing of analyses The Data tab
of Excel’s Ribbon provides several functions to create and manage connections:
■ Get External Data: Invokes wizards and dialogs to create a new connection Once the connection has been defined for the workbook, a connection file containing a description of the connection is created to enable that connection to be reused While connections can be defined to a variety of data sources, this chapter focuses on getting data from SQL Server The primary ways to define these connections are located on the From Other Sources menu, and include From SQL Server, From Analysis Services, and From Microsoft Query
■ Existing Connections: Lists all the connections that Excel can find, including those in the current workbook, any connection files found on the network (in a SharePoint Data Connection Library),
or any connection files in the user’s local My Data Sources folder Selecting a connection file makes that connection part of the workbook and invokes dialogs to import its data
While the Existing Connections dialog often contains all the connections of interest, connec-tion files located in other folders can be retrieved by pressing the Browse for More button
If the desired file is not found, it can be created by pressing the New Source button on the Browse dialog, which invokes the Data Connection Wizard, similar to choosing Get External Data, described above
■ Connections: Choosing Connections from the Data tab of Excel’s Ribbon will display the connections currently in use by the workbook The workbook’s copy of a connection can be viewed and modified via the properties dialog, removed from the workbook with the Remove button, or new connections can be added by clicking the Add button, which invokes the Exist-ing Connections dialog described above In addition, the properties of all existExist-ing connections can be examined and modified
By default, a connection is cached by each workbook in which it is used, and that cached copy is
used to retrieve data until that data source becomes unavailable, at which point the corresponding data
connection file is read to see if anything has changed Alternately, setting the connection property ‘‘Always
use connection file’’ (on the Definition tab) will reverse Excel’s search order (file first, cached connection
last) This alternate setting would be useful when data sources could change without the old one being
eliminated — then only the connection file would require update instead of every workbook that references
a connection
Other important connection properties to consider include the following:
■ Save password: Found on the Definition tab, this determines whether a password is saved in the connection and thus visible to others, or not saved, whereby each user is prompted for the password This issue can be avoided by using integrated security connections
Trang 8■ Refresh options: Found on the Usage tab, this enables external data to be refreshed either
when the workbook is opened and/or on a regular interval
■ OLAP options: Found on the Usage tab, this determines which server formatting will appear
for query results and how many rows will be displayed on drill-through operations
Managing Connections in Microsoft Office
SharePoint Server
It is generally desirable to deploy task or subject-specific data connections, rather than train staff about
servers, databases, tables, and so on; but deploying data connections in an organization can be challenging
as well, especially when database locations or structural changes require updates to an unknown number of
workbooks stored in an unknown number of locations
SharePoint offers a good solution for centrally storing and managing connection information via a Data
Connection Library Connections stored in such a library appear as ‘‘Connection files on the Network’’ for
Excel users Setting up the connection library takes a few steps:
1. Choose the site on which the connection library will be hosted, select Create from
the Site Actions menu, and then choose Data Connection Library Give it a name and
click the Create button Save the URL of the new library
2. The new library must then be marked as trusted to work as expected Run SharePoint
Central Administration, choose the Application Management tab, and then select
‘‘Create or configure this farm’s shared services’’ link to display a list of shared service
providers Choose the provider that hosts the services of interest (e.g., SharedServices1)
to view available operations Select ‘‘Trusted data connection libraries’’ to view the list
of trusted libraries and add the URL of the library created above When copying URLs
from the address bar of a browser, eliminate the /Forms/AllItems.aspx suffix
For example, http://home.mysite.com/Sitename/Libraryname/Forms/
AllItems.aspxbecomes http://home.mysite.com/Sitename/Libraryname
when specifying the library location
3. Upload connection files to your new Data Connection Library either by using the
library’s Upload function or by exporting connections from Excel Find the Export
Connection File button on the Connection Properties dialog in Excel
Sharing connections requires a bit of planning Server names must make sense for the audience that is
sharing the connection (e.g., referring to the ‘‘localhost’’ server implies a different server for each user of
the connection) In addition, choose authentication methods that enable the intended audience to read the
target data
Data Connection Libraries are a good approach to sharing connection information in an organization, and a
requirement if Excel Services is used
Trang 9Data Connection Wizard
Choosing to create a new connection to either SQL Server or Analysis Services will invoke the Data
Con-nection Wizard to define the server, credentials, database, and optionally the table/cube to be queried
Specifying only a database results in a very generic connection that can be widely used by an audience
that understands which database object they desire access to, but a generic connection to a relational
database also prompts the Excel user to choose the appropriate database object every time the data is
refreshed
Because the Data Connection Wizard does not offer the opportunity to enter a query, it is best suited
to relational scenarios in which views have been built to present large, flattened data sets appropriate
for performing analyses without requiring joins to other tables or views For other scenarios, modify the
connection to include a query:
1 Create a connection that defines the appropriate server and database using the wizard.
2 Construct the T-SQL required to return the data of interest in another environment, such as
SQL Server Management Studio
3 Launch the Connections dialog from the Data tab, locate the new connection and examine its
properties On the Definition tab of the dialog, change the command type to SQL, and then paste the T-SQL into the connection
When the wizard creates the connection, it automatically creates an odc file containing the connection details in the user’s My Data Sources folder This enables future references
to the same data to be chosen from the Existing Connections list, a handy reuse that is improved by
carefully naming connections.
Anytime the properties of a connection are altered, such as described above, Excel warns that the cached
and file copies of the connection will be different They can be made identical again by using the Export
Connection File function in the Connection Properties dialog
Several types of connection files are discussed in the following sections The wizards described
previ-ously will generate Office Data Connection (.odc) files, whereas the From Microsoft Query wizards will
generate an Excel ODBC Query Excel OLEDB Query (.rqy) and Excel OLAP Query (.oqy) files provide
an alternative that enables the placement of queries in easily edited connection files without the baggage
associated with ODBC
Microsoft Query
Microsoft Query provides a graphical design environment for relational queries While this is somewhat
complex and relies on the deprecated ODBC technology, it can be effective for some users that have
dif-ficulty with other methods of query construction
Create a new Microsoft Query connection by choosing From Microsoft Query on the Data tab’s From
Other Sources menu The wizard walks through choosing or defining a data source, and launches the
Microsoft Query applet to define the query that will return data to Excel Once the query has been
defined, simply exit the applet to return the selected data to Excel for analysis Note the ‘‘Use the Query
Wizard to create/edit queries’’ check box at the bottom of the first wizard dialog Checking this box will
insert into the dialog additional steps that attempt to simplify the query definition process, but many
users will find these additional steps confusing
Trang 10Unlike the Data Connection Wizard, the Microsoft Query process saves only a Data Source Name (.dsn)
file, which omits the query definition Fortunately, the full connection information, including the query, can
be saved to an odcfile by using the Export Connection File function in the Connection Properties dialog
Connection file types
Excel 2007 emphasizes the use of Office Data Connection (.odc) files, which are XML documents that
define the connection details These files are easily created using Excel’s wizards and tools as described
above, but not easy to create or edit outside of Excel
The Excel OLE DB Query (.rqy) file is useful to define extensions outside of Excel Consider the
follow-ing example that defines a relational query againstAdventureWorksDW:
The connection files in this section contain long lines that may not be broken onto multiple
lines Unfortunately, they appear that way in print due to space limitations In the samples
below, make sure to continue indented text on the previous line.
QueryType=OLEDB
Version=1
Connection=Provider=SQLOLEDB;Server=(local);Database=AdventureWorksDW;
Trusted_Connection=yes
CommandText=SELECT LastName, Gender, EmailAddress FROM dbo.DimCustomer
The connection string uses the standard format and the command text can be any valid query, although
it must be listed on a single line For queries against Analysis Services, the similar Excel OLAP Query
(.oqy) file, shown next, provides a simple format to define a cube connection:
QueryType=OLEDB
Version=1
CommandType=Cube
Connection=Provider=MSOLAP.3;Initial Catalog=Adventure Works DW;
Data Source=(local);Location=(local)
CommandText=Adventure Works
Alternately, connections using MDX queries can be defined using the format shown here:
QueryType=OLEDB
Version=1
CommandType=MDX
Connection=Provider=MSOLAP.3;Initial Catalog=Adventure Works DW;
Data Source=(local);Location=(local)
CommandText=select [Measures].[Internet Order Count] on 0,
[Date].[Calendar Year].members on 1 from [Adventure Works]
Basic Data Analysis
Data can be retrieved using connections into three forms within Excel A data table provides a simple
list of data, with one row in Excel for each row returned from the relational query it is based on