Nielsen p02.tex V4 - 07/21/2009 12:38pm Page 165Manipulating Data with Select IN THIS PART Chapter 8 Introducing Basic Query Flow Chapter 9 Data Types, Expressions, and Scalar Functions
Trang 1Nielsen c07.tex V4 - 07/23/2009 9:03pm Page 162
Part I Laying the Foundation
The script first loads all of the DLLs needed for SMO and the snap-ins (code and commands that
extend PowerShell’s capabilities), and then it loads the snap-ins Once this is done, all of the SQL Server
provider functionality is available, and this script can be run against PowerShell 1.0 or PowerShell 2.0
when it is available
The SQL PSDrive – SQLSERVER:
Native PowerShell provides the ability to navigate not only the disk file system, but also the system
reg-istry as though it were a file system (This is expected behavior for a shell environment, as Unix shell
systems treat most everything as a file system as well.) The SQL Server provider adds a new PowerShell
drive, also referred to as a PSDrive, called SQLSERVER: TheSet-Locationcmdlet (usually aliased
ascd) is used to change to the SQLSERVER: drive and then SQL Server can be navigated like the file
system
There are four main directories under SQLSERVER: — SQL, SQLPolicy, SQLRegistration, and
DataCol-lection:
■ The SQL folder provides access to the database engine, SQL Server Agent, Service Broker, and
Database Mail, all using the various SMO DLLs
■ The SQLPolicy folder provides access to policy-based management using the DMF and Facets
DLLs
■ The SQLRegistration folder enables access to the Registered Servers (and the new Central
Management Server feature of SQL Server 2008)
■ The DataCollection folder enables access to the Data Collector objects provided with the
Management Data Warehouse feature of SQL Server 2008
You can browse the SQLSERVER file system just like a disk file system Issuing the commandcd SQL
(orSet-Location SQL) and running theGet-ChildItemcmdlet returns the local server and any
other servers that may have been recently accessed from the PowerShell session Changing to the local
server and runningGet-ChildItemreturns the names of the SQL Server instances installed on that
server Changing to one of the instances and runningGet-ChildItemreturns the collections of objects
available to that server, such as BackupDevices, Databases, Logins, and so on Changing to the Databases
collection and runningGet-ChildItemreturns the list of user databases, along with some of the
database properties The results will look something like Figure 7-6
SQL cmdlets
The SQL Server PowerShell snap-in also provides new cmdlets specific for use with SQL Server
The majority of administrative functions are managed using SMO, and data access is managed using
ADO.NET, as mentioned before, so no cmdlets were needed for these functions Some functions are just
easier using cmdlets, so they were provided They include the following:
■ Invoke-Sqlcmd
■ Invoke-PolicyEvaluation
■ Encode-SqlName
■ Decode-SqlName
■ Convert-UrnToPath
Trang 2Nielsen c07.tex V4 - 07/23/2009 9:03pm Page 163
Scripting with PowerShell 7
FIGURE 7-6
Navigating the SQL Server ‘‘filesystem’’
The first,Invoke-Sqlcmd, takes query text and sends it to SQL Server for processing Rather than set
up the structures in ADO.NET to execute queries, theInvoke-Sqlcmdcmdlet returns results from a
query passed in as a parameter or from a text file, which provides a very easy way to get data out of
SQL Server It can perform either a standard Transact-SQL query or an XQuery statement, which
pro-vides additional flexibility
TheInvoke-PolicyEvaluationcmdlet uses the Policy-based Management feature of SQL Server
2008 It evaluates a set of objects against a policy defined for one or more servers to determine whether
or not the objects comply with the conditions defined in the policy It can also be used to reset
object settings to comply with the policy, if that is needed Lara Rubbelke has a set of blog posts on
using this cmdlet athttp://sqlblog.com/blogs/lara rubbelke/archive/2008/06/19/
evaluating-policies-on-demand-through-powershell.aspx
The character set used by SQL Server has a number of conflicts with the character set allowed by
PowerShell For example, a standard SQL Server instance name is SQLTBWS\INST01 The backslash
embedded in the name can cause PowerShell to infer a file system directory and subdirectory, because it
uses that character to separate the elements of the file system TheEncode-SqlNamecmdlet converts
strings acceptable to SQL Server into strings acceptable by PowerShell For example, the instance name
SQLTBWS\INST01 would be converted by this cmdlet into SQLTBWS%5CINST01
163
www.getcoolebook.com
Trang 3Nielsen c07.tex V4 - 07/23/2009 9:03pm Page 164
Part I Laying the Foundation
TheDecode-SqlNamecmdlet does the exact opposite ofEncode-SqlName: It converts the
PowerShell-acceptable string of SQLTBWS%5CINST01 back to SQLTBWS\INST01
Because SMO uses Uniform Resource Names (URN) for its objects, a cmdlet is provided to convert
those URN values to path names, which can be used in aSet-Locationcmdlet — for example, to
navigate through the SQL Server objects The URN for theHumanResources.Employeetable in
AdventureWorks2008on SQLTBWS\INST01 is as follows:
Server[@Name=’SQLTBWS\INST01’]\Database[@Name=’AdventureWorks2008’]\
Table[@Name=’Employee’ and @Schema=’HumanResources’]
Converting that to a path usingConvert-UrnToPathwould yield the following:
SQLSERVER:\SQL\SQLTBWS\INST01\Databases\AdventureWorks2008\
Tables\HumanResources.Employee
Summary
After looking at the basics of PowerShell and exploring a few ways to get some interesting information
about servers, this chapter reviewed a script to provide information about each server you manage Then
it examined some of the structures in SQL Server Management Objects (SMO) and some scripts to
per-form basic administrative tasks This chapter also looked at a couple of scripts to extract data from SQL
Server, because that’s a common request from businesspeople Finally, this chapter took a quick look at
the features in SQL Server 2008 to make PowerShell an integral part of the SQL Server toolset
Much more can be explored with PowerShell, but this will provide a starting point Automation enables
administrators to do more in less time and provide more value to the companies that employ them
PowerShell is a powerful way to automate most everything an administrator needs to do with SQL
Server
Trang 4Nielsen p02.tex V4 - 07/21/2009 12:38pm Page 165
Manipulating Data
with Select
IN THIS PART
Chapter 8
Introducing Basic Query Flow
Chapter 9
Data Types, Expressions, and Scalar Functions
Chapter 10
Merging Data with Joins and Unions
Chapter 11
Including Data with Subqueries and CTEs
Chapter 12
Aggregating Data
Chapter 13
Windowing and Ranking
Chapter 14
Projecting Data Through Views
Chapter 15
Modifying Data
Chapter 16
Modification Obstacles
SQL is like algebra in action
The etymology of the word ‘‘algebra’’ goes back to the Arabic word ‘‘al-jabr,’’
meaning ‘‘the reunion of broken parts,’’ or literally, ‘‘to set a broken bone.’’
Both algebra and SQL piece together fragments to solve a problem
I believe select is the most powerful word in all of computer science.
Because select is so common, it’s easy to take it for granted, but no keyword
in any programming language I can think of is as powerful and flexible
Select can retrieve, twist, shape, join, and group data in nearly any way
imaginable, and it’s easily extended with the insert, update, delete (and now
merge!) commands to modify data
Part II begins by exploring the basic logical query flow and quickly digs
deeper into topics such as aggregate queries, relational division, correlated
subqueries, and set-difference queries I’ve devoted 15 chapters to the select
command and its variations because understanding the multiple options and
creative techniques available with queries is critical to becoming a successful
SQL Server developer, DBA, or architect
Please don’t assume that Part II is only for beginners These 15 chapters
present the core power of SQL Part IX explores optimization strategies, and
it may be tempting to go straight there for optimization ideas, but the
second strategy of Smart Database Design is using good set-based code
Here are nine chapters describing how to optimize your database by writing
better queries
If SQL Server is the box, Part II is about being one with the box
www.getcoolebook.com
Trang 5Nielsen p02.tex V4 - 07/21/2009 12:38pm Page 166
Trang 6Nielsen c08.tex V4 - 07/21/2009 12:37pm Page 167
Introducing Basic
Query Flow
IN THIS CHAPTER
Logical flow of the query Restricting the result set Projecting data
Specifying the sort order
SQL is the romance language of data, but wooing the single correct answer
from gigabytes of relational data can seem overwhelming until the logical
flow of the query is mastered
One of the first points to understand is that SQL is a declarative language This
means that the SQL query logically describes the question to the SQL Query
Optimizer, which then determines the best method to physically execute the
query As you’ll see in the next eight chapters, there are often many ways of
stating the query, but each method could be optimized to the same query
execution plan This means you are free to express the SQL query in the way
that makes the most sense and will be the easiest to maintain In some cases, one
method is considered cleaner or faster than another: I’ll point those instances
out as well
SQL queries aren’t limited toSELECT The four Data Manipulation Language
(DML) commands,SELECT,INSERT,UPDATE, andDELETE, are sometimes
taught as four separate and distinct commands However, I see queries as a single
structural method of manipulating data; in other words, it’s better to think of
the four commands as four verbs that may each be used with the full power and
flexibility of the SQL
Neither are SQL queries limited to graphical interfaces Many SQL developers
who came up through the ranks from Access and who have built queries using
only the Access query interface are amazed when they understand the enormous
power of the full SQL query
This chapter builds a basic single table query and establishes the logical query
execution order critical for developing basic or advanced queries With this
foun-dation in place, the rest of Part II develops the basicSELECTinto what I believe
is the most elegant, flexible, and powerful command in all of computing
167
www.getcoolebook.com
Trang 7Nielsen c08.tex V4 - 07/21/2009 12:37pm Page 168
Part II Manipulating Data with Select
Understanding Query Flow
One can think about query flow in four different ways Personally, when I develop SQL code, I imagine
the query using the logical flow method Some developers think through a query visually using the
lay-out of SQL Server Management Studio’s Query Designer The syntax of the query is in a specific fixed
order:SELECT –FROM –WHERE – GROUP BY– HAVING –ORDER BY To illustrate the declarative
nature of SQL, the fourth way of thinking about the query flow — the actual physical execution of the
query — is optimized to execute in the most efficient order depending on the data mix and the available
indexes
Syntactical flow of the query statement
In its basic form, theSELECTstatement tells SQL Server what data to retrieve, including which
columns, rows, and tables to pull from, and how to sort the data
Here’s an abbreviated syntax for theSELECTcommand:
SELECT [DISTINCT][TOP (n)] *, columns, or expressions
[FROM data source(s)]
[JOIN data source
ON condition](may include multiple joins) [WHERE conditions]
[GROUP BY columns]
[HAVING conditions]
[ORDER BY Columns];
TheSELECTstatement begins with a list of columns or expressions At least one expression is
required — everything else is optional The simplest possible validSELECTstatement is as follows:
SELECT 1;
TheFROMportion of theSELECTstatement assembles all the data sources into a result set, which is
then acted upon by the rest of theSELECTstatement Within theFROMclause, multiple tables may be
referenced by using one of several types of joins
When noFROMclause is supplied, SQL Server returns a single row with values (Oracle requires aFROM
DUALto accomplish the same thing.)
TheWHEREclause acts upon the record set assembled by theFROMclause to filter certain rows based
upon conditions
Aggregate functions perform summation-type operations across the data set TheGROUP BYclause can
group the larger data set into smaller data sets based on the columns specified in theGROUP BYclause
The aggregate functions are then performed on the new smaller groups of data The results of the
aggre-gation can be restricted using theHAVINGclause
Finally, theORDER BYclause determines the sort order of the result set
Trang 8Nielsen c08.tex V4 - 07/21/2009 12:37pm Page 169
Introducing Basic Query Flow 8
A graphical view of the query statement
SQL Server Management Studio includes two basic methods for constructing and submitting queries:
Query Designer and Query Editor Query Designer offers a graphical method of building a query,
whereas Query Editor is an excellent tool for writing SQL code or ad hoc data retrieval because there are
no graphics to get in the way and the developer can work as close to the SQL code as possible
From SQL Server’s point of view, it doesn’t matter where the query originates; each statement is
evalu-ated and processed as a SQL statement
When selecting data using Query Designer, the SQL statements can be entered as raw code in the third
pane, as shown in Figure 8-1 The bottom pane displays the results in Grid mode or Text mode and
displays any messages The Object Browser presents a tree of all the objects in SQL Server, as well as
templates for creating new objects with code
FIGURE 8-1
The Query Designer can be used to graphically create queries
If text is selected in the Query Editor, then only the highlighted text is submitted to SQL
Server when the Execute command button or the F5 key is pressed This is an excellent
way to test single SQL statements or portions of SQL code.
169
www.getcoolebook.com
Trang 9Nielsen c08.tex V4 - 07/21/2009 12:37pm Page 170
Part II Manipulating Data with Select
Though it may vary depending on the user account settings, the default database is probably the master database Be sure to change to the appropriate user database using the database selector combo box in the toolbar, or the USE database command.
The best solution is to change the user’s default database to a user database and avoid master
altogether.
Logical flow of the query statement
The best way to think through a SQL DML statement is to walk through the query’s logical flow (see
Figure 8-2 Because SQL is a declarative language, the logical flow may or may not be the actual
phys-ical flow that SQL Server’s query processor uses to execute the query Nor is the logphys-ical flow the same as
the query syntax Regardless, I recommend thinking through a query in the following order
FIGURE 8-2
A simplified view of the logical flow of the query showing how data moves through the major clauses
of the SQLselectcommand
Data Source(s)
From
Expr(s)
Order
Here’s a more detailed explanation of the logical flow of the query Note that every step except step 4 is
optional:
1 [From]: The query begins by assembling the initial set of data, as specified in theFROM portion of theSELECTstatement (Chapter 10, ‘‘Merging Data with Joins and Unions,’’ and Chapter 11, ‘‘Including Data with Subqueries and CTEs,’’ discuss how to build even the most complexFROMclauses.)
2 [Where]: The filter process is actually theWHEREclause selecting only those rows that meet the criteria
3 [Aggregations]: SQL can optionally perform aggregations on the data set, such as finding the
average, grouping the data by values in a column, and filtering the groups (see Chapter 12,
‘‘Aggregating Data’’)
4 Column Expressions: TheSELECTlist is processed, and any expressions are calculated (covered in Chapter 9, ‘‘Data Types, Expressions, and Scalar Functions,’’ and Chapter 11,
‘‘Including Data with Subqueries and CTEs’’)
5 [Order By]: The resulting rows are sorted according to theORDER BYclause
6 [Over]: Windowing and ranking functions can provide a separately ordered view of the results
with additional aggregate functions
7 [Distinct]: Any duplicate rows are eliminated from the result set.
Trang 10Nielsen c08.tex V4 - 07/21/2009 12:37pm Page 171
Introducing Basic Query Flow 8
8 [Top]: After the rows are selected, the calculations are performed, and the data is sorted into
the desired order, SQL can restrict the output to the top few rows
9 [Insert, Update, Delete]: The final logical step of the query is to apply the data modification
action to the results of the query These three verbs are explained in Chapter 15, ‘‘Modifying
Data.’’
10 [Output]: The inserted and deleted virtual tables (normally only used with a trigger) can be
selected and returned to the client, inserted into a table, or serve as a data source to an outer
query
11 [Union]: The results of multiple queries can be stacked using a union command (see
Chapter 10, ‘‘Merging Data with Joins and Unions’’)
As more complexity has been added to the SQL SELECTcommand over the years, how to think
through the logical flow has also become more complex In various sources, you’ll find minor differences
in how SQL MVPs view the logical flow That’s OK — it’s just a way to think through a query, and this
is the way I think through writing a query
As you begin to think in terms of the SQLSELECTstatement, rather than in terms of the graphical user
interface, understanding the flow of SELECTand how to read the query execution plan will help you
think through and develop difficult queries
Physical flow of the query statement
SQL Server will take theSELECTstatement and develop an optimized query execution plan, which may
not be in the execution order you would guess (see Figure 8-3) The indexes available to the SQL Server
Query Optimizer also affect the query execution plan, as explained in Chapter 64, ‘‘Indexing Strategies.’’
The rest of this chapter walks through the logical order of the basic query
From Clause Data Sources
The first logical component of a typical SQLSELECTstatement is theFROMclause In a simple SQL
SELECTstatement, theFROMclause contains a single table However, theFROMclause can also combine
data from multiple sources and multiple types of data sources The maximum number of tables that may
be accessed within a single SQLSELECTstatement is 256
TheFROMclause is the foundation of the rest of the SQL statement In order for a table column to be in
the output, or accessed in theWHEREconditions, or in theORDER BY, it must be in theFROMclause
Possible data sources
SQL is extremely flexible and can accept data from seven distinctly different types of data sources within
theFROMclause:
■ Local SQL Server tables
■ Subqueries serving as derived tables, also called subselects or in-line views, are explained in
Chapter 11, ‘‘Including Data with Subqueries and CTEs.’’ Common table expressions (CTEs)
are functionally similar to subqueries but may be referenced multiple times within the query
171
www.getcoolebook.com