The data type for the columns in each SELECT statement mustmatch for a UNION to work.. The SQL INSERT statement enables us to add data to a table in a database.. Here is an example showi
Trang 1The UNION Keyword
A UNION is useful if you want to get data from two tables withinthe same result set For example, if we want to see the bid and askfor INTC as well as the bids and asks for all the INTC options in oneresult set, the SQL statement would read as follows:
Select StockSymbol,Bid,Ask FROM Stock
WHERE StockSymbol = ’IBM’
UNION
Select OptionSymbol,Bid,Ask FROM OptionContracts
WHERE StockSymbol = ’IBM’;
See Figure 13.6
The data type for the columns in each SELECT statement mustmatch for a UNION to work This is not an issue in the aboveexample because each of the tables has identical column sets
The INSERT Statement
Up to this point we have only queried the Options.mdb databaseand looked at the results We may, however, also be interested inchanging the data In order to add, delete, or modify the data in the
F I G U R E 13.6
Trang 2Options.mdb database, we will first need to add some elements toour SQLexample program.
Step 5 Add another button to your form
Step 6 Add the following code to the Button2_Click event:
Private Sub Button2_Click(ByVal sender As ) Handles Button2.Click
Try myConnect.Open() Dim command As New OleDbCommand(TextBox1.Text, myConnect) command.ExecuteNonQuery()
Catch MsgBox("Please enter a valid SQL statement.") Finally
myConnect.Close() End Try
End Sub
An OleDbCommand object is an SQL statement that we canuse to perform transactions against a database We use theExecuteNonQuery() member method to execute UPDATE, INSERT,and DELETE statements
For the remainder of the chapter, SELECT statements should
be executed using the first button, and all other transactions should
be executed using this new, second button
The SQL INSERT statement enables us to add data to a table in
a database Here is an example showing the syntax for adding arecord to the OptionTrades table:
INSERT INTO OptionTrades
(TradeDateTime, OptionSymbol, BuySell, Price, Quantity, TradeStatus) VALUES (#02/27/2003#,’IBMDP’,’B’,2.60,10,’F’);
You can verify that this data has been added to the table by writing
a simple SELECT statement
Notice that all values for all columns have been supplied savefor the TradeID column, which is generated automatically If avalue for a column is to be left blank, the keyword NULL could beused to represent a blank column value In regard to data types,notice that strings are delimited by single quotes, numerical datadoes not need single quotes, and dates are defined with poundsigns As we have mentioned previously, each RDBMS is different,and so you should look into the documentation of your system tosee how to define the data types Whatever your RDBMS, the
Trang 3comma-delimited list of values must match the table structureexactly in the number of attributes and the data type of eachattribute.
The UPDATE Statement
The SQL UPDATE clause is used to modify data in a database tableexisting in one or several rows The following SQL updates one row
in the stock table, the dividend amount for IBM:
UPDATE Stock SET DividendAmount = 55
WHERE StockSymbol = ’IBM’;
SQL does not limit us to updating only one column Thefollowing SQL statement updates both the dividend amount andthe dividend date columns in the stock table:
UPDATE Stock SET DividendAmount = 50,DividendDate = #03/18/2003# WHERE StockSymbol = ’IBM’;
The update expression can be a constant, any computed value,
or even the result of a SELECT statement that returns a single rowand a single column If the WHERE clause is omitted, then thespecified attribute is set to the same value in every row of the table
We can also set multiple attribute values at the same time with acomma-delimited list of attribute-equals-expression pairs
The DELETE Statement
As its name implies, we use an SQL DELETE statement to removedata from a table in a database Like the UPDATE statement, eithersingle rows or multiple rows can be deleted The following SQLstatement deletes one row of data from the StockTrades table:DELETE FROM StockTrades
WHERE TradeID = 40;
The following SQL statement will delete all records from theStockTrades table that represent trades before January 4, 2003:DELETE FROM StockTrades
WHERE TradeDateTime < #01/04/2003#;
Trang 4If the WHERE clause is omitted, then every row of the table isdeleted, which of course should be done with great caution.
BEGIN, COMMIT, and ROLLBACK
Transaction commands such as INSERT, UPDATE, and DELETEmay also contain keywords such as BEGIN, COMMIT, andROLLBACK, depending upon the RDBMS you are using Forexample, to make your DML changes visible to the rest of the users
of the database, you may need to include a COMMIT If you havemade an error in updating data and wish to restore your privatecopy of the database to the way it was before you started, you may
be able to use the ROLLBACK keyword
In particular, the COMMIT and ROLLBACK statements arepart of a very important and versatile Oracle capability to controlsequences of changes to a database You should consult thedocumentation of your particular RDMBS with regard to the use ofthese keywords
DATA DEFINITION LANGUAGE
We use DDL to create or modify the structure of tables in adatabase When we execute a DDL statement, it takes effectimmediately Again, for all transactions, you should click Button2
to execute these nonqueries You will be able to verify the results ofthe SQL statements by creating simple SELECT statements andexecuting a query with Button1 in your program
Creating Views
A view is a saved, read-only SQL statement Views are very usefulwhen you find yourself writing the same SQL statement over andover again Here is a sample SELECT statement to find all the IBMoption contracts with an 80 strike:
SELECT * FROM OptionContracts
WHERE StockSymbol = ’IBM’ AND OptionSymbol LIKE ’%P’;
Trang 5Although not overly complicated, the above SQL statement
is not overly simplistic either Rather than typing it again andagain, we can create a VIEW The syntax for creating a VIEW is asfollows:
CREATE VIEW IBM80s as SELECT * FROM OptionContracts
WHERE StockSymbol = ’IBM’ AND OptionSymbol LIKE ’%P’;
The above code creates a VIEW named IBM80s Now to run it,simply type in the following SQL statement:
SELECT * FROM IBM80s;
Views can be deleted as well using the DROP keyword.DROP VIEW IBM80s;
Creating Tables
As you know by now, database tables are the basic structure inwhich data is stored In the examples we have used so far, the tableshave been preexisting Oftentimes, however, we need to build atable ourselves While we are certainly able to build tablesourselves with an RDBMS such as MS Access, we will cover theSQL code to create tables in VB.NET
As a review, tables contain rows and columns Each rowrepresents one piece of data, called a record, and each column,called a field, represents a component of that data When we create
a table, we need to specify the column names as well as their datatypes Data types are usually database-specific but often can bebroken into integers, numerical values, strings, and Date/Time Thefollowing SQL statement builds a simple table named Trades:
CREATE TABLE Trades
(myInstr Char(4) NOT NULL,myPrice Numeric(8,2) NOT NULL,myTime Date _ NOT NULL);
The general syntax for the CREATE TABLE statement is asfollows:
CREATE TABLE TableName (Column1 DataType1 Null/Not Null, );
Trang 6The data types that you will use most frequently are theVARCHAR2(n), a variable-length character field where n is itsmaximum width; CHAR(n), a fixed-length character field of widthn; NUMERIC(w.d), where w is the total width of the field and d isthe number of places after the decimal point (omitting it produces
an integer); and DATE, which stores both date and time in a uniqueinternal format NULL and NOT NULL indicate whether a specificfield may be left blank
Tables can be dropped as well When a table is dropped, all thedata it contains is lost
DROP TABLE myTrades;
Altering Tables
We have already seen that the INSERT statement can be used to addrows Columns as well can be added to or removed from a table.For example, if we want to add a column named Exchange to theStockTrades table, we can use the ALTER TABLE statement Thesyntax is:
ALTER TABLE StockTrades ADD Exchange char(4);
As we have seen in the previous chapter, all tables must have aprimary key We can use the ALTER TABLE statement to specifyTradeID in the Trades table we created previously
ALTER TABLE Trades ADD PRIMARY KEY(TradeID);
Columns can be removed as well using the ALTER TABLEstatement
ALTER TABLE StockTrades DROP Exchange;
SUMMARY
Over the course of this chapter, we have looked at SQL datamanipulation language and data definition language While wehave certainly not covered all of SQL, you should now be fairly
Trang 7proficient at extracting and modifying data in a database as well aschanging the structure of tables within a database.
SQL consists of a limited number of SQL statements andkeywords, which can be arranged logically to perform transactionsagainst a database While it is easy to get good at SQL, it is verydifficult to become an expert
Trang 81 What is SQL? What are DDL and DML?
2 What document should you consult to find out the specifics
of SQL transactions against your RDBMS?
3 What is an OleDbCommand object, and what is theExecuteNonQuery() method?
4 If we found corrupt data in a database, what statementsmight we use to either correct it or get rid of it?
5 What is the syntax of CREATE TABLE?
Trang 9PROJECT 13.1
The Finance.mdb database contains price data However, we veryoften will be interested in a time series of log returns Create aVB.NET application that will modify the AXP table to include aReturns column Then make the calculations for the log returns andpopulate the column
PROJECT 13.2
Create a VB.NET application that will connect to the Finance.mdbdatabase and return the average volume for a user-defined stockbetween any two user-defined dates
Trang 10This page intentionally left blank.
Trang 11COLLECTION OBJECT
The Collection class allows us to store groups of objects of differentdata types and to easily count, look up, and add or remove objectswithin the collection using the Count and Item properties and theAdd and Remove methods of the Collection class Furthermore wecan iterate through the elements in a collection using a ForEach .Next loop Collections do not have fixed sizes, and memoryallocation is completely dynamic, and so in many cases they will be
a superior way of handling data compared with arrays
As with arrays, it will be important to note the index of thefirst element Most often, the Collection objects we will use will be1-based That is, the index of the first element will be by default 1and not zero as with arrays Also Collection objects allow us to
243 Copyright © 2004 by The McGraw-Hill Companies, Inc C lick here for terms of use.
Trang 12access elements of the collection by either index or an optionalstring key As we will see later, other collection types allow onlynumeric index references and may not have a key Here are theproperties and methods associated with Collection objects:
Collection
Count Returns the number of objects in
the collection
dblNum ¼ myColl.Count Item Returns a specific element of the
collection
dblObj ¼ myColl.Item(1) or dblObj ¼ myColl.Item( strKey ) Collection
Add Adds an object to the collection myColl.Add(myObj)
Remove Removes an object from the
collection
myColl.Remove(2) or myColl.Remove( strKey )
We have in fact already had some experience with Collectionobjects As you may recall, several of the ADO.NET objects welooked at in Chapter 12, like DataRowCollections and DataCo-lumnCollections, are Collection objects and as such inherit from theCollectionBase class
Here is an example using a Collection object
Step 1 Create a new Windows application named Portfolio.Step 2 In the Project menu item, select Add Class twice and
add two new classes Into these class code modules,paste in the code for the StockOption and CallOptionclasses
Step 3 Go back to the Form1 code window, and in the
Form1_Load event, add the code to create aCollection object named MyPortfolio
Dim MyPortfolio As New Collection()Step 4 Next add the code to create a CallOption object called
myOption
Dim myOption As New CallOption("IBMDP")
Trang 13Step 5 Add myOption to MyPortfolio.
MyPortfolio.Add(myOption)Step 6 Now we can actually destroy myOption using the
Nothing keyword
When we assign Nothing to an object, the object reference no longerrefers to that object instance
myOption = Nothing
Step 7 Still within the Form1_Load event, let’s create
another option and add it to MyPortfolio
myOption = New CallOption("SUQEX") MyPortfolio.Add(myOption)
myOption = NothingMyPortfolio now consists of two CallOption objects,neither known by the name myOption, but rather bytheir respective indexes within the MyPortfoliocollection
MicroSystems option (SUQEX) in the following way:Label1.Text = MyPortfolio.Item(2).Strike
Trang 14CREATING A CUSTOMIZED COLLECTION
is that if we try to use a For Each CallOption In MyPortfolio .Nextloop to process a portfolio of options, an error will occur since oneelement in MyPortfolio may be, for example, a GovtBond object
In cases where we require a more robust collection, we can,through inheritance from the CollectionBase class, create our ownCollection class and add our own functionality The CollectionBaseclass, found in the System.Collections.namespace, includes thepublic Clear method, the Count property, and a protected propertycalled List that implements the IList interface The methods andproperties—Add, Remove, and Item—require that we codify theimplementation, as you will see Here are the important propertiesand methods of the MustInherit CollectionBase class:
IList
Count Returns the number of elements in the CollectionBase object
Clear Deletes all elements from the CollectionBase object Equals Determines whether two objects in the CollectionBase are
equal GetEnumerator Returns an enumerator that can iterate through the elements
of a CollectionBase RemoveAt Deletes an element from the CollectionBase object at a
specified index IList
CopyTo Copies the elements of a CollectionBase to a one-dimensional
array Add Adds an element at the end of the CollectionBase
Contains Determines whether a specified element is contained in a
CollectionBase
Trang 15index Remove Removes the first occurrence of a specified element from the
CollectionBase
In this example, we will create an OptionCollection that onlyaccepts CallOptions as opposed to any object Then we will addmethods to buy, implementing IList.Add(), and sell, IList.Remo-veAt(), CallOptions Also we will need to implement the Itemproperty that returns the CallOption at a specified index Thiscustomized OptionCollection class will be zero-based
Step 1 Start a new Windows application and name it
OptionCollection
Step 2 In the same way as in the previous example, add the
code for the StockOption and CallOption classes.Step 3 Now add a code module for a third class called
OptionCollection with the following code:
Public Class OptionCollection
Inherits System.Collections.CollectionBase Public Sub Buy(ByVal myOption As CallOption)
List.Add(myOption) End Sub
Public Sub Sell(ByVal myIndex As Integer)
List.RemoveAt(myIndex) End Sub
Public ReadOnly Property Item(ByVal myIndex As Integer) As CallOption Get
Return List.Item(myIndex) End Get
End Property
End Class
Notice that the public Buy and Sell methods implement theAdd() and RemoveAt() methods and the Item property implementsthe Item property of the List property of the parent CollectionBaseclass
Step 4 In the Form1_Load event, create an instance of the
OptionCollection class called MyOptionPortfolio.Also create two CallOption objects
Dim myOptionPortfolio As NewOptionCollection() Dim myFirstOption As New CallOption("IBMDP") Dim mySecondOption As New CallOption("SUQEX")
Trang 16Step 5 Add the two CallOptions to MyOptionPortfolio by
“buying” them
myOptionPortfolio.Buy(myFirstOption) myOptionPortfolio.Buy(mySecondOption)Step 6 Sell the IBMDP option
myOptionPortfolio.Sell(0)Step 7 The SUQEX option is left in the portfolio as you can
F I G U R E 14.2
Trang 17Using high-quality data almost always pays off even though it’smore expensive In any case, though, time spent finding good dataand giving it a good once-over is worth the effort and expense.All data should be cleaned before use But serious datacleaning involves more than just visually scanning data in Exceland updating bad records with good data Rather, it requires that
we decompose and reassemble data This takes time
Data cleaning is a process that consists of first detection andthen correction of data errors and of updating the dirty data sourcewith clean data or preferably creating a new data source to hold theentire cleaned data set Maintaining the original dirty data source
in its original form allows us to go back if we make a mistake in ourcleaning algorithms and consequently further corrupt the data.Another problem requiring data cleaning occurs when,depending on the time interval we’re looking at, the data wehave is not in the individual ticks or bars we desire (bars beingfixed units of time with a date/time, an open, a high, a low, a close,and maybe even a volume and/or open interest) We may, forexample, possess tick data and want to analyze bars of severaldifferent durations—a minute in length, 5 minutes, a day, a week,
or a month It is, of course, possible to convert raw tick data into aseries of bars by writing a simple VB.NET program to generate thebar data and save it to a new database
Let’s look at some of the common types of bad data we oftenencounter in financial markets:
Type of
Bad quotes Tick of 23.54 should be 83.54
Missing data Blank field or data coded as “9999,” “NA,” or “0”
Column-shifted data Value printed in an adjacent column
File corruption CD or floppy disk errors
Different data formats Data from different vendors may come in different formats
or table schemas
As we know, the use of a large amount of in-sample data willproduce more stable models and have less curve-fitting danger,thereby increasing the probability of success out-of-sample andconsequently during implementation Sophisticated models, such
as GARCH(1,1), are often more affected by bad data as comparedwith simpler models
Trang 18Since many forecasting models, like GARCH, are extremelysensitive to even a few bad data points, we should be sure to look atmeans, medians, standard deviations, histograms, and minimumand maximum values of our data A good way to do this is to sortthrough the data set to examine values outside an expected range.
Or we can run scans to highlight suspicious, missing, extraneous,
or illogical data points Here are a few, but certainly not all,methods often used to scan data:
Scanning for Bad Data Intraperiod high tick less than closing price
Intraperiod low tick greater than opening price
Volume less than zero
Bars with wide high-low ranges relative to some previous time period
Closing deviance Divide the absolute value of the difference between each closing price and the previous closing price by the average of the preceding 20 absolute values Data falling on weekends or holidays
Data with out-of-order dates or with duplicate bars
As mentioned, data cleaning has three components: auditingdata to find bad data or to highlight suspicious data, fixing baddata, and applying the fix to the data set or preferably saving thedata to a new data source The methods we choose to accomplishthese three tasks constitute a data transformation managementsystem (DTMS) The hope is that our DTMS will improve thequality of the data as well as the success of our models To review, aDTMS should capture data from your data source, clean it, andthen save it back or create a new data source with the clean data
As with any process, it pays to plan ahead when building aDTMS Before you begin, identify and categorize all the types oferrors you expect to encounter in your data, survey the availabletechniques to address those different types of errors, and develop asystem to identify and resolve the errors
Of course, as we mentioned, you should purchase data onlyfrom reputable vendors who take data integrity seriously Even so,you should always scan and clean your data It’s just that dealingwith quality vendors will nonetheless save time and improveresults
Trang 19CREATING A DATA TRANSFORMATION
Imports System.Data.OleDb
Public Class Form1
Inherits System.Windows.Forms.Form
Dim myDataSet As New DataSet()
Private Sub Form1_Load(ByVal sender As ) Handles MyBase.Load
Dim myConnect As New OleDbConnection("Provider= \DirtyFinance.mdb") Dim myAdapter As New OleDbDataAdapter("select * from AXP", myConnect) myConnect.Open()
or objects Then we can simply pass a reference to the collection as
an input argument to the procedure and commence cleaning In theAXPdata DataTable, let’s search for intraday high prices that areless than the closing price
Step 3 Create a subroutine called CleanHighLessThanLow()
that accepts as an input argument a reference to a
Trang 20DataRowCollection object This subroutine shouldloop through the element of a collection and findinstances where the intraday high is less than theclose.
As we discussed in Chapter 11, the DirtyFinance.mdb Accessdatabase contains dirty data For simplicity, your subroutineshould, upon finding a dirty data point, show a message boxalerting the user to the bad data as well as its index
Private Sub CleanHighLessThanClose(ByRef myDataPoints As _
DataRowCollection) Dim x As Integer
For x = 0 To myDataPoints.Count - 1
If myDataPoints(x).Item("HighPrice") < myDataPoints(x).Item("ClosePrice") Then
MsgBox("Bad Data Point: High of " & _
myDataPoints(x).Item("HighPrice") & _
" and Close of " & myDataPoints(x).Item("Close") & _
" at " & Str(x)) End If
Next x
End Sub
Step 4 Add a button to Form1, and in the Button1_Click
event, call the subroutine to clean the table passing
a reference to the DataRowCollection The Rowsproperty of the DataTable returns a reference to theDataRowCollection
Private Sub Button1_Click(ByVal sender As ) Handles Button1.Click CleanHighLessThanClose(myDataSet.Tables("AXPdata").Rows) End Sub
Step 5 Run the program Figure 14.3 shows the result
F I G U R E 14.3