1. Trang chủ
  2. » Giáo án - Bài giảng

Business analytics data analysis and decision making 5th by wayne l winston chapter 18

26 186 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 26
Dung lượng 2,21 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

 In a text file or ASCII file—which is any file that can be opened and read in a text editor such as Notepad; it can be imported into Excel using Excel’s text import wizard..  In a rel

Trang 2

in a format suitable for analysis.

 In an Excel® file—which might still need to be rearranged to get it in the form of a rectangular data set

 In a text file (or ASCII file)—which is any file that can be opened and read

in a text editor such as Notepad; it can be imported into Excel using Excel’s text import wizard

 In a relational database (such as Access, SQL Server, Oracle)—which can be imported into Excel by forming a query using the Microsoft Query package

 A query specifies exactly which data you want to import.

 On the Web—which can be imported into Excel by creating a query and

then running it in Excel

values.

Trang 3

Rearranging Excel Data

in the form of a data set—a rectangular array of data with

observations in rows, variables in columns, and variable names in the top row.

 Sometimes simple cutting and pasting works

 In other cases, advanced Excel functions are required

 In all cases, it is best to map out a plan and then decide how to implement it

Trang 4

Example 18.1:

Baseball Salaries Original.xlsx

Objective: To rearrange the data from the

baseball Web queries into a single data

set

Solution: Data on baseball salaries was

imported into Excel from a Web site, with

a separate Web query for each of the 30

teams

to the right, with only a few players listed

 To rearrange all of the data into four long columns with the headings Player, Team, Salary, and Position, follow

these steps:

1. Insert a blank column before column B, and enter the label Team in cell B2.

2. Cut the Arizona Diamondbacks team name from cell A1 and paste it next to the first Arizona player in cell B3 Then copy it down for the other Arizona players.

3. Repeat step 2 for each of the other teams.

4. Delete unnecessary rows of labels for the other teams.

Trang 5

Example 18.2:

CPI.xlsx (slide 1 of 2)

Objective: To rearrange the monthly data into two long columns, one

with month-year and one with the CPI.

Solution: Monthly data on the Consumer Price Index (CPI) was

imported from the Web using a Web query A few rows appear below.

 The desired results after rearranging are shown to the

right.

Trang 6

Example 18.2:

CPI.xlsx (slide 2 of 2)

 Create the range name Data (for all the CPI values, not the headings in row

1 or column A)

 Add a new worksheet for the rearranged data, create the column headings

in row 1, enter 1 in cells A2 and B2, and enter 1913 in cell C2

 To generate the recurring pattern of 1 to 12 in column B, enter the formula

=IF(B2<12,B2+1,1) in cell B3 and copy this down as far as necessary.

 To generate the pattern in column A, enter the formula

=IF(B3=1,A2+1,A2) in cell A3 and copy it down.

To generate the years in column C, enter the formula =IF(B3=1,C2+1,C2)

in cell C3 and copy it down

 To generate the month-year values in column D, enter the formula

=DATE(C2,B2,1) in cell D2 and copy it down.

To generate the CPI values in column E, enter the formula

=INDEX(Data,A2,B2) in cell E2 and copy it down.

 Copy columns D and E and paste them over themselves as values Then delete columns A-C

Trang 7

Importing Text Data

format that is readable only by that package.

file, usually with a txt extension.

 With fixed width, each variable’s value starts and stops at fixed positions (columns) in the line

 Each line of data has the same length, and the columns line up.

 With delimited data, there is a delimiter character, usually a tab, comma,

semicolon, or space, that separates the values in a line

 The lines are typically of different lengths and do not line up nicely.

import wizard.

Trang 8

Example 18.3:

srn_tmp.txt (slide 1 of 2)

Objective: To import the fixed-width text file data into Excel by using Excel’s

text import wizard

Solution: The text file srn_tmp.txt was downloaded from the Web It contains

state, regional, and national (srn) annual data (1895-2005) on temperature A small portion of the data is shown below

 The Web site from which the text file was downloaded

also has a data dictionary, srn_data.txt, that

indicates what the variables are and how they are

stored in columns Part of this data dictionary is shown

to the right.

Trang 9

Example 18.3:

srn_tmp.txt (slide 2 of 2)

 Open the srn_tmp.txt file within Excel, and the first step of the text import wizard appears Select Fixed width

The second step of the wizard allows you to separate (or parse) the columns

as listed in the data dictionary Click on the third, fourth, and fifth positions

on the ruler

 The last step of the wizard allows you to fine-tune the import, column by column, but bypass this step and simply click Finish

 The data is imported into Excel, as shown below

 Create column headings in row 1, using the data dictionary as a guide

 Use “Save As” to save the txt file as an xlsx (or xls) file

Trang 10

Example 18.4:

Objective: To see how delimited text data can be imported into Excel

with the import text wizard.

Solution: Annual data by country on the number of mobile

subscribers during 2002-2009 was downloaded from the Web into the

 This time there are column headings in row 1, but the ragged lines indicate that this file must be delimited, not fixed width

Trang 11

Example 18.4:

Delimited option in step 1 of the wizard.

 In step 2, Excel guesses that the file is tab-delimited, which is correct,

so click on Finish to accept this

 The Flags column contains no data, so it can be deleted

 Column A can also be deleted because it includes a constant value, Mobile Subscribers

 The numbers in column D can be reformatted to Numeric with 0 decimals

Trang 12

Comments About Importing Text Data

open it directly into Excel, without the import text wizard

 Excel automatically parses the values between the commas into columns

option to save it in some type of format, text or otherwise.

 You can try copying and pasting the data into Excel, but it is possible that everything will be pasted into a single column

 If this happens, highlight the data in this column and click on the Text to Columns button on Excel’s Data ribbon

 The purpose of this button is to parse delimited data in a single column into

several columns.

chance that the data will not line up properly—that is, the data will get into the wrong columns

 Look closely at the parsed data before proceeding

Trang 13

Importing Relational Database Data

others are extremely complex and powerful packages.

 For database creation, querying, manipulation, and reporting, they have many advantages over spreadsheets

 However, they are not nearly as powerful as spreadsheets for statistical analysis

Excel, where the statistical analysis can be performed.

 Microsoft includes software called Microsoft Query in its Office suite that makes the importing relatively easy

Trang 14

Introduction to Relational Databases

tables.

 They are also called single-table databases, where table is the database

term for a rectangular range of data, with rows corresponding to records

and columns corresponding to fields

 Flat files are fine for relatively simple database applications, but they are not powerful enough for more complex applications

 A relational database is a set of related tables, where each table is

a rectangular arrangement of fields and records, and the tables are linked explicitly.

The linked fields are called keys

 A primary key must contain unique values, whereas a foreign key can contain duplicate values.

Trang 15

Using Microsoft Query

 The first method uses the From Access button in the Get External Data

group on the Data ribbon

 It is limited to importing whole tables or saved queries.

 The second method employs the Microsoft Query software, which allows you to import all or part of the data from many database packages into Excel

 It comes with the Office package, but may need to be installed.

 Once Microsoft Query is installed, importing data from Access (or any other

supported database package) is essentially a three-step process:

1 Define the source, so that Excel knows what type of database the data is in and where

the data is located.

2 Use Microsoft Query to define a query.

3 Return the data to Excel.

Trang 16

Example 18.5:

Objective: To illustrate how Microsoft Query can be used to import the

results of queries on the Shirt Orders database into Excel.

Solution: Fine Shirt Company has created an Access database file

that has information on its sales to its customers during the period of

2005 through 2009.

and Orders, with a link between the CustomerID fields in the

Customers and Orders tables and a link between the ProductID fields

in the Products and Orders tables, as shown in the diagram below.

Trang 17

Example 18.5:

not Access.

First, define a data source

 Open a blank spreadsheet in Excel and select From Microsoft Query from the From Other Sources dropdown menu on the Data ribbon.

 Select the top <New Data Source> item in the Choose Data Source dialog box and then click OK.

 Fill in the Create New Data Source dialog box, and then the ODBC Microsoft Access Setup dialog box to indicate which database file you want to use Click OK to

return to the Choose Data Source dialog box.

 In the Choose Data Source dialog box, make sure the Shirt Orders item is selected and the bottom checkbox is unchecked, and click OK This brings

up the Add Tables dialog box and begins the second step, where you define the query

 Specify which tables are relevant for the query, which fields you want to return to

Trang 18

Example 18.5:

 The results of one query (Find all of the

records in the Orders table that correspond

to orders for at least 80 units made by the

customer Shirts R Us for the product

Long-sleeve Tunic, and return the dates and

units ordered for these orders) are shown

to the right.

 Once the results of the query data are

returned to Excel, you can then begin the

statistical analysis of the data.

means that you can refresh the data in Excel

if the Access data change

that you can edit your query

 If your ultimate goal is to create a pivot

table based on the database data, you can

do this directly, as shown in the next

Trang 19

Example 18.5 (Continued):

Objective: To illustrate how Microsoft Query can be used to import

data directly into a pivot table.

Solution: Fine Shirt Company would like to break down revenue from

its various customers and products by using pivot tables.

through the usual PivotTable button on the Insert menu.

 Get into Microsoft Query and define a query

 When you select the Return Data to Microsoft Excel menu item from the File menu, you see a dialog box where you can specify the type of report you want and where you want it Select PivotTable Report (or PivotChart and PivotTable Report)

 From here, you can create any pivot tables in the usual way

Trang 20

Example 18.5 (Continued):

product, customer (using the Report Filter area at the top), and

quarter of year, is shown below.

 You have the option of obtaining corresponding pivot charts automatically

 The pivot table is linked to the query This means that you can go back to Microsoft Query, edit the query, and return to Excel to update the pivot table

Trang 21

SQL Statements

databases

language) was developed several decades ago.

 It is often called the “language of databases.”

 Behind each query developed in Microsoft Query is an SQL statement

 The statements can be viewed by clicking the SQL button in the Query toolbar once you have created a query.

 The statements include keywords such as SELECT, FROM, WHERE, and AND, as shown in the example below.

Trang 22

Web Queries

(slide 1 of 2)

steps required to import the data into Excel for analysis vary greatly.

 Many Web sites provide buttons that allow you to download the data

directly into Excel

 On some Web sites, the only way to get the data into Excel is to cut and paste

Web query is an Excel tool that lies between these two extremes

 Web queries search for HTML <Table> tags, find the corresponding data, and bring them into Excel in the usual row and column format.

 Many data sets on the Web import beautifully with Web queries, but some return virtually nothing.

 Sometimes you need to run several Web queries on the same basic site to get all

of the data you want.

In a URL, the part to the right of the question mark (if any) is called the query

string, and it specifies exactly which data you want.

Trang 23

Web Queries

(slide 2 of 2)

 Make sure you have an active connection to the Web, and open a new

workbook in Excel

 Click the From Web button on the Data ribbon

 Fill in the dialog box, the most important part of which is the URL (the

address of the page) at the top

 Once you enter the URL, click Go You will see the Web page with yellow arrows next to all of the tables

 Click any of these yellow arrows to change them to green checkmarks The selected tables will then be imported into Excel

 After you click Import, specify where to place the results

 A link to the Web page remains, so you can refresh to obtain the latest data

Trang 24

Cleansing Data

 This is especially the case when you obtain data from external sources such

as the Web

 It is your responsibility to correct any problems before you do any serious analysis

Cleansing data requires careful detective work to uncover all

possible errors that might be present

 Once an error is found, it is not always clear how to correct it (for example, missing data)

 Some subjectivity and common sense must be used when cleansing data sets

Trang 25

Example 18.6:

Objective: To find and fix errors in this company’s data set.

Solution: The data file has data on 1500 customers of a particular

company A portion of these data appears below.

Trang 26

Example 18.6:

 The data set has a number of problems, all of which you might encounter

in real data sets:

and then formatted as a date)

text field)

trailing zeroes)

the total amount spent)

 Use Excel tools to search for the suspicious data values.

 Then use other Excel tools, such as Find and Replace, to fix the errors.

Ngày đăng: 10/08/2017, 10:35

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN