Techniques and Methods of GIS for Business- 123docz.net

Techniques and Methods of GIS

for Business

Richard P. Greene, Northern Illinois University, USA

John C. Stager, Claremont Graduate University, USA

Abstract

This chapter reviews some standard techniques and methods of geographic information systems for business applications. Characteristics of spatial databases are first reviewed and discussed. Methods of displaying spatial data are compared and contrasted and GIS overlay procedures are described. Two case studies showcase many of the techniques introduced. The first case study illustrates the use of GIS for analyzing an urban labor market while the second demonstrates the integration of modeling functions into a GIS with an application of the gravity model.

Introduction

Managing a company requires a multitude of decisions. The decision makers historically rely on statistical analysis of their customer sales records to make future decisions. They look at charts to track a product’s sales trends. They look at store sales records to see which stores are doing well and not so well. The company’s database contains lots of other information that often is not even recognized. Almost every database table

Techniques and Methods of GIS for Business 37

contains some location information like an address, phone number, or zip code. Most companies use these attributes to keep contact lists, print mailing labels, and send billing statements and advertisements.

If a decision maker is solely looking at statistical charts and tables they are sadly missing out on a gold mine of geographic information, on which a type of geographic analysis called spatial analysis can be performed to yield trade area information, new customers, and competitor area analysis.

This chapter discusses selected techniques and methods of geographic information systems (GIS), with a focus on their applications to business. First, the standard techniques available within GIS software packages are presented. Standard GIS techniques include buffer delineations, overlay analyses, and geo-coding, all of which underlie many GIS applications already in wide use by businesses. Secondly, GIS allows for advanced spatial analyses and interpretation of spatial data. Simply stated, spatial analysis manipulates geographic coordinates and associated attribute data for the purposes of solving a spatial problem. Spatial analyses that are especially relevant for businesses are illustrated. Two example applications illustrate the range of analyses that GIS can provide for decision support in business. The first example illustrates how GIS can assist in the spatial analysis of an urban labor market’s industrial specialization. The second example illustrates the use of the gravity model by businesses for determining the spatial extent of their market areas. These examples can be replicated with any commercial GIS software.

The use of GIS for business applications is growing immensely. There are many examples of excellent uses for a GIS from a business point-of-view and the literature on the topic has grown in recent years (see Thrall, 2002; Boyles, 2002; and Grimshaw, 2000, for an overview). It is the purpose of this chapter to not only highlight GIS techniques relevant for spatial analysis in business, but more importantly, showcase their use in a couple of case studies.

Spatial Databases

Companies consider data to be a company asset. Spatial data, falling within this definition of data, would also be an asset. However, spatial data is often not utilized to the extent that it could be to fully leverage its value. Many existing attributes of company databases are spatial, including addresses, zip codes, and telephone area codes. These are not typically thought of as spatial attributes because they are not specified as latitude and longitude coordinates. For example, an address can be used regularly for mailing statements by a bank. The bank may want to put in a new branch and will base their decision on surveys or guesses to determine the location. Using existing transaction data by existing branches, the bank could plot, using a GIS, this transaction data by geocoding the customer home address and at which branch the transaction was made.

Using the GIS the bank could then perform analyses, such as a what-if analysis, based on the distance from all of the homes to the branch used. This analysis could determine possible new locations for a branch based on reducing the distance from customers to

38 Greene and Stager

the branch they use. This is a very simple example, but it does illustrate the possible use of existing spatial data that is not generally thought of as spatial.

Typically, what makes a database spatial is the connection of the data to a geographically referenced coordinate system. A geographical coordinate system precisely locates features on the earth in terms of an X and Y coordinate position. Latitude and longitude are the most frequently used reference coordinates. Both are measured as angles from the center of the earth as a point to a point on the surface of the earth. Many GIS databases are geographically referenced with transformed coordinates from a different map projec- tion and associated coordinate system and are typically referred to as X and Y coordinates.

Commercially available Data Base Management Systems (DBMSs) have, directly or through extensions, implemented support for these spatial data. Three examples of this are the Oracle Database 9.X Server with Oracle Spatial 9.X spatial database; Informix’s spatial data-blades including 2D, 3D, and Geodetic; and ESRI’s Spatial Data Engine (SDE). However, it is possible to store spatial data in a traditional DBMS, especially given that spatial data are often readily stored in most databases (e.g., address information).

It is also possible to store other spatial data (e.g., base maps, overlays) in traditional DBMS’s. This can be accomplished using Binary Large Object (BLOB), which is a collection of bits that can contain anything (e.g., video, text, music, raster data, vector data, and any digital data). Moreover, a BLOB can store a shape file (.SHP) in a record.

The shape file has become an industry standard for storing some types of spatial data.

Spatial data requires a unique set of operations to manipulate the X and Y coordinates stored in the DBMS. For example, a spatial query may be a SQL Select statement with a where clause of where Street_1 intersects Street_2. The GIS intersect operation is yet another operation that involves two polygons that overlap in the X and Y coordinate space. These unique operations are not generic to SQL and the database world. That is, a GIS intersect is not as one would have in set processing (e.g., an intersection of two sets), but rather a spatial intersection (e.g., does the trade area of Sears intersect with the trade area of Kohl’s?). Other familiar database operations are also different in a GIS context. Joins, for example, need to be oriented toward spatial data. Normally, we join on like fields and like values. In the spatial world, we will need to support a join that joins a point (e.g., a store location) to a polygon (e.g., a store’s trade area) based solely on its spatial attributes that may not be a textual representation of a spatial attribute.

Finally, DBMS indexing needs to allow for retrieving and displaying items in the database without the necessity of doing a table, in the case of a relational database, scan (i.e., sequentially processing the entire data table). The advantage of indexing is for speed and efficiency. Indexing in a DBMS is similar to the indexing of a book. If one were reading a book with no index, the only method for finding a specific item in the book would be to read the entire book in sequence from beginning to end. With an index that is arranged alphabetically, one simply looks up the term and turns to the reference page number.

Similarly, an index can be built for a DBMS, which improves the efficiency of various queries and operations.

Most enterprises have already implemented rudimentary spatial data into their operations. They have ever since they placed the first address into a database. Do enterprises have to switch to a DBMS that fully supports spatial data? They do if that is a requirement or is wholly or partially a core competency of the firm. However, in utilizing spatial data

Techniques and Methods of GIS for Business 39

to increase productivity, provide for better customer service, or increase sales, it may be fine to use what is available and add what is possible within the current DBMS.

One technique that can be used to transform an address of a customer into spatial data represented as an X and Y location is to perform geocoding or what is often referred to as address matching. Address matching is a process that compares two addresses to determine whether they are the same. To match addresses, the GIS software examines the components of addresses in both the database file where addresses are maintained and the attribute table associated with a GIS layer of roads. The U.S. Census Bureau’s digital street maps are commonly used for this purpose, as their attribute tables have four street address numbers ranging from low to high for each side of a street segment. The range indicates the possible numbers that could fall within a particular block, and the numbers are divided into even numbers on one side of the street and odd numbers on the other.

The address components for this type of street are typically represented as:

Left_from Left_to Right_from Right_to Street_name Type

201 299 200 298 SUNSET ST

GIS software tools can then take a table of addresses and interpolate a point for each address based on these address ranges. For example, a customer’s home address could be geocoded into latitude and longitude coordinates and then stored in the customer database as an attribute of the customer, as is the address. When a delivery is scheduled to that customer, the latitude and longitude could be placed into a shipping file for all the shipments that must be made on a given day. Using this file, the shipments can be divided into the number of available trucks. Then using the list of shipments on a given truck, the latitude and longitude could be used by a GIS to route the truck on the most efficient path for all of the deliveries.

One other example would be to keep spatial data on the company assets used by customers of an electric utility. Then a problem occurs — let us say that someone accidentally knocked down a utility pole. Based on the connections that exist to customers that somehow relate to that pole (e.g., electric transmission and delivery circuits), the utility could know the extent of an outage and take proactive action to notify affected customers about the problem and the estimated time that the problem will be resolved. The spatial data processing of a geometric network of electric lines and the addresses represented as points would be employed to generate the list of affected customers.

Querying Attributes and Spatial Display of Data

The benefit for a company that stores its customer data in a GIS is that it allows them to visualize the spatial patterns of those customers. Consider the case of a company that

40 Greene and Stager

has a large database filled with tables of all aspects of its customers collected over the years. The value in the database is not its large size, but rather in its ability to answer questions. You can make queries on the database that result in a smaller subset of data.

For example you might make a query such as, “show me last month’s customers that spent over $200.” The value in your database is the ability to structure these queries, as well as the methods available for displaying the results of those queries. For instance, if the above query yields 20 customers, then one may choose to display them in a list, sort them, and sum their total spending. Alternatively, one could display the results in a chart. It would also be useful to display each customer on a map, which is made possible when the data are stored in a GIS. Such a customer map might include the store locations in order to visualize patterns such as clustering around a certain store or zip code. The observations may lead to a decision to mail advertisements to nearby zip codes that appear underserved.

Some GIS support a standardized language to perform such queries. It is called the Structured Query Language or SQL (some pronounce SQL as sequel). A SQL statement is made up of three parts: a field name, an operator, and a value. Such statements can also be connected with other statements with connectors. For instance, from the sales example above, if you wanted to visualize customers originating from high income zip codes the statement may be written as SELECT SALES_AMOUNT > 200 AND ZIP_CODE

= HIGH INCOME. The GIS would then highlight all of the customers represented as dots that met the above criteria.

Following a set of queries, a company will wish to present the results on a map for decision-making purposes, such as convincing a decision maker to take a certain action.

Many businesses in the days before GIS would place a map of an area on their wall and push colored pins into the map to signify locations of importance for strategic decision- making. Similarly, when traveling, a business person would take a highlighting marker and mark the route to be taken or record the route that was taken. Today, a critical component included in GIS maps is a legend consisting of appropriate symbols, colors, and classifications used for drawing the map layers shown on the map. These legends vary in type ranging from one color or one symbol displays to the display of many colors and symbols. For instance, a metropolitan map illustrating disposable income patterns may vary the symbol size for zip code points to denote levels of income originating from an income attribute contained in the zip code attribute table. The latter technique, referred to as a graduated symbol map, is illustrated in the first case study at the end of the chapter.

Lines are another way that features are represented on maps, and similar to areas and points they can be colored or symbolized based on an attribute contained in a table. For instance, if a company has captured the flow of its customers with an origin, such as a zip code location, and a destination, such as a store location, then the business may decide to illustrate the relationship by lines connecting the zip codes to the store and varying the width based on the volume originating by zip code. A flow map of this nature will quickly reveal the directions in which the store is attracting customers as well as the location of underserved areas.

Areas referred to as polygons are also used to communicate information other than just its location and relative size. For instance, the same metropolitan map illustrating disposable income patterns above, but this time census blocks rather than zip code

Techniques and Methods of GIS for Business 41

points, may use different colors to denote levels of income originating from an income attribute contained in a census block attribute table. This latter example is referred to as a graduated color map that is created by picking a color ramp (a spectrum of color that allows a distinct color for each represented value). For our example we would pick a red color ramp that shows a spectrum of red from a white to a pink to a bright red to a dark red. We would equate the dark red with higher disposable incomes. The way we equate the income to a specific color is done using a mapping classification technique discussed in the next section.

Another symbolization technique is a dot density map. This technique is performed by equating the income to a number of dots that are shown within the census blocks of the metropolitan area. Let us say that each dot represents $100 of disposable income. If the disposable income of a specific census block were $2,000 then there would be 20 dots within the boundary of that block on the map. There is only one potential problem with this technique, and that is that the placement of the dots is random within the block.

Mapping Classification Methods

Just as in preparing a graph of sales, mapping will often require a user to first classify the data into classes in order to simplify the display. Thus, when you use the symbolization techniques described above for map layers, you decide how many classes each needs and you decide how to break the data into classes. Each class has a beginning and an ending value, you can pick these values through your own criteria, or you can use an established classification method. Each method uses a math equation to calculate the range of each class, and some of the more common ones employed in GIS software are described.

Natural Breaks

Natural breaks are said to occur within the data when there are large jumps between values of observations in the dataset. The natural breaks method then looks for obvious breaks or gaps in the data to establish classes. If one were to request three classes to be generated from the data, the GIS would attempt to find three areas that are separated by a gap in the clusters of values.

Equal Interval

In the equal interval classification the lowest value is subtracted from the highest value to compute a range. Then the range is divided by the number of classes that are desired.

The resulting number is then added to the low to get the upper range for the low class.

The resulting number is then added to that, to get the upper range for the second class,

42 Greene and Stager

and then repeated for additional classes. An example of the equal interval classification is presented in the first case study at the end of the chapter.

Equal Area

An equal area classification sets class boundaries so as to include an equal proportion of a map area into each established class. Thus, the map will appear balanced in that each class will represent approximately the same area in extent. In business this may be of use if a company wishes to map a product that it wishes to distribute equally over a trading area.

Standard Deviation

Many GIS software packages have introduced the standard deviation or another measure of dispersion for establishing a classification. A standard deviation is basically the average difference of the set of values from the mean of the set of values. To calculate the standard deviation one calculates the mean of the values, which is the total of the values divided by the number of values. Next a sum of the difference of each of the values is computed, then squared, divided by the number of items; finally the square root of the result is computed. The formula to compute a standard deviation is:

2 1

( )

n i i

X X

s n

−

= ∑

where s is the standard deviation, n is the number of items in the list, X is the value of an item, X is the mean of the items in the list. For classification breaks, a user decides on the number of standard deviations, for instance two will result in four classes, two above the mean and two below the mean. This is an effective method for showing extremes in the data: as in the case of sales, a business can quickly visualize extreme low and high sales volume areas.

Table Joins, Buffer, and Overlays

A number of advanced database operations are available in GIS. Although not unique to GIS, table joins are a feature of all relational databases of which GIS is included. Joining tables is the technique for using data from multiple sources in your analysis. For example, you have a table of data of all of your customers’ transactions (e.g., sales) for a particular

Techniques and Methods of GIS for Business

Concepts and Theories of GIS in Business

Costs and Benefits of GIS in Business