RELATIONAL MANAGEMENT and DISPLAY of SITE ENVIRONMENTAL DATA - PART 5 ppsx

GIS programs that support cross section displays can provide a similar featurewhere a user can click on a soil boring in a cross section, and then call up data from that boring, or a spe

Trang 1

PART FIVE - USING THE DATA

Trang 2

CHAPTER 18

DATA SELECTION

An important key to successful use of an EDMS is to allow users to easily find the data theyneed There are two ways for the software to assist the user with data selection: text-based andgraphical With text-based queries, the user describes the data to be retrieved using words,generally in the query language of the software Graphical queries involve selecting data from agraphical display such as a graph or a map Query-by-form is a hybrid technique that uses agraphical interface to make text-based selections

TEXT-BASED QUERIES

There are two types of text-based queries: canned and ad hoc The trade-off is ease of use vs.flexibility

Canned queries

Canned queries are procedures where the query is prepared ahead of time, and the retrieval is

done the same way each time An example would be a specific report for management orregulators, which is routinely generated from a menu selection screen The advantage of cannedselections is that they can be made very easy to use since they involve a minimum of choices forthe user The goal of this process is to make it easy to quickly generate the output that will berequired most of the time by most of the users The EDMS should make it easy to add new cannedqueries, and to connect to external data selection tools if required Figure 85 shows an example of

a screen from Access from which users can select pre-made queries The different icons next to thequeries represent the different query types, including select, insert, update, and delete The user can

execute a query by double-clicking on it Queries that modify data (action queries), such as insert,

update, and delete, display a warning dialog box before performing the action Other than with theicons, this screen does not separate selection queries from action queries, which results in somerisk in the hands of inexperienced or careless users

Trang 3

Figure 85 - Access database window showing the Queries tab

Ad hoc queries

Sometimes it is necessary to generate output with a format or data content that was not

anticipated in the system design Text selections of this type are called ad hoc queries (“ad hoc” is

a Latin term meaning “for this”) These are queries that are created when they are needed for aparticular use This type of selection is more difficult to provide the user, especially the casualuser, in a way that they can comfortably use It usually requires that users have a goodunderstanding of the structure and content of the database, as well as a medium to high level ofexpertise in using the software, in order to perform ad hoc text-based queries The data modelshould be included with the system documentation to assist them in doing this

Unfortunately, ad hoc queries also expose a high level of risk that the data retrieved may not

be valid For example, the user may not include the units for analyses, and the database maycontain different units for a single parameter sampled at different times The data retrieved will beinvalid if the units are assumed to be the same, and there is no visible indication of the problem.This is particularly dangerous when the user is not seeing the result of the query directly, but usingthe data indirectly to generate some other result such as statistics or a contour map In general, it isdesirable to formalize and add to the menu as wide a variety of correctly formatted retrievals as

possible Then casual users are likely to get valid results, and “power users” can use the ad hoc

queries only as necessary

Figure 86 shows an example of creation of an ad hoc text-based query The user has created anew query, selected the tables for display, dragged the fields from the tables to the grid, andentered selection criteria In this case, the user has asked for all “Sulfate” results for the site “RadIndustries” where the value is > 1000 Access has translated this into SQL, which is shown in thesecond panel, and the user can toggle between the two The third panel shows the query indatasheet view, which displays the selected data The design and SQL views contain the sameinformation, although in Access it is possible to write a query, such as a union query, that can’t bedisplayed in design view and must be shown in SQL Some advanced users prefer to type in theSQL rather than use design view, but even for them the drag and drop can save typing andminimize errors

Trang 4

Figure 86 - A text-based query in design, SQL, and datasheet views

GRAPHICAL SELECTION

A second selection type is graphical selection In this case, the user generates a graphical

display, such as a map, of a given site, selects the stations (monitoring wells, borings, etc.), thenretrieves associated analytical data from the database

Trang 5

Figure 87 - Interactive graphical data selection

Figure 88 - Editing a well selected graphically

Trang 6

Figure 89 - Batch-mode graphical data selection

Geographic Information System (GIS) programs such as ArcView, MapInfo, and Enviro Spaseprovide various types of graphical selection capability Some map add-ins that can be integratedwith database management and other programs, such as MapObjects and GeoObjects, also offerthis feature

There are two ways of graphically selecting data, interactive and batch In Figure 87 the userhas opened a map window and a list window showing a site and some monitoring wells The userthen double-clicked on one of the wells on the map, and the list window scrolled to show someadditional information on the well

In Figure 88 a well was selected graphically, then the user called up an editing screen to viewand possibly change data for that well The capability of working with data in its spatial contextcan be a valuable addition to an EDMS

In Figure 89 the user wanted to work with wells in or near two ponds The user dragged arectangle to select a group of wells, and then individually selected another Then the user asked thesoftware to create a list of information about those wells, which is shown on the bottom part of thescreen In this case the spatial component was a critical part of the selection process

Selection based on distance from a point can also be valuable The point can be a specificobject, such as a well, or any other location on the ground, such as a proposed constructionlocation The GIS can help you perform these selections

Other types of graphical selection include selection from graphs and selections from crosssections Some graphics and statistics programs allow you to create a graph, and then click on apoint on the graph and bring up information about that point, which may represent a station,sample, or analysis GIS programs that support cross section displays can provide a similar featurewhere a user can click on a soil boring in a cross section, and then call up data from that boring, or

a specific sample for that boring

Trang 7

Figure 90 - Example of query-by-form

QUERY-BY-FORM

A technique that works well for systems with a variety of different user skill levels is

query-by-form, or QBF In this technique, a form is presented to the user with fields for some of the data

elements that are most likely to be used for selection The user can fill out as many of the fields asneeded to select the subset that the user is interested in The software then creates a query based onthe selection criteria This query can then be used as the basis for a variety of different lists,reports, graphs, maps, or file exports Figure 90 shows an example of this method

Trang 8

Figure 91 - Query-by-form screen showing selection criteria for different data levels

In this example, the user has selected Analyses in the upper right corner Along the left sidethe user selected “Rad Industries” as the site, and “MW-1” as the station name In the center of thescreen, the user has selected a sample date range of greater than 1/1/1985, and “Sulfate” as theparameter The lower left of the screen indicates that there are 16 records that match these criteria,meaning that there are 16 sulfate measurements for this well for this time period When the userselected List, the form at the bottom of the screen was displayed showing the results

To be effective, the form for querying should represent the data model, but in a way that feelscomfortable to the user Also, the screen should allow the user to see the selection optionsavailable Figure 91 shows four different versions of a screen allowing users to make selections atfour different levels of the data hierarchy

The more defined the data model, the easier it is to provide advanced user-friendly selection.The Access query editor is very flexible, and will work with any tables and fields that might be inthe database However, the user has to know the values to enter into the selection criteria If thefields are well defined and won’t change, then a screen like that shown in Figures 90 and 91 canprovide selection lists to select values from Figure 92 shows an example of a screen showing theuser a list of parameter names to choose from

Trang 9

Figure 92 - Query-by-form screen showing data choices

One final point to be emphasized is the reliance of data quality on good selection practices.This was discussed above and in Chapter 15 Improper selection and display can result in data that

is easy to misinterpret Great care must be taken in system design, implementation, and usertraining so that the data retrieved accurately represents the answer to the question the user intended

to ask

Trang 10

CHAPTER 19

REPORTING AND DISPLAY

It takes a lot of work to build a good database Because of this, it makes sense to get as muchbenefit from the data as possible This means providing data in formats that are useful to as manyaspects of the project as possible, and printed reports and other displays are one of the primaryoutput goals of most data management projects This chapter covers a variety of issues for reportsand other displays Graph displays are described in Chapter 20 Cross sections are discussed inChapter 21, and maps and GIS displays in Chapter 22 Chapter 23 covers statistical analysis anddisplay, and using the EDMS as a data source for other programs is described in Chapter 24

TEXT OUTPUT

Whether the user has performed a canned or ad hoc query, the desired result might be a tabulardisplay This display can be viewed on the screen, printed, saved to a file, or copied to theclipboard for use in other applications Figure 93 is an example of this type of display This is the

most basic type of retrieval This is considered unformatted output, meaning that the data is there,

but there is no particular presentation associated with it

Figure 93 - Tabular display of output from the selection screen

Trang 11

Figure 94 - Banded report for printing

FORMATTED REPORTS

Once a selection has been made, another option is formatted output The data can be sent to a

formatted report for printing or electronic distribution A formatted report is a template designedfor a specific purpose and saved in the program The report is based on a query or table thatprovides the data, and the report form provides the formatting

Standard (banded) reports

Figure 94 is an example of a report formatted for printing This example shows a standard

banded report, where the data at different parent-child levels is displayed in horizontal bands

across the page This is the easiest type of report to create in many database systems, and is mostuseful when there is a large amount of information to present for each data element, because one ormore lines can be dedicated to each result

Cross-tab reports

The next figure, Figure 95, shows a different organization called a cross-tab or pivot table

report In this layout, one element of the data is used to create the headers for columns In thisexample, the sample event information is used as column headers

Trang 12

Figure 95 - Cross-tab report with samples across and parameters down

Figure 96 - Cross-tab report with parameters across and samples down

Figure 96 is a cross-tab pivoted the other way, with parameters across and sample eventsdown In general, cross-tab reports are more compact than banded reports because multiple resultscan be shown on one line

Trang 13

Figure 97 - Data display options

Cross-tab reports provide a challenge regarding the display of field data when multiple fieldobservations must be displayed with the analytical data Typically there will be one result for eachanalyte (ignoring dilutions and reanalyses), but several observations of pH for each sample In across-tab, the additional pH values can be displayed either as additional columns or additionalrows Adding rows usually takes less space than additional columns, so this may be preferred, buteither way the software needs to address this issue

FORMATTING THE RESULT

There are a number of options that can affect how the user sees the data Figure 97 shows apanel with some of these options for how the data might be displayed

The user can select which regulatory limit or regulatory limit group to use for comparison,how to handle non-detected values, how to display graphs and handle field data, whether to includecalculated parameters, how to display the values and flags, how to format the date and time, andwhether to convert to consistent units and display regulatory limits

Regulatory limit comparison

For investigation and remediation projects, an important issue is comparison of analytical

results to regulatory limits or target levels These limits might be based on national regulations

such as federal drinking water standards, state or local government regulations, or site-specificgoals based on an operating permit or Record of Decision (ROD) Project requirements might be todisplay all data with exceedences highlighted, or to create a report with only the exceedences Formost constituents, the comparison is against a maximum value For others, such as pH, both anupper and a lower limit must be met

The first step in using regulatory limits is to define the limit types that will be used Figure 98shows a software screen for doing this The user enters the regulatory limit types to be used, alongwith a code for each type

The next step is to enter the limits themselves Figure 99 shows a form for doing this Limitscan be entered as either site-specific or for all sites For each limit, the matrix, parameter, and limittype are entered, along with the upper and lower limits and units The regulatory limit units areparticularly important, and must be considered in later comparison, and should be taken intoconsideration in conversion to consistent units as described below

There is one complication that must be addressed for limit comparison to be useful for manyproject requirements Often the requirement is for different parameters, or groups of parameters, to

be compared to different limit types on the same report For example, the major ions might becompared to federal drinking water standards, but the organics may be compared to more stringentlocal or site-specific criteria This requires that the software provide a feature to allow the use ofdifferent limits for different parameters Figure 100 shows a screen for doing this The user enters aname for the group, and then selects limits from the various limit types to use in that group

Trang 14

Figure 98 - Form for defining regulatory limit types

Figure 99 - Form for entering regulatory limits

Figure 100 - Form for defining regulatory limit groups

Trang 15

Figure 101 - Selection of regulatory limit or group for reporting

After the limits and groups have been defined, they can be used in reporting Figure 101 shows

a panel from the selection screen where the user is selecting the limit type or group for comparison.The list contains both the regulatory limit types and the regulatory limit groups, so either one can

be used at report time The software code should be set up to determine which type of limit hasbeen selected, and then retrieve the proper data for comparison

Value and flag

Analytical results contain much more information than just the measured value A laboratorydeliverable file may contain 30 or more fields of data for each analysis In a banded report there isroom to display all of this data When the result is displayed in a cross-tab report, there is only onefield for each result, but it is still useful to display some of this additional information The itemsmost commonly involved in this are the value, the analytical flag, and the detection limit Different

EDMS programs handle this in different ways, but one way to do it is using fields for reporting

factor and reporting basis that are based on the analytical flag Another way to do it is to have a

text field for each analysis containing exactly the formatting desired Examples of reporting factorand reporting basis values, and how each result might look, are shown in the following table:

b Both value and flag 1 3.7 v 0.1 3.7 v

l Less than sign (<) and

detection limit or value

1 3.7 u 0.1 < 0.1

g Greater than sign (>) and

d Detection limit (times factor)

Trang 16

Flag code Flag Reporting factor Reporting basis

Non-detects

When laboratories analyze for a constituent, it may or may not be found If it is not found, it is

referred to as not detected, or a non-detect The various different detection limits used by

laboratories are discussed in Chapter 12 If the result is not detected at the appropriate limit, the labshould flag (qualify) the data with a flag such as “u” for “undetected.” It should also report thedetection limit and the limit type It may or may not place the detection limit in the value field

In reporting and otherwise working with non-detects, they can be handled in several ways In afull, banded report, the value, flag, detection limit, and detection limit type can all be reported In across-tab report, or an export such as an XYZ file for contouring, there is no room for that Thereare several ways to handle non-detects Often a combination of these is used

Ignore them – Analyses for which the constituent was not detected can be excluded This is

generally not a good idea, since the fact that the constituent wasn’t detected is useful information

Display the value – The software can display the value provided by the laboratory, but this is

risky, because the laboratory may or may not place the detection limit in the value field It has theadvantage of being easy to implement, because the report can be based on only one field

Trang 17

Figure 102 - Form for defining calculated parameters

Display the detection limit – It makes sense to display the detection limit for non-detected

values and the value if there was a detection This is more complicated to program than just basingthe report on the value field, because the software has to look at the analysis record and determinewhich field to display, either using an IF statement (or more likely the slightly different immediateIIF) or using program code

Display the limit and qualify it – If the limit is displayed, it is helpful to qualify it in the

report, either by displaying a less than sign (<) or the flag To do this only for the non-detectsrequires special handling in the software

Apply a factor to the limit – Sometimes a numerical factor is applied to the detection limit

before it is displayed A common factor is one half, although others are sometimes used Thethinking is that the true value is somewhere between the detection limit and zero, so one half is agood guess This can be useful for estimating volumes of a material, or for other statisticalcalculations

Display a zero – A variation on using a factor is to use a zero for non-detects This is usually

not correct technically, but can be useful in some applications like contour mapping If you do use

a zero value in contouring, be sure to do so with care The value is not really a zero, but is less than

a specific value (the detection limit), and setting it to zero could be misleading, especially if thedetection limit is highly elevated, and the real value could be different enough from zero to affectthe surface Another option for contouring is to set the value to the indeterminate value, which isthe value (such as -99999) that the contouring program ignores in calculating the surface, but thenyou are throwing away the useful information that the value is low Some, but not many,contouring programs allow you to specify that the value is less than a certain amount, and then thesoftware constrains the surface based on that information That is the best solution if it is available.Which approach is best for displaying non-detects depends on the use of the data It isimportant that data users be aware of how the result is being displayed

Calculated fields

Sometimes it is helpful to display data that is based on calculations using data that is in the

database These are referred to as calculated fields or derived values These are results that are not

contained in the database, but are generated “on the fly” at retrieval time The software can provide

a system for defining and calculating these results Figure 102 shows an example of how this might

be presented

Trang 18

In this screen, the user has specified that the software is to calculate the mass of the totaldissolved solids for a sample The input parameters have been selected as the total dissolved solidsconcentration times the effluent volume The result must then be scaled to the output units ofkilograms by dividing by one million The screen is also asking for a nesting order, whichdetermines the order in which multiple calculations are to be performed, allowing complicatedmulti-step calculations with many parameters if necessary There is also a checkbox to enable anddisable the calculated field, so that a particular calculation can be turned off and on withoutdeleting it.

Consistent units

It is possible that different results for the same parameter in the database might be in differentunits This can be avoided at import time, as described in Chapter 13, but that is not alwaysdesirable When the data is displayed in a banded report with one or more lines per result, and theunits displayed, then multiple units may not be a problem, since a unit is shown with each value In

a cross-tab report, or if only the numbers (and not the units) are being retrieved for use in statistics,graphing, or mapping, then it is mandatory to convert to consistent units A good approach is todefine in the software the target units for each parameter and matrix Matrix is important becausethe units for different matrices usually should be different For example, in water the concentration

of a constituent like a metal is reported as mass per unit volume, such as milligrams per liter, whilefor a solid such as soil, it is in mass per mass, such as milligrams per kilogram or parts per million

A screen for defining target units for each parameter is shown in Figure 103

The next step is to define all of the conversion factors necessary to do the conversions This isalso shown in Figure 103 Conversion of different units of the same scale, such as from milligramsper liter to micrograms per liter, is pretty straightforward Not all conversions are this simple,however, and great care must be taken in converting between different types of measure Forexample, the laboratory may express measurements of radioactive materials like radium226 inactivity, such as picocuries per gram In order to determine how much material is there, it is useful

to have the data in mass units, such as milligrams per kilogram This conversion, however, depends

on a number of factors, such as the isotopic mix, physical properties of the sample, and so on, andconsequently is at best site-specific, and at worst involves complicated statistical calculations Besure you know what you are doing before you go too far with unit conversions

Once the desired concentration and conversion factors have been defined, the software canperform the conversion It is obvious that the value should be converted, but usually you will alsowant to convert other related information, such as the detection limit, regulatory limits used forcomparison, and so on

Other issues

There are a number of other issues that arise in formatting the data to satisfy project needs.These include handling of decimal places and date and time formatting

Handling of decimal places, or significant figures, is an issue that is not done well in many

software programs Try this experiment Open a new database in Excel In one of the cells, type in3.00, and press Enter The zeros go away Access and other programs lose trailing zeros the sameway This results in lost information If the analysis was to two decimal places, then those zerosshould be displayed There are two ways to handle this in an Access-based database One is tostore the value as a text string, rather than as a number The other is to store the number of decimalplaces in a separate field, and combine the two if necessary at retrieval time using a user-definedfunction

Trang 19

Figure 103 - Forms for defining units by parameter and matrix, and conversion between units

The issue of date and time formatting is related to the way that the data management software

stores dates and times, and how you want them displayed For example, Access combines dates andtimes into one field This field is a numeric field, with the whole number (left of the decimal point)representing the date Internally this is stored as the number of days since Dec 31, 1899, so a value

of 1 is Jan 1, 1900, and Jan 1, 2002 is 37257 The decimal portion of the date number (right ofthe decimal point) represents the time, starting at midnight For example, a value of 5 is 12:00 PM(noon) and 8:30 AM is 3541666667 This combination of date and time storage is different fromsome other systems, such as dBase and FoxPro, where the date and time are stored in separatefields For environmental projects, the date is nearly always important, but the time may or may not

be For example, for soil samples taken once, the time during the day that they were taken may not

be important, but for air samples taken every hour, it certainly would be For systems like Accessthat combine the date and time, it is useful to have a feature to turn the display of the time on andoff as appropriate for the data being displayed Reports can be formatted to display the date andtime field in different fields if desired

Trang 20

Sample Point ->

Matrix: Water Sample Date ->

MW-12/26/1981

MW-14/20/1981

Parameters Reg Limit Units

MW-14/20/1981

Figure 104 - Reports with different levels of formatting for performance comparison

Formatting and performance

Keep in mind that asking the software to perform sophisticated formatting comes at a cost InFigure 104, the panel on the top has formatted values and comparison to regulatory limits Noticethat a regulatory limit is displayed for sulfate, and both sulfate values are bolded and underlinedbecause they exceed this limit Also, for 4/20/1981 the value for iron shows the value andanalytical flags, and the value for nitrate shows “<” and the detection limit This retrieval for 315records takes 17 seconds The panel on the bottom displays only the numbers, with no comparison

to limits, and takes 1.5 seconds In data management (as in most everything else) nothing is free

INTERACTIVE OUTPUT

In the past, nearly all of the focus of data management has been on generating printed reports

As data management software evolves, it is now becoming possible to work interactively with thedata in ways that before were either not possible or not time-effective

Figure 105 shows an example of this type of interactive display The software is showing theenvironmental data in a TreeView display This display, which is similar to the Windows Explorerdisplay, shows sites at the highest level, then stations, samples, and analyses At each level, themost pertinent data is displayed This type of display lets the user “drill down” to find a particularresult quickly, even in a large database

Trang 21

Figure 105 - TreeView display of site data

ELECTRONIC DISTRIBUTION OF DATA

Often the person managing the data is not the person using it The best approach is foreveryone that needs the data to have direct access to it through the EDMS For various reasons,such as cost and location, this is not always possible There are several ways to overcome this One

is to make the data available more generally, such as through Web access Another way is throughelectronic distribution of reports The Adobe Portable Document Format (PDF) and the free PDFreader are a convenient way to distribute reports Users create the report that they want in theEDMS, and then print it to the PDF format using Acrobat for distribution Recipients of the reportcan use the free Acrobat reader to see it, formatted the way the database user intended

Trang 22

CHAPTER 20

GRAPHS

There’s an old saying that a picture is worth a thousand words In many situations, presentingdata in a graphical display makes the information much more understandable A well-designedgraph of the data in a table can be many times more informative than the table alone This chapterand the next two describe and show a variety of graphic displays that can be used to presentenvironmental data This chapter discusses traditional graphs Other graphic displays, such as mapsand cross sections, are discussed in the following two chapters

GRAPH OVERVIEW

There’s a good and a bad side to graphs They can be used to display data in a formatconducive to greater understanding They can also be confusing, misleading, or even dishonest Anexcellent book by Tufte (1983) provides a wealth of information on various aspects of graphicaldata display, including graphs and maps According to Tufte, graphical displays should:

Show the data

Induce the viewer to think about the substance rather than about methodology, graphicdesign, the technology of graphic production, or something else

Avoid distorting what the data has to say

Present many numbers in a small space

Make large data sets coherent

Encourage the eye to compare different pieces of data

Reveal the data at several levels of detail, from a broad overview to fine structure

Serve a reasonably clear purpose: description, exploration, tabulation, or decoration

Be closely integrated with the statistical and verbal description of a data set

In addition, Tufte provides the following six principles of graphical integrity:

The representation of numbers, as physically measured on the surface of the graphic itself,should be directly proportional to the numerical quantities expressed

Clear, detailed, and thorough labeling should be used to defeat graphical distortions andambiguity Write out explanations of the data on the graphic itself Label important events

in the data

Show data variation, not design variation

In time-series displays of money, deflated and standardized units of monetarymeasurement are nearly always better than nominal units

The number of information-carrying (variable) dimensions depicted should not exceed thenumber of dimensions in the data

Graphics must not quote data out of context

Trang 23

Following these two sets of guidelines will greatly increase your chance of creating goodgraphical displays Additional general information on graphs can be found in Milne (1992), andinformation specific to environmental graphing in Sara (1994, pp 11-19 to 11-28).

GENERAL CONCEPTS

Because graphing software is so accessible and easy to use, there is a tendency to throwtogether a graph of a bunch of data and be done with it If you try to follow Tufte’s guidelinesabove, then clearly there is more to it than that, from making sure the data is amenable to thegraphing technique you will be using to confirming at the end that the graph communicates thecorrect message If you keep in mind the key concepts of creating a graph, rather than take them forgranted, your graphs will be much more effective

Generally graphs present data with one data element graphed as a function of another.Commonly the independent variable, which is often presented against the X (horizontal) axis, istime, and the dependent variable, presented against the Y (vertical) axis, is the measured value It

is also possible to plot one observed value against another Sometimes the X-axis is called the

abscissa and the Y-axis is called the ordinate.

Data issues

Back in the day when graphs were created by hand, the person creating the graph was forced

to look at each data point, because he or she scaled it off and drew it on the graph With automatedprograms like Microsoft Excel and Golden Software’s Grapher, it is easy to create a graph withoutgiving it much thought This can result in a graph that looks great, but, in the worst case, is totallymeaningless For example, if you take a data set like the one graphed in Figure 106, and set thescale to logarithmic as discussed below, Grapher will complain if some of the data has a zero valueand can’t be graphed, but Excel won’t Those values may be important, and won’t be displayed ineither case, but with Excel you might not even know they are gone

There are a number of other data issues that can trip you up in creating graphs Chapter 19discussed the importance of checking units during data retrieval Use of non-detects and flaggeddata must be done carefully Duplicate data can also be a problem

A good policy is to take a hard look at the data after it has been retrieved from the EDMS, butbefore it is graphed Look at every number, or if there is too much data to do that, sort in variousways to understand the data ranges, relationships between different values, and so on Time spentdoing this will be rewarded by better graphs, ones that you are more likely to be able to trust

Coordinate systems

Graphing involves taking values and plotting them relative to some coordinate system Formost graphs this is a Cartesian XY system, but other systems, such as polar and radial plots, arepossible Think about which system will work best with your data and the message you are trying

to get across, rather than just using the default provided by the software

Graph scales

The scales of the graph determine the spacing of the points relative to each axis In the simplecase of an X-Y graph of two constituents against each other, the value range for each constituentwill be used as the scale for each axis In the case of a time-sequence graph, one of the axes(usually the horizontal one) is the time or date range, and the other is the value or values

Trang 24

0 200 400 600 800 1000 1200

U Tot0

1101001000

Parameter Comparison

Figure 106 - Comparison of linear vs logarithmic scales

For the case where the data has a large dynamic range, or where the data is lognormally

distributed, a logarithmic scale on one or both axes may be appropriate A graph with a logarithmic scale on one axis and a linear scale on the other is called a semi-log plot, and one with both axes logarithmic is called a log-log plot The graph on the right side of Figure 106 shows a

log-log plot The goal is to see the relationship between the two constituents in each sample Theleft graph shows the data graphed on a linear scale Most of the data is clustered in the lower left,and it is difficult to say what the relationship is The right graph shows a logarithmic scale for bothconstituents, and it is possible to see that there is a rough correlation between the two, and a samplewith a high value in one is likely to have a high value in the other In fact, it appears that there may

be several populations with different linear relationships between the constituents, perhapsrepresenting different sources of the material This was not at all apparent from the linear graph

Labels and annotations

There are two basic types of labels and annotations, those associated directly with graphelements, and those not Examples of the first type are the scale labels and scale titles Scale labelsidentify positions along a scale axis Usually there will be one set of labels per axis, such as thenumbers annotating the tic marks and the text label for the axis Labels not associated with graphelements include the graph title, legends, comments, and so on

TYPES OF GRAPHS

Because graphics are so useful, people have developed many different types of graphs to bestrepresent their data This section describes some of the most popular types of graphs, and thefollowing one shows some examples

Line graphs – Line graphs are often used to represent data in a series A grid is drawn, and

then one or more series of data are drawn on the grid Lines are used to connect the points tohighlight trends and patterns Often the horizontal axis (abscissa) is time, and the vertical axisWhenever presenting a forecast, give a number and a date, but never both

Rich (1996)

Trang 25

(ordinate) is the value being compared, but this is not required Line graphs are probably the mostcommon type of technical graph.

Bar graphs – Bar graphs, also called column graphs, are good for displaying increases and

decreases in quantity over a period of time They work best when the amount of data to bedisplayed is not large As with line graphs, the horizontal axis is often time

Area graphs – Area graphs are similar to line graphs, except the areas under the curve(s) are

filled

Stacked graphs – A stacked graph is a variety of bar or (more commonly) area graph where

the values are stacked cumulatively rather than each starting at zero

Scatter plots – A scatter plot is used for displaying two variables for each point against each

other Scatter plots are very popular for technical data

Box plots – Box plots are special bar graphs that show the minimum, maximum, mean, and

lower and upper quartiles for each data group

Picture graphs – In picture graphs, the data is displayed with symbols rather than lines or

bars These are sometimes used for business presentations, but are not commonly used for displays

of technical data

Pie charts – A pie chart is a type of graph used to display the fractional parts of a whole like

slices of a pie, where the size, or more accurately the angular displacement, of each slice is based

on the percentage of the whole contributed by each value

Surface plots – Surface plots are used to show one variable as a function of two others They

are similar to contour displays used on maps, but the two independent variables can be somethingother than map coordinates

Rose diagram – A rose diagram is a circular graph of angular data Angular measurements,

such as joint or cross-bed directions, are grouped by an angle range, such as 10° or 30°, and thenumber of observations in each range are shown as distances from the center Before designing arose diagram, you should examine the variability in the data and set the increments (angle range) to

be graphed appropriately If the increment is too small for the data, then only “noise” is displayed

If too coarse, the real variability is lost An alternative way of drawing the rose diagram is to start

at the outer edge and increase the values toward the center This often helps to define trends inmulti-modal data sets better than the more conventional approach (Mike Wiley, pers comm.,2002)

Polar plot – A polar plot is also a circular graph of angular data Values as a function of angle

are shown as distances from the center, creating a line graph within a circle

Maps – It’s important to remember that maps are a type of graph Because maps have so many

special issues to discuss, they will be covered separately in Chapter 22 There are also manyopportunities for combining maps with traditional graphs to create visually rich and informativedisplays

GRAPH EXAMPLES

The following examples show graphs created by several different programs Figure 107 shows

a number of graphs created with Microsoft Excel Figure 108 shows some more technical graphtypes created with Grapher from Golden Software

The previous examples have used programs outside the EDMS Figure 110 shows a fairlytypical graph of one parameter (sulfate) from two wells plotted as a function of time within anEDMS program Figure 111 shows a variation on the time sequence graph where data from severalyears is folded onto one 12-month graph This was done to help identify seasonality in the data

Trang 26

Line (time sequence) graph

0 200 400 600 800 1000 1200 1400

2/26/81 1/27/82 1/18/83 2/8/84 5/13/85 5/21/86 5/26/87 5/31/88 6/21/89 6/4/90 6/19/91 5/12/92 5/19/93 5/18/94

Sodium Sulfate

3-D bar graph

2/26/81 4/27/82 7/15/8311/13/84 5/21/86 9/16/87 12/15/88 3/23/90 6/19/91 8/5/92

0 200 400 600 800 1000 1200 1400

Sodium Sulfate

3-D bar graph with too much data

2/26/81 4/27/82

11/13/84 5/21/86 9/16/8712/15/88 3/23/90 6/19/91 8/5/92

3-D area graph

0 200 400 600 800 1000 1200 1400 1600 1800 2000

2/26/81 4/27/82 7/15/83 11/13/8 5/21/86 9/16/87 12/15/8 3/23/90 6/19/91 8/5/92 11/2/93

Sulfate Sodium

Stacked area graphFigure 107 - Examples of several graph types created with Microsoft Excel

Trang 27

90

135 180

225 270 315

M6 MW -7 MW -8 MW -1 0

W-MW -1

1 0.8 0.6 0.4 0.2 0

Lith

ics

Trilinear plotFigure 108 - Examples of several graph types created with Grapher from Golden Software

Al

Pie chart created with ExcelFigure 109 - Additional graph examples

Trang 28

Figure 110 - Formatted graph of selected parameter

Figure 111 - A graph of a constituent (blood lead) by month created by an EDMS

Sometimes it is useful to view graph data in its spatial context Figure 112 shows an example

of this type of display A map with an airphoto backdrop is displayed, along with symbols for thewell locations Time-sequence graphs are shown for five of the monitoring wells, with leader lines

to the wells from which the samples were taken This type of display shows the time sequence data,along with the spatial context of the wells, so inferences can be made about the progression ofvalues over time for different parts of the facility Graphing in spatial context is of greatest valuewhere the variation is expected to relate to geographic position For example, in addition to thewater quality parameters shown, parameters such as water level elevation and temperature oftenbenefit from being displayed in this manner

Figure 113 shows an enlarged view of part of Figure 112 The graphs show the value of theconstituents of interest, along with the vertical scales of the graph It also shows horizontal lines forthe mean value for bicarbonate, along with lines located three standard deviations above and belowthe mean, and a line for the regulatory limit Points that are outside the limit lines are displayed in adifferent color These points deserve additional scrutiny to determine if they are erroneous or real

Trang 29

Figure 112 - Graphs displayed with leader lines to their map locations

Figure 113 - Enlarged graph showing control chart limits and outliers

CURVE FITTING

Often a graph, especially a time-sequence graph, will expose a trend in the data Manygraphing programs provide a way to fit a curve to the data to help understand the trend The curvecan help understand the trend by smoothing out irregularities and variations in the data

Trang 30

Concentration Over Time

0 10 20 30 40

Month

Value 3rd Order Polynomial

Figure 114 - Graphs showing trend lines

Curve fitting must be used with caution, however Figure 114 shows an example of two graphs

of the same data set, with trend lines suggesting two very different conclusions The data setconsists of four monthly observations: 10, 25, 30, 20, 25, and 25 The question is whether the data

is trending up or down Fitting a second-order polynomial suggests that the data is trending down.Changing to a third-order polynomial suggests an upward trend Which is correct? A scarierquestion is: Which will you use to prove your point?

Because graphing software makes it so easy to use high-order polynomial fitting, it is tempting

to use high orders to improve the fit However, a third-order polynomial is the lowest order thatcan produce both concave and convex curves on the same plot This may be the highest orderappropriate for many data sets

GRAPH THEORY

Graph theory is a topic that might be confused with the theory of creating graphs, but actuallycovers a different topic It is discussed here to make the point that graph theory and the theoreticalbasis for graphing data or functions are different issues Some of the theoretical issues related tocreating graphs, such as data issues, scales, etc., are discussed above The basic material of graph

theory is spatial connectivity or topology In graph theory, “graph” is used to denote a set of

vertices possibly connected by edges, as opposed to graphing data or values Graphs in graphtheory consist of points connected by lines (vertices connected by edges), and then various kinds ofstudies are performed on these graphs Unlike geometry, topology ignores spatial issues, andaddresses only issues that don’t change when objects are deformed An example of the type ofproblem studied by graph theory is the Four-Color Problem (Figure 115), which is a theory thatany map can be colored using four colors in such a way that adjacent regions (those sharing acommon boundary segment, not just a point) receive different colors

Graph theory may have application for environmental projects by analyzing the relationshipsbetween different areas of interest, or similar area-based studies

Figure 115 - Example of the Four-Color Problem in graph theory

Trang 31

CHAPTER 21

CROSS SECTIONS, FENCE DIAGRAMS,

AND 3-D DISPLAYS

Environmental data, and geologic data in general, is inherently three dimensional A number

of graphical tools have been developed to assist with visualizing the 3-D configuration andrelationships contained in the data These range from logs through cross sections and fencediagrams to block diagrams

LITHOLOGIC AND WIRELINE LOGS

Rock or soil samples and geophysical measurements from boreholes and from outcrops make

up the basic data for many geologic projects Displaying this data as a function of depth is the firststep in interpretation Before the advent of personal computers, lithologic logs of samples wereprepared by manual drafting onto strip-log paper Wireline geophysical logs were drawn withanalog recorders on special chart paper Digital wireline logs arrived long before personalcomputers They were recorded on tape and plotted with pen plotters attached to mainframe orminicomputers Now both lithologic and wireline logs, including combinations of both in onedisplay, can be easily created using a computer program on a personal computer

For drill cuttings and outcrop samples, the plot usually consists of patterns for lithology typesalong with a text description of the rock, both plotted against depth on the vertical axis Curves forother factors, either measured or interpreted, may also be included Measured factors that can beplotted might include grain size, porosity, or oil saturation, while interpretive factors might includedepositional energy or diagenetic alteration Figure 116 shows an example of a typical lithologiclog for an environmental project

Geophysical measurements from boreholes (or less commonly from outcrops) are widely usedfor determining rock properties, and are also very valuable for stratigraphic correlation Displays

of two or more geophysical curves, such as spontaneous potential (SP) along with resistivity, orgamma ray plotted with neutron density or sonic travel time, are widely used for stratigraphic andstructural interpretation of subsurface rocks Figure 117 shows an example of a small portablegeophysical logging device

Trang 32

Figure 116 - Lithologic log for an environmental project (Courtesy of RockWare)

Figure 117 - Gamma ray logger system (Courtesy of Geotech Environmental Equipment)

Trang 33

Figure 118 - Cross section created from relational data

CROSS SECTIONS

Several lithologic or geophysical logs can be displayed side by side to form a cross section.

The use of cross sections on environmental projects is discussed in Sara (1994, pp 7-17 to 7-21).The manual approach is to tape several logs onto a big sheet of graph paper (cross section paper),

with the vertical position based on elevation (structural cross section) or on a stratigraphic horizon (stratigraphic cross section) This type of display is used to interpret the spatial position of rock

units or the lateral variation in lithologies This is particularly useful to assist with correlation oflithologic and stratigraphic units Contamination values can be added to increase the informationcontent of the cross sections

The vertical scale can be changed relative to the horizontal scale to adjust the vertical

exaggeration for cross sections, block diagrams, and other displays This is important because

many geological features are tabular in shape, and vertical exaggeration is necessary to be able tosee the features

Computers can be used to create cross section displays once the basic data on lithology,chemistry, or log values has been entered The user specifies which logs are to be used, how eachlog is to be displayed, and other information such as how the cross section is to be hung (structural

or stratigraphic datum) and how the logs are to be labeled Most cross section programs allowcorrelation lines to be drawn from log to log to display stratigraphic and structural relationships,and some allow the user to interactively pick formation tops from the logs for entry into a database.Figure 118 shows a cross section display of the concentration of uranium and radium in soil Itwas generated to show the part of the site that will need to be excavated It includes a combination

of laboratory data from soil samples along with downhole data from gamma logs Uranium valuesare to the left of each log, and radium values to the right The shaded rectangles represent soilsamples, and the continuous lines show the downhole gamma surveys Both the boxes and the linesare truncated at the excavation cutoff The elevations of geologic units, as well as the groundsurface and water table, have been added to aid in interpretation A series of parallel cross sections

of this type can be used to calculate the volume to be excavated

Trang 34

Figure 119 - Electric log cross section

Figure 120 - Map and cross section views of model results

Figure 119 shows a cross section of electric log data This type of display is very useful forperforming subsurface correlations

Figure 120 shows two graphic displays from the same project, one a map and one a crosssection The values from the borings were used to create a 3-D geostatistical block model, whichwas then displayed along with the map and cross section

Cross sections can also be used to demonstrate changes in water chemistry or contaminantdistribution over time If the same wells were resampled over time, then cross sections of eachsampling period will show the changes This adds a fourth dimension to the information obtained

PROFILES

A type of display that is similar to a cross section is a profile, which is like a slice through asurface The surface is usually a grid created by a contouring program The profile represents thevalues of that surface along the line of the profile Sometimes profiles and log cross sections arecombined to show what the surface does between control points

Trang 35

Figure 121 - Fence diagram (Courtesy of RockWare, Inc.)

FENCE DIAGRAMS AND STICK DISPLAYS

Extending from a one-dimensional lithologic or geophysical log or a two-dimensional crosssection to a three-dimensional display is very difficult with hand drafting techniques (see Tearpockand Bischke, 1991, pp 182-194), but can be easily done with a computer The user first specifieswhich wells are to be used, what data elements are to be displayed, and how they are to be shown.The software then uses the X-Y coordinates of the well locations to project them onto a three-

dimensional perspective view The logs can be shown with no connections between them (stick

diagram) or the formations can be connected from well to well (fence diagram) This type of

display can show three-dimensional relationships that are difficult to discern using other methods.Figure 121 shows an example of a fence diagram In this example, the geology from theborings, which are shown as curved lines, has been interpolated across the site, and then profilesdrawn at regularly spaced intervals Unless the wells are regularly spaced, which they usuallyaren’t, either the lines of the fence diagram must be crooked, or the lines drawn straight and thedata interpolated at the intersections This requires considerable confidence in your understanding

of the data and the spatial relationships at the site

Trang 36

Figure 122 - Block diagram created automatically from relational data

BLOCK DIAGRAMS AND 3-D DISPLAYS

Another type of three-dimensional display is the block diagram Block diagrams can be made

from two- or three-dimensional grid models of a particular volume of rock Some block diagramsoftware allows certain stratigraphic or lithologic units to be made transparent so the user can seeinto the block diagram Generating block diagrams for large grid models is computationallyintensive and requires a powerful computer to produce results in a reasonable amount of time.Fortunately, current high-end personal computers have the power to do this for all but the largestprojects Figure 122 shows an example of a block diagram created from data extracted from anEDMS In Figure 123 the low concentration material has been removed (made transparent) to showonly the higher concentration material Also, logs for the boreholes have been added Figures 124and 125 are more complicated figures with 3-D surface features and a contaminant plume Figure

125 adds the depth to bedrock

Although block diagrams such as those shown from Figures 122 to 125 are quitecomputationally intensive and can take several minutes or more to create, the benefits can faroutweigh the inconvenience Block diagrams with this kind of detail are very difficult to producemanually Accurate rendering of 3-D objects that are faithful to the data, as shown in these figures,

is virtually impossible without using a computer

Because many people, especially non-technical people, find it difficult or impossible tovisualize objects in three dimensions, block and 3-D diagrams can be a powerful tool in illustratingand proving your case Providing an understanding of spatial relationships for regulators, attorneys,and environmental activists can be greatly aided by these displays

Trang 37

Figure 123 - Deviated boreholes and plume display (Courtesy of RockWare, Inc.)

Figure 124 - 3-D facility display created with a mapping program (Courtesy of RockWare, Inc.)

Trang 38

Figure 125 - 3-D display of contamination under a refinery created with a GIS (Courtesy of Dan Heidenreich,HSI Geotrans)

Trang 39

CHAPTER 22

MAPPING AND GIS

Most environmental data is inherently spatial That means that the observation was taken at aspecific location in map coordinates (X and Y) and depth (or elevation) Often seeing the data in itsspatial context imparts more information content than seeing it as a text-only presentation Usingcomputerized mapping to help understand this spatial context makes sense for many projects Thischapter covers issues related to computerized mapping, including software for creating maps,displaying your data, contouring and modeling, and specialized map displays

MAPPING CONCEPTS

Since earth scientists often spend a large amount of their time working with maps, it is logical

to consider computerization of the map generation and manipulation process Computerizedmapping covers a wide variety of activities, and programs are available to help with most of them.There are some advantages and disadvantages to consider, however

Advantages and disadvantages of computerized mapping

Before making a commitment to computerized mapping, a thorough appraisal should be made

of the timesaving and other benefits that will be provided The problem is similar to the oneencountered with computer-aided design (CAD) software Making the first map with the computerwill take as much or more time than doing it by hand This is especially true if the learning curvefor the mapping software is taken into consideration The time savings will come later, when themap needs to be redone or changes need to be made Then the computer eliminates re-drafting,which can improve accuracy as well as improve speed in generating the second map

Another advantage of the computerized mapping process is that it allows the earth scientist tomake maps and diagrams that either could not or would not have been made by hand Goodexamples are trend surface and residual maps, other derived maps, block diagrams, and maps ofdata that may have previously been considered unimportant Other examples of maps more likely

to be made are multiple maps of different time periods Having the computer generate the mapsmakes it more likely that these maps will be made Some of these experimental maps will not beuseful and will be thrown away Others may provide surprising insight into the data and thegeology behind it, and could prove tremendously valuable Computer-generated maps are, for themost part, unbiased, which can be of value in many situations Finally, computerized mapping cangreatly improve the ease and accuracy of volumetric calculations

Trang 40

Often the decision on whether computerized mapping is appropriate for a project depends onthe number of maps to be made and the amount of data to be mapped For small projects, handmapping is often better For large projects with thousands (or millions) of data points, the computermay be the only way to do it.

Types of maps

Since maps are so widely used, there are hundreds of different kinds of maps A few of thosetypes of maps will be discussed here In some cases, the final map display is a combination ofseveral map types

Base maps – The most fundamental type of map is the base map Whether derived from a

topographic map or commercial or proprietary data, a base map usually must be constructed beforeany other type of map can be made Geographic, cultural, and sample location data must all becollected and related together in the right spatial positions The importance of this step must not beunderestimated, and this subject is discussed in more detail below

Posted data – The next step after creating a base map is often to post data, creating a posted

data map In many cases, one of the primary goals of organizing a database is to create this type of

map Retrieving data and posting it on maps is described in a later section

Bubble maps – A bubble map, also called a dot map or pin map, expands on data posting by

using the symbol on the map to represent the value being displayed The symbol’s size, color, orshape can reflect the value being featured Figures 129 and 130 later in the chapter show examples

of this type of map

Thematic maps – Thematic maps use the display of various map elements to communicate

data, usually numeric values For example, each county on a regional map could be color coded torepresent the value of some variable, such as economic or environmental parameters Sometimesthis is done in perspective view with the polygon of each county extruded to a height thatrepresents the value

Contour maps – Contour maps use contour lines, color fill, and other graphical displays to

communicate numeric information, usually of a continuous or nearly continuous surface (such asone broken by faults) The many issues related to creating and displaying contour maps arediscussed below

Surface geology – It is often useful to make geologic maps of surface or subsurface geology

and/or other features As the use of computers for image processing and analysis increases, moresurface geology projects are being done on the computer using airphotos and satellite photos.Software exists that allows the user to move interpreted information from images onto linedrawings (maps) for output to plotters and for integration with other types of maps

Airphotos and satellite images – Aerial imagery, whether taken from an airplane or satellite,

can be of great value in environmental mapping Often airphotos are available for various timeperiods, which can assist with documenting the history of the site In order to be used for mapping,images must be ortho-rectified to remove any spatial distortion caused by the imaging process, and

to allow them to be registered to a particular map geometry Once this has been done, both types ofimages can be used for map backdrops to illustrate a variety of points about site data

Base map creation

If the EDMS will include a map component, then base map information must be loaded intoeither the database or the geographic information system (GIS) before a map can be displayed.Then the analytical and other information can be overlaid on the base map Loading the base map

data involves two steps The first is to create a base map Often this is done using a computer-aided

drafting program such as AutoCAD or the digitizing capabilities of the GIS This base map shouldprovide sufficient locational information as a reference for the data being displayed while keeping

Tiêu đề	Relational Management and Display of Site Environmental Data - Part 5
Trường học	CRC Press LLC
Chuyên ngành	Site Environmental Data Management
Thể loại	Chương
Năm xuất bản	2002

Định dạng
Số trang	82
Dung lượng	2,56 MB