4 Requirements Analysis and Conceptual Data Modeling his chapter shows how the ER and UML approaches can be applied to the database life cycle, particularly in steps I through IIb as def
Trang 24
Requirements Analysis and Conceptual Data Modeling
his chapter shows how the ER and UML approaches can be applied
to the database life cycle, particularly in steps I through II(b) (as defined in Section 1.2), which include the requirements analysis and conceptual data modeling stages of logical database design The example introduced in Chapter 2 is used again to illustrate the ER modeling prin-ciples developed in this chapter
4.1 Introduction
Logical database design is accomplished with a variety of approaches, including the top-down, bottom-up, and combined methodologies The traditional approach, particularly for relational databases, has been a low-level, bottom-up activity, synthesizing individual data elements into normalized tables after carefully analyzing the data element interdepen-dencies defined during the requirements analysis Although the tradi-tional process has been somewhat successful for small- to medium-sized databases, when used for large databases its complexity can be over-whelming to the point where practicing designers do not bother to use it with any regularity In practice, a combination of the top-down and bot-tom-up approaches is used; in most cases, tables can be defined directly from the requirements analysis
The conceptual data model has been most successful as a tool for communication between the designer and the end user during the
T
Trang 3data elements, because data elements usually represent the attributes Therefore, using entities as an abstraction for data elements and focus-ing on the relationships between entities greatly reduces the number of objects under consideration and simplifies the analysis Though it is still necessary to represent data elements by attributes of entities at the con-ceptual level, their dependencies are normally confined to the other attributes within the entity or, in some cases, to attributes associated with other entities with a direct relationship to their entity
The major interattribute dependencies that occur in data models are
the dependencies between the entity keys, the unique identifiers of
differ-ent differ-entities that are captured in the conceptual data modeling process Special cases, such as dependencies among data elements of unrelated entities, can be handled when they are identified in the ensuing data analysis
The logical database design approach defined here uses both the conceptual data model and the relational model in successive stages It benefits from the simplicity and ease of use of the conceptual data model and the structure and associated formalism of the relational model To facilitate this approach, it is necessary to build a framework for transforming the variety of conceptual data model constructs into tables that are already normalized or that can be normalized with a min-imum of transformation The beauty of this type of transformation is that it results in normalized or nearly normalized SQL tables from the start; frequently, further normalization is not necessary
Before we do this, however, we need to first define the major steps of the relational logical design methodology in the context of the database life cycle
4.2 Requirements Analysis
Step I, requirements analysis, is an extremely important step in the data-base life cycle and is typically the most labor intensive The datadata-base designer must interview the end user population and determine exactly what the database is to be used for and what it must contain The basic objectives of requirements analysis are:
Trang 4• To delineate the data requirements of the enterprise in terms of basic data elements
• To describe the information about the data elements and the rela-tionships among them needed to model these data requirements
• To determine the types of transactions that are intended to be executed on the database and the interaction between the trans-actions and the data elements
• To define any performance, integrity, security, or administrative constraints that must be imposed on the resulting database
• To specify any design and implementation constraints, such as specific technologies, hardware and software, programming lan-guages, policies, standards, or external interfaces
• To thoroughly document all of the preceding in a detailed requirements specification The data elements can also be defined
in a data dictionary system, often provided as an integral part of the database management system
The conceptual data model helps designers accurately capture the real data requirements because it requires them to focus on semantic detail in the data relationships, which is greater than the detail that would be provided by FDs alone The semantics of the ER model, for instance, allow for direct transformations of entities and relationships to
at least first normal form (1NF) tables They also provide clear guidelines for integrity constraints In addition, abstraction techniques such as gen-eralization provide useful tools for integrating end user views to define a global conceptual schema
4.3 Conceptual Data Modeling
Let us now look more closely at the basic data elements and relation-ships that should be defined during requirements analysis and concep-tual design These two life cycle steps are often done simultaneously
Consider the substeps in step II(a), conceptual data modeling, using the ER model:
• Classify entities and attributes (classify classes and attributes in UML)
• Identify the generalization hierarchies (for both the ER model and UML)
Trang 54.3.1 Classify Entities and Attributes
Though it is easy to define entity, attribute, and relationship constructs,
it is not as easy to distinguish their roles in modeling the database What makes a data element an entity, an attribute, or even a relationship? For example, project headquarters are located in cities Should “city” be an entity or an attribute? A vita is kept for each employee Is “vita” an entity or a relationship?
The following guidelines for classifying entities and attributes will help the designer’s thoughts converge to a normalized relational data-base design:
• Entities should contain descriptive information
• Multivalued attributes should be classified as entities
• Attributes should be attached to the entities they most directly describe
Now we examine each guideline in turn
Entity Contents
Entities should contain descriptive information If there is descriptive information about a data element, the data element should be classified
as an entity If a data element requires only an identifier and does not have relationships, it should be classified as an attribute With “city,” for example, if there is some descriptive information such as “country” and
“population” for cities, then “city” should be classified as an entity If only the city name is needed to identify a city, then “city” should be classified as an attribute associated with some entity, such as Project The exception to this rule is that if the identity of the value needs to be con-strained by set membership, you should create it as an entity For exam-ple, “State” is much the same as city, but you probably want to have a State entity that contains all the valid State instances Examples of other data elements in the real world that are typically classified as entities