conceptual data model—An organization of data that describes the relationships among the primitive data elements.. data model—An organization of data that describes the relationships amo
Trang 1X and A are either simple or composite attributes (data items), X must be a superkey in that table This is a strong form of 3NF and
is the basis for most practical normalization methodologies
candidate key—Any subset of the attributes (data items) in a superkey
that is also a superkey and is not reducible to another superkey
CASE tool—Computer-aided software engineering tool, a software
design tool to assist in the logical design of large or complex data-bases examples include ERwin Data Modeler and Rational Rose using UML
class—A concept in a real-world system, represented by a noun in UML;
similar to an entity in the ER model
class diagram (UML)—A conceptual data model; a model of the static
relationships between data elements of a system (similar to an ER diagram)
completeness constraint—Double-line symbol connecting a
super-type entity with the subsuper-types to designate that the listed subsuper-type entities represent a complete set of possible subtypes
composition—A relationship between one class and a group of other
classes in UML; the class at the diamond (aggregate) end of the relationship is composed of the class(es) at the small (component) end; similar to aggregation in the ER model
conceptual data model—An organization of data that describes the
relationships among the primitive data elements For example, in the ER model, it is a diagram of the entities, their relationships, and their attributes
connectivity of a relationship—A constraint on the count of the
number of associated entity occurrences in a relationship, either one or many
data item—The basic component of a data record in a file or database
table; the smallest unit of information that has meaning in the real world Examples include customer lastname, address, and identification number
data model—An organization of data that describes the relationships
among primitive and composite data elements
data warehouse—A large repository of historical data that can be
inte-grated for decision support
database—A collection of interrelated stored data that serves the needs
of multiple users; a collection of tables in the relational model
Trang 2database administrator (DBA)—The person in a software
organiza-tion who is in charge of designing, creating, and maintaining the databases of an enterprise The DBA makes use of a variety of soft-ware tools provided by a DBMS
database life cycle—An enumeration and definition of the basic steps
in the requirements analysis, design, creation, and maintenance
of a database as it evolves over time
database management system (DBMS)—A generalized software
system for storing and manipulating databases Examples include Oracle, IBM’s DB2, Microsoft SQL Server, or Access
data mining—A way of extracting knowledge from a database by
searching for correlations in the data and presenting promising hypotheses to the user for analysis and consideration
DBA—See database administrator.
degree of a relationship—The number of entities associated in the
relationship: recursive binary (1 entity), binary (2 entities), ternary
(3 entities), n-ary (n entities).
denormalization—The consolidation of database tables to increase
performance in data retrieval (query), despite the potential loss of data integrity Decisions on when to denormalize tables are based
on cost/benefit analysis by the DBA
deployment diagram (UML)—Shows the physical nodes on which a
system executes This is more closely associated with physical database design
dimension table—The smaller tables used in a data warehouse to
denote the attributes of a particular dimension, such as time, loca-tion, customer characteristics, product characteristics, etc
disjointness constraint (d)—A symbol in an ER diagram to designate
that the lower-level entities in a generalization relationship have nonoverlapping (disjoint) occurrences If the occurrences overlap, then use the designation (o) in the ER diagram
entity—A data object that represents a person, place, thing, or event if
informational interest; it corresponds to a record in a file when stored For example, you could define employee, customer, project, team, and department as entities
entity cluster—The result of a grouping operation on a collection of
entities and relationships in an ER model to form a higher-level abstraction, which can be used to more easily keep track of the major components of a large-scale global schema
Trang 3entity instance (or occurrence)—A particular occurrence of an
entity For example, an instance of the entity actor would be Johnny Depp
entity-relationship (ER) diagram—A diagram (or graph) of entities
and their relationships, and possibly the attributes of those enti-ties
entity-relationship (ER) model—A conceptual data model
involv-ing entities, relationships among entities, and attributes of those entities
exclusion constraint—A symbol (+) between two relationships in the
ER model with a common entity that implies that either one rela-tionship must hold at a given point in time, or the other must hold, but not both
existence dependency—A dependency between two entities such that
one is dependent upon the other for its existence, and cannot exist alone For example, an employee work-history entity cannot exist without the corresponding employee entity Also refers to the connectivity between two entities as being mandatory or optional
fact table—The dominating table in a data warehouse and its star
schema, containing dimension attributes and data measures at the individual data level
fifth normal form (5NF)—A table is in fifth normal form (5NF) if and
only if there are no possible lossless decompositions into any sub-set of tables; in other words, if there is no possible lossless decom-position, then the table is in 5NF (see Section 6.5)
file—A collection of records of the same type For example, an
employee file is a collection of employee records
first normal form (1NF)—A table is in first normal form (1NF) if and
only if there are no repeating columns of data taken from the same domain and having the same meaning
foreign key—Any attribute in a SQL table (key or nonkey) that is taken
from the same domain of values as the primary key in another SQL table and can be used to join the two tables (without loss of data integrity) as part of a SQL query
fourth normal form (4NF)—A table is in fourth normal form (4NF) if
and only if it is at least in BCNF and if whenever there exists a nontrivial multivalued dependency of the form X->>Y, then X must be a superkey in the table
Trang 4functional dependency (FD)—The property of one or more
attributes (data items) that uniquely determines the value of one
or more other attributes (data items) Given a table R, a set of attributes B is functionally dependent on another set of attributes
A if, at each instant of time, each A value is associated with only one B value
generalization—A special type of abstraction relationship that
speci-fies that several types of entities with certain common attributes can be generalized (or abstractly defined) with a higher-level entity type, a supertype entity; an “is-a” type relationship For example, employee is a generalization of engineer, manager, and administrative assistant, based on the common attribute job-title
A tool often used to make view integration possible
global schema—A conceptual data model that shows all the data and
their relationships in the context of an entire database
key—A generic term for a set of one or more attributes (data items) that,
taken collectively, enables one to identify uniquely an entity or a record in a SQL table; a superkey
logical design—The steps in the database life cycle involved with the
design of the conceptual data model (schema), schema integra-tion, transformation to SQL tables, and table normalization; the design of a database in terms of how the data is related, but with-out regard to how it will be stored
lossless decomposition—A decomposition of a SQL table into two or
more smaller tables is lossless if and only if the cycle of table decomposition (normalization) and the recomposition (joining the tables back through common attributes) can be done without loss of data integrity
mandatory existence—A connectivity between two entities that has
a lower bound of one One example is the “works-in” relationship between an employee and a department; every department has at least one employee at any given time Note: if this is not true, then the existence is optional
multiplicity—In UML, the multiplicity of a class is an integer that
indicates how many instances of that class are allowed to exist
multivalued dependency (MVD)—The property of a pair of rows in
a SQL table such that if the two rows contain duplicate values of attribute(s) X, then there is also a pair of rows obtained by inter-changing the values of Y in the original pair This is a multivalued
Trang 5dependency of X->>Y For example, if two rows have the attribute values ABC and ADE, where X=A and Y has the values C and E, then the rows ABE and ADC must also exist for an MVD to occur
A trivial MVD occurs when Y is a subset of X or X union Y is the entire set of attributes in the table
normalization—The process of breaking up a table into smaller tables
to eliminate problems with unwanted loss of data (the egregious side effects of losing data integrity) from the deletion of records and inefficiencies associated with multiple data updates
online analytical processing (OLAP)—A query service that overlays
a data warehouse by creating and maintaining a set of summary views (automatic summary tables, or ASTs) to enable quick access
to summary data
optional existence—A connectivity between two entities that has a
lower bound of zero For example, for the “occupies” relationship between an employee and an office, there may exist some offices that are not currently occupied
package—In UML, a package is a graphical mechanism used to
orga-nize classes into groups for better readability
physical design—The step in the database life cycle involved with the
physical structure of the data; that is, how it will be stored, retrieved, and updated efficiently In particular, it is concerned with issues of table indexing and data clustering on secondary storage devises (disk)
primary key—A key that is selected from among the candidate keys
for a SQL table to be used to create an index for that table
qualified association—In UML, an association between classes may
have constraints specified in the class diagram
record—A group of data items treated as a unit by an application; a row
in a database table
referential integrity—A constraint in a SQL database that requires,
for every foreign key instance that exists in a table, that the row (and thus the primary key instance) of the parent table associated with that foreign key instance must also exist in the database
reflexive association—In UML, a reflexive association relates a class
to itself
relationship—A real-world association among one or more entities.
For example, “purchased” could be a relationship between cus-tomer and product