1. Trang chủ
  2. » Công Nghệ Thông Tin

DATA MODELING FUNDAMENTALS (P16) pot

11 300 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 11
Dung lượng 103,75 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Conceptual completeness of a data model implies that it is a complete representation of the information requirements of the organization.. Conceptual correctness of a data model implies

Trang 1

data modeling tools, analysis and design tools, and tools for documenting and testing applications

Circular Structure A data structure consisting of three or more entity types forming cyclical relationships where the first is related to the second, the second to the third, and so on, and finally the last related back to the first In a good data model, circular structures are resolved

Composite Key Primary key made up of more than one attribute

Concatenated Key Same as Composite Key

Conceptual Completeness Conceptual completeness of a data model implies that it is a complete representation of the information requirements of the organization

Conceptual Correctness Conceptual correctness of a data model implies that it is a true replica of the information requirements of the organization

Conceptual Data Model A generic data model capturing the true meaning of the information requirements of an organization Does not conform to the conventions of any class of database systems such as hierarchical, network, relational, and so on Conceptual Entity Type Set representing the type of the objects, not the physical objects themselves

Data Dictionary Repository holding the definitions of the data structures in a database

In a relational database, the data dictionary contains the definitions of all the tables, columns, and so on

Data Integrity Accuracy and consistency of the data stored in the organization’s data-base system

Data Manipulation Operations for altering data in the database Data manipulation includes retrieval, addition, update, and deletion of data

Data Mining Knowledge discovery process Data mining algorithms uncover hidden relationships and patterns from a given set of data on which they operate Knowledge discovery is automatic, not through deliberate search and analysis by analysts Data Model Representation of the real-world information requirements that gets implemented in a computer system A data model provides a method and means for describing real-world information by using specific notations and conventions Data Repository Storage of the organization’s data in databases Stores all data values that are part of the databases

Data View See User View

Data Warehouse A specialized database having a collection of transformed and inte-grated data, stored for the purpose of providing strategic information to the organization

Database Repository where an ordered, integrated, and related collection of the organization’s data is stored for the purpose of computer applications and information sharing

Database Administration Responsibility for the technical aspects of the organization’s database Includes the physical design and handling of the technical details such

as database security, performance, day-to-day maintenance, backup, and recovery Database administration is more technical than managerial

Database Administrator (DBA) Specially trained technical person performing the database administration functions in an organization

Trang 2

Database Practitioners Includes the set of IT professionals such as analysts, data mode-lers, designers, programmers, and database administrators who design, build, deploy, and maintain database systems

DBMS Database Management System Software system to store, access, maintain, manage, and safeguard the data in databases

DDLC Database Development Life Cycle A complete process from beginning to end, with distinct phases for defining information requirements, creating the data model, designing the database, implementing the database, and maintaining it thereafter Decomposition of Relations Splitting of relations or tables into smaller relations for the purpose of normalizing them

Degree The number of entity types or object sets that participate in a relationship For a binary relationship the degree is 2

Dimension Entity Type In a STAR schema, a dimension entity type represents a business dimension such as customer or product along which metrics like sales are analyzed

DKNF Domain Key Normal Form This is the ultimate goal in transforming a relation into the highest normal form A relation is in DKNF if it represents one topic and all

of its business rules, being able to be expressed through domain constraints and key relationships

Domain The set of all permissible data values and data types for an attribute of an entity type

DSS Decision Support System Application that enables users to make strategic decisions Decision support systems are driven by specialized databases

End-Users See Users

Entity A real-world “thing” of interest to an organization

Entity Instance A single occurrence of an entity type For example, a single invoice is an instance of the entity type called INVOICE

Entity Integrity A rule or constraint to ensure the correctness of an entity type or rela-tional table

ERD Entity-Relationship Diagram A graphical representation of entities and their relationships in the Entity-Relationship data modeling technique

Entity Set The collection of all entity instances of a particular type of entity

Entity Type Refers to the type of entity occurrences in an entity set For example, all customers of an organization form the CUSTOMER entity type

E-R Data Modeling Design technique for creating an entity-relationship diagram from the information requirements

Evolutionary Modeling Data modeling as promoted by the Agile Software Develop-ment moveDevelop-ment This is a type of iterative modeling methodology where the model evolves in “creation—feedback—revision” cycles

External Data Model Definition of the data structures in a database that are of interest to various user groups in an organization It is the way users view the database from outside

Fact Entity Type In a STAR schema, a fact entity type represents the metrics such as sales that are analyzed along business dimensions such as customer or product

Trang 3

Feasibility Study One of the earlier phases in DDLC conducting a study of the readiness

of an organization and the technological, economic, and operational feasibility of a database system for the organization

Fifth Normal Form (5NF) A relation that is already in the fourth normal form and without any join dependencies

First Normal Form (1NF) A relation that has no repeating groups of values for a set of attributes in a single row

Foreign Key An attribute in a relational table used for establishing a direct relationship with another table, known as the parent table The values of the foreign key attribute are drawn from the primary key values of the parent table

Fourth Normal Form (4NF) A relation that is already in the third normal and without any multivalued dependencies

Functional Dependency The value of an attribute B in a relation depending on the value

of another attribute A For every instance of attribute A, its value uniquely determines the value of attribute B in the relation

Generalization The concept that some entity types are general cases of other entity types The entity types in the general cases are known as super-types

Generalizing Specialists A trend in software developers, as promoted by the agile soft-ware development movement, where specialists acquire more and more diverse skills and expand their horizons Accordingly, data modelers are no longer specialists with just data modeling skills

Gerund Representation of a relationship between two entity types as an entity type itself Homonyms Two or more data elements having the same name but containing different data

Identifier One or more attributes whose values can uniquely identify the instances of an entity type

Identifying Relationship A relationship between two entity types where one entity type depends on another entity type for its existence For example, the entity type ORDER-DETAIL cannot exist without the entity type ORDER

Inheritance The property that sub-sets inherit the attributes and relationships of their super-set

Intrinsic Characteristics Basic or inherent properties of an object or entity

IT Information Technology Covers all computing and data communications in an organ-ization Typically, the CIO is responsible for IT operations in an organorgan-ization Iterative Modeling This implies that the modeling process is not strictly carried out in a sequential manner such as modeling all entity types, modeling all relationships, model-ing all attributes, and so on Iterative modelmodel-ing allows the data modeler to constantly go back, verify, readjust, and ensure cohesion and completeness

Key One or more attributes whose values can uniquely identify the rows of a relational table

Logical Data Model Also sometimes referred to as a conventional data model, consists

of the logical data structure representing the information requirements of an organiz-ation This data model conforms to the conventions of a class of database systems such as hierarchical, network, relational, and so on The logical data model for a relational database system consists of tables or relations

Trang 4

Logical Design Process of designing and creating a logical data model.

Matrix Consists of members or elements arranged in rows and columns In the relational data model, a table or relation may be compared to a matrix thereby making it possible

to apply matrix algebra functions to the data represented in the table

MDDMBS Multi-dimensional database management system Used to create and manage multi-dimensional databases for OLAP

Meta-data Data about the data of an organization

Model Transformation Process of mapping and transforming the components of a conceptual data model to those of a logical or conventional data model

MOLAP Multidimensional Online Analytical Processing An analytical processing technique in which multidimensional data cubes are created and stored in separate proprietary databases

Normal Form A state of a relation or table, free from incorrect dependencies among the attributes See also Boyce-Codd Normal Form, First Normal Form, Second Normal Form, and Third Normal Form

Normalization The step-by-step method of transforming a random table into a set of normalized relations free from incorrect dependencies and conforming to the rules of the relational data model

Null Value A value of an attribute, different from zero or blank to indicate a missing, non-applicable or unknown value

OLAP Online Analytical Processing Powerful software systems providing extensive multidimensional analysis, complex calculations, and fast response times Usually present in data warehousing systems

Physical Data Model Data model representing the information requirements of an organization at a physical level of hardware and system software, consisting of the actual components such as data files, blocks, records, storage allocations, indexes, and so on

Physical Design Process of designing the physical data model

Practitioners See Database Practitioners

Primary Key A single attribute or a set of attributes that uniquely identifies an instance

of an object set or entity type and chosen as the primary key

RDBMS Relational Database Management System

Referential Integrity Refers to two relational tables that are directly related Referential integrity between related tables is established if non-null values in the foreign key attribute of the child table are primary key values in the parent table

Relation In relational database systems, a relation is a two dimensional table with columns and rows, conforming to relational rules

Relational Data Model A conventional or logical data model where data is perceived as two-dimensional tables with rows and columns Each table represents a business object; each column represents an attribute of the object; each row represents an instance of the object

Relational Database A database system built based on the relational data model Relationship A relationship between two object sets or entity types represents the associations of the instances of one object set with the instances of the other object

Trang 5

set Unary, binary, or ternary relationships are the common ones depending on the number of object sets participating in the relationship A unary relationship is recur-sive—instances of an object set associated with instances of the same object set Relationships may be mandatory or optional based on whether some instances may

or may not participate in the relationship

Repeating Group A group of attributes in a relation that has multiple sets of values for the attributes

ROLAP Relational Online Analytical Processing An online analytical processing technique in which multidimensional data cubes are created on the fly by the relational database engine

Second Normal Form (2NF) A relation that is already in the first normal form and without partial key dependencies

Set Theory Mathematical concept where individual members form a set Set operations can be used to combine or select members from sets in several ways In a relational data model, the rows or tuples of a table or relation may be considered as forming

a set As such, set operations may be applied to manipulation of data represented as tables

Specialization The concept that some entity types are special cases of other entity types The entity types in the special cases are known as sub-types

SQL Structured Query Language Has become the standard language interface for relational databases

Stakeholders All people in the organization who have a stake in the success of the data system

STAR Schema The arrangement of the collection of fact and dimension entity types in the dimensional data model, resembling a star formation, with the fact entity type placed in the middle and surrounded by the dimension entity types Each dimension entity type is in a one-to-many relationship with the fact entity type

Strategic Information May refer to information in an organization used for making strategic decisions

Strong Entity An entity on which a weak entity depends for its existence See also Weak Entity

Sub-types See Specialization

Subset An entity type that is a special case of another entity type known as the superset Super-types See Generalization

Superset An entity type that is a general case of another entity type known as the subset Surrogate Key A unique value generated by the computer system used as a key for a relation A surrogate key has no business meaning apart from the computer system Synonyms Two or more data elements containing the same data but having different names

Syntactic Completeness Syntactic completeness of a data model implies that the model-ing process has been carried out completely to produce a good data model for the organization

Syntactic Correctness Syntactic correctness of a data model implies that the represen-tation using the appropriate symbols does not violate any rules of the modeling technique

Trang 6

Third Normal Formn (3NF) A relation that is already in the second normal form and without any transitive dependencies—that is, the dependencies of non-key attributes

on the primary key through other non-key attributes, not directly

Transitive Dependency In a relation, the dependency of a non-key attribute on the primary key through another non-key attribute, not directly

Triad A set of three related entity types where one of the relationships is redundant Triads must be resolved in a refined data model

Tuple A row in a relational table

UML Unified Modeling Language Its forerunners constitute the wave of object-oriented analysis and design methods of the 1980s and 1990s UML is a unified language because it directly unifies the leading methods of Booch, Rumbaugh, and Jacobson OMG (Object Management Group) has adopted UML as a standard

User View View of the database by a single user group Therefore, a data view of a particular user group includes only those parts of the database that group is concerned with The collection of all data views of all the user groups constitutes the total data model

Users In connection with data modeling, the term users includes all people who use the data system that is built based on the particular data model

Weak Entity An entity that depends for its existence on another entity known as a strong entity For example, the entity type ORDER DETAIL cannot exist without the entity type ORDER See also Strong Entity

XML eXtensible Markup Language Introduced to overcome the limitations of HTML XML is extensible, portable, structured, and descriptive In a very limited way, it may

be used in data modeling

Trang 8

Aggregation See Relationships, special cases

of, aggregation

Agile movement, the, 376 – 379

generalizing specialists, 379

philosophies, 378

principles, 378

See Data modeling, agile modeling principles

See also Modeling, agile; Modeling,

evolutionary

Assembly structures, 147 – 148

Attribute, checklist for validation of, 178 – 180

Attributes, 100, 158 – 178

constraints for, 169 – 170

null values, 170

range, 170

type, 170

value set, 169

data, as, 161

domain, definition of, 164

domains, 164 – 169

attribute values, for, 166

information content, 165

misrepresented, 167

split, 167

names, 163

properties or characteristics, 158

relationships of, 160

types of, 171 – 175

optional, 173

simple and composite, 171

single-valued and multi-valued, 171

stored and derived values, with, 172

values, 162

Business intelligence, 300 Business rules, incorporation of, 25 Case study

E-R model, 84 UML model, 87 Categorization See Specialization / Generalization, categorization Circular structures,

See Relationships, design issues of, circular structures

Class diagram, 62 See also UML Conceptual and physical entity types,

145 – 147 Conceptual model symbols and meanings, 77 Data lifecycle, 7 – 9

Data mining, 334 – 342 OLAP versus data mining, 336 techniques, 338

data modeling for, 341 Data model

communication tool, 5 components of, 18 – 20 database blueprint, 5 external, 13, 75 conceptual, 14 – 15, 75 identifying components, 77 – 80 review procedure, 76 – 77 logical, 15 – 17, 75, 104 – 107 transformation steps, 107 – 110

433

Data Modeling Fundamentals By Paulraj Ponniah

Copyright # 2007 John Wiley & Sons, Inc.

Trang 9

Data model (Continued )

physical, 17, 76, 111 – 112

quality, 26 – 29, 348

approach to good modeling, 351

assurance process, 365 – 373

aspects of, 365

assessment of, 370

stages of, 366

definitions, of, 351 – 360

checklists, 358

dimensions, 361

good and bad models, 349

meaning of, 360

relational, 109

symbols, 19 – 20

Data model diagram, review of, 103 – 104

Data modeling

agile modeling principles, application

of, 34 – 35

approaches, 36 – 38, 44 – 47

data mining, for, 341

data warehouse, for the, 38 – 39

methods and techniques

IDEF1X, 51

Information Engineering, 50

Object Role Modeling (ORM), 55

Peter Chen (E-R) modeling, 48

Richard Barker’s, 53

XML, 57

steps of, 20 – 26

tips, practical, 392 – 421

bill-of-materials, 409

iterative modeling, 399 – 401

cycles, establishing, 399

increments, 400

partial models, integration of, 401

layout, conceptual model, 409 – 417

adding texts, 416

component arrangement, 410

visual highlights, 417

legal entities, 402

locations and places, 403

logical data model, 417 – 421

persons, 407

requirements definition, 393 – 396

stakeholder participation, 396 – 399

time periods, 405

Data system development life cycle.

See DDLC

Data warehouse, 301 – 325

data staging, 304

data storage, 304

dimensional to relational, 322 families of STARS, 321 information delivery, 305 modeling

business data, dimensional nature

of, 306 dimensional modeling, 308 – 312 dimension entity type, 309,313 fact entity type, 309, 314 information package, 307 snowflake schema, 318 source data, 304 STAR schema, 312 – 318 data granularity, 315, 317 degenerate dimensions, 316 factless fact entity type, 316 fully additive measures, 315 semi-additive measures, 315 technologies, 302

Database design conceptual to relational, 243 informal, 272

model transformation method attributes to columns, 250 entity types to relations, 250 identifiers to keys, 252 transformation of relationships,

252 – 267 mandatory and optional conditions,

261 – 265 transformation summary, 267 when to use, 248

traditional method, 244 Databases, post-relational, 39 – 40 DDLC, 29 – 33

design, 31 implementation, 31 phases and tasks, 32 process, starting the, 30 requirements definition, 30 roles and responsibilities, 33 Decision-support systems, 296 – 301 data modeling for, 301

history of, 297 Dimensional analysis See OLAP systems, dimensional analysis

Domains See Attributes, domains E-R modeling See Data modeling, methods and techniques; Peter Chen (E-R) modeling

Entity, checklist for validation of, 153 – 155

Trang 10

Entity integrity See Relational model,

entity integrity

Entity types

aggregation, 129

association, 129

category of, 127

definition, comprehensive, 116

existence dependency, 132

homonyms, 125

ID dependency, 132

identifying, 120

intersection, 129

regular, 128

strong, 128

subtype, 128

supertype, 128

synonyms, 125

IDEF1X See Data modeling, methods and

techniques, IDEF1X

Identifiers or keys, 101, 175 – 178

generalization hierarchy, in, 177 – 178

guidelines for, 176

keys, definitions of, 175

need for, 175

Informal design, 272 – 276

potential problems, 273 – 276

addition anomaly, 276

deletion anomaly, 275

update anomaly, 275

Information engineering See Data modeling,

methods and techniques; Information

engineering

Information levels, 11 – 13

Integration definition for information

modeling See Data modeling,

methods and techniques,

IDEF1X

Key See also Identifiers or keys

composite, 176

natural, 176

primary, 176

surrogate, 176

Meta-modeling, 40

Modeling, agile, 379 – 385

documentation, 383

feasibility, 384

practices

additional, 383

primary, 381

principles auxiliary, 381 basic, 380 Modeling, evolutionary, 385 – 387 benefits of, 387

flexibility, need for, 386 nature of, 386

Modeling time dimension, 149 Normalization methodology, 276 – 291 fundamental normal forms, 278 – 285 Boyce – Codd normal form, 284 first normal form, 278

second normal form, 279 third normal form, 281 higher normal forms, 285 – 288 domain-key normal form, 288 fifth normal form, 287 fourth normal form, 286 normalization as verification, 291 steps, 277, 290

OLAP systems, 325 – 333 data modeling for, 332 dimensional analysis, 326 features, 325

hypercubes, 328 MOLAP, 330 ROLAP, 330 Online analytical processing See OLAP systems

ORM See Data modeling, methods and techniques; Object Role Modeling Peter Chen See Data modeling, methods and techniques; Peter Chen (E-R) modeling Process modeling, 40

Quality See Data model, quality Recursive structures, 145 Referential integrity See Relational model, referential integrity

Relational model, 231 – 242 columns as attributes, 234 entity integrity, 240 functional dependencies, 242 mathematical foundation, 232 modeling concept, single, 232 notation for, 237

referential integrity, 240 relation or table, 233

Ngày đăng: 07/07/2014, 09:20

TỪ KHÓA LIÊN QUAN