Figure 10-38: Denormalized — musicians, bands, their online advertisements, and some other goodies.Figure 10-38 is a partially complete data warehouse database model, with all the facts
Trang 1fact_id te
state_id (FK) city_id (FK)
City city_id state_id (FK) city
State state_id cou
Trang 2Chapter 10
Figure 10-37: Musicians, bands, their online advertisements, and some other goodies
How It Works
Figure 10-37 shows the analyzed data warehouse database model, for online musician and band advertisements The most significant requirement is to ultimately produce a single star schema, if a single star schema is possible Also add any dimensional and fact information shown as additional in
Examine Figure 10-37 once more Think about the records in the tables Yes, many advertisements are
www.ticketmaster.comwill reveal to you the sheer volume of advertisements, musicians, bands, shows, discography (released CDs), and venues Figure 10-38 takes another slant on this data warehouse database model by rolling all of these tables into a single fact table
Musician
musician_id
musician
phone
instruments
skills
Advertisement advertisement_id band_id (FK) musician_id (FK) ad_date ad_text
Band band_id band founding_date genre
Discography discography_id band_id (FK) cd_name release_date price
Merchandise merchandise_id band_id (FK) type price
Show show_id venue_id (FK) band_id (FK) venue date time
Venue venue_id venue address directions phone
Trang 3Figure 10-38: Denormalized — musicians, bands, their online advertisements, and some other goodies.
Figure 10-38 is a partially complete data warehouse database model, with all the facts rolled into a single table Figure 10-39 shows a finalized, much more sensible star schema, based purely on relative record numbers in various tables from Figure 10-39 Larger record numbers tend to warrant tables as being factual rather than dimensional in nature
Artists artist_id musician_id (FK) musician_name musician_phone musician_email band_name band_founding_date discography_cd_name discography_release_date discography_price show_date show_time venue_name venue_address venue_directions venue_phone advertisment_date advertisement_text
Musician musician_id instruments skills
Merchandise merchandise_id type
price
Band
genre
315 Creating and Refining Tables During the Design Phase
Trang 4Figure 10-39: Denormalized into a single star schema — musicians, bands, their online advertisements, and some other goodies
star schemas because all the separate elements (such as bands, advertisements, shows, and venues) are all related to each other Thus, a single star schema (a single fact table) is the most appropriate data warehouse database model design in this situation
Summar y
In this chapter, you learned about:
Artists artist_id merchandise_id (FK) genre_id (FK) instrument_id (FK) musician_name musician_phone musician_email band_name band_founding_date discography_cd_name discography_release_date discography_price show_date show_time venue_name venue_address venue_directions venue_phone advertisment_date advertisement_text
Instrument instrument_id section_id instrument
Merchandise merchandise_id type
price
Genre genre_id parent_id (FK) genre
316
Chapter 10
Trang 5❑ How to create and refine tables
no profit, and, thus, no company) This chapter has primarily expanded on Chapter 9, from analysis (what to do), into design (how to solve it) Once again, the online auction house database model has been expanded on, and detailed further by the design process, as the continuing case study Chapter 11 digs even further into the design process by describing and specifying fields within each table, along with datatypes and indexing The discussion on indexing is especially about alternate (secondary) indexing
Exercises
Use the ERDs in Figure 10-32 and Figure 10-39 to help you answer these questions:
tables in the proper order by understanding the relationships between the tables
Once again, create the tables in the proper order by understanding the relationships between the tables
317 Creating and Refining Tables During the Design Phase
Trang 7F illing in the Details with
a Detailed Design
“Digging ever deeper gives clarity to definition, and definition of clarity.” (Gavin Powell)
The further you go the more you discover.
This chapter provides the details on the internal structure of tables in terms of fields, field content, field formatting, and indexing on fields This chapter digs a little deeper into the case study mate-rial presented in the previous two chapters Chapter 9 introduced a database model in its infancy,
by analyzing what needed to be done Chapter 10 unearthed structural detail by describing how tables are built and how they are joined together
This chapter delves into the details of the tables themselves, by designing the precise content and structure of individual fields Indexing is included at this stage because indexes are created against specific table fields An index is not quite the same thing as a key, such as a primary key A pri-mary key is required to be unique across all records in a table; therefore, many database engines usually create an automatic unique index for that primary key (which helps performance by checking for uniqueness) Foreign keys, on the other hand, do not have to be unique, and even the most sophisticated of relational databases does not automatically create indexes on foreign keys This is intentional of course If an index is required on a foreign key field (which it more often than not is), an index must be manually created for that foreign key
By the end of this chapter, you will have a good understanding of how best to structure fields, their datatype formats, how, when and where those formats apply Also, you will have a better conceptual understanding of foreign key indexing and alternate (secondary) indexing
In this chapter, you learn about the following:
and some specialized datatypes
Trang 8❑ Using keys and indexes
Case Study: Refining F ield Str ucture
In this section, you refine the field content of tables for both the OLTP and data warehouse database models You continue with the consistent case study development of database models for the online auction house
The OLTP Database Model
Figure 11-1 shows the most recent version of the OLTP database model for the online auction house
Figure 11-1: The online auction house OLTP database model
History history_id seller_id (FK) buyer_id (FK) comment_date comments
Listing listing#
category_id (FK) buyer_id (FK) seller_id (FK) ticker (FK) description image start_date listing_days starting_price reserve_price buy_now_price number_of_bids winning_price
Category category_id parent_id category
Currency
ticker
currency
exchange_rate
decimals
Seller seller_id seller popularity_rating join_date address return_policy international payment_methods
Buyer buyer_id buyer popularity_rating join_date address
Bid listing# (FK) buyer_id (FK) bid_price bid_date
320
Chapter 11
Trang 9Analysis and design are an ongoing process Figure 11-1 shows two further examples of backtracking and refining:
house is based in the U.S and the U.S dollar is the default currency Currencies are separated because there can be a fair amount of complexity involved in currency exchange conversions
histories When a trader is only a buyer, that trader will have no history of activity as a seller;
Figure 11-2 shows a refined field structure for the online auction house OLTP database model shown in Figure 11-1
Figure 11-2: Refining fields for the online auction house OLTP database model
History history_id seller_id (FK) buyer_id (FK) comment_date feedback_positive feedback_neutral feedback_negative
Listing listing#
category_id (FK) buyer_id (FK) seller_id (FK) ticker (FK) description image start_date listing_days starting_price reserve_price buy_now_price number_of_bids winning_prince
Category category_id parent_id category
Currency ticker currency exchange_rate decimals
Seller seller_id seller company company_url popularity_rating join_date address_line_1 address_line_2 town
zip postal_code country return_policy international_shipping payment_method_personal_check payment_method_cashiers_check payment_method_paypal payment_method_western_union payment_method_USPS_postal_order payment_method_international_postal_order payment_method_wire_transfer
payment_method_cash payment_method_visa payment_method_mastercard payment_method_american_express
Buyer buyer_id buyer popularity_rating join_date address_line_1 address_line_2 town
zip postal_code country Bid
listing# (FK) buyer_id (FK) bid_price proxy_bid bid_date
321 Filling in the Details with a Detailed Design
Trang 10Field additions and changes are refinements of both structure and application, as shown in Figure 11-2.
structural refinement is a change to an existing field Changing the ADDRESSfield to five separate fields is
a structural refinement Field refinements are described as follows:
❑ The INCREMENTfield is added to the LISTINGtable Sellers can set a price increment for a
set The system may also override bid increments (based on all pricing factors) if an increment entered by the seller does not equate appropriately with all the pricing values set by the seller
bid up to When a bidder enters a proxy bid, it permits the online auction site to act on behalf of the bidder, increasing the bidders bid price, up to the proxy bid value (the maximum the bidder
is prepared to pay)
TOWN, ZIP, POSTAL_CODE, and COUNTRYfields The ZIPfield is used in the U.S Postal codes are used in other countries It is necessary to divide address details up in this way for two reasons:
more effective with information split into separate fields
might make for less efficiency in joins Additionally, this Boolean type division of multiple selectable options is best handled at the application level It is simply too detailed for handling
at the lower level of the database model
be tightly controlled by applications because of the immense computing power utilized to man-age huge quantities of concurrent Internet users Allowing ad hoc access to OLTP databases and applications will kill your system and result in no users, and thus no business If this is not the case, it is unlikely you are not building an OLTP database model
METHODSfield is split, as shown in the previous option This one is left to your imagination
❑ The HISTORYtable COMMENTSfield could be split into multiple field options, perhaps helping to
COMMENTS_SERVICE_LEVEL, COMMENTS_BUYER_PROMPTNESS) There are many other possibili-ties Comments could even be split into a field structure based on pick list type of preset
fields containing options for positive, neutral, and negative feedback (perhaps even all three can
be entered) When people using online Internet sites feel that they can comment, it makes them
322
Chapter 11
Trang 11feel empowered Empowering people encourages further business If buyers and sellers develop poor reputations, those buyers and sellers, not the auction house, are responsible for an ill-gained reputation This excludes the company from being involved in disputes from any respect other than the role of arbitration
❑ The COMPANY_URLand COMPANYfields are added to the SELLERtable The inclusion of the term COMPANYin the name of the field is intentional This implies that only bona fide company or corporate level traders should be allowed to enter company names or URLs (or both fields) Thus only sellers trading as online retailers (even competing online auctioneers) are ultimately encouraged to list themselves as full-fledged going concerns This allows the online auction house in the case study example to become not only an auctioneer for the individual, but also
a well-publicized portal to the Internet, for other auctioneers The Internet has huge market potential An online auction house with a well-established market presence (in the minds of potential Internet buyers) is extremely valuable for other retailers — an obvious source of prof-itability for the online auction house of this case study
It is essential to note that the changes made to the OLTP database model for the online auction house, as shown in Figure 11-2, are mostly application-oriented In other words, even though some may be analyt-ical (back to the analysis stage) in nature, these changes are more likely design level refinements At this
stage, field refinements become less of a database modeling thing, and perhaps more to do with how
applications will handle things So far, the OLTP database model in this case study has been far more
tightly controlled than the data warehouse model At this point of field refinement for an OLTP database model, the OLTP database model may appear to become less mathematical
The important thing to remember is that the database model is good at doing certain things, and that application SDKs (such as Java) are good at doing other things You don’t need to be a database expert
or an experienced Java programmer to understand the basic precept of this change in approach
Essentially, the OLTP database model might become somewhat more end-user oriented at this stage, and perhaps a little less mathematically confusing to the database modeling uninitiated So far in this book, the data warehouse modeling approach has always been more end-user oriented The data warehouse model always looks like a company from an operational perspective From this point in time, the OLTP
database model begins to look a little friendlier and a little less geeky.
Overall, many of these changes may seem a little over the top That is because many of these changes are
compli-cated because it has too many fields You might shout, “Overkill,” and you might be correct, of course!
In that case, the case study is performing its demonstrative function by showing what can be done, not necessarily what should be done You are beginning to see that a database model should not only be mathematically driven, but also application driven The needs of front-end applications can sometimes partially dictate database model design because database model and applications are dependent on each other in many respects The ERD shown in Figure 11-2 is a little too busy When things are too busy, it means structure may be becoming overcomplicated
The Data Warehouse Database Model
Figure 11-3 shows the most recent version of the data warehouse database model for the online auction house
323 Filling in the Details with a Detailed Design
Trang 12Figure 11-3: The online auction house data warehouse database model.
Category Hierarchy
Listing-Bid Facts
category_id parent_id (FK) category Seller
seller_id
popularity_rating
join_date
address
return_policy
international
payment_methods
Buyer buyer_id buyer popularity_rating join_date address
Listing_Bids_History fact_id
time_id (FK) buyer_id (FK) location_id (FK) seller_id (FK) category_id (FK) listing#
listing_description listing_image listing_start_date listing_days listing_currency listing_starting_price listing_reserve_price listing_buy_now_price listing_number_of_bids listing_winning_price listing_winner_buyer bidder
bidder_price bidder_date
Time time_id month quarter year Location
location_id region country state city
History Facts
Category Hierarchy category_id parent_id (FK) category Seller
seller_id seller popularity_rating join_date return_policy international payment_methods
Buyer buyer_id buyer popularity_rating join_date History
Comment
fact_id time_id (FK) buyer_id (FK) location_id (FK) seller_id (FK) category_id (FK) buyer_comment_date buyer_comments seller_comment_date seller_comments
Time time_id month quarter year
Location location_id region country state city
324
Chapter 11