Tài liệu ôn thi GIỮA KỲ môn Mô hình hóa dữ liệu MongoDB

Tài liệu ôn thi GIỮA KỲ môn Mô hình hóa dữ liệu MongoDB Trường Đại học Công Nghiệp Thành phố Hồ Chí Minh (IUH) Tài liệu Ôn thi GIỮA KỲ môn Mô hình hóa dữ liệu MongoDB - Lý thuyết và Bài tập thực hành Tài liệu tổng hợp các kiến thức trọng tâm phục vụ cho kỳ thi giữa kỳ môn Mô hình hóa dữ liệu, tập trung vào hệ quản trị cơ sở dữ liệu MongoDB: Lý thuyết hệ thống: Tóm tắt các khái niệm lý thuyết quan trọng cần nắm vững trước khi đi vào thiết kế mô hình thực tế. Bảng phân tích thao tác dữ liệu chi tiết: Cung cấp khung phân tích chuyên sâu cho các đối tượng (Actor) và thao tác CRUD (tạo tài khoản, đăng bài, xem bài, truy vấn xu hướng...). Thông số kỹ thuật định lượng: Hướng dẫn cách tính toán và xác định các yếu tố then chốt trong mô hình hóa như: Tần suất thao tác (Frequency - lần/s), Thời gian sống của dữ liệu (Data life), Kích thước dữ liệu (Data size), Độ bền vững (Durability) và Độ mới của dữ liệu (Freshness). Mô hình hóa truy vấn: Phân tích các kiểu đọc dữ liệu (Read Pattern) như Index scan hay Collection scan để tối ưu hóa hiệu suất hệ thống. Tài liệu được trình bày dưới dạng bảng biểu khoa học, giúp sinh viên dễ dàng áp dụng vào các bài toán thiết kế database thực tế.

Trang 1

Tài liệu ôn thi GIỮA KỲ môn Mô hình hóa dữ liệu

MongoDB

1 Các lý thuyết cần ôn

Ôn thi GK Môn Mô Hình Hóa

r operati | Needed Type y n ° size Durabili | Patter | Freshness

operatio ) ( độ (index | dữ liệu )

xác dữ collect

liệu ) ion

scan )

Ngư mới tài erName,pa (24*3600 nam bytes ty

dung email lan/s

Ngu | Dang Id_user,Id | Write (10 Thấp orcao | 10 1000 | Majori

dung | viét hoiGianD 4*3600)

t,tifle,noiD ân /s ung

Ngư | Xem Id_user,Id_ | write (10 Cao 10 1000 | Majori

Oi bai BaiViet,tho 000*100) nam | bytes | ty

dung | viét iDiemXem, /

like (24*3600

)= 11,5 lần /s Ngư | Truy Id userld | Read (10*10)*2 | Cao 10 Index <1 gio

phân | hương | ¡iDiemXem, (24*3600 dữ liệu tích Countlike, )= 0,02 được cập

ment, thoi

CountView

Trang 2

36 kí tự, mỗi kí

tự 1 byte

266 byte

ObjectID

Double, ISODate, Int

Trang 3

= Post

{ _id: ObjectiD(),

"PostiD": String

id: ObjectiDQ,

*UseriD”: String,

"name":String, kế Ì

postLis[0,1,100]:String

+

=

= Likes comments

_id:ObjectiD9, "UserID":String,

= Post

{

_id: ObjectiD()

*PostiD": String

“content":String, UserlD:String

}

_id: ObjectIDQ),

"name":String,

"UserID"-String, } "content":String

}

{

"UserD":String

}

Trang 4

my User

id: ObjecttD0,

SUsertD" Sting,

“name ":Strng,

m PostLit[0.1,100]

( is: ObjectiD0,

*PostiD": String = Like[0,1,100]

Yeonten sing,

1

= comment{0,1,100]

{

“UsertD"String,

“content” String

Trang 5

One-to-One: Embedded

users [10M]

_id: <objectld>

name: <string>

street: <string>

city: <string>

zip: <string>

shipping_address

street: <string>

city: <string>

zip: <string>

|

L

- One-to-One: embed, using

subdocuments

This address information may profit from a little more organization

Using the document model, regroup each set of address information into a subgroup

Preferred representation:

¢ Preserves simplicity

» Documents are clearer

One-to-One: Reference

- Example:

‘stores (1000) Store details (1000)

Jd: <objectid> 1d, <objecti=

[[soreta: <string> [[storeta: <string>

Iame-<stmg> [name <string>

|address: <string> ldescrpuon: <string>

city: <string>

state: <string>

zip: <string>

|coords [2]: <double>

1 5, 10}: <string>

Steet <string>

city: <string>

zip: <string>

‘staff [1, 10, 100) name <string>

address: <string>

tite: <string>

given store, we would find additional information like the manager and staff by

querying the store details

collection using the link to the corresponding document, as

we do for any relation

expressed by a reference

Trang 6

One-to-Many: Embed, in the “one” side

The first representation embeds the end

items [100K]

title: <string>

- Example: slogan: <string>

|description: <string>

stars: <int>

an item within the item itself because we want | category: <string>

to display these reviews once the item gets img_urt: <string>

retrieved from the database

For simple applications where the number

of embedded documents is small

For quartering on the many side, we use lsat dala

multi key indexes

|

| = In product catalog, we keep the top reviews of

l

price: <decimal>

|sold_at [1, 1000}: <objectld>

10, 20)

“<string>

juser_name: <string>

|body: <string>

One-to-many: Embed, in the “many” side

‘orders [10M]

_id: <objectid>

date: <date>

lcustomer_id: <int>

items [1, 2, 100)

item_id: <string>

quantity: <int>

price: <decimal>

shipping_address

street, <string>

city: <string>

zip: <string>

{ date: "2010/06/24", address: {

"3@@ University”,

"Palo Alto, CA, USA", }

}

{

date: "2019/06/24", address: {

"108 Forest”,

“Palo Alto, CA, USA",

} }

less often used

useful if "many" side is queried

more often than the “one” side

embedded object is duplicated

o duplication may be preferable for dynamic objects

Trang 7

One-to-Many: Reference, in the “one” side

- The third representation is to have two collections:

- Example:

“id: <objectld> “id: <objectid>

city: <string> istoreld: <string>

loc [2]: <double> H+——<name: <stnng>

zip: <string> state: <string>

‘stores (0, 1, 5}: <string> Zip: <string>

coords [2]: <double>

Trang 8

Jne-to-Many: Reference, in the “many” side

zips [10000] stores [1000]

_id: <string> _id: <objectid>

city: <string> 'storeld: <string>

loc [2]: <double> ~}}+———Oname: <string>

pop: <long> (0 1, Sjjaddress: <string>

state: <string> city: <string>

zip: <string> state: <string>

lcoords [2] <double>

e preferred representation using

references

¢ Allows for large documents and a high count of these

e@ Noneed to manage the

references on the "one" side

Many-to-Many: Embed, in the main side

‘carts [10K] ‘items [0, 100K] e

Tid: <objectia> nid: <ini>

ldate: <date> title: <string>

img_url: <string>

oon price: <decimal>

tiie: <string>

stars: <int>

|category: <string>

price: <decimal>

the documents from the less queried side are embedded results in duplication

keep “source" for the embedded documents in another collection indexing is done on the array

Trang 9

Many-to-Many: Reference, in the secondary side

em: slogan: <sting> |zoeed: cưng lame => documents of the other collection

[iesenpuon <sung> BC khdwess <sung>

img_ ur: <string> [np <stnng> first query on the "main" collection

rice: <decimal> [coords [2] <double>

[o6 site se — |]

top rewews[D,Z0]

user_name: <string>

\date: <date>

body: <string>

Many-to-Many: Reference, in the secondary side

- Example:

a sogan:<sting> kh name: <string> « ng > documents of the other collection i

descrpeon:<sting> l—sdses <string>

stars <> ety <ating>

once <deomas coords zJ <doube>

cx B20) [ |#ems,_ sold [1 10K 100K] <emt> ]

fer ad <em>

luser_name: <sting>

laate <aate>

body <sting>

Trang 10

One-to-Zillions Relationship

items [100K] Crow's Foot

— extension

id: <int>

title: <string>

slogan: <string>

\description: <string>

stars: <int>

category: <string>

limg_url: <string>

price: <decimal>

top_reviews [0, 20]

luser_id: <string>

luser_name: <string>

date: <date>

body: <string>

{0, 1K, 100M)|@ser_id: <objectid>

item views [100B]

|_id: <string> lqem_id: <int>

lew_ date: <date>

Quantifying the relationship

One-to-Zillions: Reference, in the “Zillions” side

One-to-Zillions: reference, in the “zillions” side

on (One to-ZilionsRelationshio

ca

one-to-zillions relationship

Trang 11

Details of a Write Operation

Data in Operation: device ID, time stamp, device metrics

Frequency: 1.6 M/sec (100 000 000/60)

Data Durability: one node, no need to wait for majority

Details of a Read Operation

- Workload analysis

Description: run analytic/trend query on temperature

Operation Type: read

Data in Operation: temperature metrics

Frequency 100/hour (10 scientists * 10 op/hr)

Data Freshness:

up to the last hour

Tiêu đề	Midterm review document for MongoDB data modeling
Chuyên ngành	MongoDB data modeling
Thể loại	Tài liệu ôn thi
Thành phố	hà nội

Định dạng
Số trang	11
Dung lượng	2,18 MB