1. Trang chủ
  2. » Luận Văn - Báo Cáo

Computer hardware analytic for electronic commerce

87 8 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Computer Hardware Analytic for Electronic Commerce
Tác giả Ly Gia Bao, Tran Nguyen Anh Huy, Tran Anh Thai, Le Huynh Long Vy
Người hướng dẫn Dr. Phan Trong Nhan, Assoc. Prof. Nguyen Thanh Binh
Trường học Ho Chi Minh University of Technology
Chuyên ngành Computer Science
Thể loại Graduation Thesis
Năm xuất bản 2021
Thành phố Ho Chi Minh City
Định dạng
Số trang 87
Dung lượng 2,2 MB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

User clicks ’Show detailed information’ button after they havebuilt a computer set by the Manual Generator or after theyreceive list of computer sets by the Auto Generator.. Use Case ID

Trang 1

HO CHI MINH UNIVERSITY OF TECHNOLOGY COMPUTER SCIENCE AND ENGINEERING FACULTY

GRADUATION THESIS

Computer Hardware Analytic for Electronic Commerce

Department: Computer science

Advisor: Dr Phan Trong Nhan

Reviewer: Assoc Prof Nguyen Thanh Binh

-o0o -

Tran Nguyen Anh Huy 1752244

HO CHI MINH CITY, 12/2021

Trang 2

VT姶云PI"A萎K"J窺E"DèEJ"MJQC

D浦"OðP<HTTT Ej¿"#<"Ukpj"xk‒p"rjVk"f p"vぜ"p {"x q"vtcpi"pjXv"eてc"dVp"vjw{xv"vtình

J窺"XÉ"VçP< Lý Gia B違q MSSV: 1752089 J窺"XÉ"VçP< Tr亥p"Piw{宇p"Cpj"Jw{ MSSV: 1752244 J窺"XÉ"VçP< Tr亥p"Cpj"Vjái _ MSSV: 1752494 J窺"XÉ"VçP< Lê Hu pj"Nqpg V悦 _ MSSV: 1752637

NGÀNH: Khoa h丑e"Oáy tính _ N閏R<

30"A亥w"8隠"nw壱p" p<

Computer Hardware Analytics for Electronic Commerce

40"Pjk羽o"x映"*{‒w"e亥w"x隠"p瓜k"fwpi"x "u嘘"nk羽w"dcp"8亥w+<

Surveying and analyzing related systems

Finding out hardware data sources for analytics

Proposing core features for the whole system towards recommendation, association rules, personalization, segmentation, and trends of hardware combination

Identifying main academic problems to be solved

Researching relevant theory (mainly association rules, prediction, recommendation

techniques) and practical approaches for the core features implementation as well as for those issues

Researching in-need technologies

Designing the system with the proposed features and solutions

Developing a system prototype

Implementing and perfect the system

Conducting empirical experiments

3 Pi {"ikcq"pjk羽o"x映"nw壱p" p< 23/08/2021

60"Pi {"jq p"vj pj"pjk羽o"x映< 13/12/2021

70"J丑"v‒p"ik違pi"xk‒p"j逢噂pi"f磯p< Rj亥p"j逢噂pi"f磯p<

1) TS Phan Tr丑pi"Pjân P瓜k"fwpi"x "{‒w"e亥w"NXVP"8«"8逢嬰e"vj»pi"swc"D瓜"o»p0

Trang 3

- Ngày 20 tháng 12 p<o 2021

40"A隠"v k< Computer Hardware Analytics for Electronic Commerce

50"J丑"v‒p"pi逢運k"j逢噂pi"f磯p: TS Phan Tr丑pi"Pjân

- Students developed a system that analyzes computer componentsÓ score bands and recommends

the combination of them, especially with their compatibility while there are many choices, to meet the end-usersÓ need Both auto and manual modes with advanced options are supported Moreover,

the users can customize their options as they wish Even when the result is not found, the recommendation module will be automatically triggered

- Students analyzed problem statement as well as its related work and challenges

- Students applied modern technologies such as Django/python compatible with MVC pattern, redis, selenium, scrapy to implement the system and deployed the database on Google Cloud Platform

- Students were very proactive and open to novel knowledge and technology

90"Pj英pi"vjk院w"u„v"ej pj"e栄c"LVTN:

- The evaluation of recommendation and system testing is still limited

- Some features need to improve such as dashboard with more metrics and component choices with further personalization, flexibility, and reliable benchmark scores

8 A隠"pij鵜<"A逢嬰e"d違q"x羽"Z D鰻"uwpi"vj‒o"8吋"d違q"x羽"¸ Mj»pi"8逢嬰e"d違q"x羽"¸

;0"5"e¤w"j臼k"UX"rj違k"vt違"n運k"vt逢噂e"J瓜k"8欝pi<

320"A pj"ik "ejwpi"*d茨pi"ej英<"ik臼k."mj "VD+< Very good Ak吋o"<"9.5/10

M#"v‒p"*ijk"t "j丑"v‒p+

Phan Tr丑pi"Pjân

Trang 4

REVIEW OF GRADUATION THESIS

2 Major: Computer Science

3 Thesis title: Computer Hardware Analytic for Electronic Commerce

4 Reviewer: Assoc.Prof Nguyen Thanh Binh, PhD

5 Contents

- In this thesis, the authors built a system helping the users to build a computer even if they have no experience, while simultaneously providing an alternative for those who want to pick and choose the components for their own computer The tasks in this thesis:

+ Build a web-server to interact with users

+ Build a collection of data from multiple sources

+ Build a data storage

+ Develop an algorithm that will automatically generate a PC based on the user's

requirements

+ Dev gnqr"c"ukorng"tgeqoogpfcvkqp"u{uvgo"dcugf"qp"wugtÓu"kpvgtcevkqpu0

- The authors should explain more system which the authors built The system has

disadvantages for users

- Some spelling errors in the essay

- The author should clarify the advantages of the system which was built compared to

Trang 5

We guarantee that this research is our own, conducted under the supervision and ance of our Instructor Phan Trong Nhan The result of our research is legitimate and hasnot been published in any forms prior to this All materials used within this research arecollected by ourselves, by various sources which are appropriately listed in the referencessection In any case of mistake, we will take full responsibility for it

Trang 7

In this topic, we are trying to create a system that focuses on helping the users to build

a computer even if they have no experience, while simultaneously providing an alternativefor those who want to pick and choose the components for their own computer The system’smajor goals will be to create a computer based on the user’s requirements, collect necessarydata, and store it

Furthermore, using the information gathered from user interactions, we will create asimple recommendation system that will identify some of the most popular PCs and havethem suggested to users

Trang 8

1.1 Problem statement 10

1.2 Goal 11

1.3 Scope 11

1.4 Thesis structure 11

2 Methodologies and Theoretical Background 12 2.1 Related work 12

2.1.1 Manual PC builders 12

2.1.2 Automatic PC generator 12

2.1.3 PC performance calculator 12

2.2 Crawler 13

2.2.1 Selenium 13

2.2.2 Scrapy 13

2.3 Web server 15

2.3.1 Django 15

2.3.2 Redis 17

2.4 Google Cloud platform 17

3 System Requirement Analysis 18 3.1 Problems 18

3.1.1 Data gathering 18

3.1.2 Building PC 19

3.1.3 Recommendation system 19

3.2 Functional requirement 19

3.2.1 Automatic computer builder 20

3.2.2 Manual computer buider 20

3.3 Non-Functional requirement 20

3.4 Diagram 21

3.4.1 Use-case 21

3.4.2 Use-case Specification 22

4 System Design 34 4.1 General Architecture 34

4.1.1 How the system work 34

4.1.2 System architecture 34

4.2 Crawler component 35

Trang 9

4.3 Database Component 37

4.3.1 Database schema 37

4.3.2 Detailed information of each entity in the ERD 37

4.4 Web server Component 48

4.4.1 General architecture 48

4.4.2 Build PC module 50

4.4.2.a Checking the compatibility between the components 50 4.4.2.b Build a PC from the user’s requirements 53

4.4.3 Database module 57

4.4.4 Redis 57

4.5 Recommendation system 58

5 System Implementation 59 5.1 Technologies used 59

5.2 Building a Prototype 60

5.2.1 Home Page 60

5.2.2 PC Auto Generator Page 60

5.2.3 PC Auto Generator Result Page 61

5.2.4 PC Manual Generator Page 65

5.2.5 Customize PC Set Page 66

5.2.6 View PC Set Details Page 68

5.2.7 Search Product Page 69

5.2.8 Data Visualization 69

6 System Testing 70 6.1 Non-functional Testing 70

6.2 Functional Testing 72

7 Conclusion 76 7.1 Achieved result 76

7.2 Evaluation 77

7.3 Future development 77

Trang 10

1.1 Budget distribution for gaming PC[3] 10

2.1 Scrapy architecture[13] 14

2.2 Django Architecture[7] 15

2.3 MTV pattern[7] 15

3.1 Use-Case diagram 21

3.2 Use-Case diagram 22

4.1 General architecture 34

4.2 General idea of index page and product page 35

4.3 The flowchart of crawling 36

4.4 Database schema 38

4.5 Web server general architecture 49

4.6 Checking compatibility between the CPU and motherboard 51

4.7 Intel CPU generation 52

4.8 Workflow for auto building PC 54

4.9 How PC scores are calculated[16] 56

4.10 Workflow with data from Redis 58

5.1 Home Page 60

5.2 PC Auto Generator Page 61

5.3 Expand Advanced options 61

5.4 Generate successfully 62

5.5 Expand PC 1 63

5.6 Generate fails 64

5.7 Recommendation when generate fails 64

5.8 PC Manual Generator Page 65

5.9 Modify RAMs 66

5.10 Customize PC 1 67

5.11 PC 1 Details Page 68

5.12 Search Product Page 69

5.13 Dashboard 70

6.1 Non-functional testing for the auto builder 71

6.2 Non-functional testing for the auto builder result page 71

6.3 Non-functional testing for the manual builder 72

6.4 Functional testing for the auto builder 73

6.5 Functional testing for the auto builder (cont.) 74

6.6 Functional testing for the manual builder 75

6.7 Functional testing for the recommendation system 76

Trang 11

7.1 Non-functional testing for navigation and login page 80

7.2 Non-functional testing for register page 81

7.3 Non-functional testing for detail page and product page 81

7.4 Functional testing for login page and register page 82

7.5 Functional testing for the auto builder result page 82

7.6 Functional testing for detail page and product page 83

Trang 12

1 User Login 22

2 User Login (cont.) 23

3 User Register 24

4 Build PC Automatically 25

5 Build PC Manually 26

6 Re-examine Requirements 27

7 View PC Set’s Details 28

8 View PC Set’s Details 29

9 Access Seller Pages 30

10 Save a Complete PC Set 31

11 Search Product 32

12 Sort Product 33

13 View Dashboard 33

14 CPU table 37

15 Motherboard table 39

16 GPU table 40

17 RAM table 41

18 SSD table 42

19 HDD table 43

20 PSU table 44

21 CASE table 45

22 RAM_Usage table 45

23 SSD_Usage table 46

24 HDD_Usage table 46

25 PSU_Usage table 47

26 Main_comp table 48

27 Base budget percentage for each component compare to maximum budget 55

28 Technologies used 59

Trang 13

Abbreviation Meaning

PC Personal computerCPU Central processing unitGPU Graphics processing unitRAM Random-access memoryHDD Hard disk drive

SSD Solid-state drivePSU Power supply unit

Trang 14

1 Introduction

In recent years, with the advancement of technology, a PC now is able to help us withmany aspects of life including studying, working, entertaining,etc But a PC is not a simpleproduct and not all PCs are the same In fact, a PC is made up of various components, andthe purpose and grade of the computer might vary depending on the components used Forexample, a PC used for entertaining purposes, specifically gaming, first prioritizes the budget

on the graphics processing unit (GPU), second CPU and memory, other components such asstorage and power supply unit (PSU) have lower priority because GPU, CPU and memorywill have higher impact on the gaming experience compared to others

Figure 1.1: Budget distribution for gaming PC[3]

A PC utilized for working purposes, notably office work, on the other hand, may noteven require a GPU, but rather a strong CPU and RAM to provide greater performancewhen performing office tasks When compared to the requirement to handle the heavy imageprocessing load when operating a game, office tasks frequently include only dealing withdocuments and calculations

Each component of a PC may have many manufacturers and each of them may alsohave their own specifications that need to be taken into consideration when purchasing

As a result, a system that allows a user to select from a wide range of components andbuild a PC that meets their needs while staying within budget has shown to be beneficial

Trang 15

1.2 Goal

The main objective of the thesis is to develop a web-based system that can assist theusers in the process of building a PC by checking the compatibility and the performance ofthe components, making sure that it matches the user’s requirements while also providing atool for those who want to pick and choose their own components The system will need away to collect necessary data from multiple websites, specifically from computer hardwarewebsites as well as a storage for those data In addition, the thesis will also propose a simplerecommendation system that bases on the user’s interactions with the system

In the scope of this thesis, we will focus on:

• Build a web-server to interact with users

• Build a collection of data from multiple sources

• Build a data storage

• Develop an algorithm that will automatically generate a PC based on user’s ments

require-• Develop a simple recommendation system based on user’s interactions

The content of the thesis will be presented as follow:

• Chapter 1 Introduction: Introduction overview, scope and goal of the thesis

• Chapter 2 Methodologies and Theoretical Background: Presents the theoreticalbackground of the related topics as well as the tools and technologies that will be used

• Chapter 3 System Requirement Analysis: Analyzes the requirements of the system

as well the functions it will provide

• Chapter 4 System Design: Presents the design and architecture of the entire system

as well as its components

• Chapter 5 System Implementation: Describes how each component of the system

is implemented

• Chapter 6 System Testing: Presents the test results of the system

Trang 16

• Chapter 7 Conclusion: Summarizes the achieved results as well as the direction ofdevelopment of the topic.

It operates by determining which component is compatible with the one the user lected earlier However, some of the restrictions are not checked or incorrect, like whenthe Random-access memory (RAM) slots you can choose exceeds the amount supported bythe motherboard or when you choose an Intel CPU to go with a motherboard which onlysupports AMD CPUs Only the dedicated website likehttps://pcpartpicker.com/ isthe most reliable as it thoughtfully checks all the compatibility issues when adding a newcomponent to the build

For the automatically generating PC feature, our group could only locate one site which

ishttps://wccftech.com/pc-builder/ It has some options for the user to specify theirrequirements before generating the PC The result is generated fast and matches the user’srequirements

However, the system has some downsides Firstly, the system may not yield any results

at some budget point Secondly, there are only a few options for the user to describe their PCrequirements Finally, for a specific budget, the system only yields one result

In order to calculate a PC’s performance, a specific testing application must be cuted on the PC Therefore, you need to have actual hardware in order to run the testingprogram which will provide the performance score for the PC components

Trang 17

exe-This will require a big pool of sample hardware in order to have enough data.

To do this the https://www.userbenchmark.com/ website created a software

to calculate the performance of PC running it and published that software for the user todownload The software then gathers those data from users and display them on the website

In this thesis, we will use the performance data from this website

In our project, we will be using Selenium on Python with the Google Chrome Driver Through Selenium Python API we can access all functionalities of Selenium Web-Driver in an intuitive way

Scrapy[14]is an application framework for crawling web sites and extracting structureddata which can be used for a wide range of useful applications, like data mining, informationprocessing or historical archival

Even though Scrapy is originally designed for web scraping, it can also be used toextract data using APIs (such as Amazon Associates Web Services) or as a general-purposeweb crawler

Scrapy architecture:[13]

The data flow in Scrapy is controlled by the execution engine, and goes like this:

1 The Engine gets the initial Requests to crawl from the Spider

2 The Engine schedules the Requests in the Scheduler and asks for the next Requests tocrawl

3 The Engine sends the Requests to the Downloader, passing through the DownloaderMiddlewares

4 Once the page finishes downloading, the Downloader generates a Response (with thatpage) and sends it to the Engine, passing through the Downloader Middlewares

5 The Engine receives the Response from the Downloader and sends it to the Spider forprocessing, passing through the Spider Middleware

Trang 18

Figure 2.1: Scrapy architecture[13]

6 The Spider processes the Response and returns scraped items and new Requests (tofollow) to the Engine, passing through the Spider Middleware

7 The Engine sends processed items to Item Pipelines, then sends processed Requests tothe Scheduler and asks for possible next Requests to crawl

8 The process repeats (from step 1) until there are no more requests from the Scheduler.Scrapy Component:[13]

• Scrapy Engine: The engine is responsible for controlling the data flow tween all components of the system, and triggering events when certain actions occur

be-• Scheduler: The Scheduler receives requests from the engine and enqueuesthem for feeding them later (also to the engine) when the engine requests them

• Downloader: The Downloader is responsible for fetching web pages andfeeding them to the engine which, in turn, feeds them to the spiders

• Spiders: Spiders are custom classes written by Scrapy users to parse sponses and extract items from them or additional requests to follow

re-• Item Pipeline: The Item Pipeline is responsible for processing the items oncethey have been extracted (or scraped) by the spiders Typical tasks include cleansing,validation and persistence (like storing the item in a database)

Trang 19

• Downloader middlewares: Downloader middlewares are specific hooks thatsit between the Engine and the Downloader and process requests when they pass fromthe Engine to the Downloader, and responses that pass from Downloader to the Engine.

• Spider middlewares: Spider middlewares are specific hooks that sit betweenthe Engine and the Spiders and are able to process spider input (responses) and output(items and requests)

Django is a high-level Python web framework that encourages rapid development andclean, pragmatic design It offers a standard method for fast and effective website develop-ment by providing many tools and built- in features to handle common web developmenttasks like user authentication, content administration, site maps, etc

Figure 2.2: Django Architecture[7]

Django is designed based on the Model-Template-View (MTV) pattern which is a tle different from the commonly used Model-View-Controller (MVC) pattern The maindifferent is that Django itself manages the user’s interaction which is the Controller part inMVC

lit-Figure 2.3: MTV pattern[7]

Trang 20

After handling user’s interaction, view will be called which interacts with the modeland templates to produce a response that will be returned to the user.

Advantages of Django[8]

• Django is easy to set up and run

• It provides an easy-to-use interface for various administrative activities

• Helps you to define patterns for the URLs in your application

• Offers built-in authentication system

• Cache framework comes with multiple cache mechanisms

• High-level framework for rapid web development

• A complete stack of tools

• Data modelled with Python classes

Disadvanges of Django[8]

• It is a monolithic platform

• High dependence on Django ORM Broad Knowledge required

• Only allows you to handle a single request per time

• Routing requires some knowledge of regular expressions

A few general steps to build an application on Django:[4]

• Design the model

• Install the model

• Use the provided API to access the data

• Use the administrative interface to set up the data

• Design URLs

• Design template

• Run the application

Trang 21

2.3.2 Redis

Redis is an open source (BSD licensed), in-memory data structure store, used as adatabase, cache, and message broker Redis provides data structures such as strings, hashes,lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, andstreams Redis has built-in replication, Lua scripting, LRU eviction, transactions, and dif-ferent levels of on-disk persistence, and provides high availability via Redis Sentinel andautomatic partitioning with Redis Cluster.[12]

In-memory databases are purpose-built databases that rely primarily on memory fordata storage, in contrast to databases that store data on disk or SSDs In-memory data storesare designed to enable minimal response times by eliminating the need to access disks Be-cause all data is stored and managed exclusively in main memory, in-memory databases risklosing data upon a process or server failure In-memory databases can persist data on disks

by storing each operation in a log or by taking snapshots.[1]

There is no official support for Windows builds[12] so we cannot use Redis directlyfrom the provider but instead through a third party software which is Memurai[11] developed

by Microsoft

Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing vices that all runs on the same infrastructure that Google uses internally for its end-userproducts, such as Google Search, Gmail, Google Drive, and YouTube Alongside a set ofmanagement tools, it provides a series of modular cloud services including computing, datastorage, data analytics and machine learning Registration requires a credit card or bankaccount details

ser-Google Cloud Platform provides infrastructure as a service, platform as a service, andserverless computing environments and enterprise mapping services

In this project, we will take advantage of Cloud SQL provided by GCP

Key features of Cloud SQL[6]

• Fully managed: Cloud SQL automatically ensures the databases are reliable,secure, and scalable so that your business continues to run without disruption

• Integrated: Access Cloud SQL instances from just about any application

• Reliable: Easily configure replication and backups to protect your data Gofurther by enabling automatic failover to make your database highly available

• Easy migrations to Cloud SQL: Database Migration Service (DMS) makes iteasy to migrate your production databases to Cloud SQL with minimal downtime

Trang 22

Some advantage of using Cloud SQL[6]

• Reduce maintenance cost with fully managed MySQL, PostgreSQL and SQLServer databases

• Ensure business continuity with reliable and secure services backed by 24/7SRE team

• Automate database provisioning, storage capacity management, and othertime-consuming tasks

• Database observability made easy for developers with Cloud SQL Insights

• Easy integration with existing apps and Google Cloud services like GKE andBigQuery

Data is always one of the most critical aspects of any project As a result, acquiring data

is a must-do step before beginning to construct anything However, this procedure may sparksome issues along the way that required immediate and suitable solutions Unfortunately,while conducting studies for this thesis, it appears that there are two data collection issues,which will be discussed further below

First and foremost, data cannot be downloaded if the crawler is unable to locate them.Today’s websites are built with a very diverse and advance technique to provide their userswith the best experience For example, with the use of AJAX, a website can dynamicallyload the content as the user interacts with the page instead of putting all the content in tothe HTML in one go This prevents the crawler from getting the desired data as it will beinvisible to the crawler because it is not presented HTML at the time the crawler gets it As aresult, we have to use tools with browser integration to get the content of the website or find

a way to send the request to get the data directly from the server

Second, data collection process can sometime make too many visits to a website thatcontains a lot of information This can cause the server to be overloaded which will affectthe users’ experience when being on that website Moreover, it can get even worse whenthe website owner detects a bot is accessing their properties, make them to set up some pre-vention methods such as blocking IP or requiring some verification before allowing access

Trang 23

Therefore, we will need our crawler to behave like human as similar as possible while specting the website owner by reducing the amount of requests or increasing the intervalbetween each visit.

In order for a PC to work properly, we need to take into consideration the compatibilitywhen combining the components In the process of generating the PCs based on the user’srequirements, handling budget is the most difficult part Budget needs to be split reasonablybetween the components because focusing too much on a single component can cause thebottleneck problem or when a fix budget is set for each component, it may return a slightlyworse result because many different combinations have been ruled out by the fix budget So

we need to set a suitable and flexible price range for each component based on the purpose

of the PC

Running a combination on all of the available components can cause a drastically crease in run-time which will slow down the process To prevent this, some steps need to betaken to reduce the input components before running the combination as well as set a properlimit on how many results will be needed

In this thesis, the recommendation system will be used in case the user cannot find anyresult from the PC building process It will work based on the interaction of the users withthe website Those records will help to find which PC is the most popular and system willtake it for recommendation However, user may spam those interactions and may lead toincorrect results when applying the recommendation system To solve this, we need to limitthe score which can be gained from each interact session between a user and a PC

The following are necessary activities that should be taken on the system:

• Build a complete set of computers with great compatibility and performance ically

automat-• Help create a brand new computer set from scratch

• Provide a more detailed glimpse of selected computer set

• Provide further details when users search for a specific item

• Offer to preserve a copy of the desired computer set

Trang 24

3.2.1 Automatic computer builder

• Help basic users obtain computer sets based only on goal and budget

• Include various criteria for users to better describe their conditions

• Allow users to review their standards after receiving results

• Show the list of the computer sets that were found, together with crucial features likeperformance and price

• Provide users with a more complete view of the computer they’re interested in

• Provide search forms for customers to fill out in order to find their desired items

• Ensure that all of the components are compatible and operate properly together

• Allow users to add different items to the RAM, SSD, and HDD categories as long asthey are compatible with the mainboard

• Allow users to modify the quantity of RAMs, SSDs, and HDDs as long as they arecompatible with the mainboard

• Allow users to change the components they’ve chosen

• Update total cost automatically when modifying the computer set

• The system offers an easy-to-use interface

• The system is able to respond quickly

• The system can be load on almost web browsers

• The system can be load on almost operating system platforms

• Users passwords shall be encrypted with the PBKDF2 algorithm with a SHA256 hash

• The system shall be easy for users to interact

• The system shall check the validity of user inputs

• The system shall crawl data from trusted pages

Trang 25

3.4 Diagram

General User-case diagram for all users of the system:

• Actor: Member, Visitor

• Number of use-case: 6

Figure 3.1: Use-Case diagram

User-case diagram for all users of the system at detailed level:

• Actor: User

• Number of use-case: 11

Trang 26

Figure 3.2: Use-Case diagram

Trang 27

• User account has already been created

• User has accessed the system

Post-Condition(s) User successfully logged into the system

Basic Flow

1 User clicks on the Login button on the top right of the website

2 System redirects User to Login page

3 User enters their Username and Password into the fields in thelogin form

4 System validates the login information successful and rects User to the previous page

redi-Alternative Flow

1a User clicks ’Show detailed information’ button after they havebuilt a computer set by the Manual Generator or after theyreceive list of computer sets by the Auto Generator

Use-case continues to step 3

4a System validates the login information successful and rects User to the PC’s detail page

redi-4b If user registers and then logins, system redirects user toHome page

Exception Flow

4b System validates the login information failed and displays themessage

4b1 User returns to step 3

4b2 User chooses to cancel the login and goes back to previouspage

Table 2: User Login (cont.)

Trang 28

Use Case ID 2

Use Case Name User Register

Description User enters their information to create an account and then can use

their login information to log into the website

Pre-Condition(s) User has accessed the system

Post-Condition(s) User successfully created an account

Basic Flow

1 User clicks on the Sign-up button on the top right of the site

web-2 System redirects User to Register page

3 User enters their Username, Email, Password and ConfirmedPassword into the fields in the register form

4 System validates the register information successful

5 System redirects User to the login page, and displays the cessful message

4b1 User returns to step 3

4b2 User chooses to cancel the register and goes back to previouspage

Table 3: User Register

Trang 29

Use Case ID 3

Use Case Name Build PC Automatically

Description User describes their condition by fill in the form and system will

generates a list of computer sets that matches the user’s desire.Pre-Condition(s) User has accessed the system

Post-Condition(s) User receives a list of computer sets which matches their conditions

Basic Flow

1 User clicks the Generator category and chooses Auto feature

on the navigation bar

2 System redirects User to the PC Auto Builder Form page

3 User selects their goal, fills out the budget and submit

4 System validates the form successful

5 System redirects User to the result page, and displays the list

of appropriate computer sets

Trang 30

Use Case ID 4

Use Case Name Build PC Manually

Description User builds a computer set from the beginning

Pre-Condition(s) User has accessed the system

Post-Condition(s) User receives a complete and compatible computer set

Basic Flow

1 User clicks the Generator category and chooses Manual ture on the navigation bar

fea-2 System redirects User to the Manual Generator page

3 User completes the form for the component they want

4 System returns a list of appropriate products that meet user’srequirements

5 User selects one or more items, depend on component’s gory and the computer’s compatibility

cate-6 User repeats step 3 for each additional component until theyare pleased with the current computer configuration

Alternative Flow

6a User removes the selected item and re-selects the other.6b User adjusts the quantity of the selected item (only RAM,SSD, HDD)

Use-case continues step 6

Exception Flow

4b System fails to retrieve any items that match user’s ments

require-4b1 User returns to step 3

Table 5: Build PC Manually

Trang 31

Use Case ID 5

Use Case Name Re-examine Requirements

Description User has received the Auto Generator’s result and wishes to review

their inputs once again

Pre-Condition(s) User has received the Auto Generator’s result and is on the result

page

Post-Condition(s) User sees their specified requirements

Basic Flow

1 User clicks on the drop-down bar near top of the page

2 System shows the user’s requests

Alternative Flow

Exception Flow

Table 6: Re-examine Requirements

Trang 32

Use Case ID 6

Use Case Name Customize a PC Set

Description User customizes the PC set that they chosen

Pre-Condition(s) User has received the Auto Generator’s result and is on the result

2 System shows general information of the computer set

3 User clicks the ’Customize this build’ button from that puter set

com-4 System redirects user to the Manual Generator page whichholds all the information of the selected computer set

5 User customizes the set until they are pleased

Alternative Flow

Exception Flow

Table 7: View PC Set’s Details

Trang 33

2 System shows general information of the computer set.

3 User clicks the ’Show detailed information’ button from thatcomputer set

4 System redirects user to the PC’s details page

Trang 34

Use Case ID 8

Use Case Name Access Seller Pages

Description User wants to know where they can get the components for the

com-puter set they’ve picked

Pre-Condition(s) User has selected a computer set and is viewing it on the detail page.Post-Condition(s) User is directed to the seller pages

3 User clicks the link they want

4 System redirects user the appropriate pages

Trang 35

Use Case ID 9

Use Case Name Save a Complete PC Set

Description User is able to save a copy of their chosen computer set

Pre-Condition(s) User has selected a computer set and is viewing it on the detail page.Post-Condition(s) User has a copy of their chosen computer set on their local device

Basic Flow

1 User clicks the ’Export to PDF’ button at the end of the page

2 System sends a download request to the user’s browser

3 User receives a PDF file containing the in depth information

of the selected computer set

Alternative Flow

Exception Flow

Table 10: Save a Complete PC Set

Trang 36

Use Case ID 10

Use Case Name Search Product

Description User searches for a product they want

Pre-Condition(s) User has already access the website

Post-Condition(s) User gets the information of the selected product

Basic Flow

1 User clicks Product category and chooses the type of nent they want on the navigation bar

compo-2 System redirects user to the page listing out all of the products

of the chosen type

3 User browses for the desired product

Alternative Flow

3a User reduces the number of products they have to browse for

by using the filters on the left side

4a System returns the products that match user’s requirements.Use-case continues step 3

Trang 37

Use Case ID 11

Use Case Name Sort Product

Description The user sorts the products into the desired order

Pre-Condition(s) User is on the Product page

Post-Condition(s) User gets the list of products in the desired order

Basic Flow

1 User clicks on the header of the column they want to sort

2 System sorts the list and displays to user

Description User views the dashboard

Pre-Condition(s) User has accessed the system

Post-Condition(s) User can view the dashboard page

Basic Flow

1 User clicks view dashboard in the navigation board

2 User is redirected to the dashboard page

Alternative Flow

Exception Flow

Table 13: View Dashboard

Trang 38

4 System Design

Data will be collected from the Hardware component websites (https://www.userbenchmark.com/, https://gearvn.com/, https://phongvu.vn/, etc.) and then pushed

to the database for storage The web server will then use those data from the database inconjunction with Redis to provide services to the users through the browser

Figure 4.1: General architecture

The system consists of 3 main components:

• Crawler: This component will be responsible for collecting data throughoutthe Internet

• Google Cloud SQL: The main database of the whole system It is powered

by Google Cloud which provides many useful features to help make the database easy

to manage, reliable and secure

Trang 39

• Web Server: The primary component which is in charge most of the features

in this thesis and uses the data from the database Redis will serve as a cache databasefor the server to increase the performance

On the hardware components websites that we need to get the data from, for each type

of component, there are generally two main types of pages: the index page and the productpage The index page is where the list of products is displayed For a product, it usuallycontains only the name and the link to that product page The product page is where the data

we need located and it is also the page that we want to extract the data

Figure 4.2: General idea of index page and product page

If we are going to start from the initial page, which can be the home page or the firstindex page, we can get all of its URLs to the product page and then move on to the nextindex page Then we can keep repeating the process till we reach the last index page and getall the product pages

During the crawling process, we have encountered a few different ways a websitepresents their content:

• All the content is located in the HTML when the web page is loaded This is the easiesttype of page to extract the information as everything is already in the HTML We canrequest the HTML and then get the data This is usually the way that the product page

is presented

• No content or only some is available and the rest is hidden behind JavaScript codethat is run when the browser loads the page or on user interaction We cannot directlyrequest the HTML to extract the data because the initial HTML that is requested now

no longer has the information that we need They will be added after the HTML isloaded by the JavaScript This type of page is usually the index page where the list ofproducts is located In this case, we have 2 solutions:

Trang 40

– Solution 1: We will mimic the request that the JavaScript will be usingand use that request to get the data response directly from the server Then wewill extract data from the JSON response of the server.

– Solution 2: We will use Selenium which can control the browser andsimulate user like clicking or scrolling to render the data into the HTML forextraction

The crawling will overall follow these step:

Figure 4.3: The flowchart of crawling

• Step 1: Initialize the "Index" queue with the index URL These URLs areunique

• Step 2: Loop through the "Index" queue to get index page and extract theproduct URL from it Put all the product URL to the "Product" queue

• Step 3: Loop through the "Product" queue to get the product page and extractdata from it Data will be record into a JSON file

• Step 4: If the "Product" queue is empty, the process stop

Ngày đăng: 02/06/2022, 20:18

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w