User clicks ’Show detailed information’ button after they havebuilt a computer set by the Manual Generator or after theyreceive list of computer sets by the Auto Generator.. Use Case ID
Trang 1HO CHI MINH UNIVERSITY OF TECHNOLOGY COMPUTER SCIENCE AND ENGINEERING FACULTY
GRADUATION THESIS
Computer Hardware Analytic for Electronic Commerce
Department: Computer science
Advisor: Dr Phan Trong Nhan
Reviewer: Assoc Prof Nguyen Thanh Binh
-o0o -
Tran Nguyen Anh Huy 1752244
HO CHI MINH CITY, 12/2021
Trang 2VT姶云PI"A萎K"J窺E"DèEJ"MJQC
D浦"OðP<HTTT Ej¿"#<"Ukpj"xk‒p"rjVk"f p"vぜ"p {"x q"vtcpi"pjXv"eてc"dVp"vjw{xv"vtình
J窺"XÉ"VçP< Lý Gia B違q MSSV: 1752089 J窺"XÉ"VçP< Tr亥p"Piw{宇p"Cpj"Jw{ MSSV: 1752244 J窺"XÉ"VçP< Tr亥p"Cpj"Vjái _ MSSV: 1752494 J窺"XÉ"VçP< Lê Hu pj"Nqpg V悦 _ MSSV: 1752637
NGÀNH: Khoa h丑e"Oáy tính _ N閏R<
30"A亥w"8隠"nw壱p" p<
Computer Hardware Analytics for Electronic Commerce
40"Pjk羽o"x映"*{‒w"e亥w"x隠"p瓜k"fwpi"x "u嘘"nk羽w"dcp"8亥w+<
Surveying and analyzing related systems
Finding out hardware data sources for analytics
Proposing core features for the whole system towards recommendation, association rules, personalization, segmentation, and trends of hardware combination
Identifying main academic problems to be solved
Researching relevant theory (mainly association rules, prediction, recommendation
techniques) and practical approaches for the core features implementation as well as for those issues
Researching in-need technologies
Designing the system with the proposed features and solutions
Developing a system prototype
Implementing and perfect the system
Conducting empirical experiments
3 Pi {"ikcq"pjk羽o"x映"nw壱p" p< 23/08/2021
60"Pi {"jq p"vj pj"pjk羽o"x映< 13/12/2021
70"J丑"v‒p"ik違pi"xk‒p"j逢噂pi"f磯p< Rj亥p"j逢噂pi"f磯p<
1) TS Phan Tr丑pi"Pjân P瓜k"fwpi"x "{‒w"e亥w"NXVP"8«"8逢嬰e"vj»pi"swc"D瓜"o»p0
Trang 3- Ngày 20 tháng 12 p<o 2021
40"A隠"v k< Computer Hardware Analytics for Electronic Commerce
50"J丑"v‒p"pi逢運k"j逢噂pi"f磯p: TS Phan Tr丑pi"Pjân
- Students developed a system that analyzes computer componentsÓ score bands and recommends
the combination of them, especially with their compatibility while there are many choices, to meet the end-usersÓ need Both auto and manual modes with advanced options are supported Moreover,
the users can customize their options as they wish Even when the result is not found, the recommendation module will be automatically triggered
- Students analyzed problem statement as well as its related work and challenges
- Students applied modern technologies such as Django/python compatible with MVC pattern, redis, selenium, scrapy to implement the system and deployed the database on Google Cloud Platform
- Students were very proactive and open to novel knowledge and technology
90"Pj英pi"vjk院w"u„v"ej pj"e栄c"LVTN:
- The evaluation of recommendation and system testing is still limited
- Some features need to improve such as dashboard with more metrics and component choices with further personalization, flexibility, and reliable benchmark scores
8 A隠"pij鵜<"A逢嬰e"d違q"x羽"Z D鰻"uwpi"vj‒o"8吋"d違q"x羽"¸ Mj»pi"8逢嬰e"d違q"x羽"¸
;0"5"e¤w"j臼k"UX"rj違k"vt違"n運k"vt逢噂e"J瓜k"8欝pi<
320"A pj"ik "ejwpi"*d茨pi"ej英<"ik臼k."mj "VD+< Very good Ak吋o"<"9.5/10
M#"v‒p"*ijk"t "j丑"v‒p+
Phan Tr丑pi"Pjân
Trang 4REVIEW OF GRADUATION THESIS
2 Major: Computer Science
3 Thesis title: Computer Hardware Analytic for Electronic Commerce
4 Reviewer: Assoc.Prof Nguyen Thanh Binh, PhD
5 Contents
- In this thesis, the authors built a system helping the users to build a computer even if they have no experience, while simultaneously providing an alternative for those who want to pick and choose the components for their own computer The tasks in this thesis:
+ Build a web-server to interact with users
+ Build a collection of data from multiple sources
+ Build a data storage
+ Develop an algorithm that will automatically generate a PC based on the user's
requirements
+ Dev gnqr"c"ukorng"tgeqoogpfcvkqp"u{uvgo"dcugf"qp"wugtÓu"kpvgtcevkqpu0
- The authors should explain more system which the authors built The system has
disadvantages for users
- Some spelling errors in the essay
- The author should clarify the advantages of the system which was built compared to
Trang 5We guarantee that this research is our own, conducted under the supervision and ance of our Instructor Phan Trong Nhan The result of our research is legitimate and hasnot been published in any forms prior to this All materials used within this research arecollected by ourselves, by various sources which are appropriately listed in the referencessection In any case of mistake, we will take full responsibility for it
Trang 7In this topic, we are trying to create a system that focuses on helping the users to build
a computer even if they have no experience, while simultaneously providing an alternativefor those who want to pick and choose the components for their own computer The system’smajor goals will be to create a computer based on the user’s requirements, collect necessarydata, and store it
Furthermore, using the information gathered from user interactions, we will create asimple recommendation system that will identify some of the most popular PCs and havethem suggested to users
Trang 81.1 Problem statement 10
1.2 Goal 11
1.3 Scope 11
1.4 Thesis structure 11
2 Methodologies and Theoretical Background 12 2.1 Related work 12
2.1.1 Manual PC builders 12
2.1.2 Automatic PC generator 12
2.1.3 PC performance calculator 12
2.2 Crawler 13
2.2.1 Selenium 13
2.2.2 Scrapy 13
2.3 Web server 15
2.3.1 Django 15
2.3.2 Redis 17
2.4 Google Cloud platform 17
3 System Requirement Analysis 18 3.1 Problems 18
3.1.1 Data gathering 18
3.1.2 Building PC 19
3.1.3 Recommendation system 19
3.2 Functional requirement 19
3.2.1 Automatic computer builder 20
3.2.2 Manual computer buider 20
3.3 Non-Functional requirement 20
3.4 Diagram 21
3.4.1 Use-case 21
3.4.2 Use-case Specification 22
4 System Design 34 4.1 General Architecture 34
4.1.1 How the system work 34
4.1.2 System architecture 34
4.2 Crawler component 35
Trang 94.3 Database Component 37
4.3.1 Database schema 37
4.3.2 Detailed information of each entity in the ERD 37
4.4 Web server Component 48
4.4.1 General architecture 48
4.4.2 Build PC module 50
4.4.2.a Checking the compatibility between the components 50 4.4.2.b Build a PC from the user’s requirements 53
4.4.3 Database module 57
4.4.4 Redis 57
4.5 Recommendation system 58
5 System Implementation 59 5.1 Technologies used 59
5.2 Building a Prototype 60
5.2.1 Home Page 60
5.2.2 PC Auto Generator Page 60
5.2.3 PC Auto Generator Result Page 61
5.2.4 PC Manual Generator Page 65
5.2.5 Customize PC Set Page 66
5.2.6 View PC Set Details Page 68
5.2.7 Search Product Page 69
5.2.8 Data Visualization 69
6 System Testing 70 6.1 Non-functional Testing 70
6.2 Functional Testing 72
7 Conclusion 76 7.1 Achieved result 76
7.2 Evaluation 77
7.3 Future development 77
Trang 101.1 Budget distribution for gaming PC[3] 10
2.1 Scrapy architecture[13] 14
2.2 Django Architecture[7] 15
2.3 MTV pattern[7] 15
3.1 Use-Case diagram 21
3.2 Use-Case diagram 22
4.1 General architecture 34
4.2 General idea of index page and product page 35
4.3 The flowchart of crawling 36
4.4 Database schema 38
4.5 Web server general architecture 49
4.6 Checking compatibility between the CPU and motherboard 51
4.7 Intel CPU generation 52
4.8 Workflow for auto building PC 54
4.9 How PC scores are calculated[16] 56
4.10 Workflow with data from Redis 58
5.1 Home Page 60
5.2 PC Auto Generator Page 61
5.3 Expand Advanced options 61
5.4 Generate successfully 62
5.5 Expand PC 1 63
5.6 Generate fails 64
5.7 Recommendation when generate fails 64
5.8 PC Manual Generator Page 65
5.9 Modify RAMs 66
5.10 Customize PC 1 67
5.11 PC 1 Details Page 68
5.12 Search Product Page 69
5.13 Dashboard 70
6.1 Non-functional testing for the auto builder 71
6.2 Non-functional testing for the auto builder result page 71
6.3 Non-functional testing for the manual builder 72
6.4 Functional testing for the auto builder 73
6.5 Functional testing for the auto builder (cont.) 74
6.6 Functional testing for the manual builder 75
6.7 Functional testing for the recommendation system 76
Trang 117.1 Non-functional testing for navigation and login page 80
7.2 Non-functional testing for register page 81
7.3 Non-functional testing for detail page and product page 81
7.4 Functional testing for login page and register page 82
7.5 Functional testing for the auto builder result page 82
7.6 Functional testing for detail page and product page 83
Trang 121 User Login 22
2 User Login (cont.) 23
3 User Register 24
4 Build PC Automatically 25
5 Build PC Manually 26
6 Re-examine Requirements 27
7 View PC Set’s Details 28
8 View PC Set’s Details 29
9 Access Seller Pages 30
10 Save a Complete PC Set 31
11 Search Product 32
12 Sort Product 33
13 View Dashboard 33
14 CPU table 37
15 Motherboard table 39
16 GPU table 40
17 RAM table 41
18 SSD table 42
19 HDD table 43
20 PSU table 44
21 CASE table 45
22 RAM_Usage table 45
23 SSD_Usage table 46
24 HDD_Usage table 46
25 PSU_Usage table 47
26 Main_comp table 48
27 Base budget percentage for each component compare to maximum budget 55
28 Technologies used 59
Trang 13Abbreviation Meaning
PC Personal computerCPU Central processing unitGPU Graphics processing unitRAM Random-access memoryHDD Hard disk drive
SSD Solid-state drivePSU Power supply unit
Trang 141 Introduction
In recent years, with the advancement of technology, a PC now is able to help us withmany aspects of life including studying, working, entertaining,etc But a PC is not a simpleproduct and not all PCs are the same In fact, a PC is made up of various components, andthe purpose and grade of the computer might vary depending on the components used Forexample, a PC used for entertaining purposes, specifically gaming, first prioritizes the budget
on the graphics processing unit (GPU), second CPU and memory, other components such asstorage and power supply unit (PSU) have lower priority because GPU, CPU and memorywill have higher impact on the gaming experience compared to others
Figure 1.1: Budget distribution for gaming PC[3]
A PC utilized for working purposes, notably office work, on the other hand, may noteven require a GPU, but rather a strong CPU and RAM to provide greater performancewhen performing office tasks When compared to the requirement to handle the heavy imageprocessing load when operating a game, office tasks frequently include only dealing withdocuments and calculations
Each component of a PC may have many manufacturers and each of them may alsohave their own specifications that need to be taken into consideration when purchasing
As a result, a system that allows a user to select from a wide range of components andbuild a PC that meets their needs while staying within budget has shown to be beneficial
Trang 151.2 Goal
The main objective of the thesis is to develop a web-based system that can assist theusers in the process of building a PC by checking the compatibility and the performance ofthe components, making sure that it matches the user’s requirements while also providing atool for those who want to pick and choose their own components The system will need away to collect necessary data from multiple websites, specifically from computer hardwarewebsites as well as a storage for those data In addition, the thesis will also propose a simplerecommendation system that bases on the user’s interactions with the system
In the scope of this thesis, we will focus on:
• Build a web-server to interact with users
• Build a collection of data from multiple sources
• Build a data storage
• Develop an algorithm that will automatically generate a PC based on user’s ments
require-• Develop a simple recommendation system based on user’s interactions
The content of the thesis will be presented as follow:
• Chapter 1 Introduction: Introduction overview, scope and goal of the thesis
• Chapter 2 Methodologies and Theoretical Background: Presents the theoreticalbackground of the related topics as well as the tools and technologies that will be used
• Chapter 3 System Requirement Analysis: Analyzes the requirements of the system
as well the functions it will provide
• Chapter 4 System Design: Presents the design and architecture of the entire system
as well as its components
• Chapter 5 System Implementation: Describes how each component of the system
is implemented
• Chapter 6 System Testing: Presents the test results of the system
Trang 16• Chapter 7 Conclusion: Summarizes the achieved results as well as the direction ofdevelopment of the topic.
It operates by determining which component is compatible with the one the user lected earlier However, some of the restrictions are not checked or incorrect, like whenthe Random-access memory (RAM) slots you can choose exceeds the amount supported bythe motherboard or when you choose an Intel CPU to go with a motherboard which onlysupports AMD CPUs Only the dedicated website likehttps://pcpartpicker.com/ isthe most reliable as it thoughtfully checks all the compatibility issues when adding a newcomponent to the build
For the automatically generating PC feature, our group could only locate one site which
ishttps://wccftech.com/pc-builder/ It has some options for the user to specify theirrequirements before generating the PC The result is generated fast and matches the user’srequirements
However, the system has some downsides Firstly, the system may not yield any results
at some budget point Secondly, there are only a few options for the user to describe their PCrequirements Finally, for a specific budget, the system only yields one result
In order to calculate a PC’s performance, a specific testing application must be cuted on the PC Therefore, you need to have actual hardware in order to run the testingprogram which will provide the performance score for the PC components
Trang 17exe-This will require a big pool of sample hardware in order to have enough data.
To do this the https://www.userbenchmark.com/ website created a software
to calculate the performance of PC running it and published that software for the user todownload The software then gathers those data from users and display them on the website
In this thesis, we will use the performance data from this website
In our project, we will be using Selenium on Python with the Google Chrome Driver Through Selenium Python API we can access all functionalities of Selenium Web-Driver in an intuitive way
Scrapy[14]is an application framework for crawling web sites and extracting structureddata which can be used for a wide range of useful applications, like data mining, informationprocessing or historical archival
Even though Scrapy is originally designed for web scraping, it can also be used toextract data using APIs (such as Amazon Associates Web Services) or as a general-purposeweb crawler
Scrapy architecture:[13]
The data flow in Scrapy is controlled by the execution engine, and goes like this:
1 The Engine gets the initial Requests to crawl from the Spider
2 The Engine schedules the Requests in the Scheduler and asks for the next Requests tocrawl
3 The Engine sends the Requests to the Downloader, passing through the DownloaderMiddlewares
4 Once the page finishes downloading, the Downloader generates a Response (with thatpage) and sends it to the Engine, passing through the Downloader Middlewares
5 The Engine receives the Response from the Downloader and sends it to the Spider forprocessing, passing through the Spider Middleware
Trang 18Figure 2.1: Scrapy architecture[13]
6 The Spider processes the Response and returns scraped items and new Requests (tofollow) to the Engine, passing through the Spider Middleware
7 The Engine sends processed items to Item Pipelines, then sends processed Requests tothe Scheduler and asks for possible next Requests to crawl
8 The process repeats (from step 1) until there are no more requests from the Scheduler.Scrapy Component:[13]
• Scrapy Engine: The engine is responsible for controlling the data flow tween all components of the system, and triggering events when certain actions occur
be-• Scheduler: The Scheduler receives requests from the engine and enqueuesthem for feeding them later (also to the engine) when the engine requests them
• Downloader: The Downloader is responsible for fetching web pages andfeeding them to the engine which, in turn, feeds them to the spiders
• Spiders: Spiders are custom classes written by Scrapy users to parse sponses and extract items from them or additional requests to follow
re-• Item Pipeline: The Item Pipeline is responsible for processing the items oncethey have been extracted (or scraped) by the spiders Typical tasks include cleansing,validation and persistence (like storing the item in a database)
Trang 19• Downloader middlewares: Downloader middlewares are specific hooks thatsit between the Engine and the Downloader and process requests when they pass fromthe Engine to the Downloader, and responses that pass from Downloader to the Engine.
• Spider middlewares: Spider middlewares are specific hooks that sit betweenthe Engine and the Spiders and are able to process spider input (responses) and output(items and requests)
Django is a high-level Python web framework that encourages rapid development andclean, pragmatic design It offers a standard method for fast and effective website develop-ment by providing many tools and built- in features to handle common web developmenttasks like user authentication, content administration, site maps, etc
Figure 2.2: Django Architecture[7]
Django is designed based on the Model-Template-View (MTV) pattern which is a tle different from the commonly used Model-View-Controller (MVC) pattern The maindifferent is that Django itself manages the user’s interaction which is the Controller part inMVC
lit-Figure 2.3: MTV pattern[7]
Trang 20After handling user’s interaction, view will be called which interacts with the modeland templates to produce a response that will be returned to the user.
Advantages of Django[8]
• Django is easy to set up and run
• It provides an easy-to-use interface for various administrative activities
• Helps you to define patterns for the URLs in your application
• Offers built-in authentication system
• Cache framework comes with multiple cache mechanisms
• High-level framework for rapid web development
• A complete stack of tools
• Data modelled with Python classes
Disadvanges of Django[8]
• It is a monolithic platform
• High dependence on Django ORM Broad Knowledge required
• Only allows you to handle a single request per time
• Routing requires some knowledge of regular expressions
A few general steps to build an application on Django:[4]
• Design the model
• Install the model
• Use the provided API to access the data
• Use the administrative interface to set up the data
• Design URLs
• Design template
• Run the application
Trang 212.3.2 Redis
Redis is an open source (BSD licensed), in-memory data structure store, used as adatabase, cache, and message broker Redis provides data structures such as strings, hashes,lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, andstreams Redis has built-in replication, Lua scripting, LRU eviction, transactions, and dif-ferent levels of on-disk persistence, and provides high availability via Redis Sentinel andautomatic partitioning with Redis Cluster.[12]
In-memory databases are purpose-built databases that rely primarily on memory fordata storage, in contrast to databases that store data on disk or SSDs In-memory data storesare designed to enable minimal response times by eliminating the need to access disks Be-cause all data is stored and managed exclusively in main memory, in-memory databases risklosing data upon a process or server failure In-memory databases can persist data on disks
by storing each operation in a log or by taking snapshots.[1]
There is no official support for Windows builds[12] so we cannot use Redis directlyfrom the provider but instead through a third party software which is Memurai[11] developed
by Microsoft
Google Cloud Platform (GCP), offered by Google, is a suite of cloud computing vices that all runs on the same infrastructure that Google uses internally for its end-userproducts, such as Google Search, Gmail, Google Drive, and YouTube Alongside a set ofmanagement tools, it provides a series of modular cloud services including computing, datastorage, data analytics and machine learning Registration requires a credit card or bankaccount details
ser-Google Cloud Platform provides infrastructure as a service, platform as a service, andserverless computing environments and enterprise mapping services
In this project, we will take advantage of Cloud SQL provided by GCP
Key features of Cloud SQL[6]
• Fully managed: Cloud SQL automatically ensures the databases are reliable,secure, and scalable so that your business continues to run without disruption
• Integrated: Access Cloud SQL instances from just about any application
• Reliable: Easily configure replication and backups to protect your data Gofurther by enabling automatic failover to make your database highly available
• Easy migrations to Cloud SQL: Database Migration Service (DMS) makes iteasy to migrate your production databases to Cloud SQL with minimal downtime
Trang 22Some advantage of using Cloud SQL[6]
• Reduce maintenance cost with fully managed MySQL, PostgreSQL and SQLServer databases
• Ensure business continuity with reliable and secure services backed by 24/7SRE team
• Automate database provisioning, storage capacity management, and othertime-consuming tasks
• Database observability made easy for developers with Cloud SQL Insights
• Easy integration with existing apps and Google Cloud services like GKE andBigQuery
Data is always one of the most critical aspects of any project As a result, acquiring data
is a must-do step before beginning to construct anything However, this procedure may sparksome issues along the way that required immediate and suitable solutions Unfortunately,while conducting studies for this thesis, it appears that there are two data collection issues,which will be discussed further below
First and foremost, data cannot be downloaded if the crawler is unable to locate them.Today’s websites are built with a very diverse and advance technique to provide their userswith the best experience For example, with the use of AJAX, a website can dynamicallyload the content as the user interacts with the page instead of putting all the content in tothe HTML in one go This prevents the crawler from getting the desired data as it will beinvisible to the crawler because it is not presented HTML at the time the crawler gets it As aresult, we have to use tools with browser integration to get the content of the website or find
a way to send the request to get the data directly from the server
Second, data collection process can sometime make too many visits to a website thatcontains a lot of information This can cause the server to be overloaded which will affectthe users’ experience when being on that website Moreover, it can get even worse whenthe website owner detects a bot is accessing their properties, make them to set up some pre-vention methods such as blocking IP or requiring some verification before allowing access
Trang 23Therefore, we will need our crawler to behave like human as similar as possible while specting the website owner by reducing the amount of requests or increasing the intervalbetween each visit.
In order for a PC to work properly, we need to take into consideration the compatibilitywhen combining the components In the process of generating the PCs based on the user’srequirements, handling budget is the most difficult part Budget needs to be split reasonablybetween the components because focusing too much on a single component can cause thebottleneck problem or when a fix budget is set for each component, it may return a slightlyworse result because many different combinations have been ruled out by the fix budget So
we need to set a suitable and flexible price range for each component based on the purpose
of the PC
Running a combination on all of the available components can cause a drastically crease in run-time which will slow down the process To prevent this, some steps need to betaken to reduce the input components before running the combination as well as set a properlimit on how many results will be needed
In this thesis, the recommendation system will be used in case the user cannot find anyresult from the PC building process It will work based on the interaction of the users withthe website Those records will help to find which PC is the most popular and system willtake it for recommendation However, user may spam those interactions and may lead toincorrect results when applying the recommendation system To solve this, we need to limitthe score which can be gained from each interact session between a user and a PC
The following are necessary activities that should be taken on the system:
• Build a complete set of computers with great compatibility and performance ically
automat-• Help create a brand new computer set from scratch
• Provide a more detailed glimpse of selected computer set
• Provide further details when users search for a specific item
• Offer to preserve a copy of the desired computer set
Trang 243.2.1 Automatic computer builder
• Help basic users obtain computer sets based only on goal and budget
• Include various criteria for users to better describe their conditions
• Allow users to review their standards after receiving results
• Show the list of the computer sets that were found, together with crucial features likeperformance and price
• Provide users with a more complete view of the computer they’re interested in
• Provide search forms for customers to fill out in order to find their desired items
• Ensure that all of the components are compatible and operate properly together
• Allow users to add different items to the RAM, SSD, and HDD categories as long asthey are compatible with the mainboard
• Allow users to modify the quantity of RAMs, SSDs, and HDDs as long as they arecompatible with the mainboard
• Allow users to change the components they’ve chosen
• Update total cost automatically when modifying the computer set
• The system offers an easy-to-use interface
• The system is able to respond quickly
• The system can be load on almost web browsers
• The system can be load on almost operating system platforms
• Users passwords shall be encrypted with the PBKDF2 algorithm with a SHA256 hash
• The system shall be easy for users to interact
• The system shall check the validity of user inputs
• The system shall crawl data from trusted pages
Trang 253.4 Diagram
General User-case diagram for all users of the system:
• Actor: Member, Visitor
• Number of use-case: 6
Figure 3.1: Use-Case diagram
User-case diagram for all users of the system at detailed level:
• Actor: User
• Number of use-case: 11
Trang 26Figure 3.2: Use-Case diagram
Trang 27• User account has already been created
• User has accessed the system
Post-Condition(s) User successfully logged into the system
Basic Flow
1 User clicks on the Login button on the top right of the website
2 System redirects User to Login page
3 User enters their Username and Password into the fields in thelogin form
4 System validates the login information successful and rects User to the previous page
redi-Alternative Flow
1a User clicks ’Show detailed information’ button after they havebuilt a computer set by the Manual Generator or after theyreceive list of computer sets by the Auto Generator
Use-case continues to step 3
4a System validates the login information successful and rects User to the PC’s detail page
redi-4b If user registers and then logins, system redirects user toHome page
Exception Flow
4b System validates the login information failed and displays themessage
4b1 User returns to step 3
4b2 User chooses to cancel the login and goes back to previouspage
Table 2: User Login (cont.)
Trang 28Use Case ID 2
Use Case Name User Register
Description User enters their information to create an account and then can use
their login information to log into the website
Pre-Condition(s) User has accessed the system
Post-Condition(s) User successfully created an account
Basic Flow
1 User clicks on the Sign-up button on the top right of the site
web-2 System redirects User to Register page
3 User enters their Username, Email, Password and ConfirmedPassword into the fields in the register form
4 System validates the register information successful
5 System redirects User to the login page, and displays the cessful message
4b1 User returns to step 3
4b2 User chooses to cancel the register and goes back to previouspage
Table 3: User Register
Trang 29Use Case ID 3
Use Case Name Build PC Automatically
Description User describes their condition by fill in the form and system will
generates a list of computer sets that matches the user’s desire.Pre-Condition(s) User has accessed the system
Post-Condition(s) User receives a list of computer sets which matches their conditions
Basic Flow
1 User clicks the Generator category and chooses Auto feature
on the navigation bar
2 System redirects User to the PC Auto Builder Form page
3 User selects their goal, fills out the budget and submit
4 System validates the form successful
5 System redirects User to the result page, and displays the list
of appropriate computer sets
Trang 30Use Case ID 4
Use Case Name Build PC Manually
Description User builds a computer set from the beginning
Pre-Condition(s) User has accessed the system
Post-Condition(s) User receives a complete and compatible computer set
Basic Flow
1 User clicks the Generator category and chooses Manual ture on the navigation bar
fea-2 System redirects User to the Manual Generator page
3 User completes the form for the component they want
4 System returns a list of appropriate products that meet user’srequirements
5 User selects one or more items, depend on component’s gory and the computer’s compatibility
cate-6 User repeats step 3 for each additional component until theyare pleased with the current computer configuration
Alternative Flow
6a User removes the selected item and re-selects the other.6b User adjusts the quantity of the selected item (only RAM,SSD, HDD)
Use-case continues step 6
Exception Flow
4b System fails to retrieve any items that match user’s ments
require-4b1 User returns to step 3
Table 5: Build PC Manually
Trang 31Use Case ID 5
Use Case Name Re-examine Requirements
Description User has received the Auto Generator’s result and wishes to review
their inputs once again
Pre-Condition(s) User has received the Auto Generator’s result and is on the result
page
Post-Condition(s) User sees their specified requirements
Basic Flow
1 User clicks on the drop-down bar near top of the page
2 System shows the user’s requests
Alternative Flow
Exception Flow
Table 6: Re-examine Requirements
Trang 32Use Case ID 6
Use Case Name Customize a PC Set
Description User customizes the PC set that they chosen
Pre-Condition(s) User has received the Auto Generator’s result and is on the result
2 System shows general information of the computer set
3 User clicks the ’Customize this build’ button from that puter set
com-4 System redirects user to the Manual Generator page whichholds all the information of the selected computer set
5 User customizes the set until they are pleased
Alternative Flow
Exception Flow
Table 7: View PC Set’s Details
Trang 332 System shows general information of the computer set.
3 User clicks the ’Show detailed information’ button from thatcomputer set
4 System redirects user to the PC’s details page
Trang 34Use Case ID 8
Use Case Name Access Seller Pages
Description User wants to know where they can get the components for the
com-puter set they’ve picked
Pre-Condition(s) User has selected a computer set and is viewing it on the detail page.Post-Condition(s) User is directed to the seller pages
3 User clicks the link they want
4 System redirects user the appropriate pages
Trang 35Use Case ID 9
Use Case Name Save a Complete PC Set
Description User is able to save a copy of their chosen computer set
Pre-Condition(s) User has selected a computer set and is viewing it on the detail page.Post-Condition(s) User has a copy of their chosen computer set on their local device
Basic Flow
1 User clicks the ’Export to PDF’ button at the end of the page
2 System sends a download request to the user’s browser
3 User receives a PDF file containing the in depth information
of the selected computer set
Alternative Flow
Exception Flow
Table 10: Save a Complete PC Set
Trang 36Use Case ID 10
Use Case Name Search Product
Description User searches for a product they want
Pre-Condition(s) User has already access the website
Post-Condition(s) User gets the information of the selected product
Basic Flow
1 User clicks Product category and chooses the type of nent they want on the navigation bar
compo-2 System redirects user to the page listing out all of the products
of the chosen type
3 User browses for the desired product
Alternative Flow
3a User reduces the number of products they have to browse for
by using the filters on the left side
4a System returns the products that match user’s requirements.Use-case continues step 3
Trang 37Use Case ID 11
Use Case Name Sort Product
Description The user sorts the products into the desired order
Pre-Condition(s) User is on the Product page
Post-Condition(s) User gets the list of products in the desired order
Basic Flow
1 User clicks on the header of the column they want to sort
2 System sorts the list and displays to user
Description User views the dashboard
Pre-Condition(s) User has accessed the system
Post-Condition(s) User can view the dashboard page
Basic Flow
1 User clicks view dashboard in the navigation board
2 User is redirected to the dashboard page
Alternative Flow
Exception Flow
Table 13: View Dashboard
Trang 384 System Design
Data will be collected from the Hardware component websites (https://www.userbenchmark.com/, https://gearvn.com/, https://phongvu.vn/, etc.) and then pushed
to the database for storage The web server will then use those data from the database inconjunction with Redis to provide services to the users through the browser
Figure 4.1: General architecture
The system consists of 3 main components:
• Crawler: This component will be responsible for collecting data throughoutthe Internet
• Google Cloud SQL: The main database of the whole system It is powered
by Google Cloud which provides many useful features to help make the database easy
to manage, reliable and secure
Trang 39• Web Server: The primary component which is in charge most of the features
in this thesis and uses the data from the database Redis will serve as a cache databasefor the server to increase the performance
On the hardware components websites that we need to get the data from, for each type
of component, there are generally two main types of pages: the index page and the productpage The index page is where the list of products is displayed For a product, it usuallycontains only the name and the link to that product page The product page is where the data
we need located and it is also the page that we want to extract the data
Figure 4.2: General idea of index page and product page
If we are going to start from the initial page, which can be the home page or the firstindex page, we can get all of its URLs to the product page and then move on to the nextindex page Then we can keep repeating the process till we reach the last index page and getall the product pages
During the crawling process, we have encountered a few different ways a websitepresents their content:
• All the content is located in the HTML when the web page is loaded This is the easiesttype of page to extract the information as everything is already in the HTML We canrequest the HTML and then get the data This is usually the way that the product page
is presented
• No content or only some is available and the rest is hidden behind JavaScript codethat is run when the browser loads the page or on user interaction We cannot directlyrequest the HTML to extract the data because the initial HTML that is requested now
no longer has the information that we need They will be added after the HTML isloaded by the JavaScript This type of page is usually the index page where the list ofproducts is located In this case, we have 2 solutions:
Trang 40– Solution 1: We will mimic the request that the JavaScript will be usingand use that request to get the data response directly from the server Then wewill extract data from the JSON response of the server.
– Solution 2: We will use Selenium which can control the browser andsimulate user like clicking or scrolling to render the data into the HTML forextraction
The crawling will overall follow these step:
Figure 4.3: The flowchart of crawling
• Step 1: Initialize the "Index" queue with the index URL These URLs areunique
• Step 2: Loop through the "Index" queue to get index page and extract theproduct URL from it Put all the product URL to the "Product" queue
• Step 3: Loop through the "Product" queue to get the product page and extractdata from it Data will be record into a JSON file
• Step 4: If the "Product" queue is empty, the process stop