Association rule mining is used to recom-mend outfits.. When recommendations are madebased on data currently entered by the user, it is called “ephemeral personalization”.The most person
Trang 1VIETNAM NATIONAL UNIVERSITY
UNIVERSITY OF TECHNOLOGYFACULTY OF COMPUTER SCIENCE AND ENGINEERING
GRADUATION THESIS
-A Decision Support System
for Outfit Planning
Council: Software Engineering
Instructor: Assoc Prof Quan Thanh Tho, Ph.D.Student: Nakamura Ryuta - 1752004
Ho Chi Minh City, December 2022
Trang 3Declaration of Authenticity
I declare that this research is my work, conducted under the supervision and guidance
of Assoc Prof Dr Quan Thanh Tho The result of our research is legitimate andhas not been published in any form before this All materials used in this researchhave been collected from various sources and are listed in the references section
In addition, within this research, I also used the results of several other authorsand organizations They have all been aptly referenced
In any case of plagiarism, I stand by my actions and will be responsible for them.University of Technology - Vietnam National University HCMC, therefore, is notresponsible for any copyright infringements conducted within our research
Ho Chi Minh City, December 2022
AuthorNakamura Ryuta
iii
Trang 4Firstly, I would like to express my deep and sincere thanks to Dr Quan Thanh Tho,for supervising my thesis and thesis proposal I have benefited greatly from yourwealth of knowledge and meticulous editing
Next, I am very thankful to Mr Nguyen Quang Duc He helped me a lot withdeveloping the system and writing the thesis I would also like to thank my lab mates.Without their support, I could not complete this meaningful research
iv
Trang 5In this paper, a decision support system is designed and built to help users choosetheir daily outfits The system takes in outfits and corresponding events and thenanalyzes them to provide daily outfit suggestions to make outfit planning easier forthe user Association analysis techniques are expected to be used in our system tomake predictions of the information necessary to create daily outfit recommendations
In the process, the association rules and their algorithms are researched Finally, theweb application is developed that actually recommends outfits using association rules
v
Trang 61.1 Problem Statement 1
1.2 Goals 1
1.3 Scope 2
1.4 Thesis Structure 2
2 Theoretical Background 4 2.1 Web Application 4
2.2 Recommender System 6
2.3 Association Rule Mining 9
3 Related Works 10 3.1 Similar Works 10
3.2 Algorithms Used in Association Rule Mining 11
3.2.1 Apriori Algorithm 11
3.2.2 Eclat Algorithm 15
3.2.3 FP-growth Algorithm 15
4 Application Implementation 16 4.1 Technology 16
vi
Trang 74.1.1 React (JavaScript Library) 16
4.1.2 MUI (UI Library) 16
4.1.3 FastAPI (Web Framework for Python) 17
4.1.4 mlxtend 17
4.2 Application Design 17
4.2.1 Application Requirements 17
4.2.2 Use-case Diagram and the Description 19
4.2.3 Application Mock-up 31
4.3 System Implementation 39
4.3.1 Overall System Structure 39
4.3.2 Finding Association Rules 42
5 Result and Evaluation 44 5.1 Evaluation of My Outfit Recommender System 44
5.1.1 Evaluation 44
5.2 Web Application 47
5.2.1 Recommendation Flow 47
5.2.2 Web Performance 53
6 Conclusion 54 6.1 Summary 54
6.2 Future Development 54
vii
Trang 8List of Figures
1.1 Activity Diagram 3
2.1 Web Application Architecture 5
2.2 Confusion Matrix 8
4.1 Use-case Diagram of Sign Up and Log In 19
4.2 Select Items 22
4.3 Add/Edit Items 23
4.4 Use-case Diagram of Create Outfits 24
4.5 Use-case Diagram of Set User’s Event in Calendar 27
4.6 Use-case Diagram of Suggest Outfit Plans 29
4.7 Landing Page 31
4.8 Sign Up 32
4.9 Log In 33
4.10 Check Items 34
4.11 Main Screen 35
4.12 Items 36
4.13 Outfits 37
4.14 Calendar 38
4.15 Find Association Rules 40
4.16 Show Outfit Recommendation 41
4.17 Find Association Rules 42
4.18 Apriori 43
4.19 Association Rules 43
viii
Trang 95.1 My Outfit History for One Month 45
5.2 Test Data Set 46
5.3 Create New Outfit 48
5.4 Created Outfit 49
5.5 Add Event 51
5.6 First Recommendation 51
5.7 Second Recommendation 52
5.8 Third Recommendation 52
ix
Trang 10List of Tables
2.1 Comparison of Content-based filtering and Collaborative filtering 6
3.1 Comparison of Apriori, Eclat and FP-Growth 11
3.2 Data Set Example 12
3.3 Candidate Set 1 (C1) 12
3.4 Frequent Item Set 1 (L1) 13
3.5 Candidate Set 2 (C2) 13
3.6 Frequent Item Sets 2 (L2) 13
3.7 Candidate Set 3 (C3) 14
3.8 Generated Association Rules 14
4.1 Use-case Description of Sign Up 20
4.2 Use-case Description of Log In 21
4.3 Use-case Description of Select Items 22
4.4 Use-case Description of Add/Edit Items 25
4.5 Use-case Description of Create Outfits 26
4.6 Use-case Description of Set User’s Events to Their Calendar 28
4.7 Use-case Description of Suggest Outfit Plans 30
5.1 Association Rules from My Outfit History for One Month 46
5.2 Initial Data Sets 47
5.3 Frequent Item Sets 50
5.4 Association Rules 50
5.5 Web Performance 53
x
Trang 11or activity” In the first phase, the system gathers information on what outfits arechosen for what kind of events The system uses association analysis techniques toanalyze trends in the data set.
Asso-of the information necessary to create daily outfit recommendations
The activity diagram shown in Figure 1.1 shows the flow of my recommendationsystem from website access to the system recommending an outfit and the user ac-cepting it First, the user accesses the website If the user has already registered as
1
Trang 12CHAPTER 1 INTRODUCTION 2
a user, the dashboard will appear after the user logs in If the user is not registered,he/she will be taken to the dashboard after registering and logging in Next, theuser adds his/her event to the calendar At that time, if the user has enough outfithistory, the outfit will be recommended Association rule mining is used to recom-mend outfits In more detail, association rules are generated from the combination
of events and their corresponding outfits in the user history Therefore, if the user’soutfit history is non-existent or insufficient, the user must manually register the outfitalong with the event Finally, if the user accepts the recommendation of the system,the outfit is registered If the user does not accept, the system will suggest anotheroutfit until the user accepts
of a recommendation system for an outfit and the algorithm used for the associationrules are presented The next chapter shows how the application is implemented Inthe “Result and Evaluation” chapter, my recommendation system is evaluated on thebasis of data collected from actual users Finally, conclusions and future developmentsare discussed
Trang 13CHAPTER 1 INTRODUCTION 3
Figure 1.1: Activity Diagram
Trang 14A web application (or web app) runs in a web browser It is different from softwarethat runs locally and natively on the operating system of the device Recently, many
IT companies have offered their services as web applications These services do notrequire installation They are also easy for the provider to publish In addition, moreand more web applications are incorporating machine learning models Therefore, Idecided to implement this system as a web application and release it to the public
On a technical note, it is common practice these days to divide web applicationsinto a front-end and a back-end Figure 2.1 shows the general architecture of modernweb applications Various frameworks are available for both the front-end and theback-end For the front end, React, Vue, and Angular are the three most commonlyused frameworks These frameworks enable a technology called SPA (single-pageapplications) By switching content on a single web page, there is no need for pagetransitions, and web expression is not tied to browser behavior For the back-end,various frameworks have emerged in each programming language When developingweb applications that use machine learning, Python is often used for the back-endbecause of its extensive library
4
Trang 15CHAPTER 2 THEORETICAL BACKGROUND 5
Figure 2.1: Web Application Architecture
Trang 16CHAPTER 2 THEORETICAL BACKGROUND 6
2.2 Recommender System
A recommender system is a system that provides suggestions for items that may bedesirable to a particular user In general, the suggestions refer to various decision-making processes, such as which products to buy, which music to listen to, whichonline news to read, etc
There are two main approaches to the recommendation system One is orative filtering”, and the other is “Content-based filtering” Table 2.1 shows thecomparison of them Collaborative filtering is a method of making recommendationsbased on the item user’s behavioral history This paper [1] introduces a recommen-dation system based on collaborative filtering Content-based filtering, on the otherhand, is a method of recommending items by sorting similarity based on the item’sfeatures In collaborative filtering, recommendations can be made through the behav-ioral history of other users without any information or knowledge of the target item
“Collab-On the other hand, content-based recommendation requires a design that convertsthe features of an item into a feature vector
Table 2.1: Comparison of Content-based filtering and Collaborative filtering
Content-based filtering Works with less data Item feature are needed
According to the paper [2], the recommendation system is classified into three els depending on the degree of personalization If the same recommendation is made
lev-to all users, it is classified as “non-personalized” When recommendations are madebased on data currently entered by the user, it is called “ephemeral personalization”.The most personalized recommendation method, “persistent personalization,” makesdifferent recommendations based on the user’s personal information and past usagehistory, even if the user has the same input and behavior in the system My system isclassified as “persistent personalization” If multiple users select the same event, theywill be recommended completely different outfits because they have different outfithistories
This paper also classifies recommendation systems by business objective “Broad
Trang 17CHAPTER 2 THEORETICAL BACKGROUND 7
Recommendation Lists” makes recommendations using overall statistics and editors
on the operator’s side Best sellers and overall sales rankings are examples of this In
“Customer Comments and Ratings,” users make comments and reviews on a systemprovided by the provider
The following measures are used by the recommendation system to evaluate theaccuracy of the forecasts The “accuracy” indicates the percentage of agreement be-tween the predicted results and the data for testing The percentage of items deter-mined to be conforming that are actually conforming is “precision” The percentage
of items judged conforming to all conforming items is “recall”
These numbers can be found using the confusion matrix shown in Figure 2.2
My system uses an algorithm for outfit suggestions In this project, in the firststage, the user determines the outfits Then, once the data on user tendencies iscollected, the rules of user tendencies are found from the dataset For this purpose,association rule analysis is used in my system Association rule mining is a rule-basedmachine learning method for discovering interesting relations between variables inlarge databases The concept of mining association rules is explained in detail in thenext section
Trang 18CHAPTER 2 THEORETICAL BACKGROUND 8
Figure 2.2: Confusion Matrix
Trang 19CHAPTER 2 THEORETICAL BACKGROUND 9
2.3 Association Rule Mining
Association rule mining uncovers interesting patterns hidden in transactional records(usually referred to as item sets) It is best known for analyzing data sets collectedfrom POS systems Association rule analysis is used in a wide range of applicationstoday
According to the original definition proposed by Agrawal [4], the problem of ciation rule mining is defined as I = {i1, i2, , in} be a set of items in a transaction
asso-D = {t1, t2, , tm} A rule is defined as an implication of the form: X ⇒ Y, where
X, Y ⊆ I The set of items X is the antecedent or left-hand side (LHS), while theitem Y is the consequent of the association rule, which is also called the right-handside (RHS) [5]
There are well-known measures for evaluating association rules The following arethree indicators used for recommendations
• Support = F requency(X, Y )
NSupport of the rule Xij is defined as the percentage of items in S that satisfythe union of the items in X and ij It is an indication of how frequently therelevant itemsets appeared in the dataset
• Confidence = F requency(X, Y )
F requency(X)The confidence of the rule is the percentage of articles that satisfy X and also
ij It is an indication of how often the rule is valid
Support(X) ∗ Support(Y )
A practical solution to the problem of finding too many association rules isfying support and confidence constraints is to further filter or rank the rulesfound using additional interest measures A popular measure for this purpose
sat-is lift Lift, originally called Interest, measures the number of times X and Yoccur together compared to the expected number of times if they were statisti-cally independent It can be interpreted as the deviation of the support of thewhole rule from the support expected under independence given the support ofthe LHS and the RHS Higher lift values indicate stronger associations[6]
Trang 20Chapter 3
Related Works
3.1 Similar Works
Similar work on outfit recommendation systems is presented in this section
We consider several factors when choosing what to wear tomorrow (dress code,fashion, season, weather, color harmony, etc.) and combine the clothes we alreadyown The Outfit Support System is a machine learning system that assists in thiscomplex decision-making process There are several studies on such decision-makingsystems
The paper [7] called “Trip outfits advisor: Location-oriented clothing dation” contains a study on proposed outfits for travel By observing data on photosfrom several popular travel Web sites, it was discovered that people’s clothing choicesand color combinations were strongly correlated with weather, season, and the types
recommen-of major attractions at the destination This shows the effectiveness recommen-of the system insuggesting clothing from a travel destination
Many fashion-specific recommendation systems use image features The paper[8] called “Hi, magic closet, tell me what to wear!” presents a system that suggestsappropriate clothing for situations such as travel and sports Recommendations aremade using the clothing attributes, a feature-occasion potential and an attribute-occasion potential, and an attribute-attribute potential
Whether an item fits the user’s aesthetic sense is very important in the user’sclothing choice The paper [9] called “Aesthetic-based Clothing Recommendation”
10
Trang 21CHAPTER 3 RELATED WORKS 11
proposes to introduce aesthetic information, which is highly relevant to user ence, into clothing recommender systems
prefer-3.2 Algorithms Used in Association Rule Mining
Many algorithms have been proposed to generate association rules Some well-knownalgorithms are Apriori, Eclat, and FP-Growth, but they only do half the job, sincethey are algorithms for mining frequent itemsets Another step needs to be taken togenerate rules from frequent itemsets found in a database
Table 3.1: Comparison of Apriori, Eclat and FP-Growth
Approach Breadth First Search Depth First Search Divide & Conquer
3.2.1 Apriori Algorithm
The Apriori algorithm was proposed by Agrawal and Srikant in 1994 [10] This isproceeded by identifying individual items from the database that are frequent andexpanding them to larger itemsets as long as those itemsets appear in the databasewith sufficient frequency
Below are the steps for the apriori algorithm:
• Step-1: Determine the support of item sets in the transactional database, andselect the minimum support and confidence
• Step-2: Take all supports in the transaction that has a higher support valuethan the minimum or selected support value
• Step-3: Find all the rules in these subsets that have higher confidence than thethreshold or minimum
• Step-4: Sort the rules in decreasing order of lift
Trang 22CHAPTER 3 RELATED WORKS 12
Table 3.2: Data Set ExampleTransaction ID Item-set
Table 3.3: Candidate Set 1 (C1)Item-set Support Count
In the next step, Candidate set 2 (C2) is generated with the help of L1 In C2,the pair of sets of items from L1 is created in the form of subsets After creating thesubsets, the support counts are checked again from the main transaction table of thedata sets Therefore, the set of candidates shown in Table 3.5 is generated
Again, the support count of C2 is compared with the minimum support count,and after comparing, the item set with fewer support counts is eliminated from Table
Trang 23CHAPTER 3 RELATED WORKS 13
Table 3.4: Frequent Item Set 1 (L1)Item-set Support Count
C2 It generates frequent item sets shown in Table 3.6
Table 3.6: Frequent Item Sets 2 (L2)Item-set Support Count
is calculated from the data sets It gives the Table 3.7
Now we will create the frequent item sets (L3) table As we can see from theabove C3 table, there is only one combination of item set that has a support countequal to the minimum support count So, the L3 will have only one combination, i.e.,{A, B, C}
The association rules are generated from the previous subset To generate theassociation rules, first, a new table is created with the possible rules from the occurredcombination {A, B, C} For all the rules, the Confidence is calculated using formula
Trang 24CHAPTER 3 RELATED WORKS 14
Table 3.7: Candidate Set 3 (C3)Item-set Support Count
confi-Table 3.8: Generated Association RulesRules Support Confidence (%)
Trang 25CHAPTER 3 RELATED WORKS 15
3.2.2 Eclat Algorithm
The ECLAT algorithm [11] stands for Equivalence Class Clustering and
Bottom-Up Lattice Traversal ECLAT is a more efficient and scalable version of the Apriorialgorithm While the Apriori algorithm operates horizontally, mimicking the Breadth-First Search of a graph, the ECLAT algorithm operates vertically, like the Depth-FirstSearch of a graph Due to this vertical approach, ELCAT is faster than the Apriorialgorithm
3.2.3 FP-growth Algorithm
The two main drawbacks of the Apriori algorithm are
1 It is necessary to construct a candidate set at each step
2 It must scan the database repeatedly to build the set of candidates
These two properties inevitably slow down the algorithm To overcome these dundant steps, a new association rule mining algorithm called the Frequent PatternGrowth (FP-growth) algorithm was developed [12] This algorithm overcomes theshortcomings of the Apriori algorithm by storing all transactions in a Trie DataStructure
Trang 26re-Chapter 4
Application Implementation
4.1 Technology
4.1.1 React (JavaScript Library)
My system uses a JavaScript library called React React – A JavaScript libraryfor building user interfaces https://reactjs.org/ Version 17 on the front-end
to implement Single-Page Applications (SPA) React is among the most widely usedJavaScript frameworks and libraries currently available React is a famous JavaScriptlibrary for building user interfaces Since it is also available as open source, it isconstantly being improved and its quality is maintained
Also, my system uses Next.js Next.js by Vercel - The React Framework https://nextjs.org/ Version 12, a JavaScript framework developed based on React It iscapable of server-side rendering, etc
4.1.2 MUI (UI Library)
MUI MUI: The React component library you always wanted https://mui.com/ is
a UI library used in React My system uses MUI (version 5) There are several UIframeworks in React MUI is one of the most popular MUI makes it easy to build
on the material design advocated by Google
16
Trang 27CHAPTER 4 APPLICATION IMPLEMENTATION 17
4.1.3 FastAPI (Web Framework for Python)
FastAPI https://fastapi.tiangolo.com/, which uses Python, is used on the end to implement the recommender system My system uses FastAPI version 0.86.0.SQL alchemy, Uvicorn, Pydantic are also used together This web framework is oftenused to build web applications that use machine learning models because it usesPython and is easy to implement Although there are some very popular frameworks
back-in Python, such as Flask and Django, FastAPI has several advantages over theseframeworks
The key features are the following
• Automatically generates Open API documentation
• Easily includes type definitions in Python
• Easy to implement asynchronous communication
• Can be implemented with less code
4.1.4 mlxtend
The mlxtend (machine learning extensions) http://rasbt.github.io/mlxtend/.library provides a variety of useful tools for data scientists on a daily basis It in-cludes features such as learning curve plots and stacking not found in scikit-learn ormatplotlib The mlxtend library contains a library for mining association rules
Trang 28CHAPTER 4 APPLICATION IMPLEMENTATION 18
1 Users can sign up and log in
2 Users can input their available clothing items
3 Users can create outfits from items that they have chosen or added
4 Users can manually set outfit plans they desire
5 The system suggests the outfit’s plan based on the plan that the user inputs inthe calendar using association rules mining
6 Users can accept or reject the suggested plan, and if users reject it, they canchange the plan
(a) The system suggests an outfit plan
(b) The user rejects the outfit plan
(c) The system suggests a different outfit plan
(d) Repeat until the users accept the recommendation
(e) The user accepts the outfit plan
(f) The system stores the outfit plan
Non-functional Requirements
The non-functional requirements for my system are as follows:
1 The web application runs in a browser
2 Time to recommend outfits is less than 10 seconds
3 A server can support service for 1000 users simultaneously
4 Easy to understand how to use the system
5 A simple explanation, even for those who are unfamiliar with fashion
Trang 29CHAPTER 4 APPLICATION IMPLEMENTATION 19
Figure 4.1: Use-case Diagram of Sign Up and Log In
4.2.2 Use-case Diagram and the Description
Use case diagrams are graphical representations of possible interactions between usersand a system Use-case diagrams and their descriptions are shown below
Sign Up and Log In
The sign-up and sign-in use-case is shown in Figure 4.1 The description of the usecase is presented in Table 4.1 and Table 4.2
Input User Information
The use-cases of “Input User Information” is shown in Figure 4.2 The description ofthe use case is presented in Table 4.3
Trang 30CHAPTER 4 APPLICATION IMPLEMENTATION 20
Table 4.1: Use-case Description of Sign Up
Related requirements Users can sign up
Goal in the context The system stores the user name and password in
the database
Successful end condition Store correct information
Failed end condition Fails to store the user information
Primary actors User, System
Trigger Users press the “Sign Up” button
Main flow
1 User input user information
2 The system validates the input data
3 Users press the “Sign Up” button
4 The system stores the username and passwordException
Trang 31CHAPTER 4 APPLICATION IMPLEMENTATION 21
Table 4.2: Use-case Description of Log In
Related requirements Users can log in
Goal in the context The system authenticates users by user name and
password from the server, and gets user informationfrom the database
Preconditions Users sign up
Successful end condition The system shows the main page which shows user
information (User Name) on NavbarFailed end condition Fails to authenticate user
Primary actors User, System
Trigger Users press the “Log In” button
Main flow
1 User input user information
2 Users press the “Log In” button
3 The system calls API to the back-end server
4 Check E-mail and Password
5 If the server returns OK and User ID, then thesystem calls API to the back-end to fetch User
ID, User Name
6 The system redirects to the main pageException
Trang 32CHAPTER 4 APPLICATION IMPLEMENTATION 22
Figure 4.2: Select Items
Table 4.3: Use-case Description of Select ItemsUse-case name Select items
Related requirements Users can select items they own
Goal in the context The system stores collected data in the database
Successful end condition Dashboard page shows up
Failed end condition The database does not store items that the user hasPrimary actors User, System
Main flow
1 The system shows available items
2 Users pick up items
3 Dashboard page shows upException
Trang 33CHAPTER 4 APPLICATION IMPLEMENTATION 23
Figure 4.3: Add/Edit Items