Based on computer vision and machine learning, this research proposes a Kansei emotional evaluation for Aodai, which is traditional and well-known Vietnamese clothes for women.. Self-org
Trang 1Vietnamese Aodai Clothes based on Computer Vision and Machine Learning
Thang Cao, Hung T Nguyen, Hien M Nguyen and Yukinobu Hoshino
AbstractThe more that human society develops, the greater the human need for well-mannered and elegant clothes, especially traditional costumes Selecting fine clothes for a specific occasion is always an interesting individual question Based on computer vision and machine learning, this research proposes a Kansei (emotional) evaluation for Aodai, which is traditional and well-known Vietnamese clothes for women Features of an Aodai image are described by color coherence vectors Self-organizing maps and multi-layer neural networks are used to learn the relationships between the image features and the Kansei words Once learned, the system can rec-ommend which Aodai is suitable for a woman through her desired feelings She can use this recommendation when purchasing an Aodai at on-line stores or selecting one from her own collection for an outing Topics for future research include inves-tigating other image representation methods, such as combinations of color buckets
in different parts of the Aodai, using more detailed descriptions in decorative pat-terns, and integrating conspicuity factors such as color harmony, discriminability and visibility
Thang Cao
The University of Electro-Communications, Tokyo, Japan, e-mail: cao@hpc.is.uec.ac.jp
Hung T Nguyen
VNU University of Science, Hanoi, Vietnam, e-mail: nguyenthehungkhmttn@gmail.com Hien M Nguyen
Hanoi Water Resources University, Hanoi, Vietnam, e-mail: hiennm@wru.vn
Yukinobu Hoshino
Kochi University of Technology, Kochi, Japan, e-mail: hoshino.yukinobu@kochi-tech.ac.jp
1
Trang 21 Introduction
One important aspect that highlights the beauty of Vietnamese women is their Aodai costume Early versions of the Vietnamese Aodai appeared in the 17th century in the Nguyen Dynasty Throughout the country’s history, the Aodai has changed little
in design, decorative pattern, and color [1] Currently, the most popular Aodai style
is a long dress that fits tightly around the women’s upper torso, and splits into two flaps from the waist down, covering wide pants This style emphasizes a woman’s bust and curves while making it easy to move, as shown in Fig 1
Each Aodai is customized to fit a specific body Color and decorative patterns, together with the wearer’s emotions, normally depend on her age and outgoing en-vironment Young girls often dress in pure white, office women in delicate pastels with slight decoration, and middle-aged women in strong, rich colors and decora-tions, as illustrated in Fig 2 A woman’s Aodai also embodies her personality and social position
Using computer vision and machine learning, this paper presents a Kansei evalu-ation system for the Aodai Based on Aodai images and emotional evaluevalu-ations gath-ered from a survey, the system estimates whether an Aodai image fits the feelings
of a woman Section 2 describes the selection of an Aodai for different occasions Section 3 introduces Kansei engineering in fashion design Sections 4 and 5 present the selection of Kansei words, image features, and data preparation in our experi-ments Section 6 presents the modeling of the Kansei evaluation for Aodai images featured by CCV histograms with Self-organizing Map and Neural Networks Our conclusions and future works are discussed in Section 7
Fig 1: Design of Aodai (Source: Wikipedia)
Trang 3Fig 2: Examples of Aodai for girls (left), office women (middle), and middle-aged women (right)
2 Selecting an Aodai for an Occasion
For every dressing, the more elegant clothes we wear, the more respect we have from surrounding people Clothes and fashion of a person may bring a relaxing and interesting atmosphere to other people We often ask ourselves how to choose suitable clothes for a specific occasion so that we could become distinguished or the same as others
In Vietnam, a woman often has a collection of Aodai with a variety of colors and decorations, and for each outgoing, she chooses one that fits her own emotions Here are some questions she may ask herself when choosing clothes:
• Meeting place: university campus, office, hotel, or park?
• Emotions she wants others to feel about her: vivid, sweet, or gentle?
• Activity purpose: conference, outing, showing, or ceremony?
• Who she will meet: students, office staff, or businessmen?
• How about surrounding people: young, middle-aged, old, or all of them?
From such questions, the woman chooses an Aodai with the color and decoration which she thinks the most suitable and she hopes that the others also feel in a similar way However, sometimes things chosen by the emotion of an individual do not fit
Trang 4the others’, and the wearer may need an advice, especially when going to a special event She also needs a recommendation when looking for an appropriate Aodai on on-line stores
3 Kansei Engineering in Fashion Design
Kansei or Affective Engineering translates human emotions and feelings into spe-cific parameters that can be used for product development, design, and evaluation [2] So far there have been few Kansei researches on fashion such as clothing, fabric design [3, 4, 5] and fashion [6, 7, 8]
Ogata and Onisawa [3] proposed a system which presents several clothing de-sign patterns to the user Based on the user’s evaluation, the system runs the genetic algorithm to search through clothing patterns until a satisfied pattern is found Kim and Cho [4] also used an interactive genetic algorithm to develop a fashion design support system They classified the design of dresses for women into three parts which are represented in separate 3-D models, and then created different de-signs from a combination of these models The system suggests a preferable design through an interactive session with the user
Using the rough set theory, Santos and Rebelo [5] constructed a semantic database describing relations between the function and context use of clothes The proposed system provides clothing designers and producers with relevant informa-tion such as users’ clothing preferences for a certain task, and therefore can help in the beginning of the clothing design process
Using Principal Component Analysis, Anitawati et al investigates relations of e-commerce website designs and responses of consumers’ emotional to the website The relations are analysed based on predefined rules on colours, design elements, layouts, page orientations and typography [6]
By survey with fashion experts, from a variety of fashions collected from fash-ions magazines and documents worldwide, Yi-Ching Chang et al uses cluster and Multi Dimension Scale Analysis to identify some distinct fashion style images and
to define a suitable design language for each style image The survey also used to find out differences in sensing fashion style image between designers and consumers [7]
S Ishihara et al presented an automatic semantic structure analyser and Kansei expert systems (ES) builder using self-organizing neural networks The system auto-matically analyses semantic structure of Kansei words by using two self-organizing map Via graphical user interface, users can browse and explore Kansei structures generated by the ES [8]
Eric and Kamei used a multi-layer neural network to produce a color conspicuity value from RGB values of two figures and a potential ground The output value, which is ideal relative area of the two figures, is applied to visualization designs by weighting each conspicuity value with a ground coefficient and the relative size of every color in a design [14]
Trang 5This research deals with emotional evaluation for Vietnamese Aodai images Im-ages features and Kansei words are learned by Self-organizing maps and multi-layer neural networks Having learned, the system can recommend which Aodai is suit-able for a woman through her desired feelings and can be used in purchasing an Aodai at on-line stores or selecting one from her own fashion collection
4 Selecting Kansei Words for Aodai
From common adjectives that are used by Vietnamese people to express their emo-tions and feelings about clothes, we collected 34 Kansei words categorized into three groups: Elegant, Active, and Inactive After conducting an initial survey, we selected only nine words for the three groups, as illustrated on Table 1 Then we used the semantic differential scale method with five levels from one to five in a sur-vey for emotions of different Aodai clothes Fig 3 shows an interface of the sursur-vey program in Vietnamese
Table 1: Kansei words for Aodai
5 Image Features and Training Data
To give a reasonable emotional advice for selecting clothes, the system should be able to model relations between clothing characteristics such as color, size, and type, and Kansei words A popular method for representing the clothes images is histogram, as described below
5.1 Color Intensity Histogram
A color intensity histogram represents the number of occurrences of color intensities
in an image For an image I : (x, y) → [0,255] where (x,y) is a pixel in row x and
Trang 6Fig 3: An interface of the survey program
column y of the image, the color intensity histogram is given as follows:
That means h i is the number of pixels having a color intensity value of i Color
histograms are often used to compare images because different objects usually have distinctive histograms, and histograms are easy to calculate However, a color his-togram only shows overall pixel intensity information and does not represent cor-relations between color objects on the images Two different images may have the same color intensity histogram
5.2 Color Coherence Vector
Color Coherence Vectors (CCV) is a histogram-based method for comparing im-ages that incorporates spatial information [10] A color’s coherence is defined as the degree to which pixels of that color are members of large similarly-colored regions The significant regions are called coherent regions Coherent pixels are part of some sizeable contiguous region while incoherent pixels are not A CCV stores the num-bers of coherent and incoherent pixels for each color CCVs prevent coherent pixels
in one image from matching incoherent pixels in another This allows a fine distinc-tion that cannot be made with color intensity histograms
To compute a CCV, an image is slightly blurred first, and then the color space
is discretized into n color buckets Next, connected components that have the same
Trang 7discretized color buckets are calculated A pixel is coherent if the size of its con-nected component exceeds a fixed valueτ, and the pixel is incoherent otherwise The CCV of an image is the vector⟨(αi ,βi)⟩, whereαiis the number of coherent pixels andβi is the number of incoherent pixels of the i-th discretized color It has
been reported that CCV is better than color histograms in image comparison [10] Fig 4 illustrates CCV regions on an Aodai image
Fig 4: An illustration of CCV regions
5.3 Training Data
Training data for building the system consists of 110 images of Aodai clothes and corresponding Kansei words collected from a survey with 41 Vietnamese people in
a variety of ages and social positions After the survey, we have 110× 41 = 4510
training instances with detailed numbers for each Kansei words shown on Table 2 From the Aodai images, normalized CCV histograms are created and clarified after performing appropriate image pre-processing steps, such as histogram equal-ization and noise removal
Trang 8Table 2: The number of training instances for each Kansei word
Kansei Words Training Instances Percentage
6 Modeling Relations between Aodai Clothes and Kansei Words
6.1 Modeling by SOM
Self-Organizing Maps (SOM) is a kind of unsupervised learning It is often used
to discover structures or relationships in data SOM automatically find a mapping from the space of input vectors to a one or two-dimensional space The mapping preserves the closeness between the vectors; two input vectors close to each other would be mapped to points on the output map that still keep the spatial relationship
in the input space [11]
The advantage of SOM is that it is simple, easy to understand, and good for vi-sualization One can easily train the network and then intuitively evaluate how well the training is performed and how similar the objects are The limitation of SOM
is accuracy of distances among output neurons It is easy to see the distribution of input vectors on the output map, but it is difficult to accurately evaluate distances and similarities between them Moreover, if the output dimension and learning al-gorithms are chosen improperly, similar input vectors may not be always close to each other and the network may converge into some local optimal points [12] SOM has so far been used in many practical applications, including Kansei mod-eling [8, 9] In this research, inputs to the SOM are CCV histograms and its out-put is a map showing locations of Aodai images Aodai images with similarities in CCV histograms would be arranged in the vicinity each other The modeling Kansei words described for the similar Aodai would also be in the vicinity each other The modeling of Aodai images and Kansei words by SOM is illustrated in Fig 5
Fig 5: Modeling Aodai images and Kansei words by SOM
Trang 9Fig 6: Normalized emotional degrees (right) for an Aodai image (left) using similar images on SOM’s output map that is shown in Fig 7
On a winner neuron on the output map, modeling emotional degrees are esti-mated from training instances fallen on the neuron and its neighbours described below
Let the winner neuron be B, its neighbour neuron be B n (n = 1, , N), degrees of emotional words modeled by the winner be A j ( j = 1, , 9 for the nine emotional words.) A jis computed as follows:
A j = A B j+
N
∑
n=1
A Bn j × d Bn →B (2)
where A B j and A Bn j are degrees of the word A j on the neuron B and B n, respectively,
d B n →B is the distance of the neighbor neuron B n to the winner neuron B, d B n →Bis
close to one when B n is near B and it is close to zero otherwise.
When a woman chooses an Aodai, its image will be put into the SOM inputs and a winner neuron will be identified on the output map By Eq (2), the system estimates degrees of Kansei words associated with the winner neuron as emotional evaluations for the Aodai Fig 6 shows an example of emotional evaluations for an Aodai image by SOM An output map is illustrated on Fig 7
6.2 Modeling by Multi-Layer Neural Networks
As a kind of supervised learning, multi-layer neural networks (NN) is an effective technique to analyse, model, and make sense of complex data across a broad range
of applications It enables intelligent systems to learn from experience and exam-ples, improving performance of the system over time [13, 14, 15] To train a neural
Trang 10Fig 7: A SOM map for Aodai Images after learning with CCV histograms
network, a set of training instances with corresponding outputs need to be provided
A trained neural network can be used to predict outputs for unknown input data
In modeling emotional evaluations of Aodai clothes, inputs to the neural network are the features of Aodai images, and outputs are Kansei words with their degrees After training, relations of the image features and emotional words are generalized, and the trained neural network can give a proper emotional word to a new Aodai image When a woman looks for an Aodai, the system can help her identify how people feel about the Aodai that she likes The modeling by neural networks is shown in Fig 8