Improving sales forecasting models by integrating customers’ feedbacks a case study of fashion products

Improving sales forecasting models by integrating customers’ feedbacks a case study of fashion products Cải thiện mô hình dự báo doanh số bán hàng bằng cách tích hợp phản hồi của khách hàng vào một nghiên cứu điển hình về sản phẩm thời trang

INTRODUCTION

Background of the study

Potential consumers often rely on online reviews before making purchase decisions On the Shopee platform, customer feedback is categorized into two types: written reviews and ratings While written reviews provide detailed customer sentiments after receiving their orders, ratings are typically assigned prior to purchase and can be influenced by various seller strategies My research indicates that many comments retrieved were null, displaying only rating scores, as customers can delete their comments after the crawler processes the data To enhance sales forecasting accuracy, it is crucial to eliminate the rating feature and focus on sentiment analysis and daily sales volume, thereby reducing data noise.

Online reviews significantly impact product sales by alleviating customer uncertainty through detailed insights into user experiences The rise of Internet technology has amplified the popularity of these reviews, prompting Vietnamese e-commerce platforms like Shopee, Lazada, and Sendo to create review systems that incentivize customer feedback This shift in customer behavior influences purchasing decisions, as potential buyers rely on reviews to gauge sentiments—whether negative, neutral, or positive—about products Consequently, various methods have been developed to assess the sentiments conveyed in online critiques.

Research objectives

Sales forecasting involves estimating future revenue by predicting the volume of products or services a company will sell By accurately forecasting product sales, companies can develop effective strategies in marketing, sales management, and production, ultimately enhancing profitability and reducing costs Research indicates that effective sales forecasting is crucial for informed decision-making and operational efficiency.

Customer feedback significantly impacts purchasing decisions, as it helps to alleviate or amplify consumers' uncertainty regarding a product By offering detailed insights into users' experiences, businesses can provide essential information that influences potential buyers.

This study proposes a novel method for sales forecasting by integrating a convolutional neural network (CNN) model with the sentiment analysis capabilities of the PhoBERT model, leveraging data from product reviews By utilizing the pre-trained BERT model to assess online comment sentiments, this approach aims to enhance the accuracy of sales predictions Notably, there is a scarcity of research that incorporates internet review content to boost product sales The effectiveness of the proposed forecasting method is evaluated using actual fashion data from the Vietnamese e-commerce platform, Shopee.

This study focuses on machine learning-based prediction, specifically analyzing data from the Shopee e-commerce platform Our project aims to document the methods used to address issues and forecast trends in women's fashion items on Shopee We concentrate on a specific category within the Fashion sector, monitoring products such as jackets, coats, and vests to derive meaningful insights and predictions.

Shopee offers several valuable features for research, including detailed product information such as item ID, name, and link Each product is accompanied by the number of reviews, ratings, and comments, along with the total comments for each item Additionally, user names of reviewers and the timestamps of their comments are provided, alongside individual ratings and sales quantities over consecutive days Price information is also available for every product.

Thereby building related charts, giving growth forecasting models of Products,

This research focuses on the "Jacket, Coat, Vest" category within the fashion industry, analyzing various models to predict daily sales with minimal error By emphasizing sentiment analysis and time series data, we aim to enhance previous studies in Machine Learning Our findings will assist sellers on the Shopee e-commerce platform in identifying upcoming trends in women's fashion, ultimately enabling them to optimize performance and financial outcomes Accurate sales forecasting will empower businesses to make informed decisions, recognize annual sales trends linked to events, and effectively manage inventory levels to avoid overstocking or understocking Additionally, this research will facilitate better financial management and strategic purchasing, ensuring businesses maintain a balanced budget throughout their operations.

Research methods

This article presents a novel strategy for predicting product sales by leveraging product review data through a combination of regression models, deep learning techniques, and sentiment analysis By utilizing the pre-trained BERT model to analyze online review sentiments, the approach enhances the accuracy of sales forecasting models Notably, this research fills a gap in existing literature, as few studies have explored the impact of online review content on boosting product sales Furthermore, the effectiveness of the proposed strategy is evaluated using real-world fashion data from Shopee.

The remaining sections of this project are structured as follows: Chapter 2 includes a literature review, while the study framework and methodology cover data collection status, the management of online review data, and sentiment analysis models.

Chapter 3 outlines 15 forecasting models and their performance criteria Chapter 4 presents the forecasting results, comparing them to the standard model and other sales forecasting models Finally, Chapter 5 discusses the implications and limitations of the study, along with recommendations for future research.

Buyers and sellers both care about reactions in Shopee This report illustrates the relationship between reactions and historical sold

In Shopee, there are some main reactions on each product (rating score, rating count, rating with comments , like and the most important is historical sold)

Figure 1 1 Example of an item' interface on Shopee

Figure 1 2 Rating with context actions

The historical sales data is a key indicator of a product's revenue, prompting the need to analyze customer reactions alongside this metric Reactions can be categorized into two groups: pre-purchase actions and post-purchase feedback Before making a purchase, potential buyers often seek information relevant to the product, including reviews and comments from previous customers In contrast, ratings and contextual feedback are provided only after customers have completed their purchase, reflecting their experiences with the product.

An analysis of the total likes and historical sales across 26 levels in a specific category reveals a weak correlation between these two variables The presence of various tools designed to artificially boost likes suggests that the displayed like counts may not be genuine, leading to misleading data that fails to accurately reflect any true relationship.

Figure 1 3 Relationship between Total sold with Total like

The increase in total historical sales correlates with higher overall ratings, especially when considering customer feedback Since customers can only provide ratings after receiving their orders, ratings accompanied by comments are more reliable than simple likes Furthermore, sellers incur a fee for each transaction, making it costly and challenging to manipulate these figures.

On average, the number of comments in each category equals the number of historical sold multiple 0.0959 minus 177089

Figure 1 4 Relationship between Total historical sold with Total rating with context

Customers often leave significantly more feedback on purchases in three key categories: women's clothing, beauty products, and fashion accessories These items, primarily bought by women, are typically non-essential Conversely, categories such as home care and tickets or vouchers receive fewer comments, as home care products are seen as necessities, while tickets and vouchers lack distinctive features.

By analyzing the features used to construct a dataset for crawling, I have concluded that there is a significant relationship between the historical sales index and various attributes of items listed on the Shopee website Specifically, when assessing the correlation between customer reactions and historical sales, utilizing the rating alongside the content index provides a more reliable evaluation.

19 o Women tend to comment after buying products more than men o Essential necessaries have less comments o Products with few characteristics have less comments

This project focuses on "Improving Sales Forecasting Models by Integrating Customer Feedback: A Case Study of Fashion Products." The initial challenge involves data analysis on the Shopee e-commerce platform, where I monitored a selection of fashion products to create growth forecasting models The aim is to predict trends in women's fashion items on Shopee by incorporating sentiment analysis from natural language processing (NLP) within the broader field of data science To address complex issues, I utilized software development techniques from computer science to manage data sources and apply ETL methods for effective data organization As I work on advanced deep learning models and innovative techniques for sales forecasting and sentiment analysis, it is essential to leverage mathematics and statistics to optimize the models and identify key factors that enhance performance while reducing error rates through data augmentation.

Under the guidance of Assoc Prof Dr Tran Thi Oanh and with Mr Nguyen Trung Hieu, the CEO of Ciaolink, as my client, I focused on understanding the specific requirements and challenges associated with ecommerce sites This collaboration allowed me to clarify the project's objectives and explore various data crawling techniques to address business problems effectively I realized the importance of developing a tailored end-to-end product that meets the unique needs of the company.

I meticulously developed my graduation thesis project to leave a lasting impression at university By addressing the raised requirements and challenges, I established a solid foundation for building my model and achieving meaningful results.

20 develop more to be a subject for my graduation thesis They also as well as the decision makers guide me to carry out this project

This project assists sellers on the Shopee e-commerce platform by providing insights into upcoming women's fashion trends Our analytical models aim to enhance sales performance and financial outcomes by understanding customer psychology, ultimately driving higher profits and revenue Consequently, businesses can confidently compete in the marketplace.

An Ecommerce website analysis report is a powerful tool that provides insights into your online store's performance By understanding user behavior and identifying areas for improvement, you can enhance the user experience, attract more customers, boost your conversion rate, and optimize the overall efficiency of your e-commerce site.

Leveraging data analytics is a powerful strategy for businesses to expand their audience reach and gain insights into customer interests and behaviors, such as preferred products and effective advertisements Companies that quickly adapt their business models to evolving market trends and customer demands can achieve a significant competitive edge over both established competitors and new entrants Additionally, continuously assessing and optimizing an e-commerce site's performance is essential for improving user experience, ultimately leading to increased sales.

Business analytics enables companies to analyze effective marketing KPIs and strategies, providing an interactive framework that simplifies the tracking of marketing initiative success across multiple levels of detail.

I have just focused on a single e-commerce platform - Shopee, the category I choose is women's jackets in women's fashion

LITERATURE REVIEW

E-commerce transactions generate vast amounts of data, leading to various challenges when analyzing sales metrics Key sales functions include identifying product attributes, setting prices, calculating net sales, and launching new products Previous studies have explored the Expectation Maximization (EM) algorithm and various predictive techniques Notably, Convolutional Neural Networks (CNN) have been employed for sales forecasting While some research focuses on feature extraction for sales predictions, this study emphasizes sentiment analysis, particularly using PhoBERT, to automatically identify effective features that improve outcomes Additionally, other studies have utilized neural network techniques, including recurrent neural networks (RNN) and long short-term memory networks (LSTM), each offering unique methodologies for sales forecasting.

A 2018 study utilized the Nonlinear Autoregressive Neural Network (NARNN) to establish a forecasting and preprocessing framework for e-commerce challenges Time series analysis, particularly the ARIMA algorithm, was employed to compare the effectiveness of these research studies Recognized as a leading method for sales forecasting, time series analysis leverages the Autoregressive function to enhance predictive accuracy Research indicates that machine learning models for sales time series forecasting serve as essential tools in modern business intelligence Additionally, findings suggest that the ARIMA model excels in delivering precise predictions within time series analysis.

A recent study highlights the potential of machine learning in forecasting restaurant sales, revealing that many establishments struggle with accurate daily sales estimates due to inadequate training in sales prediction methods This research aims to evaluate the effectiveness of supervised learning techniques to enhance sales tracking and analysis, enabling better financial decision-making in the restaurant and e-commerce sectors Findings indicate that both random forest and gradient boosting models excel in fitting sales data, with random forest emerging as the superior regression model due to its lower error rate However, subsequent studies have shifted focus to time series models, neglecting the significant role of sentiment analysis, particularly customer reviews, which are crucial in influencing purchasing decisions.

Time series analysis techniques, including autoregressive (AR), integrated (I), and moving average (MA) models, are commonly employed in traditional sales forecasting These models utilize a linear function based on historical sales data to predict future sales More advanced models, such as autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA), offer enhanced versatility and performance However, these time series models are best suited for products with predictable or seasonal sales patterns, relying on past sales data for forecasting As the volume of data in e-commerce grows, newer techniques are emerging to better address irregular sales patterns For instance, Kulkarni et al leverage search phrase volume as a marketing metric to forecast new product sales based on online search information, while Ramanathan et al employ multiple linear regression analysis for improved forecasting accuracy.

Incorporating product-specific demand drivers enhances the accuracy of promotional sales predictions Yeo et al.'s approach leverages customers' browsing behavior to forecast product sales, yet its application is often limited to specific commercial contexts These methods require extensive feature extraction from data using specialized domain knowledge, highlighting their challenges in organizing and extracting meaningful information effectively.

Online customer reviews are user-generated content found on e-commerce and various websites, serving as unstructured data that reflects customer opinions on products and services These reviews are a vital source of product-related information, enabling businesses to gauge consumer sentiment and preferences Research, such as that conducted by Lassen et al., demonstrates the potential of online reviews in forecasting quarterly sales, particularly for products like iPhones.

Recent studies have leveraged social media data to enhance predictive modeling in various industries For instance, a linear regression model utilizing Twitter data successfully predicted iPhone sales by analyzing factors such as tweet subjectivity, volume, and sentiment ratios Similarly, Shukri et al employed text mining and sentiment analysis to assess consumer satisfaction for automobile brands like Mercedes-Benz, Audi, and BMW, categorizing emotions and polarities using a Naive Bayesian classifier Their findings indicated a strong correlation between emotional categorizations and consumer satisfaction, highlighting the impact of user satisfaction on the virality of ideas on social media In the film industry, Hur et al utilized multiple regression models, including artificial neural networks and support vector regression, incorporating sentiment analysis from movie reviews to improve accuracy in forecasting audience numbers Additionally, researchers in another study demonstrated the effectiveness of multivariate regression models, combining social media and financial market data with time series analysis to estimate monthly automobile sales.

This study employs 30 multivariate regression data and least squares support vector regression (LSSVR) models to forecast monthly total vehicle sales in the USA Utilizing three data sources—stock market prices, tweet sentiment scores, and hybrid statistics—the research integrates tweet sentiment and stock market values into a comprehensive dataset Seasonal variables related to monthly vehicle sales are also incorporated, alongside deseasonalized factors derived from the sales data and input sources The analysis compares LSSVR and backpropagation neural networks against various time series models, including the naive model, exponential smoothing model, autoregressive integrated moving average (ARIMA) model, seasonal ARIMA model, and standard ARIMA model The findings indicate that LSSVR models yield more accurate predictions when utilizing hybrid data and deseasonalized methods compared to other models.

Support vector machines (SVMs), rooted in Vapnik-Chervonenkis theory and the principle of structural risk reduction, have gained prominence as a powerful data classification method Initially developed for regression tasks, support vector regression (SVR) has become increasingly popular for function approximation and regression estimation challenges However, both SVM and SVR encounter difficulties when addressing computationally complex quadratic functions To enhance computational efficiency, researchers have suggested modifying these models to tackle quadratic programming problems as linear equations.

SentiStrength requires data preprocessing for effective text analysis, as highlighted in a study Initially, only the relevant text area is retained, followed by the removal of duplicate content, often found in advertisements Additionally, any distracting symbols and website links are eliminated After filtering the tweets, SentiStrength evaluates the remaining data, generating scores that reflect the strength of positive and negative sentiments, ultimately providing a positive sentiment score.

31 from 1 to 5 and a negative sentiment score from 1 to 5 A score of 0 does not constitute an exit Scores of 1 and 1 imply neither positivity nor negativity [33, 34]

In the study [22], one-step-ahead rolling forecasting was applied to LSSVR models for multivariate regression, predicting the total vehicle sales for the upcoming month using emotion scores from tweets and current stock market prices The research utilized data from February 2016 to August 2017 for testing, while training data spanned from September 2009 to January 2016, incorporating various models such as ARIMA, SARIMA, BPNN, and LSSVRTS [36].

[22] also constructed the BPNN model, which has one hidden layer and 10 hidden nodes, and utilized genetic algorithms [37] to generate suitable parameters for the BPNN and

The LSSVRTS models include four time series approaches: the naive model, the exponential smoothing model, the autoregressive integrated moving average (ARIMA) model, and the seasonal autoregressive integrated moving average (SARIMA) model, alongside one multivariate regression model.

The least squares support vector regression model was utilized to forecast monthly total vehicle sales Furthermore, deseasonalizing techniques were applied to analyze the monthly vehicle sales data, incorporating sentiment assessments from tweets and stock market values based on the deseasonalized components derived from the sales data.

RESEARCH METHODOLOGY

Sentiment Analysis Models

According to [7,8,9,10,11], the most advanced language model for Vietnamese is pretrained PhoBERT model (Pho, or "Ph," is a common dish in Vietnam):

The PhoBERT variants, "base" and "big," represent the first large-scale monolingual language models available for Vietnamese, built on the RoBERTa framework RoBERTa enhances the BERT pre-training process, ensuring more reliable performance for PhoBERT's training strategy.

PhoBERT sets a new benchmark in Vietnamese natural language processing (NLP) by excelling in four key tasks: Part-of-speech tagging, Dependency parsing, Named-entity recognition, and Natural language inference It surpasses previous monolingual and multilingual models, demonstrating its superior capabilities in understanding and processing the Vietnamese language.

The first large-scale monolingual language models pre-trained for Vietnamese, PhoBERTbase and PhoBERTlarge, are now publicly available Experimental results show that PhoBERT consistently surpasses the performance of the latest multilingual model, XLM.

The PhoBERT models, developed by VinAI Research, enhance the state-of-the-art in various Vietnamese natural language processing (NLP) tasks, including part-of-speech tagging, dependency parsing, named-entity recognition, and natural language inference By utilizing methods outlined in official publications, these models aim to support future research and applications in Vietnamese NLP.

Sentiment analysis is a vital task in natural language processing (NLP) that involves training machine learning models to classify text based on opinion polarity Among various models, RoBERTa enhances the BERT pre-training method, leading to improved performance and serving as the basis for the advanced PhoBERT pre-training strategy This study introduces an innovative approach for conducting sentiment analysis on Vietnamese reviews.

Our proposed sentiment analysis model leverages SentiWordNet, a specialized lexical resource for sentiment classification, in conjunction with PhoBERT, an optimized version of the BERT model tailored for the Vietnamese language.

In this study, we utilized the pre-trained PhoBERT model to analyze customer reviews for various items, categorizing them into three sentiment labels: "Negative," "Neutral," and "Positive." We employed raw data collected through web crawling and the VLSP 2016 dataset, which contains reviews relevant to the technology sector, making it suitable for comments on the Shopee platform The PhoBERT experiment involved using a training dataset to fine-tune the model and a testing dataset to evaluate its performance in a similar context The model analyzes and classifies sentences into sentiment categories, allowing it to predict whether new sentences are positive, negative, or neutral based on the training it received.

Sales Forecasting Models

The modeling process involves selecting algorithms for research, utilizing four key methods: linear regression, decision trees, random forests, and convolutional neural networks, all of which are widely used in predictive analysis Linear regression aims to forecast one variable based on another, while effective techniques like regularization, cross-validation, and dimension reduction help mitigate overfitting Decision trees, which form a tree-like model of decisions and their potential outcomes, adeptly handle both numerical and categorical data, maintaining performance even with slight assumption violations Random Forest, comprised of multiple decision trees, collectively predicts a class The foundation for testing the model will be based on time series analysis, including an exploration of the Autoregressive Integrated approach.

The Moving Average (ARIMA) model is chosen for its high accuracy in predictions, making it one of the best available forecasting tools Unlike ARIMA, which is a conventional statistical machine learning algorithm that does not require feature extraction, the Long Short-Term Memory (LSTM) deep learning algorithm relies on memory and erase ports for data storage This research aims to forecast market sales using ARIMA and compares its performance metrics and loss against various machine learning methods to enhance the accuracy of future sales predictions.

Recent advancements in deep learning architecture have enabled the effective processing of vast amounts of real-time market data on e-commerce platforms This study focuses on utilizing a long short-term memory (LSTM) network to predict daily product demand by analyzing time series and historical sales data The raw sales data undergoes preprocessing through various methods before being input into the model, which periodically adjusts sales figures based on actual demand The analysis is conducted on data from 300 stores across Vietnam, with a training and testing split by date that optimizes prediction accuracy and minimizes error The study highlights the advantage of deep learning in multivariate time series forecasting, as it eliminates the need for manual feature extraction Additionally, the potential of convolutional neural networks (CNN) is noted for automatically extracting features to assess daily sales variance and any increases in average sales over specified intervals.

Integrating reviews and ratings of customers into forecasting models

Figure 3 1 A framework for integrating customers' feedback into forecasting models

The researcher enhances forecasting models by incorporating sentiment analysis, utilizing raw data files that were crawled and matched by item ID The dataset used for training and testing includes key columns such as "Quantity Sold," "Item," "Negative," and "Neutral."

We have selected regression models to integrate with the results of sentiment analysis To achieve this, we create multiple new columns for each record, specifically adding three new columns that represent the labels "Positive," "Negative," and "Neutral."

The analysis categorizes customer comments into "Neutral" and "Positive" labels, calculating their rates by dividing the counts by the total number of comments An additional column reflects the average customer ratings for each record Data for sales quantities of 300 items was collected between March 4, 2023, and April 13, 2023 The training set includes data up to April 9, 2023, while the testing set comprises data from the subsequent period The study hypothesizes that customer feedback significantly influences purchasing decisions on Shopee, enabling the application of sentiment analysis models to enhance forecasting accuracy.

Social networks and internet have both changed significantly One of the primary web destinations for users is customer reviews, which are helpful for expressing the

Sentiment analysis, also known as opinion mining, is a crucial activity in natural language processing (NLP) that evaluates users' attitudes and thoughts across various scenarios Academics utilize crowdsourced data, such as customer feedback, to perform sentiment analyses on products, events, or environments This process categorizes text-related attitudes as neutral, positive, or negative, leveraging text analytics and computational linguistics The primary goal of sentiment analysis is to ascertain the author's viewpoint regarding a specific context, which may reflect personal opinions, emotional states, or the intentional expression of feelings.

Sentiment analysis plays a vital role in distinguishing between facts and opinions within source materials, where facts represent objective events and opinions reflect subjective feelings and attitudes Understanding the context of customer ratings is essential for predicting purchasing decisions In e-commerce, where customer feedback significantly influences data, integrating sentiment analysis enhances the accuracy of sales predictions This improved analysis provides reliable recommendations for organizations and individuals regarding item sales trends, ultimately leading to more effective supply chain management.

In the second phase of my research, I analyzed a dataset comprising 3,000 comments matched with the sales data of 300 items on the Shopee platform, utilizing an updating tool for effective data management I applied conventional statistical models developed in the initial phase to compare the performance of different methods and data scales Specifically, I leveraged the PhoBERT model for sentiment analysis, utilizing the VLSP 2016 dataset to enhance the accuracy of my findings.

I analyzed 37 comments for 60 items on Shopee, focusing on customer feedback for technical products, achieving an accuracy of 0.88 This level of accuracy allows for reliable predictions of comments in the updated file All comments were categorized using the same sentiment scoring system as before, with classifications of Negative: 0, Neutral: 1, and Positive: 2 The analysis utilized two files containing data on 300 matching items: one detailing customer feedback for each item and the other tracking the quantity sold.

In just 41 days, I collected data on 300 items and added four new features—avg_neu, avg_neg, avg_pos, and avg_rate—to enhance the sales forecasting models alongside the quantity sold (q_sold) These features are derived from the mean sentiment scores associated with each item, calculated from the current day and previous days to ensure accuracy in the training models However, the limited variation in the frequency of sentiment score changes for each item suggests that customer feedback may not significantly impact the predictions of sales quantities.

Conventional sales forecasting models, including Linear Regression, Decision Tree Regressor, Random Forest Regressor, and Long Short-Term Memory (LSTM), can incorporate sentiment features; however, ARIMA, a statistical machine learning algorithm, requires feature extraction as it does not handle multi-dimensional data To effectively combine sentiment scores with the ARIMA model, an ensemble approach using ARIMA-LSTM is proposed, which involves building an additional linear regression model that incorporates three features: the outputs of ARIMA and LSTM, sentiment scores, and ratings After conducting experiments with various forecasting models, including Linear Regression, Decision Tree, Random Forest, ARIMA, LSTM, and ensemble LSTM-ARIMA, the Convolutional Neural Network (CNN) demonstrated superior performance, particularly when using the rolling window method on smaller datasets Furthermore, when integrating sentiment features into regression models, the CNN model achieved a lower Mean Absolute Error (MAE) score compared to the Root Mean Square Error (RMSE) score, highlighting its effectiveness in sales forecasting.

Therefore, the output of sales forecasting models is using selected features such as daily quantity sold, accumulated sentiment scores (average score of negative comments,

To enhance sales forecasting accuracy, it is essential to analyze the average scores of neutral and positive comments, which can predict daily sales volumes Comparing sales forecasting models that incorporate sentiment analysis against those that do not, using evaluation metrics such as Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), provides a comprehensive assessment of model performance.

RESULTS ANALYSIS AND DISCUSSIONS

Data Collection

Building a tool to crawl real-time data, particularly from the Shopee website, poses significant challenges due to its unique features that differ from other e-commerce platforms like Lazada and Tiki Traditional manual crawling methods, often showcased on platforms like YouTube, are insufficient for obtaining large datasets, such as daily sales quantities, since Shopee's API does not provide direct access to this information and standard web scraping techniques are frequently blocked Consequently, acquiring daily sales data and comments for sales forecasting models is impractical on a personal computer, necessitating the use of a Virtual Private Server (VPS) for effective data crawling This process requires substantial investment in labor, time, and resources To address these challenges, I have conducted a small-scale research project, focusing on concatenating data files with matching item IDs through programming loops This approach aims to analyze the impact of customer feedback on sales forecasting models by integrating sentiment analysis with AI techniques, ultimately exploring the hypothesis that positive feedback increases the likelihood of item selection by consumers.

From the above figure, the data collection process includes 7 steps as follows:

Step 1 Determining the products and their features need to crawl on Shopee including Item Name, Daily Quantity Sales, Price, Rating, Customer reviews

Step 2 Determining amount of data need to crawl – all features determined in 16 days recently

Step 3 Building tool automatically crawl data day by day

Step 4 Catching up with exact parameters returning features in Shopee API Step 5 Open to the connection to the server

Step 6 Send request to Shopee

Step 7 Save data in json format and csv file or save in databases if having large volume of data and clean features in dataset if necessary: 1 zip format file contains 2 file, one for quantity sold, and the other for comments with the same ordering items

The following table is the detail of columns are attributes used in sales prediction, showing what it means, and what type of data it is

Table 4 1 Data crawled statistics in initial progress

Attributes Crawl methods Number of data crawled

Item Names Automatically crawl using tool 300 items

Daily Quantity Sales Automatically crawl using tool built

10 rows of quantity sales for each item (10*300 records)

Prices Manually crawl information showed in Shopee website 300 ranges of prices

Ratings Manually crawl information showed in Shopee website 300 ratings

Reviews Manually crawl information showed in Shopee website 5646 comments

Manually crawl information showed in Shopee website

300 general quantity sales for 300 items

Table 4 2 Data crawled statistics in updated progress

Attributes Description Crawl methods Number of data crawled

Item Ordering number of items Automatically crawl using tool built

ID ID number of each item

Automatically crawl using tool built

Name Name of each item Automatically crawl using tool built

Date (DD/MM/YYYY) Automatically crawl using tool built

41 rows of quantity sales for each item (41*300 records)

Automatically crawl using tool built 300 items

Automatically crawl real-time comments using tool built

All comments crawled each day (max 3000 comments for each item)

Account name of customer left comment for each item

Automatically crawl real-time comments using tool built

All users left comments for each item (300 items)

Raw time customer left comment for each item

Automatically crawl using tool built

All comment time (can transfer to YYYY-MM-DD type) for each comment one user left

Rating Rating score of each item Automatically crawl using tool built

Automatically crawl real-time comments

All real-time comments left for each item (300 items)

Initially, we attempted to crawl data from the Shopee website's "Jacket, Coat & Vest" category using Python, following suggestions from YouTube While we successfully retrieved key information such as item links, titles, ratings, prices, and sales quantities in Vietnam and globally, we encountered challenges with data cleaning Specifically, we needed to address issues like inconsistent formatting in sales figures and price ranges Most critically, we lacked daily sales quantity data, which is essential for our prediction model focused on a single category within the e-commerce platform Therefore, in the next phase, we aimed to automate the data crawling process to eliminate the need for manual extraction and coding of each item's information.

The use of a rolling window method on 12,300 rows of historical data significantly enhances demand forecasting for our AI model By employing a tool designed to automatically crawl raw data, we gather practical, daily updated data more efficiently than manual methods, which struggle to handle large volumes of customer reviews and sales data Researchers must monitor specific parameters for each item to obtain daily sales figures While Excel can store millions of records, optimizing data processing speed and memory necessitates saving crawled e-commerce data in databases.

Data Pre-processing

In pretrained PhoBERT model for sentiment analysis, when we gathered the following 5 files dataset

44 o “Comment-shopee.xlsx’”, o “Comment_shopee_exclude_time.xlsx”, o “df_quantityRatingComment.csv” o “SA2016-Test-Gold.zip” o “SA2016-training_data.zip

Data files are uploaded to Google Colab by accessing files stored in Google Drive I initially labeled the data in a public file using a numerical system to indicate importance: 2 for positive, 0 for negative, and 1 for neutral These labeled data points were then concatenated into a training set The accompanying images showcase sample data from each file, with the exception of the last two zip format files, which contain public data used to train the PhoBERT model for sentiment analysis Additionally, the file ‘Comment-shopee.xlsx’ includes timestamps, names, and dates corresponding to comments made by individuals about various items.

However, for raw comments crawled from Shopee website, we label them manually with 3 labels, 0: positive, 1: neutral, 2: negative without having considered the extent of goodness of comments’ meaning

Figure 4 2 Self-labeled comment on shopee

In the second file, I removed timestamps and the names of individuals who left comments, resulting in a total of 60 items on the first page This modification facilitates the construction of sentiment analysis models, as each comment is manually labeled into three categories: ‘Good: 0’, ‘Neutral: 1’, and ‘Bad: 2’ These labels are consistent with those in the original raw data file.

Figure 4 3 Self-labeled comment on shopee exclude time

We collected samples by navigating through each item on the website and saved them in a data frame that corresponds to the quantity sold, identified by No_Item.

Figure 4 4 Comment data sample crawled manually

In Google Colab, I began the data pre-processing step by labeling a public dataset with numerical values based on their significance: 2 for positive, 0 for negative, and 1 for neutral I then combined these labeled data into a training set Following a similar approach, I processed the testing data, which was also publicly available and categorized into three groups: Positive (2), Negative (0), and Neutral (1) I ensured that the manually assigned labels matched those in the public dataset, resulting in a comprehensive classification of comments from the Shopee platform into the three designated labels.

Figure 4 5 Total number of comments on Shopee platform divided into 3 labels

The majority of comments reflect neutral to positive emotions, with 986 neutral and 4,240 positive comments recorded In contrast, negative comments account for just 7.3% of the total, while positive comments comprise a significant 75% This data highlights the overall favorable sentiment expressed in the feedback.

The pretrained PhoBERT model was utilized to enhance the prediction of new sentences, highlighting the importance of improving forecasting models in the presence of noisy data and precise context in rating comments When comments exhibit high similarity with minimal variance, it becomes crucial to implement advanced techniques in feature engineering to achieve better outcomes.

An essential aspect of sentiment analysis is the preprocessing of text data, which involves conducting numerous experiments and research using Python code to eliminate irrelevant characters and spaces This process is crucial for enhancing the quality of the data, and I have compiled statistics from the raw data to develop a function named clean_text.

Figure 4 6 Clean text function in sentiment analysis

In the given function, characters and blank spaces do not affect the sentiment classification of comments, making their removal an effective method for eliminating noisy data.

Having cleaned text, 3 sets are train, dev, test consists of cleaned comments with each label are showed by the following figure

The following figures describes ways I processed Shopee comments in order to predict them using result of PhoBERT model after training and referencing process

By applying optional function called “extract_comment”, raw data is illustrated clearly in “comment” column with their sentiment label

Figure 4 9 Example of cleaned data with predicted sentiment label

Exploratory Data Analysis (EDA)

4.3.1 EDA comment data to train PhoBERT

In this process, I used “Unicode” to make comments that have accented characters to not accented and then count values in each category with True: have not accented,

I divided the dataset into training and development sets with a ratio of 80% training data to 20% testing data After concatenating these sets, I printed the shapes of the Train, Dev, and Test datasets Understanding the number of labels in each group within the training dataset is crucial, as illustrated in the accompanying figure.

Figure 4 10 Value counts of each label in training set

Visualizing data in a bar chart allows for easier comparison, as it aligns with the labels established in the training and development sets.

Figure 4 11 Bar plot show value counts of each label in train set

Figure 4 12 Bar plot show value counts of each label in dev set

This bar chart to make statistics about distribution of the length of sentences in a training dataset

The vertical y-axis is the number of sentences; meanwhile, the horizontal x-axis is the number of words

Figure 4 13 Bar plot show distribution of the length of sentences in train set

The next step is to view the frequency of appearance of each word in the train set and dev set

Figure 4 14 Frequency of appearance of each word in the train set

Figure 4 15 Frequency of appearance of each word in the dev set

4.3.2 EDA data to load in sales forecasting models

Figure 4 16 Frequency changes across items

The analysis of the new data set reveals that the number of comments on each item remained unchanged during the specified date range of March 4, 2023.

- April 13, 2023) The average number of sentiment changes for each item is 16.58,

In a 41-day period, sentiment changes occur only 16.58 times, highlighting their limited impact on efficiency compared to daily sales fluctuations, as sentiment variations represent just six percent of an item's lifecycle The average sentiment rating across labels is notably low at 0.001, and the cumulative sentiment value remains relatively stable leading up to the predicted date, indicating minimal daily changes To effectively incorporate sentiment features into sales forecasting models, it is essential to preprocess noisy data, as the majority of comments on platforms like Shopee tend to be neutral to positive, resulting in only a modest influence of sentiment on the quantity of items sold.

Experiment results of sentiment analysis models

The research used three sources of data for training set including sample data from

The 2016 VLSP dataset from Shopee consists of comments data, which is split into a training set and a testing set in an 80:20 ratio The PhoBERT model is utilized to evaluate the dataset, with results presented through a classification report that details label values and predictions, along with a heat map for visual representation.

The 'positive' label represents the largest volume, accounting for 47.51% of actual values correlated with predicted values The overall accuracy of PhoBERT is 88.47%, with strong F1-scores for three labels, although the 'neutral' label's F1-score is lower at 62.68% F1-score is crucial for evaluating model performance, as it reflects the harmonic mean of precision and recall.

It accounts for both false positives and false negatives As a result, it works well with an unbalanced dataset Table 4.3 shows the experimental results

I also measured the performance on each label as shown in Table 4.3

1 https://vlsp.org.vn/vlsp2016/eval/sa

2 https://huggingface.co/vinai/phobert-base

Table 4 3 Results of PhoBERT model on each label (%) precision recall F1- score support

This study introduces a deep sentiment representation model that integrates convolutional neural networks (CNN) and long short-term memory recurrent neural networks (LSTM) The model utilizes pre-trained word vectors, allowing CNN to extract key local features from the text, which are then processed by a four-layer LSTM to create sentence representations for sentiment classification By employing 1-dimensional convolution, the model effectively handles unstructured data Extensive experiments on the dataset, which included preprocessing steps like removing null comments and stop words, demonstrated the model's effectiveness Notably, it retains semantic features with zero value to ensure comprehensive feature inclusion over a 41-day period In comparison to other deep learning models, such as LSTM and CNN, the PhoBERT model outperformed them in sentiment analysis, achieving an accuracy of 81.99% for LSTM and 80.05% for CNN, along with superior F1 scores and weighted averages for LSTM.

Table 4 4 Results of LSTM model on each label (%) precision recall F1- score support

Table 4 5 Results of CNN model on each label (%) precision recall F1- score support

The analysis of the three sentiment analysis models revealed that the PhoBERT model achieved the highest performance Consequently, the researcher decided to enhance sales forecasting models by integrating customer reviews with the results from the PhoBERT model.

Table 4 6 Result of three models on sentiment analysis precision recall F1-score support LSTM 82.16 81.99 82.05 5119

Experiment results of sales forecasting models

The assessment matrices used in this study will be presented alongside the results and discussions Focusing on daily sales forecasting, the researcher employed PhoBERT for sentiment analysis of customer comments, as well as linear regression, decision tree, random forest, and convolutional neural network models The Mean Absolute Error (MAE) was selected as the evaluation metric to determine model performance By comparing the outcomes from each model, the researcher aims to identify the model with the lowest error rate.

Table 4 7 Results of sales prediction models with/without integrated with customers' feedback by using 1 model built to predict 300 items

Integrating customer feedback into data analysis can significantly reduce evaluation metrics like MAE and RMSE To effectively assess the sentiment of recent comments, implementing a rolling window function is essential This approach not only highlights the quality of recent feedback but also enhances data augmentation, particularly when the window size is appropriately selected.

The data frame used for training models includes four key features: daily quantity sold and the accumulated sentiment average score for each label—negative, neutral, and positive comments—across seven rows for each item These features are automatically extracted by a CNN model, providing valuable insights into consumer sentiment and sales performance.

To effectively analyze sales data, key features such as daily sales variance and post-seven-day sales increases should be extracted The CNN model outperforms conventional regression models like Linear Regression, DecisionTreeRegressor, and RandomForestRegressor in sales forecasting, achieving a 4.8% improvement in the MAE metric when combined with semantic scores Unlike LSTM models, which require extensive datasets, CNNs demonstrate superior performance with fewer data points, emphasizing the importance of using normalized factors, regularization, and dropout to mitigate overfitting Moreover, MAE is a more suitable error metric for sales quantity than MSE, as MSE can produce disproportionately high values When expanding the dataset beyond initial ten-day captures, the models effectively measure sales spikes over 41 days The CNN model yields evaluation metrics of 4.52 for MAE and 11.92 for RMSE, compared to 4.75 and 12.14 without semantic integration As the dataset grows, model accuracy is expected to improve by up to 5%, with the CNN model showing a notable 4.8% increase when sentiment features are included In contrast, the DecisionTreeRegressor model achieves a better MAE of 4.56 with a 3.39% error rate, while RandomForestRegressor shows a smaller decrease from 4.7 to 4.56 The integrated sales forecasting model with customer reviews results in an RMSE of 12.20, compared to 12.31 for the RandomForestRegressor without customer review integration.

In comparison to other models, Linear Regression and Decision Tree Regression exhibit a slight increase in RMSE error metrics, recording values of 12.25 and 12.22 when integrating customer feedback, and 12.36 and 12.16 without semantic features The analysis indicates that CNN outperforms traditional regression models, which often require manual feature extraction and struggle with overfitting in large datasets Additionally, while CNN's multiple layers facilitate the extraction of crucial features, Random Forest Regressor finds it challenging to identify the relevant features associated with sales forecasting.

Integrating PhoBERT sentiment analysis with the CNN model significantly enhances sales performance, reducing the Mean Absolute Error (MAE) to approximately 0.23, which translates to a 4.84% decrease in error compared to the CNN model without sentiment analysis Additionally, when compared to the DecisionTreeRegressor, the incorporation of sentiment analysis features lowers the MAE score to 0.16, resulting in a 3.39% reduction in error rate.

CONCLUSION AND FUTURE RESEARCH

Tiêu đề	Improving sales forecasting models by integrating customers’ feedbacks: A case study of fashion products
Tác giả	Luong Thuy Vy
Người hướng dẫn	Assoc. Prof. Dr. Tran Thi Oanh
Trường học	Vietnam National University, Hanoi International School
Chuyên ngành	Computer Science
Thể loại	Graduation project
Năm xuất bản	2023
Thành phố	Hanoi

Định dạng
Số trang	79
Dung lượng	1,78 MB