1. Trang chủ
  2. » Tất cả

Hsum hc integrating bert based hidden aggregation to hierarchical classifier for vietnamese aspect based sentiment analysis

6 9 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề HSUM hc integrating bert based hidden aggregation to hierarchical classifier for vietnamese aspect based sentiment analysis
Tác giả Tri Cong-Toan Tran, Thien Phu Nguyen, Thanh-Van Le
Trường học Ho Chi Minh City University of Technology Vietnam National University Ho Chi Minh City
Chuyên ngành Information and Computer Science
Thể loại Nghiên cứu khoa học
Năm xuất bản 2021
Thành phố Ho Chi Minh City
Định dạng
Số trang 6
Dung lượng 270 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

HSUM HC Integrating Bert Based Hidden Aggregation to Hierarchical Classifier for Vietnamese Aspect Based Sentiment Analysis HSUM HC Integrating Bert based hidden aggregation to hierarchical classifier[.]

Trang 1

HSUM-HC: Integrating Bert-based hidden

aggregation to hierarchical classifier for Vietnamese

aspect-based sentiment analysis

Tri Cong-Toan Tran

Ho Chi Minh City University of Technology

Vietnam National University Ho Chi Minh City

Ho Chi Minh, Viet Nam tri.tran.1713657@hcmut.edu.vn

Thien Phu Nguyen

Ho Chi Minh City University of Technology Vietnam National University Ho Chi Minh City

Ho Chi Minh, Viet Nam thien.nguyen.phu@hcmut.edu.vn

Thanh-Van Le*

Ho Chi Minh City University of Technology Vietnam National University Ho Chi Minh City

Ho Chi Minh, Viet Nam ltvan@hcmut.edu.vn

*Corresponding Author

Abstract—Aspect-Based Sentiment Analysis (ABSA), which

aims to identify sentiment polarity towards specific aspects in

customers’ comments or reviews, has been an attractive topic

of research in social listening In this paper, we construct a

specialized model utilizing PhoBert’s top-level hidden layers

integrated into a hierarchical classifier, taking advantage of

these components to propose an effective classification method

for ABSA task We evaluated our model’s performance on two

public datasets in Vietnamese and the results show that our

implementation outperforms previous models on both datasets.

Index Terms—aspect based sentiment analysis, PhoBert, BERT,

hidden layer aggregation, hierarchical classifier, Vietnamese

cor-pus

I INTRODUCTION The fast growth of e-commerce, particularly the B2C

(business-to-customer) model, has resulted in a rise in online

purchasing habits It makes day-to-day transactions extremely

simple for the general public, and it ultimately becomes one of

the most popular sorts of purchases, especially during a global

pandemic like COVID-19 Due to the sheer development of

social media platforms, customers are encouraged to provide

reviews and comments expressing their positive or negative

sentiments about the products or services that they

experi-enced Analyzing a huge amount of data for mining public

opinion is a time-consuming and labor-intensive operation As

a result, building an automatic sentiment analysis system can

help consumers exploit quality judgments of others about

in-terest products Moreover, this system will support businesses

to better manage their reputation, understand the business

requirements well adapted to the customer’s needs and avoid

marketing disasters For this reason, sentiment analysis has

become one of the most attractive study fields in machine

learning among academic and business researchers in recent years

There have been previous interesting researches of sentiment analysis for Vietnamese text using VLSP 2016 datasets1 However, in modern days, sentiment analysis does not provide enough information since it assumes that the entire review only has one topic and one sentiment, but a product can have both its pros and cons in many aspects The challenge of Aspect-based sentiment analysis (ABSA) is not only detecting aspects

in a review but also the sentiment attached to that aspect A review can be represented by dozens or hundreds of words about multiple aspects with different sentiments to each, and determining which sentiment words go with which aspect can

be very difficult With ABSA, reviews about a product can now be analyzed in detail, showing the reviewer’s opinion on each aspect of that product

The main problem of ABSA is as follows: Given a customer review about a domain (e.g hotel or restaurant), the goal

is to identify sets of (Aspect, Polarity) that fit the opinion mentioned in the review Each aspect is a set of an entity and an attribute, and polarity consists of negative, neutral, and positive sentiment For each domain, all possible combinations

of entities and attributes are predefined The ABSA task will

be divided into two phases: (i) identify pairs of entities and attribute, (ii) analyze the sentiment polarity to the correspond-ing aspect (entity#attribute) identified in the previous phase

For example, a review “Nơi đây có quang cảnh tuyệt đẹp,

đồ ăn cũng ngon nhưng phục vụ hơi tệ” (This place has

an amazing view, the food is great too but the service is bad) will output (Entity#Attribute: Polarity) as follows:

(Ho-1 https://vlsp.org.vn/vlsp2016/eval/sa

Trang 2

tel#Design&Features: Positive), (Food&Drinks#Quality:

Posi-tive), (Service#General: Negative)

In this paper, we propose a method using multiple Bert’s

top-level hidden layers for classification combined with an

intuitive hierarchical classifier for the ABSA task Our results

demonstrate that a large model with many hidden layers

contains useful information which can be used to get better

results We achieved the highest score when applying our

method to two Vietnamese ABSA datasets such as VLSP2

and UIT ABSA [1] dataset

II RELATEDWORK

In recent years, Sentiment Analysis has taken off and is

strongly developed by advanced researches for social listening

Many corpora and tasks have been developed, such as SemEval

2015 (Task 12) [2] and 2016 (Task 5) [3] for various languages,

including English, Chinese, etc The first public Vietnamese

benchmark datasets were released by the VLSP (Vietnamese

Language and Speech Processing) community in 2018 The

organizer built two benchmark document-level corpora with

4,751 and 5,600 reviews for the restaurant and hotel domain,

respectively

Several interesting methods have been proposed to handle

these tasks The earliest works are heavily based on feature

engineering (Wagner et al [4]; Kiritchenko et al [5]), which

made use of the combination of n-grams and sentiment lexicon

features to solve various ABSA tasks in SemEval Task 2014

Nguyen and Shirai [6]; Wang et al [7]; Tang et al [8] were

able to achieve higher accuracy by improving on Neural

net-work with hierarchical structure by integrating the dependency

relations and phrases [6], an Attention module [7], or with

target-dependent mechanism [8] Ma et al [9] incorporated

useful commonsense knowledge into a deep neural network to

further enhance the model

Recently, the pre-trained language model over a large text

corpus such as ELMo (Peters et al [10]), OpenAI GPT

(Radford et al [11]), and especially BERT (Devlin et al

[12]) have shown their effectiveness to alleviate the effort of

feature engineering Chi Sun et al [13] proposed four methods

of converting the ABSA task, such as question answering

(QA) and natural language inference (NLI), into a sentence

pair classification task by constructing auxiliary sentences and

fine-tuned a BERT model to solve the task The sentence

pair is created by concatenating the original sentence with

an auxiliary sentence generated by several methods from the

target-aspect pair Karami et al [14] proposed two modules

called Parallel Aggregation and Hierarchical Aggregation

uti-lizing the hidden layers of the BERT language model to

produce deeper semantic representations of input sequences

The prediction and its loss are performed on each one of the

selected modules These losses are then aggregated to produce

the final loss of the model They used Conditional Random

Fields (CRFs) for the sequence labeling task which yielded

better results In addition, their experiments also show that

2 https://vlsp.org.vn/vlsp2018/eval/sa

training BERT with a large number of epochs does not cause the model to overfit

For the low-resource language such as Vietnamese, there has been little study of aspect-based sentiment analysis over the year, but still steady progress Oanh et al [15] proposed a BERT-based Hierarchical model which integrated the context information of the entity layer into the prediction of the aspect layer, optimizing the global loss functions to capture the entire information from all layers Their model consists of two main components The Bert component encodes the context information of the review into a representation vector The representation vector will be used as input to the hierarchical model to generate multiple outputs (entity; aspect; polarity) corresponding to each layer Thin et al [16] performed an investigation on the performance of various monolingual pre-trained language models compared with multilingual models

on the Vietnamese aspect category detection problem This research showed the effectiveness of PhoBert compared to several models, including the XLM-R [17], mBERT model [12] and another version of BERT model for Vietnamese languages

III PROPOSED MODEL

In this section we will introduce HSUM-HC, our ABSA approach inheriting the benefits of PhoBert with hidden layer aggregation and hierarchical classifiers for Vietnamese text (Fig 1) By deeply analyzing the characteristic of each model,

we believe this combination can give us a model that is well suited for ABSA task PhoBert is a monolingual pre-trained model specifically made for the Vietnamese language Input sequences will be tokenized and fed into the PhoBert Model, then we take top n hidden layers as the meaningful context input for the next step of the hierarchical aggregation layer Then the output of the latter layer will be input into

a hierarchical classifier for predicting the set of aspects and sentiment polarity

1) Bert Model: There have been many multilingual pre-trained Bert models that support Vietnamese, but as pointed out by [18], these models have two main problems: little pre-training data and not recognizing compound words PhoBert is made to address these problems, it is also the first monolingual Bert model pre-trained for Vietnamese PhoBert’s pre-training approach is based on RoBerta [19], which aims to improve Bert’s training procedures for better performance The pre-training was done with 20GB of monolingual text (Vietnamese Wikipedia and Vietnamese news corpus3) and employs the use of a segmenter, VNCoreNLP4 to tokenize compound words (e.g khách_sạn, thức_uống) PhoBert has been used

as a pre-trained model in our research because we aim to process Vietnamese text for ABSA tasks For fine-tuning, we follow the steps taken when pre-training the model, using VNCoreNLP for word segmentation, and PhoBert’s tokenizer

to split sequences into tokens and map tokens into their index,

3 https://github.com/binhvq/news-corpus

4 https://github.com/vncorenlp/VnCoreNLP

Trang 3

Figure 1: Our HSUM-HC model for the ABSA task

adding the [CLS] token at the start and [SEP] token at the end

of each sequence This tokenizer will also give us the attention

masks and pad sequences to ensure equal length Then the

list of tokens and attention masks will be input into the Bert

model

2) Hidden layer aggregation with hierarchical classifiers:

A Bert-based model with a hierarchical classifier was created

by Oanh et al [15] to deal with ABSA Its architecture is based

on how a human would annotate manually the same task It

carries out classification in three layers: Entity, Aspect, and

Sentiment The process is to label first the entity (e.g Hotel,

Room, ) then identify the entity’s attribute (e.g Design,

Comfort, ) to form an aspect, and lastly analyze the sentiment

for that aspect in the review Every layer contributes its output

as context to the next layer With this architecture, we can

solve ABSA with an end-to-end model, without the need for

multiple classifiers

In the original Bert with hierarchical classifier

implementa-tion from Oanh et al [15], we observe that some improvements

can be made to achieve better performance for this task Firstly,

they used a multilingual Bert model and did further training

for Vietnamese to create a pre-trained model accustomed to

Vietnamese However it is still not specialized since,

with-out the use of a segmenter, Vietnamese compound words

are not handled properly We experimented with the model architecture in their paper and saw that we could improve the result by around 3% by using PhoBert as a pre-trained model and VNCoreNLP for word segmentation Secondly, in their implementation, only the last hidden layer was used to make the prediction, this means the top layer is considered most important and all the information in previous hidden layers

is not utilized [20] showed that all hidden layers of BERT can contain information, higher-level layers have valuable semantic information Thus, we can enhance the Bert-based model by using these layers For that reason, we implemented the hierarchical hidden level aggregation architecture by [14], which adds a BERT layer on top of the hidden layers The output is then aggregated with the previous hidden layer and then goes through the hierarchical classifier and the total loss

is the sum of every classifier’s losses

The Binary Cross-Entropy loss function for each layer Li

of the classifier is calculated as follows:

Li=

C X

c=1

yc· log(σ(ˆy)) + (1 − yc) · log(1 − σ( ˆyc)) (1) with C being the number of classes for that layer

The loss for each classifier is the sum of three predictions layers’ losses calculated above

classif ier_loss = L1+ L2+ L3 (2) The total loss is the sum of all classifier’s losses, with H being the number of classifiers

total_loss =

H X

h=1 classif ier_lossh (3)

With this implementation, we obtain an enhanced model with the goal of achieving the best performance possible for the aspect-based sentiment analysis task: A monolingual pre-trained model for Vietnamese text, a mechanism to exploit this pre-trained model to its full potential, and a hierarchical classifier Our promising results will be presented in detail in the experiment section

IV EXPERIMENTS

A Datasets

We experimented our model’s performance with the VLSP

2018 ABSA dataset, which was the first public Vietnamese dataset for ABSA task This dataset was collected from user comments on Agoda5 and consists of document-level reviews The length of each review varies by quite a large number, some are short sentences but some reviews can contain hundreds of words, with the longest containing around 1000 words

We also evaluated our model on the UIT ABSA Datasets, which is sentence-level reviews containing relatively short sentences, which only have 1.65 per review on average The data was collected on mytour6 In the formulation of both

5 https://www.agoda.com/vi-vn

6 https://mytour.vn

Trang 4

datasets, multiple annotators were employed and raw data were

manually annotated with strict guidelines

The datasets deal with the hotel and restaurant domains

being divided into training, development, and testing sets with

similar label ratios There are 34 aspects for the hotel domain

and each review can have a various amount of aspects Details

about the dataset can be seen in Table I and Table II From

the standard deviation for each dataset, it is apparent that

the aspect distribution is very uneven, with the most frequent

aspect appearing around 2000 times, and the rarest aspect only

appearing 2 or 3 times

Table I: Dataset Details for VLSP 2018 ABSA

Type #Reviews #Aspect Avg Aspect σ Avg Length

Table II: Dataset Details for UIT ABSA

Type #Reviews #Aspect Avg Aspect σ Avg Length

train 7180 11812 1.65 469.00 18.25

B Evaluation Metrics

To evaluate the performance of ABSA models, we use

the micro-average method The evaluation will be done in

two phases, Phase A will evaluate the model’s capabilities in

detecting aspects of a review, Phase B will evaluate the aspect,

polarity pair detection The Precision, Recall, and F1 Scores

are evaluated with the following formulas:

P recision =

P

c i ∈CT Pci P

c i ∈CT Pci+ F Pci

Recall =

P

c i ∈CT Pci P

c i ∈CT Pci+ F Nci

F 1 = 2 ∗ P re ∗ Rec

P re + Rec

C Experimental Setup

As mentioned above, we use VNCoreNLP’s segmenter to

segment each review before using PhoBert Then we use

PhoBert’s tokenizer to get token ids, attention masks and

then perform padding We use Hugginface’s AdamW

opti-mizer7together with the constant scheduler8 for warmup, the

base learning rate we choose is 2e−5 and 5e−6 for document

and sentence level datasets, respectively We set the warmup

ratio to 0.25 and batch size to 10, then we train each model

for 100 epochs The BERT model we use is PhoBert-large

with 25 Transformers blocks and a hidden layer size of 1024

We test the performance of two settings: 4 layers aggregation

(HSUM-HC_4) and 8 layers aggregation (HSUM-HC_8)

8 https://huggingface.co/transformers/main_classes/optimizer_schedules.html

D Experimental Results and Discussion

We compared our model’s performance with previous work done on the same dataset For the UIT ABSA Dataset, all results besides ours are from the baseline results in [1], these results will be taken from the Multi-task approach (except for SVM)

1) Experimental Results: Results can be seen in Table III and IV for two datasets Overall, we find that our implemen-tation outperforms previous methods in the same task For the VLSP 2018 dataset, our model achieved an F1 score of 85.20% for Phase A and 80.08% for Phase B, which is a significant improvement from previous Deep Learning models Notably, compared to [15], our model performs considerably better when applying a hierarchical classifier with a language-specific pre-trained model and hidden layers, improving 3.14%

in Phase A and 5.39% in Phase B F1 score of our model is 6.04% higher than one of [16] which used PhoBert-base with

a linear layer for aspect detection For the UIT ABSA Dataset, our model got 80.78% and 75.25% in Phase A and Phase B, respectively Our model also improved at least 1.68% in Phase

A and 1.56% in Phase B compared to baseline models in [1] It’s also proven that using the top 8 layers for hidden layer aggregation gives us a better performance compared to only

4, this is because we are using a large model with more hidden layers, which means more layers can contain useful semantic information

From the results of UIT ABSA sentence-level dataset, we can see that our implementation can have lower precision but much higher recall than previous models, which leads to a higher F1 score than Deep Learning models, meaning it overall outperforms these models This is even more apparent in the document-level dataset, which has longer reviews requiring the model to capture long-range dependencies, each review also has a higher amount of aspects on average Therefore this task can be considered more challenging than sentence-level However, for document-level, our model scores significantly higher than it did on sentence-level This means that our model, instead of being challenged by long sequences and forgetting information, actually can learn the extra information

in these sequences and make use of them to achieve a better result We see that our model shows its true potential when put through a more demanding task with more information to learn

Overall the results show that our implementation is effective

in dealing with ABSA, and all three components PhoBert, HSUM, and hierarchical classifier are essential for improving the model’s performance

2) Loss and performance curve: In our experiments, we trained our model with a high amount of epochs and relatively little data Our training loss curve can be seen in Fig 2, from

a first glance, it is obvious that our model started to overfit very early and the validation loss kept increasing However,

we observe that it is not the case Even though validation loss was increasing, performance still slowly increases as can be seen in Fig 3 This case was also observed by [14] and [21], indicating that the model still learns with a slow and steady

Trang 5

Table III: Results on the test set of VLSP 2018 Dataset, Hotel domain

Phase A (Aspect Detection) Phase B(Aspect Polarity Detection) Models Precision Recall F1 Precision Recall F1

-BiLSTM + CNN 84.03 72.52 77.85 76.53 66.04 70.90

Our method

HSUM-HC_8 86.79 83.66 85.20 84,52 76.08 80.08

HSUM-HC_4 85.59 83.39 84.67 83.50 74.65 78.83 Table IV: Results on the test set of UIT ABSA Dataset, Hotel domain

Phase A (Aspect Detection) Phase B(Aspect Polarity Detection) Models Precision Recall F1 Precision Recall F1 Multiple SVM 76.68 74.70 75.68 69.06 67.28 68.16

LSTM + Attention 83.47 69.07 75.59 76.22 63.07 69.03 BiLSTM + Attention 82.02 72.08 76.73 74.68 65.63 69.86

CNN-LSTM + Attention 76.92 70.76 73.71 69.02 63.50 66.14 BiLSTM-CNN 77.11 78.22 77.66 70.23 71.23 70.72 PhoBert-base 83.46 75.18 79.10 77.75 70.03 73.69

Our method

HSUM-HC_8 80.26 81.31 80.78 76.87 73.71 75.25

HSUM-HC_4 79.75 80.96 80.34 76.89 72.97 74.88

Figure 2: The loss curves on the validation and test sets for VLSP 2018 (left) and UIT ABSA dataset (right)

Figure 3: The F1 curves on the validation and test sets for VLSP 2018 (left) and UIT ABSA dataset (right)

Trang 6

pace At one point the performance stands and the learning

process stops It can be explained that BERT was pre-trained

on an enormous amount of data and therefore will not easily

overfit

E Conclusion

We implemented an effective method that utilizes hidden

layers of Bert with a hierarchical classifier to deal with the

Vietnamese ABSA task We experimented on two datasets on

different review levels and significantly outperforms previous

methods, achieving state-of-the-art results for both datasets

We find that since Bert-large has 25 hidden layers, using 8

layers for aggregation gives better performance compared to

the original 4 layers usage For future work, we plan to apply

our model to different domains and languages and test it with

online customer reviews to see its potential applications

ACKNOWLEDGMENT

We would like to thank VLSP 2018 organizers and the UIT

NLP Group for providing us with the ABSA datasets

REFERENCES [1] D Van Thin, N L.-T Nguyen, T M Truong, L S Le, and

D T Vo, “Two new large corpora for vietnamese aspect-based

sentiment analysis at sentence level,” ACM Trans Asian Low-Resour.

Lang Inf Process., vol 20, no 4, May 2021 [Online] Available:

https://doi.org/10.1145/3446678

[2] B Phạm and S McLeod, “Consonants, vowels and tones across

vietnamese dialects,” International Journal of Speech-Language

Pathology, vol 18, no 2, pp 122–134, 2016, pMID: 27172848.

[Online] Available: https://doi.org/10.3109/17549507.2015.1101162

[3] M Pontiki, D Galanis, H Papageorgiou, I Androutsopoulos, S

Man-andhar, M AL-Smadi, M Al-Ayyoub, Y Zhao, B Qin, O De Clercq,

V Hoste, M Apidianaki, X Tannier, N Loukachevitch, E Kotelnikov,

N Bel, S M Jiménez-Zafra, and G Eryi˘git, “SemEval-2016 task 5:

Aspect based sentiment analysis,” in Proceedings of the 10th

Interna-tional Workshop on Semantic Evaluation (SemEval-2016) San Diego,

California: Association for Computational Linguistics, Jun 2016, pp.

19–30.

[4] J Wagner, P Arora, S Cortes, U Barman, D Bogdanova, J Foster, and

L Tounsi, “DCU: Aspect-based polarity classification for SemEval task

4,” in Proceedings of the 8th International Workshop on Semantic

Eval-uation (SemEval 2014) Dublin, Ireland: Association for Computational

Linguistics, Aug 2014, pp 223–229.

[5] S Kiritchenko, X Zhu, C Cherry, and S Mohammad,

“NRC-Canada-2014: Detecting aspects and sentiment in customer reviews,” in

Proceed-ings of the 8th International Workshop on Semantic Evaluation (SemEval

2014) Dublin, Ireland: Association for Computational Linguistics, Aug.

2014, pp 437–442.

[6] T H Nguyen and K Shirai, “PhraseRNN: Phrase recursive neural

network for aspect-based sentiment analysis,” in Proceedings of the 2015

Conference on Empirical Methods in Natural Language Processing.

Lisbon, Portugal: Association for Computational Linguistics, Sep 2015,

pp 2509–2514.

[7] Y Wang, M Huang, X Zhu, and L Zhao, “Attention-based LSTM

for aspect-level sentiment classification,” in Proceedings of the 2016

Conference on Empirical Methods in Natural Language Processing.

Austin, Texas: Association for Computational Linguistics, Nov 2016,

pp 606–615.

[8] D Tang, B Qin, X Feng, and T Liu, “Effective lstms for

target-dependent sentiment classification,” 2016.

[9] Y Ma, H Peng, and E Cambria, “Targeted aspect-based

sentiment analysis via embedding commonsense knowledge into

an attentive lstm,” Proceedings of the AAAI Conference on

Artificial Intelligence, vol 32, no 1, Apr 2018 [Online] Available:

https://ojs.aaai.org/index.php/AAAI/article/view/12048

[10] M E Peters, M Neumann, M Iyyer, M Gardner, C Clark, K Lee, and L Zettlemoyer, “Deep contextualized word representations,” in

Proceedings of the 2018 Conference of the North American Chapter

of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) New Orleans, Louisiana: Association for Computational Linguistics, Jun 2018, pp 2227–2237 [11] A Radford and K Narasimhan, “Improving language understanding by generative pre-training,” 2018.

[12] J Devlin, M.-W Chang, K Lee, and K Toutanova, “Bert: Pre-training

of deep bidirectional transformers for language understanding,” 2019 [13] C Sun, L Huang, and X Qiu, “Utilizing BERT for aspect-based

sentiment analysis via constructing auxiliary sentence,” in Proceedings

of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume

1 (Long and Short Papers) Minneapolis, Minnesota: Association for Computational Linguistics, Jun 2019, pp 380–385.

[14] A Karimi, L Rossi, and A Prati, “Improving bert performance

for aspect-based sentiment analysis,” arXiv preprint arXiv:2010.11731,

2020.

[15] O T Tran and V T Bui, “A bert-based hierarchical model for

viet-namese aspect based sentiment analysis,” in 2020 12th International

Conference on Knowledge and Systems Engineering (KSE), Can Tho, Viet Nam, 2020, pp 269–274.

[16] D V Thin, L S Le, V X Hoang, and N L.-T Nguyen, “Investigating monolingual and multilingual bertmodels for vietnamese aspect category detection,” 2021.

[17] A Conneau, K Khandelwal, N Goyal, V Chaudhary, G Wenzek,

F Guzmán, E Grave, M Ott, L Zettlemoyer, and V Stoyanov, “Un-supervised cross-lingual representation learning at scale,” 2020 [18] D Q Nguyen and A T Nguyen, “Phobert: Pre-trained language

models for vietnamese,” CoRR, vol abs/2003.00744, 2020 [Online].

Available: https://arxiv.org/abs/2003.00744 [19] Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis,

L Zettlemoyer, and V Stoyanov, “Roberta: A robustly optimized BERT

pretraining approach,” CoRR, vol abs/1907.11692, 2019 [Online].

Available: http://arxiv.org/abs/1907.11692 [20] G Jawahar, B Sagot, and D Seddah, “What does BERT learn about

the structure of language?” in Proceedings of the 57th Annual Meeting

of the Association for Computational Linguistics, Florence, Italy, Jul.

2019, pp 3651–3657.

[21] X Li, L Bing, W Zhang, and W Lam, “Exploiting BERT for end-to-end

aspect-based sentiment analysis,” in Proceedings of the 5th Workshop

on Noisy User-generated Text (W-NUT 2019), Hong Kong, China, Nov.

2019, pp 34–41.

Ngày đăng: 18/02/2023, 08:02

TỪ KHÓA LIÊN QUAN