1. Trang chủ
  2. » Luận Văn - Báo Cáo

Luận văn transductive support vector machines for cross lingual sentiment classification

4 1 0
Tài liệu được quét OCR, nội dung có thể không chính xác
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Transductive Support Vector Machines for Cross-lingual Sentiment Classification
Tác giả Nguyen Thi Thuy Linh
Người hướng dẫn Professor Ha Quang Thuy
Trường học University of Engineering and Technology Vietnam National University
Chuyên ngành Computer Science
Thể loại Thesis
Năm xuất bản 2009
Thành phố Hanoi
Định dạng
Số trang 4
Dung lượng 40,5 KB

Các công cụ chuyển đổi và chỉnh sửa cho tài liệu này

Nội dung

Transductive Support Vector Machines for Cross-lingual Sentiment Classification [lm NGHE Nguyen Thi Thuy Linh Faculty of Information Technology University of Engineering and Technolo

Trang 1

Transductive Support Vector Machines for Cross-lingual Sentiment Classification

[lm NGHE

Nguyen Thi Thuy Linh

Faculty of Information Technology

University of Engineering and Technology Vietnam National University, Hanoi

Supervised by Professor Ha Quang Thuy

A thesis submitted in fulfillment of the requirements for the degree of

Master of Computer Science

December, 2009

Trang 2

Table of Contents

1 Introduction

11 Intreduction

1.2 What might be mvolved?

L8 Ourapproach

Lá Related more

1⁄41 Sentiment classification

141.1 Sentiment classification taska 14.1.2 Sentiment classification features

144.13 Seutinent classification techniques 14.1.4 Sentiment classification domains

14.2 Cross-domain text classification

2 Background

2.1 Sentiment Analysis

2.1.1 Applications 2.2 Support Vector Machines

2.3 Seui-supervised Leukmiques

2.3.1 Gencrate maximum-likelihood models

23.2 Co-training and bootstrapping 2.3.3 Transductive SVM

3 The semi-supervised model for cross-lingual approach

3.1 ‘The semi-supervised model

3.2 Teview Translation

33 Fealures ss 0.0 Lee

3.3.1 Words Segmentation 3.3.2 Part of Speech ‘Tagging 35.3 N-gram model

13

13

16

16

16

18 18

Trang 3

4.5.1 Effect of rosslingual corpus ee 28 4.5.2 Effect of extraction features 0 ee 24

45.2.1 Using stopword list beet eee 34 4.5.2.2 Segmentation and Part of speech tagging 34

Trang 4

Abstract

Sentiment classification has heen much attention and has many useful applications

on business and intelligence This thesis investigates sentiment classification prob- Jem employing machine learning technique Since the limit of Vietnamese sentiment corpus, while there are many available English sentiment corpus on the Web We combine English corpora as training data and a number of unlabeled Vietnamese dude in semi-supervised model, Machine learning vlimninates the language gap be-

iween the training set aud test seb in our model Moreover, we ulso examine types

of features Lo obtain the best performance

The results show thar semi-snpervised classifier are quite good in leveraging cross-lingnal corpns to compare with the classifier without cross-lingual corpus In term of features, we find that using only unigram model tnrning out rhe ontperfnr- mace

Ngày đăng: 21/05/2025, 20:33

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN