Vietnam National University - Ho Chi Minh CityHo Chi Minh city University of Technology Faculty of Computer Science and Engineering GRADUATE THESIS GROUNDED LANGUAGE LEARNING: IMPROVE TE
Trang 1Vietnam National University - Ho Chi Minh City
Ho Chi Minh city University of Technology Faculty of Computer Science and Engineering
GRADUATE THESIS
GROUNDED LANGUAGE LEARNING: IMPROVE TEXT REPRESENTATION WITH VISUAL INFORMATION
Major: Computer science
Council: Computer Science 11 Supervisor: Assoc Prof Quan Thanh Tho Reviewer: Mr Le Dinh Thuan
-o0o -Student: Nguyen Tran Cong Duy (1710043)
Trang 2Grounded Language Learning: Improve Text Representation with Visual Information
Thesis
Nguyen Tran Cong Duy Supervisors Assoc Prof Quan Thanh Tho
Trang 3I hereby declare that, except for the reference results from other related works spec-ified in the thesis, the contents presented in this thesis are my own implementation and there is no part of the content applied for a degree at another school
Ho Chi Minh City, July 11, 2021
i
Trang 4As a matter of first importance, I am massively thankful to my counselor Assoc Prof Quan Thanh Tho for his consistent help and direction all through my thesis, and for the opportunity in studying and researching he gave me Second, I additionally thank my parents and my best friends for the persistent consolation, backing, and consideration
ii
Trang 5Nowadays, people learn languages through listening, speaking, reading, writing, and multimodal interactions with the real world Even a child has been taught from a young age to talk (listen), teach to speak, teach gestures and learn through pictures from a young age, people from a young age not only learn language from resources or books with only words that learn a combination of images, stories, and descriptive sentences Today’s language models are mostly not learned from real-world factors but are trained by purely linguistic data sources There have been
a few constructions that incorporate language and other elements into applications for robotics, visual and linguistic tasks, etc with positive results but presently a major challenge in the industry, machine learning, deep learning today
iii