Luận văn a visual art generator for music artists

Therefore, the system from this project helps embracethe shift towards web-based technologies and enable artists to leverage the full potential of the modern browser and create captivati

Trang 1

VIETNAM NATIONAL UNIVERSITY HO CHI MINH CITY

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY FACULTY OF COMPUTER SCIENCE AND ENGINEERING

REPORT CAPSTONE PROJECT

A VISUAL ART GENERATOR

FOR MUSIC ARTISTS MAJOR: COMPUTER SCIENCE

SUPERVISOR : TRUONG TUAN ANH, Ph.D

Trang 2

VIETNAM NATIONAL UNIVERSITY HO CHI MINH CITY

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY FACULTY OF COMPUTER SCIENCE AND ENGINEERING

REPORT CAPSTONE PROJECT

A VISUAL ART GENERATOR

FOR MUSIC ARTISTS MAJOR: COMPUTER SCIENCE

SUPERVISOR : TRUONG TUAN ANH, Ph.D

Trang 3

Supervisor Signature

Supervisor Signature:

Date:

Trang 4

ĐẠI HỌC QUỐC GIA TP.HCM CỘNG HÒA XÃ HỘI CHỦ NGHĨA VIỆT NAM

- Độc lập - Tự do - Hạnh phúc TRƯỜNG ĐẠI HỌC BÁCH KHOA

KHOA: KH & KT MÁY TÍNH NHIỆM VỤ LUẬN VĂN/ ĐỒ ÁN TỐT NGHIỆP

BỘ MÔN: CÔNG NGHỆ PHẦN MỀM Chú ý: Sinh viên phải dán tờ này vào trang nhất của bản thuyết trình

HỌ VÀ TÊN: Huỳnh Phước Thiện

MSSV: 1952463

NGÀNH: Khoa học Máy tính LỚP:

1 Đầu đề luận văn/đồ án tốt nghiệp:

Bộ sinh sản phẩm nghệ thuật tự động cho nghệ sĩ nhạc

2 Nhiệm vụ (yêu cầu về nội dung và số liệu ban đầu):

This project requires creating a system to help artists in creating their own visualizer for their performance Some processes of the system are as follows:

- The waveform will be analyzed to extract the data of the frequency

- The data will put together with some random art installation framework such as p5js (a JavaScript framework), OpenFrameworks (C++)

- Based on these two steps, some suitable systems of processing visual art will be built

- The artists can set up their theme for their songs as well as control it over time with the suitable system that they choose

Phase 1:

✔ Learning knowledges and technologies related to the topic

✔ Listing main features of the application (functional requirements)

✔ Design architecture of the system

Phase 2:

✔ Implement the algorithm and functionalities of the application

✔ Evaluate the algorithm and application

3 Ngày giao nhiệm vụ: 30/05/2023

4 Ngày hoàn thành nhiệm vụ: 15/09/2023

5 Họ tên giảng viên hướng dẫn: Phần hướng dẫn:

TS Trương Tuấn Anh

Nội dung và yêu cầu LVTN/ĐATN đã được thông qua Bộ môn

Ngày 30 tháng 05 năm 2023

CHỦ NHIỆM BỘ MÔN GIẢNG VIÊN HƯỚNG DẪN CHÍNH

TS TRƯƠNG TUẤN ANH TS TRƯƠNG TUẤN ANH

Trang 5

PHẦN DÀNH CHO KHOA, BỘ MÔN:

Người duyệt (chấm sơ bộ): Đơn vị: _ Ngày bảo vệ: Điểm tổng kết: _ Nơi lưu trữ luận án: _

Trang 9

The project "A Visual Art Generator For Artists" is the result of a year of research bythe author, under the enthusiastic guidance of Truong Tuan Anh, Ph.D at the Faculty ofComputer Science and Engineering, Ho Chi Minh City University of Technology

First of all, the author would like to express sincere and deep gratitude to Truong TuanAnh, Ph.D for always accompanying and supporting the author with valuable experi-ences and knowledge The advice and instructions of the teachers have led to today’ssweet fruit

The author would like to thank his special advisor and friend, Nguyen Quoc Viet, forhelping the author to conduct and bring his insane ideas into reality This is a valu-able foundation as well as an inspiration for the author to not give up on developing theproject

The author would like to thank his family, relatives, and friends for helping him ing the most difficult times After 4 years of university, the author has had many goodmemories and valuable lessons to help him be more resilient on the path of personaldevelopment

dur-The author would like to thank himself for being brave enough to face this insanely cult challenge, and for stepping out of his comfort zones and finally creating somethingmeaningful

diffi-Finally, the author is really grateful for having his beloved special one, best friend andsoulmate, Nguyen Anh Thao Ngan, in his life and on his creative journey since day one.She has never doubted the author’s creative journey and always encourages the author tomove forward

In the process of research and reporting, it is inevitable that mistakes can be made, andthe author would love to ask for your comments and suggestions so that the project can

be more and more complete, expanded, and can be put into practice

Once again sincerely thank you!

The author, Huynh Phuoc Thien

Trang 10

While unconventional, and perhaps met with skepticism, the approach the author poses diverges from the common practices observed in the field Typically, digital artiststend to working with tools like TouchDesigner, Blender, Processing, or Python How-ever, the author wants to show an alternative perspective

pro-Over the past decade, the web browser has undergone a remarkable transformation Theadvent of code splitting, ES6+ conventions, and HTML canvas has contributed signifi-cantly to this progression Moreover, the availability of powerful libraries like Three.js,D3.js, and P5.js has revolutionized the creation of performance-friendly animations onthe website’s very own

It is essential to recognize that the browser environment is now capable of supportingintricate visual art projects and complex animations that were previously reserved fortraditional desktop applications Therefore, the system from this project helps embracethe shift towards web-based technologies and enable artists to leverage the full potential

of the modern browser and create captivating both audio and visual experiences

In this project, the author has researched and proposed a web-based generator systemwhich can help music artists to play along with 3D objects and textures and create uniquevisualizers for their songs and performances

Trang 11

1.1 Overview 11

1.1.1 The urgency of the subject 11

1.2 Research orientations 12

1.2.1 Research goals 12

1.2.2 Research missions 12

1.2.3 Application scopes 12

1.2.4 Features 12

1.2.5 Non-functional requirements 13

2 Related works 14 2.1 TouchDesigner 15

2.1.1 Main Features of TouchDesigner 16

2.1.2 What makes TouchDesigner different? 17

2.1.3 Some constraints 17

2.2 Synesthesia 17

2.2.1 The features of Synesthesia 17

2.2.2 What makes Synesthesia different? 18

2.2.3 Some Constraints 18

3 Technologies 20 3.1 Next JS/React JS 21

3.1.1 Next JS 21

3.1.2 React 21

3.2 Three JS/React Three Fiber 22

3.2.1 Three JS 22

3.2.2 React Three Fiber 24

3.3 Redux Toolkit 24

3.4 Web MIDI API 25

4 System analysis and design 27 4.1 Current issues of VAG and a creative method to manipulate the render on a web-based application 28

4.2 Audio Analysis and Control 29

4.2.1 Definition of Audio 29

4.2.2 Audio File Formats 29

4.2.3 Audio Data 30

4.2.4 How to analyze and extract the Audio data ? 30

Trang 12

4.2.5 Audio Control with MIDI Devices 31

4.3 3D Rendering 32

4.3.1 Coordinate System 32

4.3.2 Geometries 32

4.3.3 Camera 33

4.3.4 Lighting 35

4.4 User stories 35

4.5 Use cases 36

4.6 Architecture diagrams 43

4.7 How the data flows 46

4.7.1 Entities 46

4.7.2 Relationships 46

5 Implementation 47 5.1 Programmings 48

5.2 Redux Toolkit 48

5.2.1 Redux Store 48

5.2.2 3D Components 50

5.2.3 Get component together for rendering 50

5.2.4 Loading Texture 51

5.2.5 Loading 3D Model 52

5.2.6 Load Audio and Analyze 53

5.2.7 Catch audio signal using MIDI 54

5.2.8 Playing the audio 55

5.2.9 MIDI Script - Controlling by MIDI 56

5.2.10 MIDI Messages Control to modify the texture 58

5.3 Edit Site 59

5.3.1 User add 3D objects 59

5.3.2 Manual Controlling The Parameters 62

5.3.3 Orbit Control, Axis helper and Stats 63

6 System testing 64 6.1 MIDI Connection Testing 65

6.2 Testing Objects Rendering 65

6.3 User Acceptance Testing 66

6.3.1 Testing functional requirement 66

6.3.2 Testing non-functional requirement 67

7 Conclusions 69 7.1 Result archived 70

7.2 Improving points 70

7.3 Future works 70

Trang 13

List of Tables

4.1 The description for "Audio File Upload and Analyze" use-case diagram 37

4.2 The description for "Choose 3D Objects" use-case diagram 38

4.3 The description for "Modify Texture" use-case diagram 39

4.4 The description for "Set Lighting" use case 40

4.5 The description for "Set Camera" use case 41

4.6 The description for "Render Scene" use case 42

Trang 14

List of Figures

2.1 An example of a TouchDesigner project 15

2.2 Interface of TouchDesigner 15

2.3 OP Create Dialog - many methods for the creative process 16

2.4 Synesthesia’s interface 18

3.1 Structure of Redux Toolkit 24

4.1 We can extract data from audio wave files 28

4.2 And put it in control for our scene renderer 28

4.3 The WAV format 29

4.4 The MP3 format 30

4.5 AnalyserNode extract the Frequency data 31

4.6 GainNode modify the Amplitude data 31

4.7 Novation Launchkey MK2 31

4.8 The Coordinate System 32

4.9 The Basic Geometries 33

4.10 Two Types of Camera 34

4.11 Two Types of Lighting 35

4.12 The use case diagram for the Visual Art Generator 36

4.13 The general architecture diagram for VAG system 43

4.14 The deployment diagram for VAG system 44

4.15 The VAG application architecture diagram 45

4.16 Data flow of VAG’s 3D Components 46

5.1 Structure of Redux Store 49

5.2 Create a 3D Geometry 50

5.3 The default rendering 50

5.4 The code of Texture Loader 51

5.5 The Brick Wall texture 51

5.6 The code to load 3D models to VAG 52

5.7 The skeleton model loaded to VAG 52

5.8 The code to load and analyze audio 53

5.9 The code to update the state of the audio node 54

5.10 The example code of playing the audio file 55

5.11 The code to request the MIDI access 56

5.12 The code to handle the success and failure of MIDI Connection 56

5.13 The code to retrieve the data - message of the MIDI data 57

5.14 The code to modify texture by MIDI’s note amplitude (velocity) 58 5.15 The code to modify texture by audio signal triggered by MIDI Controller 58

Trang 15

5.16 The Operator List 59

5.17 The Code of Operator List 60

5.18 The Function to render by clicking Add 60

5.19 The Geometry After Rendering 61

5.20 The GUI Manual Control 62

5.21 Orbit Control, Axis helper and Stats 63

5.22 The Coordinate System with Orbit Control, Axis Helper and Stats 63

6.1 Testing MIDI Connection 65

6.2 Testing Objects rendering 65

Trang 16

Introduction

In this chapter, the author will present an overview of the urgency of the topic, objectives, and scope of the project’s research

Contents

1.1 Overview 11

1.1.1 The urgency of the subject 11

1.2 Research orientations 12

1.2.1 Research goals 12

1.2.2 Research missions 12

1.2.3 Application scopes 12

1.2.4 Features 12

1.2.5 Non-functional requirements 13

Trang 17

1.1 Overview

1.1.1 The urgency of the subject

Audio and visual arts have shared a mutual and inspiring relationship, with each formenhancing and elevating the other’s impact This special identity of audio and visualelements is rooted in their innate ability to convey emotions, tell stories, and engageaudiences on a deeper, multi-sensory level Together, they create a more immersive andcaptivating experience, transcending the boundaries of traditional artistic expression

In the realm of live performances nowadays, audio and visual are equally crucial Musicconcerts, for instance, often incorporate synchronized light shows, projections, or LEDdisplays that complement the music’s mood and tempo These visual elements intensifythe overall experience, creating a mesmerizing ambiance that immerses the audience inthe artist’s world Many applications have been used in practice Examples include VJ-ing softwares which are used in a world-class concert of top tier artists such as Beyonce,Kanye West, Taylor Swift, etc However, this approach requires a lot of resources andeverything has to be on a scope of strict management in many levels

The author wants to find a solution to, first, the complexity of the available visual tools,and second, enhancing the possibilities for music artists to work on and experiment theirown creation Thus, a suitable method is required to implement the visualizer on a web-based generator, and at the same time, the artists can manage the audio and trigger it tocontrol the textures of the visualizer

The development of the Visual Art Generator helps as an alternative to purchases ofother features on the market It is also more simple to conduct and get access to

In live music performance

In typical live performances, the artists mainly work on their audio side and let a VJteam do the visual tasks The synchronization between the audio and visual elements

is critical to achieving a cohesive and immersive experience Thus, VJs often rehearsealongside musicians, gaining an intimate understanding of the music’s structure, beats,and dynamics However, this approach has consumed a lot of time and resources Themusic artists also put their trust on the visual teams, and sometimes it does not end well.Many major issues such as copyrights are concerned if these visual teams are not legit

in the business Therefore, the music artists/performers deserve a full control of what ishappening on their stage

In audio-visual artwork

Recently, some concepts such as digital art and generative art have become popularamong the art communities Many artists has built and developed a foundation for digitaland generative web-based art It can be controversial whether it is art or not, however,the power of technology, randomness and proper arrangement can bring an unexpectedeffects compared to the traditional art

Trang 18

With the help of the Visual Art Generator, artists can experiment and merge their audioand visual into one and create a whole new different experience for the audience.

1.2 Research orientations

1.2.1 Research goals

The goal of the research is to find an alternative to traditional VJ and visual software,thereby continuing to develop and bring it into a web-based applications to supportartists, from which artist can create components of their visualizers and trigger sometextures of them This research will first be applied to myself as a music producer andthen possibly expand to other music artist communities

1.2.2 Research missions

The subject focuses on the following tasks:

• Research artists’ interest when searching visualizer online

• Research possible solutions for controlling and rendering 3D objects by soundsand audio Develop the most optimal solution to implement the project

• Based on the result, implement a web-based application as a generator for helping

to conduct and render visualizers

• Import favorite 3D Models

Overview: Artists are able to import their 3D objects from Blender or any source ofmodel rendering This will enrich their projects with complex and intricate mod-els, enhancing the visual experience and realism of their creations By supportingvarious file formats and utilizing advanced rendering techniques, the web-based

Trang 19

application integrate pre-existing 3D assets into their designs, saving time and panding the creative possibilities.

ex-• 3D elements rendering:

Overview: The Visual Art Generator provide a main Canvas for artists to rendertheir 3D elements after they finish arranging their designs

• Customize and manipulate textures, lighting, camera angles:

Overview: The Visual Art Generator supports artists to setting up and modify thetextures of the 3D elements, lighting, types of camera and angles These are themost important factors of all 3D renderers

• Audio Analyzing:

Overview: Valuable information about the audio (frequency, amplitude, and othercharacteristics) is extracted and used for triggering the textures and other factors.From this approach, artists can have a free and unexpectable method to design theirown visualizer

• Connect and control MIDI Devices

Overview: Visual Art Generator provides a connection to MIDI Devices and artistscan use their MIDI to control their designs

1.2.5 Non-functional requirements

• Performance: Capable of handling visualizations efficiently, ensuring smooth user

interactions and rendering

• Compatibility: The visual generator should be compatible with a wide range of

web browsers to ensure accessibility for users across different platforms

• Stabilize Rendering State : It stands to reason that have your rendering setting be

stable every minute

• Usability: The user interface should be intuitive, user-friendly, and aesthetically

pleasing, enabling users of varying skill levels to create visual art effortlessly

Trang 20

Related works

In this chapter, the author will analyze some related works, and point out the advantages and disadvantages of the technologies they use

Contents

2.1 TouchDesigner 15

2.1.1 Main Features of TouchDesigner 16

2.1.2 What makes TouchDesigner different? 17

2.1.3 Some constraints 17

2.2 Synesthesia 17

2.2.1 The features of Synesthesia 17

2.2.2 What makes Synesthesia different? 18

2.2.3 Some Constraints 18

Trang 21

2.1 TouchDesigner

TouchDesigner is a visual, node-based programming language, which is developed bythe Toronto-based company Deriative It is now one of the best choices for creatorswho are working in fields such as: media systems, architectural projections, live musicvisuals, and software designers However, we have to make it clear that TouchDesigner isnot an application that is ready on start to perform actions that may seem simple in otherapplications TouchDesigner is an environment with extreme depth, and many potentialpitfalls

Figure 2.1: An example of a TouchDesigner project

With its node-based interface, users can control the flow easier by the visualization ofwhat is going on in their project From that, they can manage the effects, camera angleand many more

Figure 2.2: Interface of TouchDesigner

Trang 22

Figure 2.3: OP Create Dialog - many methods for the creative process

2.1.1 Main Features of TouchDesigner

1 Rendering and compositing

2 Workflow and scalable architecture

3 Video and audio in / out

4 Multi-display support

5 Video mapping

6 Animation and control channels

7 Custom control panels and application building

8 3D engine and tools

9 Device and software interoperability

10 Scripting and programming

Trang 23

2.1.2 What makes TouchDesigner different?

• With the graphical interface, the users can see the visualization rather than lines ofcode This can give a more interesting approach in “programming” for users

• There are a lot of operators organized as a family and they support each other tomaximize the quality of users’ workflow

• Moreover, TouchDesigner also support Python for some custom coding as well.This is a really helpful feature for those who have basic knowledge of Python

2.1.3 Some constraints

• Steep Learning Curve: TouchDesigner has a steep learning curve, especially for

those new to visual programming or computer graphics Its node-based interfaceand complex workflows can be overwhelming for beginners

• Hardware Requirements: To fully utilize the capabilities of TouchDesigner, a

powerful computer with a dedicated graphics card is often required This can be alimitation for users with less capable hardware

• Commercial License Cost: TouchDesigner requires a commercial license for

com-mercial use or for projects that generate revenue above a certain threshold This costcan be prohibitive for individual artists or small studios with limited budgets

• Debugging Complex Networks: As projects in TouchDesigner grow in

complex-ity, managing and debugging the node network can become challenging, especiallywhen dealing with multiple interconnected components

• Platform Dependency: TouchDesigner is primarily designed for Windows, and

while there are macOS and Linux versions, some features or third-party extensionsmay be limited on non-Windows platforms

2.2 Synesthesia

Synesthesia is an audio-reactive visual instrument used by VJs, musicians, and creativecoders worldwide Synesthesia also allows anyone to harness the power of shaders tocreate mind-blowing visual experiences in realtime just by dragging mouse with its sim-ple interface

2.2.1 The features of Synesthesia

• Get up and running in seconds with 50+ built-in scenes

• Advanced audio algorithms automatically translate music into visual action

• Drive the action with powerful, midi mappable controls

• Import your own videos and logos and transform them in real-time

Trang 24

• Integrate with other VJ software using Syphon/Spout and NDI.

• Import shaders from Shadertoy and ISF to add fresh content

• Create your own scenes with a live coding environment

• Generate real-time video output to send directly to a projector

• New content available through the in-app marketplace

Figure 2.4: Synesthesia’s interface

2.2.2 What makes Synesthesia different?

Synesthesia is really easy to use and it provides a wide range of theme for songs, artistsand festivals Thus, at big stages Synesthesia can be considered as the king of exper-imental experience as well as giving a chance for audience to dive into the song byvisualization For the users, they can customize their theme with lots of elements andcreate their own visualizer for the performance

2.2.3 Some Constraints

• Artistic Interpretation: The artistic choices made by the software developers

in designing the visual representations can influence the synesthetic experience.However, these choices might not align with the preferences or expectations of allusers

Trang 25

• Personal Variations: Synesthesia is a highly subjective and personal experience,

and different individuals may have unique ways of perceiving audio-visual nections Creating a one-size-fits-all solution may not fully cater to everyone’ssynesthetic experiences

con-• Copyright and Intellectual Property: Live visual software may be used to

en-hance performances or events, but using copyrighted material without proper thorization could lead to legal issues

au-• Hardware Compatibility: Compatibility with different audio setups (microphones,

speakers, audio interfaces) and visual output devices (monitors, projectors) can be

a concern

• Performance Context: The effectiveness of the synesthetic experience can be

in-fluenced by the performance context (e.g., size of the audience, lighting conditions,venue acoustics)

Trang 26

3.2.1 Three JS 22 3.2.2 React Three Fiber 24 3.3 Redux Toolkit 24

3.4 Web MIDI API 25

Trang 27

3.1 Next JS/React JS

3.1.1 Next JS

Next.js is a web application development framework You may utilize React nents to create user interfaces using Next.js Next.js then adds structure, functionality,and optimizations to your application Next.js also abstracts and configures tooling foryou, such as bundling, building, and more This lets you to concentrate on developingyour application rather than setting up tooling Next.js can help you build interactive,dynamic, and quick online applications whether you’re an individual developer or part

compo-of a larger team

Next.js includes some main features:

• Routing: A file-system based router built on top of Server Components that

sup-ports layouts, nested routing, loading states, error handling, and more

• Rendering: Client-side and Server-side Rendering with Client and Server

Com-ponents Further optimized with Static and Dynamic Rendering on the server withNext.js Streaming on Edge and Node.js runtimes

• Data Fetching: Simplified data fetching with async/await support in React

Com-ponents and the fetch() API that aligns with React and the Web Platform.

• Styling: Support for your preferred styling methods, including CSS Modules,

Tail-wind CSS, and CSS-in-JS

• Optimizations: Image, Fonts, and Script Optimizations to improve your

applica-tion’s Core Web Vitals and User Experience

• TypeScript: Improved support for TypeScript, with better type checking and more

efficient compilation, as well as custom TypeScript Plugin and type checker

• API Reference: Updates to the API design throughout Next.js Please refer to the

API Reference Section for new APIs

3.1.2 React

React, developed by Facebook, is a declarative, fast, and adaptable JavaScript toolkit forcreating user interfaces It allows you to build complicated user interfaces out of small,isolated sections of code known as "components."

Trang 28

React includes some main features:

• Component-Based Architecture: React employs a component-based architecture,

in which the user interface is separated into reusable and independent components.Each component can handle its own current state and behavior, making complexUIs easier to maintain and scale

• Virtual DOM: React implements the Virtual DOM, which is a virtual version of

the actual DOM When the data changes, React compares the Virtual DOM to thereal DOM and updates only the appropriate bits, reducing the number of directmanipulations on the actual DOM

• Declarative Syntax: React employs declarative syntax, which allows developers

to express how the UI should look based on the current state of the application.This method simplifies the code, making it more predictable and reasonable

• JSX (JavaScript XML): React makes use of JSX, a syntactic extension that allows

developers to write UI components in a combination of JavaScript and HTML-likesyntax JSX makes it easy to perceive and comprehend the UI structure

• Unidirectional Data Flow: React has a unidirectional data flow, which means that

data moves in just one direction—from parent to child components This ensuresthat the application state is predictable and manageable

Purpose of Next JS and React

The author used properties of Next JS and React JS to build the foundation of the based Visual Art Generator

web-3.2 Three JS/React Three Fiber

3.2.1 Three JS

Three.js is a free and opsource JavaScript toolkit that allows developers to build thralling 3D computer graphics and immersive visualizations directly in web browsers.Three.js, created by Mr.doob (Ricardo Cabello) and a vibrant community of contribu-tors, reduces the intricacies of WebGL (Web Graphics Library) and allows for the seam-less integration of 3D content into web applications

en-Designers and developers may use Three.js to create interactive 3D environments, mated models, spectacular visual effects, and virtual reality experiences that work flaw-

Trang 29

ani-abstracts most of the low-level WebGL code by providing a rich collection of tools andutilities, making it accessible to both seasoned 3D artists and those new to 3D graphics.Three.js has the following major features:

• Geometries: Three.js includes a number of built-in geometries, including cubes,

spheres, planes, tori, and more These geometries serve as the foundation for thescene’s 3D objects and shapes

• Materials: The library includes a variety of material types, such as basic, Lambert,

Phong, and physically based rendering (PBR) Materials specify how 3D objectsreact to light and interact with their surroundings, allowing for realistic rendering

• Cameras: Three.js provides a variety of camera types, including

PerspectiveCam-era and OrthographicCamPerspectiveCam-era, to manipulate the 3D scene’s view and perspective.This enables developers to generate a variety of camera angles and viewpoints

• Lights: diverse light sources, including as AmbientLight, DirectionalLight,

Point-Light, and SpotPoint-Light, allow for the modeling of diverse lighting situations insidethe 3D scene, improving visual realism

• Animation: Three.js provides animation support in the form of keyframe-based

animation and skeleton animation Animating 3D objects, cameras, and lights lows developers to build dynamic and interactive scenes

al-• Textures and Mapping: The library allows you to apply textures to 3D objects,

allowing you to display complex and realistic surfaces UV mapping allows forexact texture placement

• Post-processing: Three.js includes post-processing effects such as bloom,

depth-of-field, and color tweaks that can be applied to the final rendered output to prove the visual quality and ambience of the scene

im-• Raycasting: This technique permits interaction with 3D objects in the scene Mouse

clicks or touches on 3D objects can be detected and responded to by developers,enabling user interaction and interactivity

• Shadows: Three.js allows you to cast and generate shadows from light sources,

which adds depth and authenticity to your 3D scene

• WebXR: The WebXR API allows Three.js to create virtual reality (VR) and

aug-mented reality (AR) experiences that allow users to explore 3D content in sive ways

immer-• Physics: Using physics libraries like as Ammo.js, developers may add physics

sim-ulations to 3D objects, allowing for realistic interactions and behavior inside thescene

• Exporters and Importers: Three.js can import 3D models in a variety of file

formats, including OBJ, STL, GLTF, FBX, and others It also provides exporterswith the opportunity to save 3D scenes for use in other applications

Trang 30

3.2.2 React Three Fiber

React Three Fiber is a strong and unique toolkit that combines React.js with Three.js tures to create gorgeous and interactive 3D online apps React Three Fiber, built on top

fea-of React and Three.js, provides a declarative approach to construct 3D scenes, meshes,lighting, and animations using common React components and hooks

Developers may use React Three Fiber to construct complex 3D scenes and control thestate of 3D objects by leveraging React’s simplicity and reactivity It abstracts Three.js’scomplexity, allowing developers to focus on generating compelling 3D content ratherthan dealing directly with WebGL code

The library uses the React rendering system to efficiently refresh the 3D scene in sponse to changes in the application state This implies that when the React components’states change, React Three Fiber automatically updates the 3D scene, optimizing perfor-mance and ensuring smooth visual interactions It will be simple to specify and structure3D elements, meshes, cameras, and lights

re-Purpose of Three JS and React Three Fiber

The author combined these two libraries to easily render components of Three JS on theweb-based application

3.3 Redux Toolkit

Redux Toolkit is a tool for simplifying and streamlining state management in based applications It is built on Redux and seeks to address common difficulties andbest practices for state management, making it easier for developers to design scalableand maintainable systems

Redux-Figure 3.1: Structure of Redux Toolkit

Trang 31

Principles of Redux Toolkit

The first principle is that everything that changes in the application, including data and

UI state, is saved in an object known as "state" or "state tree." The application heavilyrelies on multiple data sources throughout its functioning Data from the initial serverresponse, user activities (data input, menu clicks, button presses, and so on), data up-dates from the server, and data computed within the application (e.g., computing accountbalances based on exchange rate variations) are examples of these sources These datasources are frequently referred to as "data sources." They arrive from many locationsand at various times, making it difficult to effectively control the application They canhave an impact on specific components, numerous components within the program, orcause a chain reaction of events

The second principle asserts that the "state" is read-only The only way to change the

application’s state is to send a "Action" (an object that defines what happens) The plication’s state cannot be altered directly; it stays as an object, and changes can only

ap-be made via dispatched actions With Redux, state changes occur only when an event(action) occurs

The third princpile is to utilize pure functions that take parameters of the previous state

and an action to return the new state These functions are referred to as "reducers." Thesepure functions are used to change the state of the program They take an event value asinput, the current state as output, and return the next state

Purpose of Redux Toolkit

The Author used Redux Toolkit to store the renders, the value of component state andmanage them in a proper data flow

3.4 Web MIDI API

The Web MIDI API is a JavaScript interface that allows online applications to cate with MIDI (Musical Instrument Digital Interface) devices that are linked to a user’scomputer or device Web developers can use this API to access, send, and receive MIDImessages, enabling the seamless integration of MIDI devices into web-based music andmultimedia applications

communi-The Web MIDI API’s key features are:

• Enables web applications to access and enumerate any MIDI devices attached to

a user’s system, such as MIDI keyboards, controllers, synthesizers, and drum chines

ma-• Web MIDI API allows to send and receive MIDI signals to and from connectedMIDI devices Applications can use this to operate external MIDI devices or torespond to incoming MIDI data

• Web MIDI API offers real-time connectivity with MIDI devices, making it idealfor applications requiring low-latency, quick interactions

Trang 32

• Web MIDI API exposes information about MIDI input and output ports, enablingdevelopers to connect MIDI devices to web applications.

• Web MIDI API can set up Event listeners which can be used by developers tohandle incoming MIDI signals, such as note-on/off events, controller changes, andother MIDI data

• The Web MIDI API interacts easily with the Web Audio API, allowing nization and interaction between MIDI devices and audio elements in the web ap-plication

synchro-Purpose of Web MIDI API

The author wanted to establish a connection between the Visual Art Generator and MIDIdevices, from which users can control and manipulate elements in the render

Trang 33

System analysis and design

In this chapter, the author brings a deeper insight of some important works during theprocess of analyzing and designing components that make up the Visual Art Generator

4.3.1 Coordinate System 32 4.3.2 Geometries 32 4.3.3 Camera 33 4.3.4 Lighting 35 4.4 User stories 35

4.5 Use cases 36

4.6 Architecture diagrams 43

4.7 How the data flows 46

4.7.1 Entities 46 4.7.2 Relationships 46

Trang 34

4.1 Current issues of VAG and a creative method to

ma-nipulate the render on a web-based application

VAG is an experiment of the author to control the art by his audio The idea seemsamazing, however, there are some major issues that needed to be handled Because it is

an audio-visual collaboration, VAG has two biggest difficulties which are how to render3D objects and how to control them with sounds

Therefore, this thesis will primarily focuses on experimenting rendering 3D objects andmanipulate them on the web with a combination of audio analysis and 3D renderingmethods

Figure 4.1: We can extract data from audio wave files

Figure 4.2: And put it in control for our scene renderer

Trang 35

4.2 Audio Analysis and Control

4.2.1 Definition of Audio

The electronic representation or replication of sound waves, often in the form of trical impulses, digital data, or analog recordings, is referred to as audio It includes allaudible frequencies, allowing for the collection, processing, and playback of sound for avariety of applications such as music, communication, entertainment, and multimedia

elec-When we receive sounds from any source, our brain analyses it and gathers some formation Our brain can recognize the pattern of each word matching to it and generate

in-or encode the textual intelligible data into wavefin-orm if the sound data is properly fin-ormed.The wave generates numerical data in the form of frequency

Within the scope of this thesis, it can be referred to our brain as VAG receives the data

of audio files and process for the use of manipulating the textures, which is similar tothe action of the brain

4.2.2 Audio File Formats

There are many type of formats for audio files However, the author would like to use 2most popular format in the world for the standard input These two formats are:

• WAV: developed by Microsoft and IBM It’s a lossless or raw file format and have

no compression on the original sound recording

• MP3: created by the Fraunhofer Society in Germany and is widely used across

the world It’s the most popular file format since it makes music easy to save onportable devices and transfer via the Internet Despite the fact that mp3 compressesmusic, it nevertheless provides decent sound quality

Figure 4.3: The WAV format

Trang 36

Figure 4.4: The MP3 format

4.2.3 Audio Data

Audio data is a digital representation of analog sounds that retains the main qualities ofthe original A sound, as we learned in physics class, is a wave of vibrations that travelsthrough a material such as air or water and eventually reaches our ears When analyzingaudio data, three main aspects must be considered: time period, amplitude, and frequency.

Time period: it is how long it lasts, or how many seconds it takes to complete one

cycle of vibrations

Amplitude: it is the measure of sound strength in decibels (dB), which we perceive

as loudness

Frequency: measured in Hertz (Hz) indicates how many sound vibrations happen per

second People interpret frequency as low or high pitch

4.2.4 How to analyze and extract the Audio data ?

There are two main features which are used to manipulate the textures of 3D objectswithin the scope of this thesis They areAmplitude and Frequency of the audio file.

From the library Three JS, we have 2 features calledAudioAnalyzer and Audio They

use theWeb Audio API, which provides a wide range of audio handling and processing

on web-based application

Get the Frequency Data

In Web Audio API, there is an interface calledAnalyserNode This interface can help

to extract the frequency data and put the data in the flow of manipulating the textures

Trang 37

Figure 4.5: AnalyserNode extract the Frequency data

The Amplitude Data

In Web Audio API, there is an interface called GainNode This interface represents

a change in volume It is an audio-processing module that causes a given gain to beapplied to the input data before its propagation to the output A GainNode always has

exactly one input and one output, both with the same number of channels

Figure 4.6: GainNode modify the Amplitude data

4.2.5 Audio Control with MIDI Devices

In the scope of this thesis, the author uses the MIDI Device calledNovation Launchkey MK2 developed by Novation Music.

Figure 4.7: Novation Launchkey MK2The audio data will be arranged and mapped into suitable control key of the NovationLaunchkey and user will be able to control by the CC or notes on the device

Trang 38

4.3 3D Rendering

From basic advertisements to deep virtual reality, 3D visualization is ubiquitous 3D dering is used by architects, product designers, industrial designers, and branding firms

ren-to generate attractive, realistic visuals that resemble real life

In the scope of a 3D rendering web-based application as VAG, it is important to keep itsimple and easy to manifest Moreover, the flow of renders has to be smooth and stablefor user to experience However, it requires some basic knowledge about 3D setup andrendering such as coordinate system, space, coloring, transparency, shadowing, lighting,camera setup, etc

The author will present some of these basic features of a 3D rendering scene below This

is also the default setup of the VAG

4.3.1 Coordinate System

Figure 4.8: The Coordinate System

In geometry, a coordinate system is a system that uses one or more integers, or dinates, to define the position of points or other geometric components on a manifoldsuch as Euclidean space The order of the coordinates matters, and they are sometimesidentified by their position in an ordered tuple, and other times by a letter, as in "thex-coordinate." In elementary mathematics, the coordinates are assumed to be real num-bers, but they can also be complex numbers or members of a more abstract system, such

coor-as a commutative ring The use of a coordinate system allows geometry issues to betransformed into numerical problems and vice versa; this is the foundation of analyticgeometry

4.3.2 Geometries

Geometries are the fundamental building elements used in 3D rendering to construct3D objects and shapes within a virtual 3D space Geometries define the structure, size,and look of 3D objects and are critical in producing realistic and visually appealing 3Drepresentations Some examples of basic geometries used in 3D rendering include:

Trang 39

• Sphere: A perfectly round 3D shape, resembling a ball.

• Cylinder: A 3D shape with circular bases and a curved surface connecting them.

• Cone: A 3D shape with a circular base and a single curved surface tapering to a

– Perspective Camera: A perspective camera models how human eyesight works

in real life By converging distant objects towards a vanishing point, it vides a sense of depth and realism items appear smaller as they move awayfrom the camera, and the camera’s field of view influences how large itemsappear

pro-– Orthographic Camera: The orthographic camera does not exhibit

perspec-tive effects It projects things onto the visual plane without taking their tance from the camera into account As a result, regardless of their distance,all objects appear the same size in the produced image This camera is fre-quently used in technical drawings, engineering, and certain stylised paintingstyles

dis-• Depth Perception:

– Perspective Camera: The items in the foreground appear larger than those in

the background due to perspective projection, creating a sense of depth andspatial connection in the image

– Orthographic Camera: Lacks depth perception since all objects are the same

size regardless of distance from the camera It shows the scene in an isometric

or "flat" perspective

• Usage:

– Perspective Camera: Commonly employed in scenes that demand a sense of

realism, immersion, and depth perception It’s popular in 3D games, films,animations, and architectural visualizations

Tiêu đề	A Visual Art Generator for Music Artists
Tác giả	Huỳnh Phước Thiện
Người hướng dẫn	TS. Trương Tuấn Anh
Trường học	Vietnam National University Ho Chi Minh City - Ho Chi Minh City University of Technology
Chuyên ngành	Computer Science
Thể loại	Capstone project
Năm xuất bản	2023
Thành phố	Ho Chi Minh City

Định dạng
Số trang	79
Dung lượng	3,13 MB