What makes our system advantageous is that we, forthe first time, combine embodied mixed reality, live 3D human actor capture andAmbient Intelligence, for an increased sense of presence
Trang 1416 Y.M Ro and S.H Jin
above, the frames of soccer videos were categorized into four view types, i.e., V D
fC; M; G; Gpg The processing time for the view decision in soccer videos wasmeasured
Table 1 shows the time measured for the view type decision for different terminalcomputing power As shown, the longest time is taken to detect the global view withgoal post
From the experimental results, the first condition of the soccer video, meaning itsstability in real-time for the proposed filtering system, can be found by substitutingthese results to Eq (3) For the soccer video, the first condition can be described as,
Tfp, it is observed that the sampling rate, fs, becomes 2.5 frames per second by theexperimental result
As a result of the experiments, we obtain the system requirements for real-timefiltering of soccer videos as shown in Fig.10 Substituting PT Gp/s of Table 1 into
Table 1 Processing time for the view type decision Terminal
EŒPT.C / 0.170 sec 0.037 sec 0.025 sec.
EŒPT.M / 0.270 sec 0.075 sec 0.045 sec.
EŒPT.G/ 0.281 sec 0.174 sec 0.088 sec.
EŒPT.Gp / 0.314 sec 0.206 sec 0.110 sec.
Fig 9 Variation of filtering performance according to sampling rate
Trang 2Eq (6), we acquire the number of input channels and frame sampling rates available
in the used filtering system As shown, the number of input channels depends onboth sampling rate and terminal capability By assuming the confidence limit of thefiltering performance, Tfp, we also get the minimum sampling rate from Fig.10
Fig 10 The number of input channels enables the real-time filtering system to satisfy the filtering requirements in (a) Terminal 1, (b) Terminal 2, and (c) Terminal 3 1 and 2 lines indicate the conditions of Eq 6 and Fig 9, respectively 1 line shows that the number of input channels is inversely proportional to b with the processing time of Gp 2 line is the range of sampling rate required to maintain over 80% filtering performance And 3 line (the dotted horizontal line),
represents the minimum number of channels, i.e., one channel
Trang 3418 Y.M Ro and S.H Jin
To maintain stability in the filtering system, the number of input channels and the
2 ,
3 lines meet Supposing that the confidence limit of the filtering performance
is 80%, Figure10illustrates the following results: one input channel is allowablefor real-time filtering in Terminal 1 at sampling rates between 2.5 and 3 frames persecond In Terminal 2, one or two channels are allowable at sampling rates between2.5 and 4.5 frames per second Terminal 3 can have less than four channels at sam-pling rates between 2.5 and 9 frames per second The results show that Terminal 3,which has the highest capability, has a higher number of input channels for real-timefiltering than the others
We implemented the real-time filtering system on our test-bed [27] as shown inFig.11 The main screen shows a drama channel assumed to be the favorite station
of the TV viewer And the screen box at the bottom right in the figure shows thefiltered broadcast from the channel of interest In this case, a soccer video is selected
as the channel of interest and “Shooting” and “Goal” scenes are considered as themeaningful scenes
To perform the filtering algorithm on the soccer video, the CPU usage and ory consumption of each terminal should remain stable Each shows a memoryconsumption of between 32 and 38 Mbytes, and an average of 85% T1/, 56% T2/,and 25% T3/ CPU usage time by a Window’s performance monitor
mem-Fig 11 Screen shot to run real-time content filtering service with a single channel of interest
Trang 4For practical purposes, we will discuss the design, implementation and integration
of the proposed filtering system with a real set-top box To realize the proposedsystem, computing power to calculate and perform the filtering algorithm within thelimited time is the most important element We expect that TV terminals equippedwith STB and PVR will evolve into multimedia centers in the home with computingand home server connections [28,29] The terminal also requires a digital tunerenabling it to extract each broadcasting stream time-division, or multiple tuners forthe filtering of multiple channels Next, practical implementation should be based
on conditions such as buffer size, the number of channels, filtering performance,sampling rate, etc., in order to stabilize filtering performance Finally, the terminalshould know the genre of the input broadcasting video because the applied filteringalgorithm depends on video genre This could be resolved by the time schedule of
an electronic program guide
The proposed filtering system is not without its limitations As shown in previousworks [21–24], the filtering algorithm requires more enhanced filtering performancewith real-time processing As well, it is necessary that the algorithm be extendable
to other sport videos such as baseball, basketball, golf, etc; and, to approach a realenvironment, we need to focus on the evaluation of the corresponding system uti-lization, e.g., CPU usage and memory consumption as shown in [13] and [30]
Conclusion
In this chapter, we introduced a real-time content filtering system for live casts to provide personalized scenes, and analyzed its requirements in TV terminalsequipped with set-top boxes and personal video recorders As a result of experi-ments based on the requirements, the effectiveness of the proposed filtering systemhas been verified By applying queueing theory and a fast filtering algorithm, it isshown that the proposed system model and filtering requirements are suitable forreal-time content filtering with multiple channel inputs Our experimental resultsrevealed that even a low-performance terminal with 650 MHz CPU can perform thefiltering function in real-time Therefore, the proposed queueing system model andits requirements confirm that the real-time filtering of live broadcasts is possiblewith currently available set-top boxes
broad-References
1 TVAF, “Phase 2 Benchmark Features,” SP001v20, http://www.tv-anytime.org/, 2005, pp 9.
2 N Dimitrova, H.-J Zhang, B Shahraray, I Sezan, T Huang, and A Zakhor, “Applications of Video-Content Analysis and Retrieval,” IEEE Multimedia, Vol 9, No 3, 2002, pp 42–55.
Trang 5420 Y.M Ro and S.H Jin
3 S Yang; S Kim; Y M Ro, “Semantic Home Photo Categorization,” IEEE Trans Circuits and Systems for Video Technology, Vol 17, 2007, pp 324–335.
4 C.-W Ngo, Y.-F Ma, and H.-J Zhang, “Video Summarization and Scene Detection by Graph Modeling,” IEEE Trans Circuits and Systems for Video Technology, Vol 15, No 2, 2005,
pp 296–305.
5 H Li, G Liu, Z Zhang, and Y Li, “Adaptive Scene-Detection Algorithm for VBR Video Stream,” IEEE Trans Multimedia, Vol 6, No 4, pp 624–633, 2004.
6 Y Li, S Narayanan, and C.-C Jay Kuo, “Content-Based Movie Analysis and Indexing Based
on AudioVisual Cues,” IEEE Trans Circuits and System for Video Technology, Vol 14, No.
Detec-9 S H Jin, T M Bae, Y M Ro, “Intelligent Broadcasting System and Services for alized Semantic Contents Consumption” Expert system with applications, Vol 31, 2006,
12 M Bais, J Cosmas, C Dosch, A Engelsberg, A Erk, P S Hansen, P Healey,
G K Klungsoeyr, R Mies, J.-R Ohm, Y Paker, A Pearmain, L Pedersen, A Sandvand,
R Schafer, P Schoonjans, and P Stammnitz, “Customized television: standards compliant advanced digital television,” IEEE Trans Broadcasting, Vol 48, No 2, 2002, pp 151–158.
13 N Dimitrova, T McGee, H Elenbaas, and J Martino, “Video content management in sumer devices,” IEEE Trans Knowledge and Data Engineering, Vol 10, Issue 6, 1998,
con-pp 988–995.
14 N Dimitrova, H Elenbass, T McGee, and L Agnihotri, “An architecture for video content filtering in consumer domain,” in Proc Int Conf on Information Technology: Coding and Computing 2000, 27–29 March 2000, pp 214–221.
15 D Gross, and C M Harris, Fundamentals of Queueing Theory, John Wiley & Sons: New York, NY, 1998.
16 L Kleinrock, Queueing System, Wiley: New York, NY, 1975.
17 K Lee, and H S Park, “Approximation of The Queue Length Distribution of General Queues,” ETRI Journal, Vol 15, No 3, 1994, pp 35–46.
18 Jr A Eckberg, “The Single Server Queue with Periodic Arrival Process and Deterministic Service Times,” IEEE Trans Communications, Vol 27, No 3, 1979, pp 556–562.
19 Y Fu, A Ekin, A M Tekalp, and R Mehrotra, “Temporal segmentation of video objects for hierarchical object-based motion description,” IEEE Trans Image Processing, vol 11, Feb.
23 M Kumano, Y Ariki, K Tsukada, S Hamaguchi, and H Kiyose, “Automatic Extraction of
PC Scenes Based on Feature Mining for a Real Time Delivery System of Baseball Highlight Scenes,” in Proc IEEE Int Conf Multimedia and Expo 2004, 2004, pp 277–280.
Trang 624 R Leonardi, P Migliorati, and M Prandini, “Semantic indexing of soccer audio-visual quences: a multimodal approach based on controlled Markov chains,” IEEE Trans Circuits and Systems for Video Technology, Vol 14, No 5, 2004, pp 634–643.
se-25 P Meer and B Georgescu, “Edge Detection with Embedded Confidence,” IEEE Trans Pattern Analysis and Machine Intelligence, Vol 23, No 12, 2001, pp 1351–1365.
26 C Wolf, J.-M Jolion, and F Chassaing, “Text localization, enhancement and binarization in multimedia documents,” in Proc 16th Int Conf Pattern Recognition, Vol 2, 2002, pp 1037– 1040.
27 S H Jin, T M Bae, Y M Ro, and K Kang, “Intelligent Agent-based System for Personalized Broadcasting Services,” in Proc Int Conf Image Science, Systems and Technology’04, 2004,
Trang 7Chapter 19
Digital Theater: Dynamic Theatre Spaces
Sara Owsley Sood and Athanasios V Vasilakos
Introduction
Digital technology has given rise to new media forms Interactive theatre is such anew type of media that introduces new digital interaction methods into theatres In atypical experience of interactive theatres, people enter cyberspace and enjoy the de-velopment of a story in a non-linear manner by interacting with the characters in thestory Therefore, in contrast to conventional theatre which presents predeterminedscenes and story settings unilaterally, interactive theatre makes it possible for theviewer to actually take part in the plays and enjoy a first person experience
In “Interactive Article” section, we are concerned with embodied mixed realitytechniques using video-see-through HMDs (head mounted display) Our researchgoal is to explore the potential of embodied mixed reality space as an interactivetheatre experience medium What makes our system advantageous is that we, forthe first time, combine embodied mixed reality, live 3D human actor capture andAmbient Intelligence, for an increased sense of presence and interaction
We present an Interactive Theatre system using Mixed Reality, 3D Live, 3Dsound and Ambient Intelligence In this system, thanks to embodied Mixed Real-ity and Ambient Intelligence, audiences are totally submerged into an imaginativevirtual world of the play in 3D form They can walk around to view the show at anyviewpoint, to see different parts and locations of the story scene, and to follow thestory on their own interests Moreover, with 3D Live technology, which allows live3D human capture, our Interactive Theatre system enables actors at different placesall around the world play together at the same place in real-time Audiences can seethe performance of these actors/actresses as if they were really in front of them Fur-thermore, using Mixed Reality technologies, audiences can see both virtual objects
B Furht (ed.), Handbook of Multimedia for Digital Entertainment and Arts,
DOI 10.1007/978-0-387-89024-1 19, c Springer Science+Business Media, LLC 2009
423
Trang 8and the real world at the same time Thus, they can see not only actors/actresses ofthe play but the other audiences as well All of them can also interact and participate
in the play, which creates a unique experience
Our system of Mixed Reality and 3D Live with Ambient Intelligence is intended
to bring performance art to the people while offering performance artists a creativetool to extend the grammar of the traditional theatre This Interactive Theatre alsoenables social networking and relations, which is the essence of the theatre, by sup-porting simultaneous participants in human-to-human social manner
While Interactive Theater engages patrons in an experience in which they drivethe performance, a substantial number of systems have been built in which theperformance is driven by a set of digital actors That is, a team of digital actorsautonomously generates a performance, perhaps with some input from the audience
or from other human actors
The challenge of generating novel and interesting performance content for tal actors differs greatly by the type of performance or interaction at hand In caseswhere the digital actor is interacting with human actors, the digital actor must un-derstand the context of the performance and respond with appropriate and originalcontent in a time frame that keeps the tempo or beat of the performance in tact.When performances are completely machine driven, the task is more like creating
digi-or generating a compelling stdigi-ory, a variant on a classic set of problems in the field
of Artificial Intelligence In section “Automated Performance by Digital Actors”
of this article, we survey various systems that automatically generate performancecontent for digital actors both in human/machine hybrid performances, as well as incompletely automated performances
Interactive Theater
The systematic study of the expressive resources of the body started in France withFrancois Delsarte at the end of the 1800s [4,5] Delsarte studied how people ges-tured in real life and elaborated a lexicon of gestures, each of which was to have
a direct correlation with the psychological state of man Delsarte claimed that forevery emotion, of whatever kind, there is a corresponding body movement He alsobelieved that a perfect reproduction of the outer manifestation of some passion willinduce, by reflex, that same passion Delsarte inspired us to have a lexicon of ges-tures as working material to start from By providing automatic and unencumberinggesture recognition, technology offers a tool to study and rehearse theatre It alsoprovides us with tools that augment the actor’s action with synchronized digitalmultimedia presentations
Delsarte’s “laws of expression” spread widely in Europe, Russia, and the UnitedStates At the beginning of the century, Vsevolod Meyerhold at the Moscow ArtTheatre developed a theatrical approach that moved away from the naturalism ofStanislavski Meyerhold looked to the techniques of the Commedia dell’Arte, pan-tomime, the circus, and to the Kabuki and Noh theatres of Japan for inspiration, andcreated a technique of the actor, which he called “Biomechanics.” Meyerhold was
Trang 919 Digital Theater: Dynamic Theatre Spaces 425fascinated by movement, and trained actors to be acrobats, clowns, dancers, singers,and jugglers, capable of rapid transitions from one role to another He banished vir-tuosity in scene and costume decoration and focused on the actor’s body and hisgestural skills to convey the emotions of the moment By presenting to the publicproperly executed physical actions and by drawing upon their complicity of imagi-nation, Meyerhold aimed at a theatre in which spectators would be invited to socialand political insights by the strength of the emotional communication of gesture.Meyerhold’s work stimulated us to investigate the relationship between motion andemotion.
Later in the century Bertold Brecht elaborated a theory of acting and stagingaimed at jolting the audience out of its uncritical stupor Performers of his playsused physical gestures to illuminate the characters they played, and maintained adistance between the part and themselves The search of an ideal gesture that distillsthe essence of a moment (Gestus) is an essential part of his technique Brecht wantedactors to explore and heighten the contradictions in a character’s behavior He wouldinvite actors to stop at crucial points in the performance and have them explain to theaudience the implications of a character’s choice By doing so he wanted the public
to become aware of the social implications of everyone’s life choices Like Brecht,
we are interested in performances that produce awakening and reflection in the lic rather than uncritical immersion We therefore have organized our technology toaugment the stage in a way similar to how “Mixed Reality” enhances or completesour view of the real world This contrasts work on Virtual Reality, Virtual Theatre,
pub-or Virtual Actpub-ors, which aims at replacing the stage and actpub-ors with virtual ones,and to involve the public in an immersive narration similar to an open-eyes dream.English director Peter Brook, a remarkable contemporary, has accomplished acreative synthesis of the century’s quest for a novel theory and practice of acting.Brook started his career directing “traditional” Shakespearean plays and later movedhis stage and theatrical experimentation to hospitals, churches, and African tribes
He has explored audience involvement and influence on the play, preparation vs.spontaneity of acting, the relationship between physical and emotional energy, andthe usage of space as a tool for communication His work, centered on sound, voice,gestures, and movement, has been a constant source of inspiration to many contem-poraries, together with his thought-provoking theories on theatrical research anddiscovery We admire Brook’s research for meaning and its representation in the-atre In particular we would like to follow his path in bringing theatre out of thetraditional stage and perform closer to people, in a variety of public and culturalsettings Our Virtual theatre enables social networking by supporting simultaneousparticipants in human-to-human social manner
Flavia Sparacino at the MIT Media Lab created the Improvisational TheatreSpace[1], [2], which embodied human actors and Media Actors to generate an emergentstory through interaction among themselves and the public An emergent story isone that is not strictly tied to a script It is the analog of a “jam session” in mu-sic Like musicians who play together, each with their unique musical personality,competency, and experience, to create a musical experience for which there is noscore, a group of Media Actors and human actors perform a dynamically evolving
Trang 10story Media Actors are autonomous agent-based text, images, movie clips, andaudio These are used to augment the play by expressing the actor’s inner thoughts,memory, or personal imagery, or by playing other segments of the script Humanactors use full body gestures, tone of voice, and simple phrases to interact withmedia actors An experimental performance was presented in 1997 on the occasion
of the Sixth Biennial Symposium on Arts and Technology [3]
Interactive Theater Architecture
In this section, we will introduce the design of our Interactive Theatre Architecture.The diagram in Fig.3shows the whole system architecture
Embodied mixed reality space and Live 3D actors
In order to maintain an electrical theatre entertainment in a physical space, the actorsand props will be represented by digital objects, which must seamlessly appear inthe physical world This can be achieved using the full mixed reality spectrum ofphysical reality, augmented reality and virtual reality Furthermore, to implementhuman-to-human social interaction and physical interaction as essential features ofthe interactive theatre, the theory of embodied computing is applied in the system
As mentioned above, this research aims to maintain human-to-human interactionsuch as gestures, body language and movement between users Thus, we have de-veloped a live 3D interaction system for viewers to view live human actors in themixed reality environment In fact, science fiction has presaged such interaction incomputing and communication In 2001: A Space Odyssey, Dr Floyd calls homeusing a videophone an early on-screen appearance of 2D video-conferencing Thistechnology is now commonplace
More recently, the Star Wars films depicted 3D holographic communication.Using a similar philosophy in this paper, we apply computer graphics to create real-time 3D human actors for mixed reality environments One goal of this work is
to enhance the interactive theatre by developing a 3D human actor capture mixedreality system The enabling technology is an algorithm for generating arbitrarynovel views of a collaborator at video frame rate speeds (30 frames per second)
We also apply these methods to communication in virtual spaces We render theimage of the collaborator from the viewpoint of the user, permitting very naturalinteraction
Hardware setup
Figure1represents the overall structure of the 3D capture system Eight DragonflyFireWire cameras, operating at 30 fps, 640 by 480 resolution, are equally spacedaround the subject, and one camera views him/her from above Three Sync Units
Trang 1119 Digital Theater: Dynamic Theatre Spaces 427
Fig 1 Hardware architecture [ 8 ]
from Point Grey Research are used to synchronize image acquisition of these eras across multiple FireWire buses [6] Three Capture Server machines receive thethree 640 by 480 video streams in Bayer format at 30 Hz from three cameras each,and pre-process the video-streams The Synchronization machine is connected withthree Capture Sever machines through a Gigabit network This machine receivesnine processed images from three Capture Server machines, synchronizes them, andsends them also via gigabit Ethernet links to the Rendering machine At the Render-ing machine, the position of the virtual viewpoint is estimated A novel view of thecaptured subject from this viewpoint is then generated and superimposed onto themixed reality scene
cam-Software components
All of the basic modules and the processing stages of the system are represented inFigure2 The Capturing and Image Processing modules are placed at each CaptureServer machine After the Capturing module obtains raw images from the cam-eras, the Image Processing module will extract parts of the foreground objects fromthe background scene to obtain the silhouettes, compensate for the radial distor-tion component of the camera mode, and apply a simple compression technique.The Synchronization module, on the Synchronization machine, is responsible for
Trang 12Fig 2 Software architecture [ 8 ]
getting the processed images from all the cameras and checking their timestamps tosynchronize them If those images are not synchronized, based on the timestamps,the Synchronization module will request the slowest camera to continuously cap-ture and send back images until all these images from all nine cameras appear to becaptured at nearly the same time
The Tracking module will calculate the Euclidian transformation matrix between
a live 3D actor and the user’s viewpoint This can be done either by marker-basedtracking techniques [7] or other tracking methods, such as IS900 After receiving theimages from the Synchronization module and the transformation matrix from theTracking module, the Rendering module will generate a novel view of the subjectbased on these inputs The novel image is generated such that the virtual cameraviews the subject from exactly the same angle and position as the head mountedcamera views the marker This simulated view of the remote collaborator is thensuperimposed on the original image and displayed to the user In the interactivetheatre, using this system, we capture live human models and present them via theaugmented reality interface at a remote location The result gives the strong impres-sion that the model is a real three-dimensional part of the scene
Trang 1319 Digital Theater: Dynamic Theatre Spaces 429Interactive Theatre system
In this section, we will introduce the design of our Interactive Theatre system Thediagram in Figure3shows the whole system architecture
3D Live capture room
3D Live capture rooms are used to capture the actors in real-time Basically, theseare the capturing part of 3D Live capture system, which has been described in theprevious section The actors play inside the 3D Live recording room, and their im-ages are captured by nine surrounding cameras After subtracting the background,those images are streamed to the synchronization server using RTP/IP multicast, thewell-known protocols to transfer multimedia data streams over the network in real-time Together with the images, the sound is also recorded and transferred to thesynchronization server using RTP in real-time This server will synchronize thosesound packets and images, and stream the synchronized data to the render clients
by also using RTP protocol to guarantee the real-time constraint While receivingthe synchronized streams of images and sounds transferred from the synchroniza-tion server, each render client buffers the data and uses it to generate the 3D imagesand playback the 3D sound for each user One important requirement of this system
Fig 3 Interactive Theatre system [ 8 ]
Trang 14Fig 4 Actor playing Hamlet is captured inside the 3D Live recording room
is that the actors at one recording room need to see the story context They mayneed to follow and communicate with actors from other recording rooms, with thevirtual characters generated by computers, or even with the audiences inside thetheatre to interact with them for our interactivity purpose In order to achieve this,several monitors are put at the specific positions inside the recording room to reflectthe corresponding views of other recording rooms, the virtual computer generatedworld, and the current images of the audiences inside the theatre Those monitorsare put at fixed positions so that the background subtraction algorithm can easilyidentify their fixed area in the captured images and eliminate them as they are parts
of the background scene Figure4shows an example of the recording room, where
an actor is playing Hamlet
Interactive Theatre Space
The Interactive Theatre Space is where the audiences can view the story in highresolution 3D MR and VR environments Inside this space, we tightly couple thevirtual world with the physical world
The system uses IS900 (InterSense) inertial-acoustic hybrid tracking devicesmounted on the ceiling While visitors walk around in the room-size space, theirhead positions are tracked by the tracking devices We use the user’s locationinformation to interact with the system, so that the visitors can actually interactwith the theatre context using their bodily movement in a room-size area, which
Trang 1519 Digital Theater: Dynamic Theatre Spaces 431incorporates the social context into the theatre experience The Interactive TheatreSpace supports two experience modes, VR and MR modes Each user wears awireless HMD and a wireless headphone connected to a render client Based onthe user’s head position in 3D, which is tracked IS900 system, the render clientwill render the image and sound of the corresponding view of the audience so thatthe audience can view the MR/VR environment and hear 3D sound seamlesslyembedded surrounding her.
In VR experience mode, featured with fully immersive VR navigation, the itors will see they are in the virtual castle and they need to navigate in it to findthe story performed by the 3D live actors For example, in Figure5, we can see thelive 3D images of the actor playing Hamlet in the Interactive Theatre Space in VRmode with the virtual grass, boat, trees and castle The real actors can also play withimaginative virtual characters generated by the computers, as shown in Figure6 As
vis-a result, in VR mode, the users vis-are surrounded by chvis-arvis-acters vis-and story scenes Theyare totally submerged into an imaginative virtual world of the play in 3D form Theycan walk or turn around to view the virtual world at any viewpoint, to see differentparts and locations of the story scene, and to follow the story on their own interests.Besides the VR mode, users can also view the story in MR mode, where the vir-tual and the real world mixed together For example, the real scene is built inside theroom, with real castle, real chairs, tables, etc., but the actors are 3D live charactersbeing captured inside the 3D Live recording rooms at different places
Moreover, our Interactive Theatre system enables actors at different places playtogether on the same place in real-time With the real-time capturing and renderingfeature of 3D Live technology, using RTP/IP multicast to stream 3D Live data inreal-time, people at different places can see each other as if they were in the samelocation With this feature, dancers from many places all over the world can dancetogether via internet connection, and their 3D images are displayed at the Interac-tive Theatre Space corresponding to the users’ viewpoints, tracked by IS900 system.The Content Maker module in Figure3defines the story outline and scene by spec-ifying the locations and interactions of all 3D Live and virtual characters In order
to enable the interaction of the audiences and the actors at different places, severalcameras and microphone are put inside the Interactive Theatre Space to capture theimages and voice of the audiences Those images and voice captured by the cam-era and microphone near the place of a 3D Live actor, which is pre-defined by the
Fig 5 Interactive Theatre Space in VR mode: 3D Live actor as Hamlet in virtual environment