Technology to Realize Dokodemo Door
So that Anyone Can Easily Experience the Unknown

Shigeo Morishima
Professor, Faculty of Science and Engineering, Waseda University

Although the accurate definition of Dokodemo Door (is not clear, if it is interpreted as technology used to teleport someone instantaneously to another place, that would be physically impossible in the first place. Interpreting the Dokodemo Door as achieving an effect that provides the emotions and impressions virtually equivalent to those can be felt when moving to another place without physical movement, however, would be possible in a technical sense. Virtual reality (VR) was once the rage of the day. A six-wall screen called CAVE displayed 3D images synchronized with the viewer's perspective to realize the display of immersive spaces and the interactive operation of objects. As images of what viewers experience were synthesized through CG, however, VR did not reach an impressive level, although it certainly achieved some commercial success-for example with the virtual experience in a kitchen or the walk-through experience in a house planned to be built. Also, although it evolved into teleconferencing with realistic sensations, communication used to realize communication in the virtual space, or augmented reality (AR) to overlap a virtual space and a real space, it is nothing more than a tool, and not the means to convey emotions. That is because it is ultimately a fake world, in which subjects are synthesized, vastly removed from the real world.

Replication example of 3D face contour from the front face image: Active Snap Shot

While it is physically impossible to teleport ourselves, technologies for bringing the environment of a place which we wish to visit is a research theme we aimed at in the project called Dive into the Movie, sponsored by Special Coordination Funds for Promoting Science and Technology from the Ministry of Education, Culture, Sports, Science and Technology. Human beings receive stimuli through the five senses, and it is technologies, particularly focusing on audiovisual information, which aim to faithfully replicate without any change the stimuli that a camera operator or a main character feels wherever they are in our present standing position. That is, these technologies refer to the technology of taking and reproducing panoramic images (FIPPO), and the technology of recording and reproducing 3D sound field (wave field synthesis). These technologies provide an opportunity to actually experience those locations that are inaccessible to ordinary people no matter how hard they try (for example, the top of Mt. Everest, the deep sea, inside a lion cage, a player's position during a game, or the location of the main character in a movie)-something that even Dokodemo Door cannot make happen. What sets these panoramic images apart from the normal video images is that while the latter purely reflects the camera operator's perspective, taking and reproducing panoramic images deliver a high-quality 360-degree view as seen from a standing position. In addition, by devising a head-mounted imaging device, the reproduction of images taken at the camera operator's eye level enables eye contact to be made with others who are around, if any are, and its battery-powered lightweight mobile feature has the advantage of being portable anywhere. Meanwhile, as for the sound field, unlike the binaural or Dolby Surround, because the wave front of the sound field developed in the space is physically reproduced, the sound heard while standing in the position naturally reaches your ears without putting headphones on. Then, rotate your head around, and you can recognize the direction of the sound source as well as the 3D audio. With these two combined, a world which has never been experienced before is made available to everyone easily and safely.

Panoramic imaging system (FIPPO) and the shot image (provided by Kazuaki Kondo, Kyoto University)

Future Cast System

Now then, human desires never end, and we sometimes hope to get incorporated in the film which we are watching, instead of merely being immersed in the actual environment. It is, so to speak, teleportation to a story. It is the Future Cast System that successfully realized this sensation. It is a visual entertainment system featuring participation from viewers, which made its debut at the Mitsui-Toshiba Pavilion at the Aichi Expo in the summer of 2005. This is a system where scanning is performed for the faces of the visitors first, and once the film starts, all the faces of the characters appearing in the movie are replaced with the visitors', and they act and alter their facial expressions. The system is characterized by the ability to let viewers see themselves from the third person's perspective, so to speak, and share the feeling of excitement with their families and friends. Unlike a personal pleasure, it allows us to bask in a certain kind of heroism, and also various educational effects can be expected in that we can look at ourselves objectively with the system. After the Aichi Expo, its features to customize individual characteristics made technological advances, and it was released as a commercial system for permanent exhibition at Huis Ten Bosch in 2007. The time required for scanning of faces was shortened from 3 minutes to 10 seconds while any errors in modeling vanished, and an optional service was added to let visitors buy their photographs if they like them. Although four years have already passed since the system made its debut there, it still enjoys the number one popularity, as many visitors come from Taiwan, Korea, and China these days. Also at present, with further technology progress achieved, technologies have been established to enable the replication of a 3D mesh model of face from a snapshot without using complex devices, and they have reached a high-precision and high-efficiency level at which processing takes less than two seconds with a laptop and furthermore there is little deformation of face contour. The customization feature has extended beyond the facial contours, widely covering the voice, way of walking, changes in facial expression, skin texture, physique, and hair, which has dramatically improved the reality of appearing characters and made them a more familiar presence. The length of time required for customization is a few minutes from measurement to modeling. It is a matter of time before this technology sees the light of day.

Example of character customization

Professor Morishima was born in Kushimoto, Wakayama Prefecture, in 1959. He completed his doctoral course at the School of Engineering, the University of Tokyo in 1987 (Doctor of Engineering). After assuming positions of a Full-Time Lecturer, Associate Professor, and Professor on the Department of Electrical Engineering, Faculty of Technology, Seikei University, he was appointed to the current post as a Professor, Faculty of Science and Engineering, Waseda University in 2004. At Seikei University, he conducted a study of intelligent image coding which serves as a basis for teleconferencing with realistic sensations communication, and won the Achievement Award from the Institute of Electronics, Information and Communication Engineers. He became a Visiting Professor at the University of Toronto in 1994, where he conducted research on the modeling of muscles in facial expressions. He was a Visiting Researcher at ATR from 1999 to 2010, where he conducted a study of interactive media and application of voice translation technologies. He received the TELECOM System Technology Award from the Telecommunications Advancement Foundation for his research on Dive into the Movie in 2010. He engages in research on graphics, vision, and voice/music information processing. Practical applications include the applied motion-capture technology for Nodame Cantabile: Paris Chapter [Nodame Kantabire Pari Hen], and the development of the real-time skin shader for Dynasty Warriors 7 and Samurai Warriors 3Z.