The Japan News by The Yomiuri Shimbun

Home > Campus Now > SPECIAL REPORT : New Year Issue (Jan. 2014)

Campus Now

New Year Issue (Jan. 2014)


A happy future with people and robots

Waseda University strives to become the “international research university” which is advocated in the “Waseda Vision 150” which states the ideal from of our university upon reaching its 150th anniversary (2032). Every day, Waseda gives back to human society by broadly transmitting its research activities throughout the world.
In the field of humanoid robots, the WABOT Project was started in 1970 as research that spans different departments in the School of Science and Engineering. The WABOT Project boasts over 40 years of activity and continues to produce world-leading technology daily. This article reports the successes and issues of Waseda research and technology aimed at coexistence between people and robots.

Aiming for coexistence with people
Humanoid robot technology at Waseda

From among Waseda’s advanced humanoid robot technology, this article introduces SCHEMA, a communication robot which is the culmination of knowledge at the Perceptual Computing Laboratory led by Professor Tetsunori Kobayashi (Faculty of Science and Engineering).

Towards realization of a robot system to enjoy conversation

In addition to language information contained in voices, human conversation is composed of a variety of non-language information which is mutually conveyed unconsciously. Some elements include intonation, line of sight, facial expressions and gestures. For example, a person may nod or give a questioning look while listening to someone talk. These actions are a simple way of displaying whether or not the listener agrees with the speaker’s discourse. Furthermore, if there are multiple listeners, the speaker’s line of sight or body position indicates the group member to whom the discourse is being directed. In other words, in order to effectively communicate language information, there are various protocols for the use of non-language information. Natural conversation is formed only when these protocols are followed. One focus of my research is to clarify conversational protocols while using robots. I am also developing element technology to realize such protocols with robots. By combining these research themes, I seek to realize the comprehensive technology of a conversation system.

Research in conversation using robots is based on fulfilling the layer protocol of matching the medium for information transmission to the medium (the human body) which is used by human beings. However, there is also one more important meaning—robots are a treasure chest of errors. Unlike human beings who can perform complicated actions unconsciously, robots are not capable of correct action unless they have been consciously programmed to perform. The strange actions of robots show us where understanding for important protocols is lacking. In order to learn, correct examples alone are not sufficient. Instead, an appropriate amount of correct and incorrect examples are necessary. In 1999, I used “ROBITA” to conduct experiments on protocols for conversation in groups. In 2003, through “ROBISUKE,” I experimented with protocols for turn-taking in conversation. In 2009, I used “SCHEMA” to perform experiments focused on protocols for enlivening conversations. In these experiments, I compared good examples and bad ones in order to advance my research on correct understanding of conversation protocols.

Furthermore, there is value in each element technology used in realizing conversation robots. Facial image processing technology and sound source separation technology used in robots have been licensed to corporations for practical implementation. I am currently conducting several joint projects with corporations. I look forward to many different technologies being transferred and increasing the convenience of our daily lives.

My ideal is to realize a conversation system which can join human conversations naturally and act as a partner in enjoyable conversation. The conversation systems currently used in society today are nothing more than convenient methods of information input. However, intrinsically speaking, the act of conversation itself is fun. By clarifying the structural elements of a system for realizing enjoyable conversation, I hope to develop a system which will be used for the act of conversation itself.

Tetsunori Kobayashi
Professor of the Faculty of Science and Engineering

In 1985, Completed the Doctoral Program in engineering at the Waseda University Graduate School of Science and Engineering. Holds a PhD in engineering. After serving as Associate Professor at Hosei University, took the position of Associate Professor at Waseda University in 1991. Assumed his current position in 1997. His research themes include speech recognition and synthesis, time-varying image processing, microphone array, and conversational robots. Has held positions including Director of the Association for Natural Language Processing, chair of the IPSJ SLP, and managing editor of the Editorial Committee on Journal of the IEICE.

Please visit the Facebook page for more information.

Multi-party interaction-oriented humanoid robot

Changes conversation by checking expressions, line of sight, rhythm and movement.

During a conversation between two people, the simple exchange of one person speaking and the other person listening is nothing more than a “string telephone.” In such a conversation, smooth exchange of information is not possible. In addition to voice, natural conversation is based on the use of comprehensive elements including expression, line of sight, rhythm and movement. “SCHEMA” confirms non-language information and freely changes conversation according to different situations. For example, if the listener tilts their head to the side, SCHEMA recognizes that the “listener does not understand” and uses a different expression to provide information again


Participates in multi-party conversation; leads conversation.

Sometimes, when three people are talking, two of the people will engage in lively conversation regarding a common subject, leaving the third person out of the conversation. In such situations, “SCHEMA” first joins the lively conversation between the two people. After taking control of the conversation, SCHEMA leads the conversation by addressing the remaining third person. By doing so, SCHEMA enables enjoyable conversation by all parties. This is made possible through a complicated system which recognizes gaps in conversation between two people, makes a statement in conjunction with the subject being discussed, and then provides a theme which enables the third person to join the conversation.