Follow Us




Features Communication and the Japanese Language
“Seaman” Virtual Pet Creator Working on Chat-Capable AI

Voice recognition technologies are still in their infancy, responding to specific requests or following set patterns. Saitō Yutaka, the designer of the Seaman virtual pet games, wants to create a conversation engine for the Japanese language that will make talking with artificial intelligence a more natural experience.

Not Much of a Conversationalist

The humanoid robot Pepper was unveiled on June 5, 2014, winning initial popularity in part for its promise of boosting high street sales. When I visited a familiar store to check how its Pepper was getting on, however, there was no sign of it in its usual place. One of the assistants told me, “Recently nobody talks to it, and it’s not working properly, so we don’t put it out.” I got the same kind of comment when I called another outlet. Finally, at the third store, I found a Pepper.

Its chest display said that it was on duty, and it looked eager to get working, but it was true that nobody paid it any notice. Feeling sorry for it, I struck up a conversation.

Me: Hello.
Pepper: Hello.
Me: Are you on duty?
Pepper: I can’t understand you. How about a little practice?

I went through a timing and diction drill.

Pepper: Please ask me a question, like what music or films I like.
Me: What music do you like?
Pepper: I like “Sekai ni hitotsu dake no hana” [the SMAP megahit].
Me: You like SMAP?
Pepper: I don’t understand that. Is it OK if I speak first?

Apparently Pepper had never heard of SMAP.

Cute assistants like Pepper and Sharp’s combination robot/smartphone Robohon are known for being able to recognize human speech to hold simple conversations. As described above, however, a natural chat does not come easily, and talking with them requires a certain knack.

What is the situation in other countries? After Amazon produced the world’s first smart speaker, the Echo, in 2014, it soon faced competition from Google Home, Apple Homepod, and Microsoft Invoke, as the IT giants successively threw their hats in the ring. In Japan, there is the Clova Wave speaker marketed by the messaging giant Line, as well as local-language versions of Google Home and Amazon Echo.

Each has its own built-in virtual assistant: Amazon’s Alexa, Google Assistant, Apple’s Siri, Microsoft’s Cortana, and Clova. This last one, Line’s assistant, was developed to respond to both Japanese and Korean. No doubt, many readers have encountered these assistants in their smartphones or computers and have asked them for a weather update, used them to get information on local restaurants and shops, or instructed them to call or text a friend.

Smart speaker virtual assistants can read the news, play music, and tell jokes if required. If more assistants are installed in automotive systems, televisions, and air conditioning units, it will be possible to control those by voice as well. Amazon Echo is estimated to have sold 11 million units as of the end of 2016. The US survey firm eMarketer found that in May 2017, Amazon had a 70% share of the country’s market and that 35.7 million Americans used smart speakers at least once a month.

Pioneering Voice Recognition in a Virtual Pet Game

Now that the world is getting ready to switch from keyboards to voice recognition, there is a need for a Japanese-speaking AI conversation engine. Saitō “Yoot” Yutaka, creator of the bestselling Seaman virtual pet series, is among those working on the challenge. In 2015, he established the Seaman Artificial Intelligence Research Center.

“I’m getting on now, so I’ve thought about retiring. But it’s been eighteen years since I first made Seaman and I’ve still got all the knowledge from working on the later versions. I’m the only person who can build a Japanese conversation engine.”

Released in 1999, Seaman was one of the first games to use voice recognition. The titular creature has a fish body and a human head. As players raise it through various stages in its life cycle, the Seaman responds and reacts to what they say to it. At the time, however, voice recognition was quite primitive, so it often did not understand. As a last resort measure, Saitō designed it to get angry and criticize the player’s pronunciation when this happened. This successfully covered up the flaw and the arrogance of the Seaman character became part of the game’s appeal.

Seaman (© 1998–2017 OpenBook Inc.)

With a little inventiveness, Seaman managed to give the superficial impression of natural conversation. The development team put together scenarios imagining what the player would say and preparing appropriate replies for Seaman. The total number of scenarios were said to be enough to fill 20 telephone directories, and Saitō recorded all of Seaman’s lines himself. These scenarios represent the knowledge that he has built up.

“Amazon Echo just answers individual questions. These kinds of products take the same approach as I did with Seaman—it’s not artificial intelligence at all. We’re still only halfway to a conversation engine that knows what someone talking to it actually wants to say. AI can currently deal with requests like ‘Buy me some tickets,’ but we’re working on an AI that can make broader responses, so if someone says, ‘I got 100%,’ it will reply, ‘That’s great. That’s the second time, right?’”

  • [2018.03.16]
Related articles
Also in this series

Related articles

Video highlights

New series

  • From our columnists
  • In the news