Four keys of intelligent product voice design

  时间: 2021-08-25      70    

We will eventually communicate with machines in a natural way. In the design direction of intelligent products with voice interaction as the core function, the voice design suggestions are

① Sense of nature;

Avoid monotony, be as natural as human speech, sound proactive and willing in tone, and the words and sentences synthesized by each phoneme are clear, recognizable and natural.

The information of human speech contains phonetic acoustic features and text semantics. The phonetic acoustic features are mainly prosodic features (referring to the way phonemes combine idioms), including tone, stress, pause, speed of speech, etc. Chinese is a tonal language, and tone carries very important emotional information. Voice is a kind of natural interaction. It needs to achieve the feeling of "nature" in order to make users feel available.

How to make Siri sound more natural?

The upgrade goal of Siri in ios11 version is to "make Siri sound more natural like people". The implementation method is through in-depth learning. Each expression has slightly different sound waves, and each sentence contains dozens or hundreds of phonemes.

Siri finds out the perfect sound combination for each voice. The phonemes are collected by the candidates selected by apple. The acquisition of emotional corpus is that Apple listens to it anonymously, and then used for in-depth learning for Siri training.

② Once the "voice" is determined, it should not be changed at will;

Once the human voice has been rooted in the user's ears, it should not be changed at will. If the background picture of the mobile phone interface is changed like a person changing a new dress, and the "human voice" of the intelligent product with voice interaction as the core function is changed, it is like getting to know a stranger again. The old saying says "if you hear it, you see it", people will naturally associate the voice with someone, Whoever the new voice is, the "character modeling" will be carried out again.

③ Dialogue as people do;

First, the dialogue should be smooth and timely feedback should be achieved. If there is a pause, it should not be too long. The script is short and effective. Don't take the initiative to terminate the dialogue and promote continuous communication as much as possible. Of course, you can't let users complete a task in the form of commands. This is not an appropriate dialogue. It may be a bit like the relationship between superiors and subordinates, which will lead to resentment and resistance.

④ Try to initiate a conversation after sensing the user;

In a few days, maybe Amazon echo can recognize and calculate according to the speaker's voice emotion, and better understand the user's mood at the moment of saying this sentence through prosodic characteristics (intonation, loudness, rhythm, voice quality, etc.), just like the line "you sound a little unhappy today" in the movie her. It can perceive you and try to initiate a conversation.