Voice technology or speaker recognition is the ability of a machine or software to receive, interpret and carryout spoken commands or instructions. This technology is expanding quickly because it is easy to use, feels natural and provides instant answers and information. It is employed by consumers to shop, make reservations, get directions, book travel, etc. To understand the effect and importance of voice technology on consumer behavior, it is important to know how voice affects our lives. Also worth understanding, will be how synthetic (AI) voice differentiates itself from natural (human) voice.
How does voice affect our lives?
Voice helps us to navigate through social situations of life. As humans, our voice begins to develop from pregnancy until adulthood and will be influenced by our environment especially how we interact, what we listen to or watch on TV.
It can be developed to the point whereby it becomes our identity. This means the way we speak will determine how others perceive us and how we fit into different social groups or families. This is because our accent or manner of speaking is interpreted differently within our society. However, through conscious effort we can change the message behind our voice by adapting it to how we want to be perceived as did the former UK Prime Minister Margaret Thatcher.
Besides creating a certain perception about us, voice can also affect our emotions or moods as well as our cognitive process by altering the way we think, remember, judge, solve problems or learn.
How does synthetic (AI) voices influence us differently from real human voice
A natural voice is simultaneously made up of changes in pitch, intensity and duration of words and speech segment. Again, natural voice has redundant (repetitive) acoustic cues to speech segment. These factors coupled with the fact that an individual encounter natural voice in everyday life since from childhood makes listening and understanding natural voice effortless and trustworthy.
Synthetic voice however, is generally produced by a text-to-speech system and lacks natural phonetic variation and natural variation in pitch, level and intonation encountered with natural voice. In addition it has minimal redundancy in the acoustic cues to speech segment. These factors make it difficult for certain individuals to relate individual words or sentences to the overall meaning of a text. This means more effort is required when listening to synthetic voice. Again, it can easily be distinguished from real human voice and will by default create a perception in our minds or a sense of unreliability. Practically however, synthetic voice is good for voicing out commands and for helping the visually impaired to navigate through life.
Although the ability to understand or comprehend synthetic and natural voice produce an error rate of 7.6 % and 2.7 % respectively, no significant difference have been reported in a person’s ability to comprehend natural and synthetic speech. As such both voices are generally regarded as equivalent. But a recent study showed that humans prefer to interact with AI voices that display human-like emotions because it creates a sense of familiarity and puts them at ease.
How does AI voices impact consumer behavior or buying power
According to Think with Google, voice technology affects consumer behavior and buying power because of the following advantages:
- It is quick and efficient. This means you can easily multitask
- It provides instant answers and information
- They are always available
- It makes daily routines much easier
- It creates a new form of human relationship by engaging with its owner as if it was a friend. Owners are even saying words such as “please,” “thank you,” and even “sorry.”
Sun et al reported that, consumers spend 23 % more time when shopping with AI voices. This is true especially for products that do not require active search or comparison. Also AI voice can list features, facts and statistic that can quickly convince an individual to purchase a product. But AI voice cannot detect human emotions and hence might not be able to create a sense of urgency necessary to persuade an unconvinced customer to buy a product or service. This means AI voice cannot be reliable and trustworthy all the time.
The most important aspect of synthetic voice is its one-dimensional nature. This means the more certain attributes are met, the more customers will be satisfied and vice versa.