2018 Hardware Review: The Fad of Translation Devices

Over the past 12 months, I have been working in Microsoft Global Solution Innovation to drive new projects powered by Cloud and AI. One of those exciting incubation categories is the translation device. As the machine translation technology evolves significantly with neural network, it makes it possible to land AI speech on device as a product form factor.

More than 50 translation device products were launched in 2017/2018. A glimpse of translation products:

  • iFlytek: $450, standalone device, online speech translation(including Chinese dialects), photo translation, offline speech translation;
  • Xiaomi Moyu: $36, WiFi device, online speech translation, VPA(virtual personal assistant);
  • Cheetah Mobile: $30, Bluetooth device, online speech translation;
  • Sogou: $220, standalone device, online speech translation, meeting dictation, photo translation;
  • iLi: $189, standalone device, offline speech translation (Japanese/Chinese/English);
  • Google Pixel buds, $159, earbuds, online speech translation

Through the hardware solutions, we can tell which solution is more like the phone accessory or standalone device, how complicated the solution is, and therefore why the price range can differ from $30 to $500. This article can give you some thoughts about the relevant supply chain knowledge about translation devices.

Why Translation Device?

Before jumping to the hardware discussion, many people would ask: why do we need a translation device as we have translation APPs? Such as Google Translate, Microsoft Translator, etc.

This is the question that my partners are asked almost every day. The essential reason to use a device instead of an app is that the former is specially designed for the conversation scenario, in which the microphone and speakers should work in mid-field voice pick-up and play. Mobile microphones are designed for near-field communication, thus not a fit for general face-to-face conversation at a comfortable distance.

Some other considerations:

Privacy

Phone is a private and important device. It feels awkward to pass your phone to a stranger .

Convenience

It is easy and efficient to use translation device when you are resorting someone for help, as it is usually a one-button experience, so out of box. Whereas it takes several clicks to use a phone from unlock the screen to use the APP.

Power consumption

Nowadays most speech and machine translation processing is running on cloud. There are extra computing resources spent on APP running, audio record and play, format transfer, etc.

Some people suggest a practical solution that each can use their own phone APP and make a group, which can solve the inconvenience and privacy problem. Indeed Microsoft Translator does have this feature, but it requires that everyone has already downloaded the APP in advance. It may not work in stranger communication situation.

The Supply Chain behind

Year 2018 was the year of a sudden influx of translation device applications — there are over 50 translation device companies based on Shenzhen’s supply chain to quickly enter the AI market. Most of the manufacturers are typical “Shenzhen-featured”, chasing the major consumer electronics hot spots, in an attempt to cash in and call it a day. For example, there is a Shenzhen company whose main products are PC and tablet peripheral, launching an all-in-one translation device that integrates AI translation engines from Google, Microsoft, and iFlytek to upsell through the PC channels.

There are 3 mainstream hardware solutions of translation device.

Bluetooth accessory solution. It is the most common one in the market. Most cheap translation devices (around $30) are using this solution. The working logic behind this solution is that the device picks up the speech with mic-array, and connects to the phone APP via Bluetooth. Technically it is a mobile accessory, and the phone is where the translation engine works, either by connecting to the cloud or by offline.

WiFi speaker solution. Another mainstream solution is similar to WiFi speaker solution, which requires phone set-up to connect online. After the initial set-up, it can work as a standalone device in WiFi environment.

Phone/Tablet solution. It is an increasingly popular solution in the market to differentiate from the above two solutions. It is a standalone device with a screen, which supports a camera for OCR translation and offline translation under bad network situations. And of course, the price range will be higher, around $500. We see the trend that the screens will be bigger and bigger, fron 1.54' inch to 4' inch screens.

It can also support SIM card inside, and some products even upgrade it into a mobile hotspot and power bank as an integral solution.

Technology- driven, not hardware- driven

In 2018, the number of outbound tourists in China reached 140 million, a YoY increase of 13.5%. Last year the shipment of translation devices was about 600k units, and it is expected to reach a market size of 30 to 40 million units within three to five years.

So far there are not giant differences among AI technology companies in machine translation models. The key win in the future is the corpus. Translation devices are for specific communication scenarios, such as shopping/ordering the food/asking the way when travelling in different countries, or multi-lingual meetings. Details do matter in user experience: speech translation is not a simple concatenation of phrase translations. Whether users would keep using the solution depends on whether the technology can help different parties to understand better across language barriers, which includes how to solve the problem of sentence breaks, how to make better fault tolerance for recognition errors, how to recognize accents, how to make sense when it involves slangs and technical terms.

There will be more scenario-based and more commercialized products and applications. Let’s expect more innovations in 2019.

Grace Peng

Digital Native Solution Specialist at Microsoft

5 年

优秀啊!

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了