No longer lost in translation

No longer lost in translation

The whole world had the same language and the same words. When they were migrating from the east, they came to a valley in the land of Shinar and settled there. They said to one another, “Come, let us mold bricks and harden them with fire.” They used bricks for stone, and bitumen for mortar. Then they said, “Come, let us build ourselves a city and a tower with its top in the sky, and so make a name for ourselves; otherwise we shall be scattered all over the earth.” The LORD came down to see the city and the tower that the people had built. Then the LORD said: If now, while they are one people and all have the same language, they have started to do this, nothing they presume to do will be out of their reach. Come, let us go down and there confuse their language, so that no one will understand the speech of another. So the LORD scattered them from there over all the earth, and they stopped building the city. That is why it was called Babel, because there the LORD confused the speech of all the world. From there the LORD scattered them over all the earth.

The quote above is from Genesis 11:1-9 and tells the story of the Tower of Babel, when humanity lost its ability to speak a common language. The Lord concluded that if we humans can all communicate flawlessly with each other, nothing will be out of our reach. I am not entirely sure why the Lord thought this was a bad thing? But we may find out, soon ...

A few weeks ago, I spent 10 days in East Asia, Hong Kong and Tokyo, specifically. As a Westerner that can't speak Cantonese or Japanese and can't read Kanji I am grateful for the new technologies we have available on our phones to help translate written and spoken words in the most common languages. In fact, we are about to have access to instant translators that will allow us all to speak in our mother tongue while understanding every word spoken to us in whatever language - just as if we all spoke the same language again. We are on the brink of widely available technology, for example embedded in our earbuds, that will instantly translate any foreign language we hear into our preferred language. Or glasses that will automatically translate signs and text. This technology is already available. Just to illustrate, I uploaded the picture above to ChatGPT and asked it to identify and translate all Kanji signs it can make out and put that into a table. This is what I got back:

Widely available, multi-modal, real-time translation technology will have far-reaching consequences. You don't have to believe the Bible to realize that. I'll refrain here from making broad predictions. Instead, I want focus on the very narrow area I know a little bit about - international tax, trade, and transfer pricing.

Translating TP Reports

Anyone working in this field deals with foreign languages all the time. A quick look at KPMG's Global Transfer Pricing Review shows that many countries either require or can ask for a local file report in their local language, including major jurisdictions like China, Japan, Mexico, and Italy. Historically, US multinationals that prepare their transfer pricing reports centrally in English would have wanted to get professional translation for reports for jurisdictions that require them to be in their local language. This can now be achieved by machine translation.

Machine translation was actually the original use case for the Transformer technology, the T in "GPT." In the seminal paper "Attention is all you need" Google researchers demonstrated that Transformer technology achieved far superior results in the field of machine translation as compared to other deep-learning approaches that had been used until then. This is the same technology that now underlies all major large-language models (LLMs), like OpenAI's ChatGPT or Google's Gemini.

As practitioners, we are obviously concerned with the accuracy of any translation. LLMs are known to hallucinate. And we also know that context really matters. See the exchange in Gemini below (note I am using German in these example because I am fluent in German so it makes it easier for me to evaluate the responses):

Tasse is the German word for (tea)cup. Note however what happens with a slight change in the prompt:

Capitalizing CUP is enough to make Gemini realize that this is a reference to the transfer pricing CUP method. Although, as you can see from the response, it's not 100 percent sure. But then again, if I asked a normal German/English bilingual person with limited or no knowledge of transfer pricing, they likely would still translate it to Tasse and assume the capitalization is just a misspelling. Note, I am using the free Gemini 1.5 Flash version and not Gemini Advanced.

Context matters, especially for LLMs. Hence a key prompting advice is to provide context and roles to LLMs - e.g. start prompts about transfer pricing with a sentence like "You are an expert in intercompany transactions, transfer pricing and associated tax regulations."

But even with that context, we can't be sure that LLM always get translations right, especially with idiosyncratic or firm-specific terms or acronyms that often appear in a transfer pricing report.

For example, if I ask Gemini to translate "The inventory is valued based on the FIFO-method." It returns "Der Bestand wird nach der FIFO-Methode bewertet." Note that it correctly did not try to translate the acronym "FIFO" as that same English acronym is typically used in German. However, while the German terms Gemini uses for inventory and method are not wrong and understandable, a more typical German wording would be "Die Vorr?te werden nach dem FIFO-Verfahren bewertet." The nice thing about LLMs is that you can instruct them in advance on how to handle certain cases. See the prompt below.

We can build out entire custom glossaries to instruct the LLM how to translate specific terms. This ensures the accuracies and readability of the translated reports. We can even train custom models on transfer pricing and business-specific terminology to ensure the highest accuracy.

Of course, even with all that, we may still be concerned about the accuracy. One could ask someone that speaks both languages to quickly review the translated version to ensure accuracy - the "human-in-the-loop" approach that is often recommended. The problem with the "human-in-the-loop" approach is that it also means "costs-in-the-loop" not to speak of "human-errors-in-the-loop." But there are ways to minimize that. First, there is a new ISO standard, ISO 5060:2024. Since the standard is so new, many tools have not yet been certified against it. But even an ISO certification doesn't provide an absolute guarantee.

Another approach is to retranslate the document back into the original language and compare (e.g. using MS Word's Document Compare function) the original version against the retranslated version. The picture below shows the redline of a re-transalted paragraph 1.2 of the OECD Guidelines. This was done using ChatGPT with no custom instructions or glossary.

There is a lot of red. But despite all that red, the translation is substantially correct, and the original meaning is preserved. If this was a long document, like many local files are, this sort of comparison would not be very helpful unless one is really concerned with granular details. There are also established translation metrics such as METEOR that can be used to calculate scores and define thresholds above which additional human review would be required.

And finally, we can use the LLM to check on the translation by uploading both the original and retranslated version and prompting for a comparison. This is what I did and asked ChatGPT identify any significant differences. This is the response it provided:

I could, of course, ask it in my prompt to provide a more condensed summary and a recommendation about the quality and fidelity of the translation. But the point here is that we can use the LLMs to check the LLMs. All this can be built into an automated workflow to ensure accuracy and humans would only be alerted and asked to step into the "loop" if material deviations in the content are detected.

Finally, we want to ensure consistent formatting of the translated version with the source version. Most TP reports are written with word processing software, like Microsoft Word, and contain formatting, graphics, etc. If you just uploade a TP report in word or Adobe format to ChatGPT or Gemini, they will translate the content but won't provide a file back in the exact same format. However, there are tools that translate entire documents while maintaining all formatting. KPMG's Digital Gateway, for example, leverages Microsoft Azure's Document Translation to do just that.

In summary, between the ability of LLMs to understand context, the ability to define custom TP glossaries and train custom TP translation models and the use of metrics and LLMs to check on LLM translations, the reliable automation of TP local file translation, even with hundreds of pages, while maintaining the original formatting, is now possible.

The only remaining issue is security. No company will want to uploade TP reports, that often contain sensitive information to an LLM without ensuring that the data is safe & secure and isn't used for other purposes. At KPMG, our Digital Gateway GenAI platform meets those requirements. Some companies have set-up trusted AI environments themselves. Either way, having a safe environment to deploy these technologies is critical.

Safe, reliable document translation has use cases well beyond just TP local files. But TP local files provide a good example of how this technology can be deployed to improve an existing process that all multi-nationals grapple with.

Enabling Intercompany Services

Beyond simplifying TP compliance processes, translation technologies will have a much bigger impact on TP by what it enables - leveraging even more off-shore locations for intercompany services. Traditionally, India and the Philippines have been favored locations for off-shore, low-cost service centers for US multi-nationals. A key reason is the abundance of trained employees that have a high fluency of English - a result of colonialism and their historic ties to the UK and the US. India has the second largest number of English speakers after the US. Providing services generally requires communication between the service provider and service recipient. That is obviously much easier when both speak the same language. Language is not the only reason for India's success, of course. But other countries that also have a large, talented and low-cost labor pool, like China, and that have been very successful manufacturing hubs, have been far less in demand as service delivery hubs.

The new translation technologies will change that. We will have video conference calls where the two parties can speak to each other in their native language and hear everything in their native language in real time. All documents will be auto-translated. Any PowerPoint slides or other documents shared on video calls will be visible in whatever the native language is of the video conference participant. Layer on evolving technologies like Google's Project Starline, Apple's Vision Pro, or Meta's Virtual Meeting Rooms and the collaboration between people thousands of miles apart and speaking different languages will become a lot easier.

These technologies will do two things:

  1. It will accelerate off-shoring in general because communication is one barrier to do so that is being removed. And it will enable a broader scope of activities and services to be off-shored.
  2. If will expand the pool of countries and potential employees that can provide these services, including countries that have a time-zone advantage over India or the Philippines. One of the biggest draw-backs for US multi-nationals working with off-shore centers in Asia is the time-zone difference. That problem doesn't exist for countries in Latin America where historically English has been far less prevalent. The three largest countries with English speakers in Latam combined (Mexico, Brazil, and Argentina) still only have 15% of the English-speaking population of India. That's about to change.

As I argued in an earlier posting, the future of transfer pricing is bright, in part because of the growth of intercompany services. According to the BEA, import of intercompany services from foreign affiliates of US MNEs tripled from 2013 to 2023 while the export of intercompany services grew over 50%. Translation technologies will only super-charge this trend.

One argument against all this is that AI will allow us to automate more and more of the tasks we are shipping off-shore. Take call centers, as an example, something that has been powering the off-shoring boom to the Philippines. AI is already impacting call center processes and it is still early days. Many of the chatbots still relied on pre-GenAI technologies and were met with customer frustrations. GenAI will improve all this and will significantly impact this industry, including lessening the need for off-shoring of certain tasks. But at the same time, the increased capabilities will enable other tasks to be off-shored. This includes many tasks currently carried out by tax departments and tax consultant in the US. No service industry or activity will be immune to this.

Tempting God

I have no idea how much more powerful AI will become in the next few years or whether we will reach superintelligence. But I am with Ethan Mollick who wrote a few days ago: "If AI development stopped this week we would have 5-10 years of absorbing the impact of current models on education, culture, healthcare, and business." We have so many use cases, especially in tax and transfer pricing, we can apply this to. Translation is only one of many use cases for GenAI. But it's a powerful one because it illustrates well how we can use this technology in a reliable and safe way to drive productivity in our business. More importantly it highlights the transformative power of this technology by enabling better and deeper global cooperation.

If the Bible is to be believed, God gave us all different languages and scattered us all over the Earth because humans were using the power of communication and cooperation to do things for less than noble purposes. Some will say God did what he did not because he feared humanity but because he wanted to save us from ourselves. The tower of Babel was a symbol of humanities hybris and pride.

It obviously is true that technologies can be used for good and bad. GenAI is no different. But there's no turning back on this technology. It's on us to figure out how we use it in a productive way so not to tempt whatever higher power you believe in.

Amit Ringshia

Principal at KPMG Tax

1 个月

Thanks for sharing Thomas Herr.

回复

要查看或添加评论,请登录

Thomas Herr的更多文章

  • ChatGPT Transcript - TP Negotiations

    ChatGPT Transcript - TP Negotiations

    I have a meeting with the IRS about a transfer pricing adjustment they raised. We will negotiate a settlement.

    1 条评论
  • Correlation is not causation

    Correlation is not causation

    I listened to a very good podcast episode today from the if/then podcast from Stanford Graduate School of Business. The…

    5 条评论
  • Importo B: Il buono, il brutto ...

    Importo B: Il buono, il brutto ...

    Important disclaimer: This article, like any other original writing of mine on LinkedIn, represents my personal views…

    6 条评论
  • The State of Transfer Pricing

    The State of Transfer Pricing

    I started in transfer pricing 27 years ago. At the time, I thought I'd do it for 2 or 3 years, get the name of a Big 4…

    20 条评论
  • Guess who's back, back again?

    Guess who's back, back again?

    When I started in transfer pricing, a long, long time ago, I learned about asset-intensity adjustments. In fact, we had…

    1 条评论
  • Formula Fights

    Formula Fights

    When my first-born child was 3 months old, my wife and I flew to Germany to introduce our daughter to my parents and…

    3 条评论
  • Hard Soft Skills

    Hard Soft Skills

    I was recently talking to Brittany Hardin Tanguay about our efforts to further develop our Rapid Process Improvement…

    5 条评论
  • Serving Excellence

    Serving Excellence

    As a teenager, I worked a lot in hotels, including as waiter, barman, kitchen staff, even housekeeping. I thought I’d…

    1 条评论
  • Ti/en Years

    Ti/en Years

    Tin is one of the oldest metals known by man. It is known for its resilience, pliability and inability to rust.

社区洞察

其他会员也浏览了