登录查看更多内容

Sanskrit & AI: languages, Ambiguity and Efficiency

Amrith Krishna

AI Researcher and Entrepreneur | Alum at UniCambridge, ITU | PhD at IITKgp | AI Researcher | Youtuber - 100K+ Subs

发布日期: 2024年5月2日

I can't even begin to express how blown away I am by the response to my very first article on Sanskrit & AI. Seriously, you guys rock! So, here I am, back with a follow-up, all thanks to a super intriguing comment from Harsh Raj.

People often claim that Sanskrit is the least ambiguous language. So can you tell me about it's intermediate representation that happens in the hidden layers? How is it different than that of English for the same semantic sentence? I am really curious whether we can augment the other language NLP training with Sanskrit if it is less ambiguous.

Here, lets look at the premise of the comment: "Sanskrit is the least ambiguous language". Now, that begs the question, what are the most ambiguous natural languages? More importantly, how do we measure it and then rank these languages. The nerdy computer scientist in me would be tempted to go into the literature on undecidability, and build upon the work on Context free grammar (CFG) languages by Ginsburg and Ullian, where they show that determining the ambiguity in such languages is undecidable.

Let's step back and take a wider view. Is it truly ideal for a language to be completely devoid of ambiguity? From where I'm standing, definitely not. I would say, a natural language's expressive power should enable humans to produce both ambiguous and unambiguous statements. For instance, would anyone actually enjoy spending an evening or a vacation leisurely reading legal documents or contracts? I highly doubt it. But why are these texts often so painstakingly pedantic to the point of being a bit dull or excessive? It all boils down to this: these documents are being optimised to be as unambiguous as possible. An alternative interpretation of a sentence in a contract would be costly for the stakeholders.

If we look at social contexts, unambiguity need not be the only objective to be optimised for. Efficiency also comes to the picture here. We want our communication to be smooth, with minimal effort from both the sender and receiver. Often, we rely on context to iron out any potential misunderstandings. But, balancing efficiency and clarity isn't always a walk in the park. As we can imagine, optimising for both efficiency and unambiguousness often may lead to need for trade-offs in the way we communicate. Languages do evolve and incorporate various linguistic tools and techniques to optimise for efficiency and unambiguity. For instance, most languages would have their common words to be short, including functional words. However, not every such property is universal across all languages. For instance, languages like English is a vocabulary heavy language and maintain a large inventory of static words. At the same time, languages like Sanskrit (or German) relies more on the generative process of productivity. For instance, the following is a single "word" in Sanskrit, rather a compound word:

pravaramuku?ama?imarīcima?jarīcayacarcitacara?ayugala

It means, "O! the one whose dual feet are covered by the cluster of brilliant rays from the gems of the best crowns, from Panchatantra." However, it is created by combining 9 simple word stems to form a single compound word:

pravara-muku?a-ma?i-marīci-ma?jarī-caya-carcita-cara?a-yugala

Sanskrit relies on productivity, rather than on maintaining an inventory of large number of words. Now, which approach is more efficient? Which approach would make it less ambiguous. It is quite difficult to say. English and Sanskrit use two different linguistic tools to achieve quite similar outcomes in communication. Similarly, take sentence structure, for instance. English leans on word order to convey meaning, while Sanskrit relies on its morphology. Hence one would find (seemingly) arbitrary free word order sentences in Sanskrit, especially in the classical era literature. Again, both languages tackle the efficiency-ambiguity trade-off in their own ways.

领英推荐

Top LLM Papers of the Week (July Week-1 2024)

Kalyan KS 8 个月前

The Origination of Eight Major Methods For FineTuning…

Bruce Cottman 5 个月前

A philosophical perspective! Large Language Models can…

Sanjay Basu PhD 1 年前

Finally, let me showcase few scenarios where ambiguity is not only desirable but celebrated.

Sri Raghava Yadhaveeyam is a 30 stanza "bidirectional" poem in Sanskrit, which narrates the story of Rāma when read forwards, and when read backwards, it plunges into a story from Krishna's life. It is a display of linguistic and prosodic mastery. Similarly, Avadhanam is a literary improv performance, that encourages mastery of various cognitive capabilities including observation, memory, multitasking, task switching, retrieval, reasoning and creativity, nothing short of a mental gymnastics. It was a prevalent entertainment performance performed in various Indian languages. Please refer to one such performance in the video given below.

Now, let's cap things off with one last example of celebrating ambiguity. if we look at the following conversation between Sri Krishna and Satyabhama, his wife:

(Source)

????????? ?? ????? ??????? ?????? ????? ??? ?????? ?? ????? ??? ?????? ? ?? ??????? ??? ?????????? ?????????? ???? ??????????? ????? ????????? ???? ??? ????????? ??????? ???????? ???????????? ???? ??????????????

According to this shloka, Lord Krishna visits his wife Satyabhama when she is upset. Finding the door closed, he knocks. Pretending to not know, Satyabhama asks her aide Vishikha to check who it is. Krishna introduces himself with his name but Satyabhama finds another meaning for the word. Krishna starts describing himself with other words but each time Satyabhama teases him by finding the other meanings of the words.

The conversation goes this way:-

Satyabhama :- ????????? ?? ????? ??????? ??????? (O Vishikha, who knocks on the door?)
Krishna :- ????? (I am Madhava) | Satyabhama :- ??? ??????? (Is it the spring season?)
Krishna :- ?? ????? (I am Chakri, the holder of a disc) | Satyabhama :- ??? ??????? (A potter then?)
Krishna :- ? ?? ??????? (No, I am the one who holds the Earth) | Satyabhama :- ??? ?????????? ?????????? (Is it Adi Shesha, the serpent king, who carries the earth on his head?)
Krishna :- ???? ??????????? (No. I am the one who suppressed the poisonous snake Kaliya) Satyabhama :- ????? ?????:? (Is it Garuda, the King of Birds?)
Krishna :- ?? ???? (No, I am Hari) | Satyabhama :- ??? ?????????? (Is it a monkey?)
Finally the poet says, ??????? ???????? ???????????? ???? ???????????? (May Lord Krishna, thus defeated by Satyabhama in a wordplay, protect you)

I hope I've covered most of the stuff Harsh brought up here. But there's still some AI/NLP bits in his comment that need addressing. Let's save that for next week. I've got two keywords to tease you with until then: Behaviourism and Cognitive processes.

Amrith Krishna

AI Researcher and Entrepreneur | Alum at UniCambridge, ITU | PhD at IITKgp | AI Researcher | Youtuber - 100K+ Subs

10 个月

Sanskrit & AI Part 3: https://www.dhirubhai.net/pulse/sanskrit-ai-part-3-aint-thing-free-word-order-really-amrith-krishna-ggcfc

Kavya Manohar

Building AI for Justice Systems | PhD in Speech Technology | Language Technology | Research | Scientific Writing

10 个月

Amrith Krishna, Loved how this article covered various aspects of the aesthetics of linguistic ambiguity. Thanks for introducing Sri Raghava Yadhaveeyam.

Nishant Jha

IEEE Member (Student) | Master's Student (Data Science) @ UNSW Sydney | Artificial Intelligence, Blockchain & Cryptography Researcher | Womanium Quantum Scholar' 23 | ACM ICPC' 23 (South Pacific) Regionalist

10 个月

Hello Mr. Krishna, I'm currently working on the same topic and currently I'm working of CFG design for sanskrit languages and compilers and I would be glad if I get any resource or input from your end. Thanks.

1 次回应

Raviraja Bhat

Data Science Engineer - AVP @ Swiss Re | AI/ML Specialist | LLM and NLP Expert

10 个月

Amrith Krishna Thanks for this detailed insights ?? I am really curious to know from you what are your thoughts on using vedic chanting recitation styles (Samhita, Pada, Krama, Jata, Maalaa, Sikha, Rekha, Dhwaja, Danda, Rathaa, Ghana) to enhance our approach to any language understanding tasks ?

1 次回应

查看更多评论

要查看或添加评论，请登录

Amrith Krishna的更多文章

Beyond Efficiency: Kerala’s Bananas and the Cost of Overoptimization

2024年12月19日

Beyond Efficiency: Kerala’s Bananas and the Cost of Overoptimization

In the hallowed halls of contemporary art, where meaning often wears the mask of absurdity, a duct-taped banana named…

1 条评论
The Tragedy of Celebrating Shortcuts

2024年11月20日

The Tragedy of Celebrating Shortcuts

A Cautionary Tale of the Tragedy of the Commons There's an unsettling tendency in our society to celebrate clever…
The Polite Smile Trap: How to Stay Stuck in Your Career

2024年11月17日

The Polite Smile Trap: How to Stay Stuck in Your Career

In the corporate circus, where everyone juggles KPIs and performs stunts for promotions, a new trick has…

1 条评论
Why Enterprise PoCs Take Months While MVPs Are Built in Weeks: A Curious Case

2024年11月15日

Why Enterprise PoCs Take Months While MVPs Are Built in Weeks: A Curious Case

Here’s an oddity to consider: leading pre-seed incubators like Y Combinator famously champion the idea of building…

1 条评论
Parallel and Tutorial Colleges: Kerala’s Free Market Revolution in a Socialist Landscape

2024年11月12日

Parallel and Tutorial Colleges: Kerala’s Free Market Revolution in a Socialist Landscape

In the cradle of India’s socialistic utopia, where public education was the sacred cow of policy and state-run colleges…

1 条评论
How a Would-Be Carpenter Became the 'Godfather of AI'

2024年10月31日

How a Would-Be Carpenter Became the 'Godfather of AI'

Dr. Geoffrey Hinton, a Nobel Laureate in Physics and a name synonymous with artificial intelligence, almost took a very…

1 条评论
?? Ever wished to learn like Steve Jobs, Leonardo da Vinci, or Richard Feynman—but faster and more efficiently? ??

2024年10月30日

?? Ever wished to learn like Steve Jobs, Leonardo da Vinci, or Richard Feynman—but faster and more efficiently? ??

Imagine if you could tap into the learning processes of history’s greatest autodidacts—the visionaries who mastered…

3 条评论
"Why Be Passive When You Can Create Your Own Learning? Generative AI Puts You in Control!" ????

2024年10月26日

"Why Be Passive When You Can Create Your Own Learning? Generative AI Puts You in Control!" ????

If you’ve ever felt overwhelmed by generic online courses or struggled to find resources that truly meet your needs…
Bharat in BharatGen: A Vision of Inclusivity and Universalism

2024年10月10日

Bharat in BharatGen: A Vision of Inclusivity and Universalism

At BharatGen, building AI for Bharat means transcending geographies and demographics to create an inclusive framework…

1 条评论
Paper cuts: tiny things that matter in generative AI

2024年5月17日

Paper cuts: tiny things that matter in generative AI

Last week was pretty eventful with announcements from OpenAI and Google, not to mention Meta's release of Llama 3 last…

3 条评论

See all articles

Sanskrit & AI: languages, Ambiguity and Efficiency

Amrith Krishna

AI Researcher and Entrepreneur | Alum at UniCambridge, ITU | PhD at IITKgp | AI Researcher | Youtuber - 100K+ Subs

领英推荐

Amrith Krishna的更多文章

社区洞察

其他会员也浏览了

SCBX, SCB 10X AND SAMBANOVA ENTER INTO MILESTONE AGREEMENT TO EXPAND AI MODELS; TYPHOON THAI LLM ADDED TO SAMBA-1

Everything about LLM Hallucinations

Evaluating Large Language Models (LLMs): A Standard Set of Metrics for Accurate Assessment

How RAG Works: A Detailed Explanation of its Components and Steps

A Guide to Training Your Own Language Model

Give Us the Facts: Large Language Models vs. Knowledge Graphs

Finetuning Large Language Models: A Comprehensive Guide

From Chaos to Clarity: Streamlining Data Cleansing Using Large Language Models

Revealing the Gaps: Evaluating Large Language Models with New Benchmarks and Metrics

Evaluating Large Language Models: Key Metrics for Comprehensive Performance Assessment

领英推荐

Amrith Krishna的更多文章

Beyond Efficiency: Kerala’s Bananas and the Cost of Overoptimization

The Tragedy of Celebrating Shortcuts

The Polite Smile Trap: How to Stay Stuck in Your Career

Why Enterprise PoCs Take Months While MVPs Are Built in Weeks: A Curious Case

Parallel and Tutorial Colleges: Kerala’s Free Market Revolution in a Socialist Landscape

How a Would-Be Carpenter Became the 'Godfather of AI'

?? Ever wished to learn like Steve Jobs, Leonardo da Vinci, or Richard Feynman—but faster and more efficiently? ??

"Why Be Passive When You Can Create Your Own Learning? Generative AI Puts You in Control!" ????

Bharat in BharatGen: A Vision of Inclusivity and Universalism

Paper cuts: tiny things that matter in generative AI

社区洞察

其他会员也浏览了

SCBX, SCB 10X AND SAMBANOVA ENTER INTO MILESTONE AGREEMENT TO EXPAND AI MODELS; TYPHOON THAI LLM ADDED TO SAMBA-1

Everything about LLM Hallucinations

Evaluating Large Language Models (LLMs): A Standard Set of Metrics for Accurate Assessment

How RAG Works: A Detailed Explanation of its Components and Steps

A Guide to Training Your Own Language Model

Give Us the Facts: Large Language Models vs. Knowledge Graphs

Finetuning Large Language Models: A Comprehensive Guide

From Chaos to Clarity: Streamlining Data Cleansing Using Large Language Models

Revealing the Gaps: Evaluating Large Language Models with New Benchmarks and Metrics

Evaluating Large Language Models: Key Metrics for Comprehensive Performance Assessment