My Early Days in R&D:  Speech Synthesis and NLP

My Early Days in R&D: Speech Synthesis and NLP

The Second Work 2000-2002:

My Early Days in R&D: Experimenting with Speech Synthesis and NLP

Employer:

Tata CSRL (Cognitive Systems Research Lab, now Tata Innovation Lab) along with research co-ordination with TIFR (Tata Institute of Fundamental Research)

The Love:

Absolutely cherished my time at CSRL! The cozy, intimate office in a charming small building, was filled with incredibly supportive colleagues who felt like family. And the icing on the cake? The delightful trips to TIFR, exploring the most breathtaking places in Mumbai, and savoring the delectable delights in the canteen – a true paradise for the taste buds! ?????

?The Reason:

As some of you may recall from my previous article, I embarked on a fascinating journey with D2K while doing so I utilized the Shusha.ttf font to translate various forms and reports into Hindi. Along the way, I delved into the intricacies of enhancing CSD forms and reports with some nifty features, all through self-guided exploration. Showed that to my former manager, and she saw me as an ideal candidate for R&D projects. ??

The Work:

Well, my initial R&D work was to learn the ins and outs of Klatt's text-to-speech system and how phonemes (the smallest units of sound in a language) work for Indian languages, especially Hindi and Sanskrit. Luckily, thanks to my Hindi-medium schooling and Sanskrit language studies, I was uniquely equipped for this task. Who knew my high school subjects would one day prove so useful in the wacky world of research and development? But thanks to all those Hindi essays and Sanskrit grammar, word formations, and phoneme relationships, I was ready to dive into the English-to-Hindi phonemic waters of speech synthesis. Just goes to show you never know when your school studies might provide the perfect foundation for experimental R&D work! Though I have to admit conjugating verbal roots in Sanskrit class was a lot more straightforward than figuring out how to code the 'Meinra Bharat Ek Mahaan Desh Hai' sound, Heard these lines in so many variations created by computer :). Did you know that the legendary Stephen Hawking spoke with the same digital tune his entire life? That's right, Klatt's Speech Synthesis was his vocal card! So, if you've ever wondered how TTS sounded in those days :)

?Key Learnings:

This early R&D work taught me the research process, the value of collaborations, and how to have fun tinkering with computer science and math. I got hands-on experience coding modifications to Klatt's TTS, especially tuning the filters and parameters in C++ to better generate Indian language sounds. Shout out to Prof. PVS Rao for his guidance, my manager Akhilesh Srivastava for recognizing my skills, and my colleague Baiju Mathew for making this project so enjoyable. In the end, I gained invaluable research and development experience that set me up for future success; Even been part of a research paper first of its kind for my resume :), and represented our project and solution to NASSCOM 2000.

?The Second Project:

NLP (Natural Language Processing) was another component of this project, as my colleague was leaving to pursue a Master's degree in the US. This was still the very early days for machine learning, pattern recognition, word dictionaries, and brain modeling - what we now call neural networks and models like CNNs and RNNs. At the time, NLP was an infant field and we were experimenting with some of the foundational concepts like parsing sentences, categorizing words, and modeling language statistically. My colleague's project meant I had to pick up and continue their NLP work, giving me firsthand experience with these nascent technologies. It was exciting to be involved in NLP at such an early stage, right as the field was starting to take shape.

Things have really changed in the world of NLP and AI. Back then we were building foundational linguistics models from scratch, but now we have powerful pre-trained models like Transformers and GPTs readily available in libraries to utilize, refine, and extend. With the advances in attention mechanisms, open-source frameworks, cloud computing, and dataset availability, NLP has become an exciting field focused on leveraging massive models and computational power to understand and generate language. Gone are the days of coding basic statistical algorithms by hand - now we can tap into the knowledge contained in gigantic models pre-trained on vast amounts of text data. There's an incredible amount of momentum in areas like natural language understanding, language generation, and multimodal AI. The progress over the past decades has been astounding, and I'm eager to see what the coming years of AI research will uncover. However, it does make me fondly recall those early pioneering days working on fundamental NLP challenges from the ground up!

?

That early research experience sparked a lifelong passion for innovation. Even now, the spirit of curiosity that drove my initial R&D work has never left me. I'm still a tinkerer at heart - whenever I get the chance, I love to dabble in small research "Kandi" projects just for the fun of learning something new. Though my career has taken me in various directions, I'll always be that eager scientist from my early days discovering the joys of linguistic analysis and programming synthesis. The research bug bit me hard back then, and I'll never lose that hands-on enthusiasm for delving into the unknown and pushing the boundaries of what technology can do. Those R&D skills I cultivated have become core to who I am.

?

I hope you enjoyed getting a glimpse into my early days of research and development work! If you made it this far, thank you for reading - I appreciate you taking the time to learn about my experiences. Feel free to leave a comment sharing your own learnings, or let me know if you'd be interested in hearing about more of my career journey. I always enjoy connecting with readers, swapping stories, and discussing the fascinating world of technology. Until next time, be well and keep following your own curiosity wherever it leads. The path of lifelong learning is an adventure!


要查看或添加评论,请登录

Om Baghel的更多文章

社区洞察

其他会员也浏览了