登录查看更多内容

The Creative, Occasionally Messy World of Textual?Data

Towards Data Science

Your home for data science & AI. A publication sharing concepts, ideas and codes.

发布日期: 2023年11月16日

For several years, the intersection of text and data stayed (more or less) within the realm of natural language processing (NLP)?—?the wide range of machine learning tasks that leverage textual data for prediction, classification, and recommendation tools.

The rise of large language models has introduced a host of exciting new possibilities into the field, with novel use cases and innovative workflows popping up at a rapid clip. Our highlights this week represent a wide cross-section of concepts and approaches that dig deeper into this emerging area. From prompt engineering to text-to-image and text-to-speech applications, we’re thrilled to share work by authors who explore the creative possibilities of textual data as both inputs and outputs of these powerful models. Let’s dive in.

Lost in DALL-E 3 Translation. What happens when you use text-to-image tools like DALL-E 3 in languages other than English? Yennie Jun continues to explore the discrepancies in model performance for users working in under-resourced languages and the ways in which gender and other biases seep through into the generated images.
How to Convert Any Text Into a Graph of Concepts. In his latest post, Rahul Nayak dives deep into the world of Knowledge-Graph Augmented Generation, walking us through the process of transforming a text corpus into a Graph of Concepts (GC) and then visualizing it to detect patterns and draw meaningful insights.

RAG: How to Talk to Your Data. We’ve covered retrieval-augmented generation many times in recent months, but Mariya Mansurova ’s addition to the conversation is still very much worth your time: it presents a compelling, practical workflow for analyzing customer feedback using ChatGPT.
FastSpeech: Paper Overview & Implementation. Text-to-speech tools have made major strides in recent years. To gain a solid understanding of how they work and how transformers are employed to improve their performance, don’t miss Essam Wisam ’s accessible introduction to the FastSpeech paper from 2019, which facilitated much of the progress we’ve seen in this domain.
Unlocking the Power of Text Data with LLMs. If you’re a beginner who’d like to start experimenting with cutting-edge text-data techniques, Sofia Rosa ’s step-by-step guide will get you rolling up your sleeves in no time. It walks us through an entire workflow, from downloading data to working with GPT-3 and analyzing results.
A Universal Roadmap for Prompt Engineering: The Contextual Scaffolds Framework (CSF). Prompt engineering has emerged as a crucial component in the interplay between human intuition and large language models’ capabilities. Giuseppe Scalamogna goes beyond basic prompting tips and tricks to present the contextual scaffolds framework (CSF), a “general purpose mental model for effective prompt engineering.”

We hope you have some time to branch out into other topics this week?—?here are some of our recent standouts on data visualization, generated-content detection, and more:

领英推荐

Introduction to iAsk AI

Blockchain Council 6 个月前

Retrieval-Augmented Generation (RAG) and Artificial…

Prof. Ahmed Banafa 5 个月前

AI News #4. The growing relevance of semantic search

Avenga 4 个月前

Can artificial intelligence help us understand how the brain works? Stephanie Shen explores this consequential question by mapping out the similarities between biological learning and artificial neural networks.
Matplotlib is a ubiquitous and powerful visualization tool, but also comes with its share of idiosyncrasies. Lee Vaughan ’s beginner-friendly guide will help you start your learning journey on the right foot.
For all the marketing-focused data scientists out there: don’t miss Hajime Takeda ’s clear and detailed introduction to customer lifetime-value forecasting.
The ability to distinguish between human-produced and model-generated content has never been more crucial—or difficult. Stephanie Kirmer unpacks the current stakes and challenges around this conundrum.
Looking to do some hands-on tinkering this week? Amanda Iglesias Moreno ’s tutorial will guide you through the construction of hexagon maps with H3 and Plotly.
In his latest deep dive, Jeffrey N?f takes a close look at variable importance in the context of random forests, covering both traditional methods and newer developments.

Thank you for supporting the work of our authors! If you enjoy the articles you read on TDS, consider becoming a Medium member?—?it unlocks our entire archive (and every other post on Medium, too).

Until the next Variable,

TDS Editors

Alan Delon Rolim Grangeiro

1 年

https://linkedradar.com/

要查看或添加评论，请登录

The Creative, Occasionally Messy World of Textual?Data

Towards Data Science

Your home for data science & AI. A publication sharing concepts, ideas and codes.

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Optimize Your Content for Google AI Overviews and Become a Go-To Source for Chatbots and Large Language Models

Unlocking the Full Potential of Large Language Models: A Guide to Advanced Prompt Engineering

Why Small Language Models (SLMs) could be the Game Changer your business needs

How Retrieval-Augmented Generation (RAG) Helps Reduce AI Hallucinations

Unlocking the Potential of Large Language Models with RAG Architecture | #rag #llm #ai #data #innovation #technology #datascience

Google IO 2022 Keynote Highlights

Explainable AI: Text Predictions with LIME

Can We Really Hand-Engineer Level 2+ AGI?

What is Retrieval-Augmented Generation (RAG) and How to Secure RAG Solutions: A Technical Deep Dive

The Evolution of Text-to-Text Generation Models: A Comprehensive Overview

领英推荐

Getting Started with Multimodal AI, CPUs and GPUs, One-Hot Encoding, and Other Beginner-Friendly Guides

2024年11月21日

Network Analysis, Diffusion Models, Data Lakehouses, and More: Our Best Recent Deep Dives

2024年11月14日

Beyond Math and Python: The Other Key Data Science Skills You Should Develop

2024年11月7日

LLM Evaluation, AI Side Projects, User-Friendly Data Tables, and Other October Must-Reads

2024年10月31日

AI in Practice: How to Choose and Deploy the Right Strategy

2024年10月24日

What Does It Take to Get Your Foot in the Door as a Data Scientist?

2024年10月17日

All About AI Agents: Autonomy, Reasoning, Alignment, and More

2024年10月10日

Graph RAG, Automated Prompt Engineering, Agent Frameworks, and Other September Must-Reads

2024年10月3日

A Close Look at AI Pain Points, and How to (Sometimes) Resolve Them

2024年9月26日

How to Build Your Own Roadmap for a Successful Data Science Career

2024年9月19日

社区洞察

其他会员也浏览了

Optimize Your Content for Google AI Overviews and Become a Go-To Source for Chatbots and Large Language Models

Unlocking the Full Potential of Large Language Models: A Guide to Advanced Prompt Engineering

Why Small Language Models (SLMs) could be the Game Changer your business needs

How Retrieval-Augmented Generation (RAG) Helps Reduce AI Hallucinations

Unlocking the Potential of Large Language Models with RAG Architecture | #rag #llm #ai #data #innovation #technology #datascience

Google IO 2022 Keynote Highlights

Explainable AI: Text Predictions with LIME

Can We Really Hand-Engineer Level 2+ AGI?

What is Retrieval-Augmented Generation (RAG) and How to Secure RAG Solutions: A Technical Deep Dive

The Evolution of Text-to-Text Generation Models: A Comprehensive Overview