Haunted By Data Readiness

Haunted By Data Readiness

Welcome to Granica's October 2024 issue of the Data Foundation, where we'll cover key topics that show data readiness for AI doesn't have to be so scary! ?? ??

1. Granica Screen now available on AWS Marketplace ??

We’re excited to announce that Screen, our Data Safety solution for AI and analytics, is now available on AWS Marketplace! This listing makes it easier for enterprise customers to access and procure Screen through their existing AWS commitments, simplifying procurement while aligning their cloud budgets. This follows our recent listing on Google Cloud Marketplace, further extending Granica’s reach on the top cloud marketplaces.

In today’s landscape, deploying AI safely, ethically, and in compliance with regulations is more critical than ever. Screen is addressing a key gap for data leaders by enabling smarter data management that drives real business impact - not just theoretical value.?

Key highlights of Screen:

  • State-of-the-art detection accuracy: Unmatched precision and recall for both structured data and free text ensures no exposures or vulnerabilities go unnoticed in S3 buckets or lakehouse tables?
  • Data detoxification: Filters out toxicity and bias, cleansing training datasets of harmful content such as hate, violence, and profanity
  • Synthetic data generation: Produces high-quality synthetic data to augment existing datasets, minimizing the risks of sensitive information leaks
  • LLM runtime protection: Dynamically masks sensitive and toxic content in LLM prompts and RAG-based applications, safeguarding both model inputs and outputs

"We're excited to join AWS Marketplace, making it easier for enterprises to seamlessly access our products and prioritize data safety while building and deploying AI" - Granica Founder and CEO, Rahul Ponnala

2. Granica makes finalist list for AI Awards

Following up from our previous edition, we’re thrilled to share that Granica has advanced to the finalist round in the 2024 AI Awards, where we’re competing for AI Solution of the Year and AI Startup of the Year, along with recognition in other categories including:?

  • Best Use of AI in Manufacturing
  • Best Use of AI in Finance
  • Best Use of AI in Retail and eCommerce?

Hosted by the Cloud Awards, the AI Awards celebrate breakthroughs in AI and machine learning across industries. This recognition highlights our pioneering efforts in data and AI, and our commitment to advancing AI safety, efficiency, and effectiveness. Winners will be revealed later this month, and we’ll share updates in the next edition. Stay tuned!

3. Scaling laws for learning with real and surrogate data?

Our world class research team continues its string of successful and peer recognized research with the acceptance of an additional paper to NeurIPS 2024! The paper explores how leveraging surrogate data, sourced from different contexts or generated by models, offers a promising solution to overcome the challenges of limited high-quality data in machine learning. Here’s a brief summary:

  • Data Collection Challenges: Gathering large amounts of high-quality data for machine learning can often be costly or impractical.
  • Surrogate Data: Instead of relying solely on original data, you can supplement a small set of target data with data from easier-to-access sources, such as GenAI models or data collected in different contexts, referred to as “surrogate data”.?
  • Weighted Approach: Our researchers propose a method using weighted empirical risk minimization (ERM) to effectively integrate surrogate data in the training process.?

Key Findings

  1. Test Error Reduction: Using surrogate data can significantly lower the model’s test error, even when the surrogate data isn’t directly related to the original data.?
  2. Optimal Weighting: Properly weighting the surrogate data is essential to fully benefit from it.?
  3. Scaling Law: A mathematical relationship, or scaling law, helps predict the best way to mix real and surrogate data for optimal performance, allowing you to choose the right amount of surrogate data to include.?

This method has been mathematically analyzed and tested across various datasets, empirically showing its potential to improve machine learning outcomes. Here’s an Arxiv link where you can read the paper in full.?

4. What makes data “AI-Ready”?

Is your organization truly prepared for AI? Data readiness is a critical component of AI success and many enterprises struggle to align their data strategy with AI initiatives. According to Gartner, “Organizations that fail to realize the vast differences between AI-ready data requirements and traditional data management will endanger the success of their AI efforts.”

Learn how Gartner defines AI-ready data, the importance of aligning data contextually with AI use case requirements, and the role of data and AI safety and governance. Download The Gartner? report “What Makes Data AI-Ready?” here, compliments of Granica.?

5. AI gets nuclear?

AWS, Microsoft, and Google are turning to nuclear power sources to build and operate mega data centers equipped to meet the increasingly hefty demands of generative AI. Many of the deals under discussion are with existing nuclear power providers to access energy or to employ small modular nuclear reactors (SMRs). Here’s why nuclear is so interesting for big tech:

  1. Constant, reliable power supply: Unlike solar or wind, nuclear energy provides a continuous and stable power source, ensuring AI data centers operate without disruptions or downtime?
  2. High energy density: Nuclear reactors, especially SMRs, produce a massive amount of energy from a small footprint, making them ideal for powering energy-hungry data centers while using less land and resources relative to renewables?
  3. Lower carbon footprint: With AI’s growing environmental impact, nuclear power offers a cleaner alternative to fossil fuels, helping companies meet sustainability goals and reduce greenhouse emissions
  4. Scalability for future growth: As AI demands accelerate, nuclear energy, particularly from SMRs, offers a scalable solution that can grow alongside increasing computational requirements
  5. Energy security: As geopolitical tensions and supply chain issues affect global energy markets, nuclear energy provides a more secure and self-sufficient energy source, reducing reliance on fluctuating natural gas or oil supplies?

How data centers and the energy sector can sate AI’s hunger for power, McKinsey

6. Industry news

OpenAI closes $6.6 billion in funding?

  • The funding values the ChatGPT maker at $157 billion
  • The company is on pace to generate $3.6 billion in revenue this year, with losses totaling $5.0 billion

The Modernizing Data Practices to Improve Government Act (S.5109)

  • As a bipartisan bill, this seeks to establish the continuation of a dedicated Chief Data Officers (CDO) Council
  • The bill would require the CDO Council to focus on improving data management practices related to AI in emerging technologies, ensuring the government is AI-ready

Deloitte releases annual State of Ethics and Trust in Technology report

  • Over half (54%) of professionals surveyed believe technologies like GenAI pose the highest ethical risk compared to other emerging technologies


We hope you enjoyed this edition of The Data Foundation. ??

We're hiring and on the hunt for top talent in AI research, ML engineering, and data engineering. Check out our Careers page and apply today! ??

Curious to see our products in action? Sign up for a demo below.??


要查看或添加评论,请登录

Granica的更多文章

社区洞察

其他会员也浏览了