Data Phoenix Digest - ISSUE 8.2023
Dmytro Spodarets
DevOps Architect @ Grid Dynamics | Founder of Data Phoenix - The voice of AI and Data industry
Hey folks,
Welcome to this week's edition of Data Phoenix Digest! This newsletter keeps you up-to-date on the news in our community and summarizes the top research papers, articles, and news, to keep you track of trends in the Data & AI world!
Be active in our community and join our Slack to discuss the latest news of our community, top research papers, articles, events, jobs, and more...
?? Want to promote your company, conference, job, or event to the Data Phoenix community of Data & AI researchers and engineers? Click?here ?for details.
Data Phoenix community news
Upcoming webinars:
Video records of past events:
?? Don't miss out! Subscribe to our?YouTube channel ?now and be the first to receive notifications about the video records of past events and other valuable content to help you stay ahead!
Summary of the top articles, papers, and courses
Articles
Describe for Me is a website which helps the visually impaired understand images through image caption, facial recognition, and text-to-speech, a technology we refer to as “Image to Speech.” in this blog post will walk you through the Solution Architecture behind “Describe For Me”, and the design considerations of the solution.
This article explores how you can evaluate your AI or ML model. Regardless of the type of model you have and your end application, you will learn how to improve its performance by using the Wine dataset from sklearn, applying the support vector classifier (SVC), and then testing the model’s metrics. Check it out!
End-to-end machine learning pipelines can help engineers save lots of their precious time and resources, and allow them to focus more on deploying new models than maintaining existing ones. In this article, you will learn how to quickly build and deploy an end-to-end ML pipeline with Kubeflow Pipelines on AWS.
In PyTorch, there are many activation functions available for use in your deep learning models. In this post, you will see how the choice of activation functions can impact the model. Take a deep dive into how activation functions work and how to use them.
This tutorial goes through steps on how to use Comet to monitor a time-series forecasting model. The author explains how to carry out some EDA on the dataset, and then log the visualizations onto the Comet experimentation website or platform.
Papers & projects
This paper explores a new way of controlling GANs that includes "dragging" any points of the image to reach target points in a user-interactive manner - DragGAN. It can help to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories.
Diff-Pruning is a compression method tailored for learning lightweight diffusion models from pre-existing ones, without extensive re-training. The essence of Diff-Pruning is encapsulated in a Taylor expansion over pruned timesteps, a process that disregards non-contributory diffusion steps and ensembles informative gradients to identify important weights.
BloombergGPT is a 50 billion parameter language model that is trained on a wide range of financial data. It is validated on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage.
HuaTuo is a LLaMA-based model that has been supervised-fine-tuned with generated Q&A instances. The experimental results demonstrate that HuaTuo generates responses that possess more reliable medical knowledge.
SAM is a promptable segmentation system with zero-shot generalization to unfamiliar objects and images, without the need for additional training. The model was trained on Meta AI’s SA-1B dataset for 3-5 days on 256 A100 GPUs. Make sure that you try it!
Courses
Building a DL model is a complex task. The aim of this course is to start from a simple DL model implemented in a notebook, and port it to a ‘reproducible’ world by including code versioning, data versioning, experiment logging, hyper-parameter tuning, etc.
This course is designed to introduce you to several dimensions of Responsible AI with a focus on fairness criteria and bias mitigation. In 30 short videos, you will learn about different fairness criteria, bias measurements, and bias mitigation techniques.
?? If you enjoy our work, we would greatly appreciate your support by sharing our digest with your friends on Twitter, LinkedIn, or Facebook using the hashtag?#dataphoenix . Your help in reaching a wider audience is invaluable to us!