Interesting Content in AI, Software, Business, and Tech- 6/7/2023
Devansh Devansh
Chocolate Milk Cult Leader| Machine Learning Engineer| Writer | AI Researcher| | Computational Math, Data Science, Software Engineering, Computer Science
A lot of people reach out to me for reading recommendations. I figured I'd start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won't always be the most recent publications- just the ones I'm paying attention to this week. Without further ado, here are interesting readings/viewings for 6/7/2023.?If you missed last week's readings, you can find it here .
AI Papers/Writeups
0)Replacing Jobs with AI is not about competency, it's about power
Princeton professor and AI hype critic Arvind Narayanan had a great Twitter thread on how replacing jobs with AI is as much about political lobbying as it is about the job itself. Jobs with strong lobbying bases will be able to ride out the wave, while the lower-paying jobs (which are politically easier to replace) can be replaced since they don't have a strong base (even if the AI that comes in isn't that good).
Fav quote- The people firing workers and trying to replace them with AI don't even seem to particularly care if AI is any good at the job. So I think the question of which job categories will be affected is primarily a question about power, not the tech itself.
1) The importance of humanizing AI: using a behavioral lens to bridge the gaps between humans and machines
Abstract- One of the biggest challenges in Artificial Intelligence (AI) development and application is the lack of consideration for human enhancement as a cornerstone for its operationalization. Nor is there a universally accepted approach that guides best practices in this field. However, the behavioral science field offers suggestions on how to develop a sustainable and enriching relationship between humans and intelligent machines. This paper provides a three-level (micro, meso and macro) framework on how to humanize AI with the intention of enhancing human properties and experiences. It argues that humanizing AI will help make intelligent machines not just more efficient but will also make their application more ethical and human-centric. Suggestions to policymakers, organizations, and developers are made on how to implement this framework to fix existing issues in AI and create a more symbiotic relationship between humans and machines moving into the future.
Credit- Jasmine C. Bridges. M.S., MBA for this recommendation.
2) ?? AI isn’t ‘hallucinating.’ We are.
An interesting take on responsibility and who is responsible for AI Models and their outputs. Would want to have more of a conversation with Charley (the author) before I tell you whether I fully agree with the arguments, but I think the article is interesting enough that I think people should read it.
A quote I loved-
We need to accept that the complexity of the engineer-user-tech interaction does not absolve everyone from responsibility for error and harm. As Nissenbaum writes, “Instead of identifying a single individual whose faulty actions caused the injuries, we find we must systematically unravel a messy web of interrelated causes and decisions.” Right, we can start to disentangle these systems and assign partial blame and accountability to the appropriate stakeholder. ...?But the first step along this path is to close the gap between our own expectations of what LLMs can do, and?what they’re actually doing.?They aren’t hallucinating — we are.
Author - Charles Johnson
3) FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance
Abstract- There is a rapidly growing number of large language models (LLMs) that users can query for a fee. We review the cost associated with querying popular LLM APIs, e.g. GPT-4, ChatGPT, J1-Jumbo, and find that these models have heterogeneous pricing structures, with fees that can differ by two orders of magnitude. In particular, using LLMs on large collections of queries and text can be expensive. Motivated by this, we outline and discuss three types of strategies that users can exploit to reduce the inference cost associated with using LLMs: 1) prompt adaptation, 2) LLM approximation, and 3) LLM cascade. As an example, we propose FrugalGPT, a simple yet flexible instantiation of LLM cascade which learns which combinations of LLMs to use for different queries in order to reduce cost and improve accuracy. Our experiments show that FrugalGPT can match the performance of the best individual LLM (e.g. GPT-4) with up to 98% cost reduction or improve the accuracy over GPT-4 by 4% with the same cost. The ideas and findings presented here lay a foundation for using LLMs sustainably and efficiently.
Authors- Lingjiao Chen,? Matei Zaharia ,? James Zou
4) Bot or Human? Detecting ChatGPT Imposters with A Single Question
Abstract- Large language models like ChatGPT have recently demonstrated impressive capabilities in natural language understanding and generation, enabling various applications including translation, essay writing, and chit-chatting. However, there is a concern that they can be misused for malicious purposes, such as fraud or denial-of-service attacks. Therefore, it is crucial to develop methods for detecting whether the party involved in a conversation is a bot or a human. In this paper, we propose a framework named FLAIR, Finding Large language model Authenticity via a single Inquiry and Response, to detect conversational bots in an online manner. Specifically, we target a single question scenario that can effectively differentiate human users from bots. The questions are divided into two categories: those that are easy for humans but difficult for bots (e.g., counting, substitution, positioning, noise filtering, and ASCII art), and those that are easy for bots but difficult for humans (e.g., memorization and computation). Our approach shows different strengths of these questions in their effectiveness, providing a new way for online service providers to protect themselves against nefarious activities and ensure that they are serving real users. We open-sourced our dataset on?this https URL ?and welcome contributions from the community to enrich such detection datasets.
Authors- Hong Wang , Xuan L. , Weizhi Wang , Xifeng Yan
5) Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust
Abstract- Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content. In this paper, we introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs. Unlike existing methods that perform post-hoc modifications to images after sampling, Tree-Ring Watermarking subtly influences the entire sampling process, resulting in a model fingerprint that is invisible to humans. The watermark embeds a pattern into the initial noise vector used for sampling. These patterns are structured in Fourier space so that they are invariant to convolutions, crops, dilations, flips, and rotations. After image generation, the watermark signal is detected by inverting the diffusion process to retrieve the noise vector, which is then checked for the embedded signal. We demonstrate that this technique can be easily applied to arbitrary diffusion models, including text-conditioned Stable Diffusion, as a plug-in with negligible loss in FID. Our watermark is semantically hidden in the image space and is far more robust than watermarking alternatives that are currently deployed. Code is available at this https URL.
Authors: Yuxin Wen , John Kirchenbauer , Jonas Geiping , Tom Goldstein
If you're looking for a breakdown of this paper, Yannic Kilcher has a good one-
6) RWKV: Reinventing RNNs for the Transformer Era
Abstract: Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit linear scaling in memory and computational requirements but struggle to match the same performance as Transformers due to limitations in parallelization and scalability. We propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs. Our approach leverages a linear attention mechanism and allows us to formulate the model as either a Transformer or an RNN, which parallelizes computations during training and maintains constant computational and memory complexity during inference, leading to the first non-transformer architecture to be scaled to tens of billions of parameters. Our experiments reveal that RWKV performs on par with similarly sized Transformers, suggesting that future work can leverage this architecture to create more efficient models. This work presents a significant step towards reconciling the trade-offs between computational efficiency and model performance in sequence processing tasks.
Yannic Kilcher has been very busy this week. He did another great paper breakdown here-
7) Faster sorting algorithms discovered using deep reinforcement learning
Abstract- Fundamental algorithms such as sorting or hashing are used trillions of times on any given day1 . As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past2 , making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library3 . This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.
领英推荐
S/o to Eduardo C. Garrido Merchán, PhD for the find.
8)How Vector Databases enable Generative AI
A quick introduction to Vector DBs, and why this new space is already worth a Billion Dollars. Goes over what Vector DBs are, why they are good for AI and more. Written by yours truly ( Devansh Devansh ).
Reader Spotlight- ByteBrief
ByteBrief : Get byte-sized dose of AI, Tech and Business news directly to your inbox. The only newsletter for you to stay on top of Ai, Tech and Biz. Join 22,000+ readers now.
If you're doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you've written, an interesting project you've worked on, some personal challenge you're working on, your content platform, or anything else you consider important. The goal is to get to you know you better, and possibly connect you with interesting people in the community
Other interesting Reads
1) Six ways to shoot yourself in the foot with healthchecks
Link- https://philbooth.me/blog/six-ways-to-shoot-yourself-in-the-foot-with-healthchecks
Summary- A hat I get to wear quite often is that of de facto devops consultant. And with that hat on, I've been surprised by how many times I was able to break production with seemingly innocuous healthcheck tweaks. I've managed to do it in six different ways so far, so I'm listing them here in the hope that others might learn from my mistakes.
Author- Phil Booth
2) An educational side project
Summary- What does a great side project look like, which helps learn new technologies, but also helps stand out when looking for a new job? Analysis of an Uber simulation app, built from scratch.
Author- Gergely Orosz
Cool Vids-
How HR Came To Rule Corporate America
Starbucks Makes $200 Million from Unused Gift Cards- Spencer Cornelia
Java is mounting a huge comeback- Fireship Tech
Peter Singer - ordinary people are evil- Jeffrey Kaplan from University of North Carolina at Greensboro
How a Lottery Win Might Change You! Patrick Boyle
I'll catch y'all with more of these next week. In the meanwhile if you'd like to find me, here are my social links-
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Check out my other articles on Medium. :?https://rb.gy/zn1aiu
My YouTube:?https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect:?https://rb.gy/m5ok2y
My Instagram:?https://rb.gy/gmvuy9
My Twitter:?https://twitter.com/Machine01776819
An expert in sociotechnical systems change, strategist, executive coach, and facilitator.
1 年Thanks for the mention, Devansh Devansh!