#1SET: AI hallucinations, Github tips and Uber data analysis
Stay up to date with the latest advancements in technology! Learn about GitHub 's Codespaces, explore the evaluation of open-source Large Language Models and their tendency to hallucinate, discover the impact of language differences on tokenization lengths in the OpenAI API, and more! Subscribe to our newsletter for the our curated technology insights.
Golden tip: Are you familiar with GitHub's "web editor"? With a simple keystroke of '.', you can open a VS Code in your browser. And there's more: "Codespaces" allows you to run your project directly from a GitHub machine. It's a useful feature for those who are always on the go, enabling problem-solving from anywhere, even from a tablet. Without a doubt, GitHub's Codespaces is an indispensable tool for developers.
In this article published by one of our clients, Guruprasad Raghavan, co-founder of Yurts, describes findings from the evaluation of popular open-source Large Language Models (LLMs) to estimate both the frequency and degree of hallucinations. The Yurts team discovered that popular open-source models hallucinate about 55% of the time in a context-aware question-answering task when tested without any fine-tuning.
An article explaining how different languages represent distinct tokens in the OpenAI API, which can result in higher costs. Theoretically, Portuguese could be more affected compared to English in this aspect. A recent article titled "Language Model Tokenizers Introduce Unfairness Between Languages" by Petrov et al. demonstrated that "the same text translated into different languages can have drastically different tokenization lengths, with differences of up to 15 times in some cases." Check out the full explanation.
We recommend Perplexity AI as a multifunctional artificial intelligence chatbot for those seeking an alternative to Chat GPT. It functions as a real-time search engine, providing written answers with cited sources and highlighting relevant pages for user research. This is particularly noteworthy as real-time web search is a premium feature in many other chatbots. Perplexity AI is also a reliable content assistant, as it informs users of all the sources of its information. Additionally, it can be a useful SEO assistant, generating meta descriptions, title tags, and article titles, as well as assisting with keyword research and identifying sites for link building. It is a valuable tool for SEO professionals and content creators.
Ground News is a very interesting attempt to create news with political bias notifications. The platform facilitates the comparison of news sources and the identification of media biases. It evaluates the political bias of publications based on three news monitoring organizations: All Sides, Ad Fontes Media, and Media Bias Fact Check. The analysis considers aspects such as writing, story selection, and political affiliation. This evaluation is done at the publication level, not specific articles. The ratings are continuously updated to provide the most comprehensive analysis possible.
An article from Cornell University explores Reinforcement Learning from Human Feedback (RLHF) in improving the output quality of Large Language Models (LLMs) by aligning them with human preferences. The researchers proposed a simple algorithm called Reinforced Self-Training (ReST), which generates a dataset from initial LLM samples used to enhance the LLM policy with offline RL algorithms. ReST is more efficient than typical online RLHF methods as it allows for data reuse. Applied to machine translation, ReST significantly improves translation quality as measured by automated metrics and human evaluation.
With the recent updates to GPT-3.5 Turbo, developers now have the ability to customize the model for their own use cases by bringing their own data for fine-tuning. This represents a significant advancement in the flexibility and applicability of GPT-3.5 Turbo for a variety of scenarios.
领英推荐
Develop faster with Bun. With it, you can develop, test, run, and package JavaScript and TypeScript projects. Bun is an all-in-one JavaScript runtime and toolset designed for speed. It comes complete with a bundler, a test runner, and a Node.js-compatible package manager. With Bun, you have all the necessary tools to optimize your development workflow.
At the annual Defcon hacker convention in Las Vegas, 2,200 people competed to break barriers around language models. The contest, organized by the nonprofit AI security organizations Humane Intelligence and SeedAI and sponsored by the US White House and various tech companies, awarded the winners with an Nvidia RTX A6000 graphics card.
This article defined developer productivity, addressed common myths surrounding it, and discussed ways to measure it. The authors discussed developer productivity in the context of the SPACE Framework and provided ways to track it. Additionally, they correlated data from Git and project management tools. Even with basic metrics, it is possible to compare developer productivity against industry standards and start improving team productivity.
There will be no shortage of vector databases: Pinecone, Weaviate, Chroma, The Milvus Project, Supabase, Qdrant, Vespa.ai, Vercel.
How did Uber solve data consistency problem? Uber uses Spanner to store large volumes of data due to its complexity and the need for transactional consistency at a global scale. Previously, they used Cassandra for real-time data but faced difficulties with low-latency writes when dealing with millions of concurrent requests. The solution they found was to create an application layer framework to manage database operations using the Saga pattern.