Big Data Aggregation Is Paving The Way For Precision Oncology, & More
Want more research from the ARK Team? Have feedback on our publications? Click here to help inform our content creation.
1. Big Data Aggregation Is Paving The Way For Precision Oncology
By: Nemo Marjanovic, PhD
Last week, cancer researchers published[1] results from their whole-genome sequencing (WGS) study. The 100,000 Genomes Project (100kGP)[2] recruited 10,470 patients spanning 35 cancer types and identified 330 cancer driver genes, including 74 novel ones. Compared to existing testing methods, which typically surface potentially actionable mutations in 22% of tumors, this study discovered them in 55% of the tumors.
This seemingly robust approach to developing new targeted therapies highlights the importance of big data in precision oncology therapies. Aggregating big data—multiomic data including proteomics and transcriptomics, multimodal data including digital pathology and radiology, and longitudinal patient-outcome data—seems crucial to the comprehensive data analytics that will deepen our understanding of cancer biology.
Now that AI costs are falling dramatically and accelerating drug discovery cycles, researchers should be able to find and target many more novel driver genes and surface new cancer treatments.
2. Airdrops Should Give Way To A Better Token Distribution Met
By: Lorenzo Valente
Last week, two new crypto projects, ZkSync and LayerZero, launched their tokens, resurfacing skepticism around “airdrops”—a mechanism by which a protocol distributes free tokens to a set of addresses/users who have participated in and provided value to the bootstrapping phase of a product. Following the crackdown on initial coin offerings (ICO)s as illegal security sales in late 2017,[3] airdrops became a popular method for distributing tokens, not only to create initial liquidity but also to align and reward a user base while decentralizing token ownership.
Uniswap was the first project to popularize the method significantly, in 2020,[4] when it retroactively airdropped a portion of the UNI supply to users who had interacted with their smart contracts at least once. Since then, airdrops have undergone experimental phases, which have led to a final design that is beginning to cause problems for most stakeholders. With thousands of dollars at stake, professional "Sybil farmers" are now exploiting airdrops, using dedicated software to create thousands of wallets that engage in fake usage in the hopes of qualifying for free tokens.
LayerZero is trying to conduct airdrops[5] while combating Sybil farmers. If they self-report, the farmers can keep 15% of the initial airdrop. If they do not self-report and if the LayerZero Team detects them, Sybil farmers receive nothing. LayerZero also has implemented a “bounty” program through which anyone can report Sybil farmers in exchange for a portion of their airdrop allocations.
Although an effective acquisition strategy in earlier days, exploitation has taken a toll on the legitimacy of airdrops. In our view, founders, investors, and community members should create new distribution strategies that align user incentives with the long-term success of their projects.
领英推荐
3. Anthropic Has Taken Another Leap Forward With Claude 3.5 Sonnet
By: Jozef Soja
Last week, Anthropic released Claude 3.5 Sonnet,[6] the first of its new family of large language models. (LLMs). Claude 3.5 Sonnet sets new records for frontier foundation models on both general knowledge benchmarks like MMLU (Massive Multitask Language Understanding) and coding benchmarks like HumanEval, as shown below. On many benchmarks, it rivals or surpasses the performance of OpenAI's GPT-4o. Anthropic also has expanded the model’s vision capabilities that enable it to process charts and images with much greater precision.
The tools supporting large language models are also becoming more important. This week, Anthropic introduced Artifacts, a shared environment that allows users to work collaboratively with Claude 3.5 Sonnet to create and iterate on code or text. In our view, features like Artifacts will expand the applications of LLMs beyond simple chatbots, potentially creating LLM-first applications, if not operating systems.
The cost of Claude 3.5 Sonnet’s inference processes seems to be lower than those of its competitors. Based on our research, if roughly three input tokens are inferenced for every output token produced, Sonnet should cost an average of $6 per million tokens inferenced, which would be 20% and 80% lower than GPT-4o's $7.50 per million tokens and Claude 3 Opus’s $30 per million tokens, respectively.[7] Importantly, Sonnet will be the mid-sized model in the Claude 3.5 family slated for later this year. We look forward to analyzing the capabilities of the larger “Opus” and smaller “Haiku” versions.
Want more research from the ARK Team? Have feedback on our publications? Click here to help inform our content creation.
[1]?Kinnersley, B. et al.?2024. “Analysis of 10,478 cancer genomes identifies candidate driver genes and opportunities for precision oncology.”?Nature Genetics.
[2]?This is the largest whole-genome study of cancer to date.
[3] Kornfeld, T.R. et al. 2017. “SEC Cracks Down on Fraudulent ICOs in Latest Enforcement Action.” Troutman Pepper.
[4]?Haig, S. 2020. “Uniswap moves closer to a new five million UNI airdrop.” CoinTelegraph.
[5]?LayerZero Labs. 2024. “We believe it is in the protocol’s best interest…” X.
[6]?Anthropic. 2024. “Claude 3.5 Sonnet.”
[7]?Our analyses base these estimates on data derived from Ibid. and OpenAI. 2024. “Pricing.”