Bard vs. ChatGPT; Jina Embedding 2; Text2Structure; Does GPT-4 Pass Turing Text?; Transformer As Graph2Graph; and More.
Danny Butvinik
Chief Data Scientist | 100K+ Followers | FinCrime | Writer | Author of AI Vanguard Newsletter
Editor's Paper Recommendations
From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems: Encoding legislative text in a formal representation is an important prerequisite to different tasks in AI and law. For example, rule-based expert systems focused on legislation can support laypeople in understanding how legislation applies to them and provide them with helpful context and information. However, analyzing legislation and other sources to encode it in the desired formal representation can be time-consuming and a bottleneck in developing such systems. Here, we investigate to what degree large language models (LLMs), such as GPT-4, can automatically extract structured representations from the legislation. We use LLMs to create pathways from legislation, according to the JusticeBot methodology for legal decision support systems, evaluate the pathways, and compare them to manually created pathways. The results are promising, with 60% of generated pathways being rated as equivalent or better than manually created ones in a blind comparison. The approach suggests a promising path to leverage the capabilities of LLMs to ease the costly development of systems based on symbolic approaches that are transparent and explainable.
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents: Text embedding models have emerged as powerful tools for transforming sentences into fixed-sized feature vectors that encapsulate semantic information. While these models are essential for tasks like information retrieval, semantic clustering, and text re-ranking, most existing open-source models, especially those built on architectures like BERT, struggle to represent lengthy documents and often resort to truncation. One common approach to mitigate this challenge involves splitting documents into smaller paragraphs for embedding. However, this strategy results in a much larger set of vectors, consequently leading to increased memory consumption and computationally intensive vector searches with elevated latency. To address these challenges, we introduce Jina Embeddings 2, an open-source text embedding model capable of accommodating up to 8192 tokens. This model is designed to transcend the conventional 512-token limit and adeptly process long documents. Jina Embeddings 2 achieves state-of-the-art performance on a range of embedding-related tasks in the MTEB benchmark and matches the performance of OpenAI's proprietary ada-002 model. Additionally, our experiments indicate that an extended context can enhance performance in tasks such as NarrativeQA.
Does GPT-4 Pass the Turing Test?: We evaluated GPT-4 in a public online Turing Test. The best-performing GPT-4 prompt passed in 41% of games, outperforming baselines set by ELIZA (27%) and GPT-3.5 (14%) but falling short of chance and the baseline set by human participants (63%). Participants' decisions were based mainly on linguistic style (35%) and socio-emotional traits (27%), supporting the idea that intelligence is insufficient to pass the Turing Test. Participants' demographics, including education and familiarity with LLMs, did not predict the detection rate. This suggests that even those who deeply understand and interact with systems frequently may be susceptible to deception. Despite known limitations as a test of intelligence, we argue that the Turing Test continues to be relevant as an assessment of naturalistic communication and deception. AI models that can masquerade as humans could have widespread societal consequences, and we analyze the effectiveness of different strategies and criteria for judging human likeness.
Transformers as Graph-to-Graph Models: We argue that transformers are essentially graph-to-graph models, with sequences being a special case. Attention weights are functionally equivalent to graph edges. Our Graph-to-Graph Transformer architecture makes this ability explicit by inputting graph edges into the attention weight computations and predicting graph edges with attention-like functions, thereby integrating explicit graphs into the latent graphs learned by pre-trained Transformers. Adding iterative graph refinement provides a joint embedding of input, output, and latent graphs, allowing non-autoregressive graph prediction to optimize the complete graph without any bespoke pipeline or decoding strategy. Empirical results show that this architecture achieves state-of-the-art accuracies for modeling various linguistic structures, integrating very effectively with the latent linguistic representations learned by pretraining.
--
Are you looking to advertise a product, job opening, or event to an audience of over 40,000 AI researchers and engineers? Please reach out to us on?LinkedIn?to explore your options.
Enjoy the newsletter? Help us make it bigger and better by sharing it with colleagues and friends.
领英推荐
--
Industry Insights
Growth Zone?
Expert Advice
NLP Developer and Prompt Engineer| Building NLP Models to Optimize Personal Branding | LLM| Pytorch | spaCy
9 个月The results of GPT-4's performance in the Turing Test, outperforming ELIZA and GPT-3.5, raise interesting questions about what truly constitutes 'passing' such a test in the era of advanced LLMs
Exciting updates in AI and analytics! Can't wait to dig in. ??
-
9 个月Great insights on the latest developments in AI, ML, DL, and analytics! ??
Full Stack Data Scientist ? Quantitative analysis ? Entrepreneur ? AI Researcher ? Consultant
9 个月Exciting lineup! Always fascinated by advancements in AI and ML...Looking forward to diving into the latest trends and insights shared in this newsletter. Keep up the great work!
AI & Tech Product Engineer | Product Owner | Business Automation
9 个月Sounds like an informative read! ????