Interested in how we can align machine objectives with human values? Apply to the AI Alignment Seminar today! Applications open to the Georgia Tech community until September 10th. While AI systems become more powerful, our ability to understand and control these models remains limited. This program covers the technical problems that must be solved to ensure AI is developed to the benefit of our future. As a participant, you will join weekly discussion groups to review the AI alignment literature, and coauthor a whitepaper surveying the motivations and directions for future research. We will cover topics such as: - How can we specify our values in objective functions? - How do we ensure goals learned in training generalize to new distributions? - How can we efficiently implement human oversight of models? - How can we ensure models are robust to adversarial inputs? - How can we develop mechanistic understandings of model behavior? Applications are due on September 10th! Learn more: https://lnkd.in/e2k7KMpf The AI Safety Initiative at Georgia Tech is a community ensuring AI is developed to the benefit of our future.? Learn more: https://aisi.dev Subscribe to our newsletter: https://list.aisi.dev Join our Discord: https://discord.aisi.dev Georgia Institute of Technology College of Computing at Georgia Tech #ai #artificialintelligence #machinelearning
AI Safety Initiative at Georgia Tech
研究服务
Atlanta,Georgia 107 位关注者
A community at the Georgia Tech ensuring artificial intelligence is developed to the benefit of our future.
关于我们
A community at the Georgia Institute of Technology ensuring artificial intelligence is developed to the benefit of our future.
- 网站
-
https://aisi.dev
AI Safety Initiative at Georgia Tech的外部链接
- 所属行业
- 研究服务
- 规模
- 2-10 人
- 总部
- Atlanta,Georgia
- 类型
- 教育机构
- 创立
- 2022
地点
-
主要
US,Georgia,Atlanta
AI Safety Initiative at Georgia Tech员工
动态
-
Congratulations to AISI members Abhay S. and Ayush Panda on the presentation of their paper "Eliciting Language Model Behaviors using Reverse Language Models" at NeurIPS 2023, spotlighted at the Socially Responsible Language Modelling Research workshop. Their state-of-the-art method identifies natural adversarial inputs to language models using a model trained on reverse order data. Read their paper here: https://lnkd.in/eXKXdWtW If you are a member of the Georgia Institute of Technology community interested in contributing to similar research, visit: https://aisi.dev