How to Advance #DeepSeek-#AI #ReinforcementLearning #Models: AIMLExchange.com: https://lnkd.in/gGMdpZ9D : #ChatGPT https://lnkd.in/gtb_ZfQy . #Perplexity https://lnkd.in/gFdeYrSH . #You https://lnkd.in/gMHHQKsN . More on #Dynamic #OutcomesDrivenAI https://lnkd.in/gsBxKh4f When #Static #InputDrivenAI & #DataDrivenAI Models Fail. Advancing #OutcomesDrivenAI #Systems-#Networks #Uncertainty-#Complexity #Engineering Building AI-#Business #Performance #Outcomes Beyond #DataDrivenAI Since 1993 #PunishedByRewards: When #BestPractices Become #WorstPractices. Hence #RL #Models Are Suitable for #Static #Deterministic #Compliance-Driven Contexts and Fail in #Novel #Dynamic #NonDeterministic Contexts Where #Error #Detection-#Correction is More Critical than #Compliance: 30-Years R&D Advancing BEYOND Human-AI-Reinforcement Learning Models: R&D Impact Among #ArtificialIntelligence-#Quant #Finance Nobel Laureates: https://lnkd.in/epx6zV3 #Generative #AI-#Reinforcement #Learning #Models: #AI #RL #Models Beyond #ClosedSystems #Statics to #OpenSystems #Dynamics: SSRN: Post AI-Quantum Models: 128 Top-10 R&D Rankings: https://lnkd.in/ec99Zkd . How to Build #Agile-#Resilient-#Sustainable #AI-#RLModels for #OpenSystems #Dynamics of #Causal #Environments: https://lnkd.in/ejDCev2R : 1993-2025 Our Journey from World's Top Digital Site-Search Engine-Social Network To World-Leading Post AI-Quantum Know-Build-Monetize? Networks: Santa Fe Institute to Pentagon, US Department of Defense, Office of the Secretary of Defense for Policy: Global Digital, AI, Quantum, Post AI-Quantum Economies Pioneer, From Beta Version of First Web Browser in 1993, We Create the Digital Future?. You Can Too! Let's Show You How! Global Risk Management Network, LLC #QuantumValley Know-Build-Monetize? Post AI-Quantum Networks How To Future-Proof YOU Beyond AI-GenAI: Google AI #Podcasts: We Build #QuantumMinds for #Quantum #Uncertainty https://lnkd.in/eiuQZ__C United States Air Force-Pentagon, US Department of Defense, Office of the Secretary of Defense for Policy MVP: Know-Build-Monetize? Networks: 30-Years Advancing You Beyond #Digital #Search-#Learn-#Work: Why #Search When You Can #Know? Why #Learn When You Can #Build? Why #Work When You Can #Monetize? YM-ABC?: YogeshMalhotra.com New York State: "Join Dr. Yogi Malhotra to get up to speed on Cloud Technology." USAF-AFRL Ventures: "Do Something Epic: Save the World?": We Create the Digital Future?. You Can Too! Let's Show You How! AIMLExchange?: AIMLExchange.com: We Create the Digital Future? BRINT?: BRINT.com: From Future of Finance? to Future of Defense? C4I-Cyber?: C4I-Cyber.com: Because the Future of the World Depends Upon It? -- AWS Quantum Valley Global Risk Management Network LLC: 30-Years Leading AI-Quantum Finance Practices Silicon Valley's Next Big Thing?: Know-Build-Monetize? Networks: 30-Years Building Meaning-Aware AI Silicon Valley-Wall Street-Pentagon Leader: 30 Years Building AI-Cyber-Crypto-Quantum Risk Networks --
?? Why is DeepSeek R1 generating crazily long reasoning chains? A new paper, Demystifying Long CoT Reasoning in LLMs, reveals a key insight: Long CoT isn't just longer—it fundamentally changes how models learn to reason. ?? Key takeaways: ? Long CoT scales better—on MATH-500, long CoT fine-tuning (SFT) keeps improving beyond 70% accuracy, while short CoT stagnates below 55%. ? Reinforcement learning (RL) needs Long CoT—models trained on short CoT barely improve with RL. ? Reward shaping matters—without it, long CoT models risk “reward hacking.” Check out the full paper: https://lnkd.in/eHFAP6Rd #LLMs #ChainOfThought #DeepSeekR1