10 Deep Learning Trends and Predictions for?2017
Credit: https://unsplash.com/search/road?photo=c0I4ahyGIkA

10 Deep Learning Trends and Predictions for?2017

I used to write predictions for the upcoming year in my older blog. The last one I recall writing was “Software Development Trends and Predictions for 2011”. That’s quite a long time ago. Just to recap, out of 10 predictions, I gather that I got 6 accurately (i.e. Javascript VM, NoSQL, Big Data Analytics, Private Clouds, Inversion of Desktop Services, Scala), however the the remaining 4 have not gained enough traction (i.e. Enterprise AppStores, Semantic Indexing, OAuth in the Enterprise, Proactive Agents). Actually, AppStores and OAuth doesn’t happen in big enterprises, however small companies have adopted this SaaS model in full force. I’ll chalk the prediction failure to not being able to predict how slow enterprises actually change! The remain two predictions, that of Semantic Indexing and Proactive Agents, have unfortunately not progressed as I had originally projected. I may have overly estimated the AI technology at that time. Deep Learning had not been invented back then.

My Deep Learning predictions will not be at the same conceptual level as my previous predictions. I’m not going to predict enterprise adoption but I rather am going to focus on research trends and predictions. Without a doubt, Deep Learning will drive AI adoption into the enterprise. For those still living underneath a rock, it is a fact that Deep Learning is the primary driver and the most important approach to AI. However, what is not so obvious is what kind of new capabilities will arise in 2017 that will lead to exponential adoption.

So here come my fearless predictions for 2017.

  1. Hardware will accelerate doubling Moore’s law (i.e. 2x in 2017).

This of course is entirely obvious if you track developments at Nvidia and Intel. Nvidia will dominate the space throughout the entire 2017 simply because they have the richest Deep Learning ecosystem. Nobody in his right mind will jump to another platform until there is enough of an ecosystem developed for DL. Intel Xeon Phi solutions are dead on arrival solutions and will like catchup in performance with Nvidia by mid-2017 when the Nervana based chips come to market.

Intel’s FPGA solutions may see adoption by cloud providers simply because of economics. Power consumption is the number one variable that needs to be reduced. Intel’s Nervana based chip will likely clock in at 30 teraflops by mid-2017. That’s my guesstimate, but given that Nvidia is already at 20 teraflops today, I wouldn’t bet on Intel having a major impact until 2018. The only big ace that Intel may have is in 3D XPoint technology. This will help improve the entire hardware stack but not necessarily the core accelerator capabilities considering that GPUs use HBM2 that’s stacked on top of the chip for performance reasons.

Amazon has announced its FPGA based cloud instance. This is based on Xilinx UltraScale+ technology and are offering 6,800 DSP slices and 64 GB of memory on a single instance. That’s impressive capability however the offering may be I/O bound by not offering the HBM version of UltraScale+. The lower memory bandwidth solution as compared with Nvidia, Intel and even AMD may give developers pause as to wether to invest in a more complicated development process (i.e. VHDL, Verilog etc).

In late breaking news, AMD has revealed its new AMD Instinct line of Deep Learning accelerators. The specifications of these are extremely competitive versus Nvidia hardware. This offering is scheduled to be available early 2017. This is probably be enough time for AMDs ROCm software to mature.

2. Convolution Networks (CNN) will Dominate

CNNs will be the prevalent bread-and-butter model for DL systems. RNNs and LSTMs with its recurrent configuration and embedded memory nodes are going to be used less simply because they would not be competitive to a CNN based solution. Just like GOTO disappeared in the world of programming, I expect the same for RNNs/LSTMs. Actually, parallel architectures trump sequential architectures in performance.

Differentiable Memory networks will be more Common. This is just a natural consequence or architecture where memory will be extracted out of the core nodes and just reside as a separate component from the computational mechanism. I don’t see the need for forget, input and output gates for LSTM that can be replaced by auxiliary differentiable memory. We already see conversation about refactoring the LSTM to decouple memory (see Augmented Memory RNN).

3. Designers will rely more on Meta-Learning

When I began my Deep Learning journey, I had thought that optimization algorithms, particularly ones that were second order would lead to massive improvements. Today, the writing is on the wall, DL can now learn the optimization algorithm for you. It is the end of the line for anybody contemplating a better version of SGD. The better version of SGD is the one that is learned by a machine and is the one that is specific to the problem at hand. Meta-learning is able to adaptively optimize its learning based on its domain. Further related to this is whether alternative algorithms to backpropagation will begin to emerge in practice. There is a real possibility that hand tweaked SGD algorithm may be in its last legs in 2017.

4. Reinforcement Learning will only become more creative

Observations about reality will always remain imperfect. There are plenty of problems where SGD is not applicable. This just makes it essential that any practical deployment of DL systems will require some form of RL. In addition to this, we will see RL used in many places in DL training. Meta-Learning for example is greatly enabled by RL. In fact, we’ve seen RL used to find different kinds of neural network architectures. This is like Hyper-parameter optimization on steroids. If you happen to be in the Gaussian Process business then your lunch has just been eaten.

5. Adversarial and Cooperative Learning will be King

In the old days we had monolithic DL systems with single analytic objective functions. In the new world, I expect to see systems with two or more networks cooperation or competing to arrive at a optimal solution that likely will not be in analytic form. See “Game Theory reveals the future of Deep Learning”. There will be a lot of research in 2017 in trying to manage non-equilibrium contexts. We already see this now where researchers are trying to find ways to handle the non-equilibrium situation with GANs.

6. Predictive Learning or Unsupervised Learning will not progress much

Predictive Learning” is the new buzzword that Yann LeCun in pitching in replacement to the more common term “Unsupervised Learning”. It is unclear whether this new terminology will be adopted. The important question though is if Predictive Learning will make great strides in 2017. My current sense is that it will not, simply because there seems to be a massive conceptual disconnect as to how exactly it should work.

If you read my previous post about “5 Capabilities of Deep Learning Intelligence”, you get the feeling that Predictive Learning is some completely unknown capability that needs to be shoehorned into my proposed capability model. Predictive Learning is like the cosmologists Dark Matter. We know it is there, but we just don’t know how to see it. My hunch is that it has something to do with high entropy or otherwise randomness.

7. Transfer Learning leads to Industrialization

Andrew Ng thinks this is important, I think so too!

8. More Applications will use Deep Learning as a component

We saw this already in 2016 where we see Deep Learning used as a function evaluation component in a much larger search algorithm. AlphaGo employed Deep Learning in its value and policy evaluations. Google’s Gmail auto-reply system used DL in combination with beam searching. I expect to see a lot more of these hybrid algorithms rather than new end-to-end trained DL systems. End-to-end Deep Learning is a fascinating area of research, but for now hybrid systems are going to be more effective in application domains.

9. Modularity in Deep Learning will require Design Patterns

Deep Learning is just one of those complex fields that needs a conceptual structure. Despite all the advanced mathematics involved, there’s a lot of hand waving and fuzzy concepts that can best be captured not by formal rigor but rather with a method that has been proven to be effective in other complex domains like software development. I predict practitioners will finally “get it” with regards to Deep Learning and Design Patterns. This will be further motivated by the fact that Deep Learning architectures are becoming more modular rather than monolithic.

10. Engineering will outpace Theory

The background of researchers and the mathematical tools that they employ are a breeding ground for a kind of bias in their research approach. Deep Learning systems and Unsupervised Learning systems are likely these new kinds of things that we have never encountered before. Therefore, there is no evidence that our traditional analytic tools are going to be any help in unraveling the mystery as to how DL actually works. There are plenty of dynamical systems in physics that have remain perplexed about for decades, I see the same situation with regard to dynamical learning systems.

This situation however will not prevent the engineering of even more advanced applications despite our lack of understanding of the fundamentals. Deep Learning is almost like biotechnology or genetic engineering. We have created simulated learning machines, we don’t know precisely how they work, however that’s not preventing anyone from innovating.

I’ll come back to these predictions in a year from now. Wish me luck!

To continue to be updated with latest in Deep Learning, follow Intuition Machine at Medium or read the Deep Learning Playbook:


John Danskin

Sailing around, having adventures

8 年

Nice. As a parallel hardware vendor, I'd love for giant CNNs to replace skinny deep RNNs, but I'm not seeing evidence. The memory based networks that I've seen, including the differentiable memory article you cited, are using RNNs to access the memory. I didn't find non-RNN examples of augmented memory either, but my search wasn't exhaustive. Do you have more evidence? I'm all aboard on your other 9 predictions.

回复
Eduardo Fernandez G.

Head of North America Engagement | Architecture & IT Strategy | Global CTO Division at Banco Santander

8 年

Good Article!

回复
Andrey Cheremskuy

Senior Consultant at Deloitte

8 年

Regarding unsupervised learning it seems to me that it will actually advance in 2017. Taking into consideration open source tools Google DeepMind and OpenAI released this year.

要查看或添加评论,请登录

Carlos E. Perez的更多文章

  • Deep Learning AI Canvas and Viral Product Development

    Deep Learning AI Canvas and Viral Product Development

    In the breakneck Deep Learning research field, the obsession is to dish out innovative research as quickly as possible…

    1 条评论
  • Introducing the Deep Learning AI Canvas

    Introducing the Deep Learning AI Canvas

    One of the big mysteries of Deep Learning is, how do we apply this disruptive new AI technology to improving our…

  • How to Re-Invent the Cognitive Stack of Intelligence

    How to Re-Invent the Cognitive Stack of Intelligence

    Kevin Kelly (founding editor of Wired magazine) just wrote a near disaster of an article “The AI Cargo Cult: The Myth…

  • The AI Economy is Reserved for the Highly?Skilled

    The AI Economy is Reserved for the Highly?Skilled

    Thomas Frey has a thought provoking article “78 Skills that are Difficult to Automate”. Frey breaks down the categories…

  • The Deep Learning Roadmap

    The Deep Learning Roadmap

    It just occurred to me, that after a couple of years tracking Deep Learning developments, that nobody has even bothered…

    1 条评论
  • Deep Learning Unknowable Knowns

    Deep Learning Unknowable Knowns

    One good way to frame the question of the limits of Deep Learning is in the context of the Principle of Computational…

    1 条评论
  • Infographic: Best Practices for Training Deep Learning Networks

    Infographic: Best Practices for Training Deep Learning Networks

    More on this topic and how you can introduce Deep Learning into your business can be found in an upcoming book: "Deep…

    3 条评论
  • The Alien Style of Deep Learning Generative Design

    The Alien Style of Deep Learning Generative Design

    What happens when you have Deep Learning begin to generate your designs? The commons misconception would be that a…

    8 条评论
  • How to Explain Deep Learning using Chaos and Complexity

    How to Explain Deep Learning using Chaos and Complexity

    I want to talk to you today about the concerns of Non-Equilibrium Information Dynamics and how an understanding its…

    3 条评论
  • The Many Tribes of Artificial Intelligence (AI)

    The Many Tribes of Artificial Intelligence (AI)

    One of the biggest confusions about “Artificial Intelligence” is that it is a very vague term. That’s because…

    27 条评论

社区洞察

其他会员也浏览了