登录查看更多内容

Training for latent variable energy based models

Alfredo Canziani

Assistant Professor of Computer Science at NYU Courant Institute of Mathematical Sciences

发布日期: 2020年11月1日

This week we went through the second part of my lecture on latent variable ?? energy ?? based models. ??

We've warmed up a little the temperature ??, moving from the freezing ?? zero-temperature free energy F??(y) (you see below spinning) to a warmer ?? F?(y).

Be careful with that thermostat! If it's gonna get too hot ?? you'll end up killing ?? your latents ?? and end up with averaging them all out, indiscriminately, ending up with plain boring MSE (fig 1.3)! ??

From fig 2.1–3, you can see how more z's contribute to F?(y).

This is nice, 'cos during training (fig 3.3, bottom) *The Force* will be strong with a wider region of your manifold, and no longer with the single Jedi. This in turns will lead to a more even pull and will avoid overfitting (fig 3.3, top). Still, we're fine here because z ∈ ?.

Finally, switching to the conditional / self-supervised case, where we introduce an observed x, requires changing 1 line of code! ??

Basically, self-sup and un-sup are super close in programming space! ??

So, learning the horn ?? was easy peasy! ??

Made with @matplotlib as usual.

For reference, the (now correct) training data is shown below. Pay attention that ∞ y's (an entire ellipse) are associated to a give x. So, you cannot hope to use a neural net to train on (x, y) pairs. That model would collapse into the segment (0, 0, 0) → (1, 0, 0).

One more note. ??

Fig 1.4 shows the *correct* terminology ???? vs. what is currently commonly used ????.

Actual softmax → "logsumexp" (scalar).

Its derivative, softargmax → "softmax" (pseudo probability).

This is analogous to max and, its derivative, argmax. But softer. ??

要查看或添加评论，请登录

Alfredo Canziani的更多文章

Inference for latent variable energy based models

2020年10月21日

Inference for latent variable energy based models

This week we've learnt how to perform inference with a latent variable ?? energy ?? based model. ?? These models are…
Graph Convolutional Networks (GCN)

2020年4月29日

Graph Convolutional Networks (GCN)

?? NEW LECTURE ?? Graph Convolutional Networks… from attention! In attention ?? is computed with a [soft]argmax over…
Self/cross hard/soft attention

2020年4月22日

Self/cross hard/soft attention

?? NEW LECTURE ?? “Set to set” and “set to vector” mappings using self/cross hard/soft attention. We combined a (two)…

Training for latent variable energy based models

Alfredo Canziani

Assistant Professor of Computer Science at NYU Courant Institute of Mathematical Sciences

Alfredo Canziani的更多文章

社区洞察

其他会员也浏览了

MACHINE LEARNING MODEL FOR MECHANICAL ENGINEERING APPLICATIONS

Will AI replace the need for basic Maths understanding? Not quite yet...

Probability Inequalities in Machine Learning

Model Compression Techniques

LSTM, GRU and Attention Block

Teaching your Machine Inference and Logic: Analogy Algorithms

A behavioral comparison of binary classifiers

Template Matching

Summarization and Prompting

Machines Vs Humans...

Alfredo Canziani的更多文章

Inference for latent variable energy based models

Graph Convolutional Networks (GCN)

Self/cross hard/soft attention

社区洞察

其他会员也浏览了

MACHINE LEARNING MODEL FOR MECHANICAL ENGINEERING APPLICATIONS

Will AI replace the need for basic Maths understanding? Not quite yet...

Probability Inequalities in Machine Learning

Model Compression Techniques

LSTM, GRU and Attention Block

Teaching your Machine Inference and Logic: Analogy Algorithms

A behavioral comparison of binary classifiers

Template Matching

Summarization and Prompting

Machines Vs Humans...