登录查看更多内容

Europython 2018 - Day 4

Bruno Hanzen

CTO

发布日期: 2018年7月26日

Fourth day of the conference. Fatigue starts to take its toll, with some people taking naps where they find a quiet place in the conference center! Here are my daily take-outs

Deep Learning with PyTorch for Fun and Profit

Alexander Hendorf is a senior developper. After attending a presentation about AI in arts, he has decided to play with the idea. He chose PyTorch as a development framework because of:

"research friendliness"
accessibility to "Pythonistas"
presence of a lively community
support by Facebook

He started with applying "style transfer" between comics, or comics and photographs. The results he shown were impressive.

He then started a project around "die drei", a teenager's audio-series, which have 200+ audio+transcripts available. He set to create new "die drei"-like episodes.

He first tried to create some realistic texts based on the style of the actual texts. His results were, at best, mixed. He then voice synthesis. Same story: the results were not of commercial quality. Whe he tried to create illustrations, his results were much better. He has not tried to create a plot yet, but sees this as a difficult exercise.

His remarks:

do not hesitate to "play" with a toy project
beware of cherry picking: usually, only the best results are shown. There can be a lot of chaff, which is rarely shown.
there is a lot of material available on the Internet. Code quality is "not uniform": still a lot of Python 2.7 code (which will be retired end 2019), closures, ...
Beware of hype

When to use Machine Learning: Tips, Tricks and Warnings

Pascal van Kooten is a senior datascientist working for Jibes Data Analytics. Over 5 years, he has contributed to a dozen projects in as many companies.

To him, machine learning is a subset of Artificial Intelligence. The objective is to generalise from observations.

He explained some projects, mostly personal, and drew the following conclusions:

simpler is better than complex
analyse before starting to apply machine learning
machine learning is probably not very suitable in environments with strict rules/strong compliance requirements
When you start using machine learning:
Build a domain-specific platform
Don't start with the most complex problems
Do not overoptimise
Do not underestimate the work to be done outside the models proper: data preparation, results exploitation
Use cross-validation and anomaly detection
The models you build must be able to go to production.

Data is not flat

Alisa Dammer works as a developper for Joblift. She has presented a toy project: predicting the sleep cycle of a typical IT student. One of her friends has recorded his sleep cycle (about 1.000 observations), and she has tried to model the data.

Considering the dataset as a one-dimensional time series, she has had little succes: about 30% accuracy. She has been able to improve the score with feature engineering: she has added some categorical variables linked to the time, like day/night, season, meal times, and finally academic calendar (exams, holidays). She finally reached 95 % accuracy.

Her conclusions:

feature engineering is important for small datasets/low dimensionality
feature engineering is complex, time consuming and requires domain-specific expertise.
she has tested different kinds of models, and it appeared that feature engineering had more impact on the end result than the choice of the model.

More Than You Ever Wanted To Know About Python Functions

Mark Smith works as developer advocate for Nexmo. He guided us in to the sutleties of funtion definitions, closures and methods. The talk was full of practical tips, but difficult to summarise. His slides and code will be made available.

May the Fuzz be with you

Heid Thorpe works as a Datascientist for the Australian government.

This is probably the strangest talk I have seen so far. She has explained how she "invents" data with a deep learning (LSTM) system. To me, with a scientific background, inventing data is a capital sin. but, in fact, her objective is to invent fake, but realistic test data. This can be used to test applications, but also to test security. Testing, and creating test datasets are boring activities, which deserve to be automated.

She uses an LSTM neural network. She needs some example of "real" data and, from there, she can create huge test datasets, which can be fed into automated test systems.

She first fed Shakespeare sonnets into the system and, fater a few epochs, the system gererated realistic (although meaningless) texts. She then showed how she coud use it to generate the XML part of .docx files.

要查看或添加评论，请登录

Bruno Hanzen的更多文章

Identify this device

2023年3月30日

Identify this device

I bought this device in a curiosa shop. It is probably a device used to demonstrate the propagation of electromagnetic…

7 条评论
Investir en production d'électricité pilotable

2022年9月2日

Investir en production d'électricité pilotable

Voici une rapide estimation pour la Belgique. Il s'agit bien entendu d'ordres de grandeur, pas d'un plan financier.

2 条评论
Ah, les grandes familles!

2019年3月8日

Ah, les grandes familles!

Où les archives d'un ancien Premier Ministre reviennent sur la liaison du roi Baudouin avec sa maratre. Reprenons au…

7 条评论
Une alternative aux sodas?

2019年2月9日

Une alternative aux sodas?

Hier, j'ai préparé un plat à la bière. J'ai utilisé la "bière de table" brune de mon enfance.

2 条评论
Insoutenable

2019年2月6日

Insoutenable

Les chiffres du Hainaut Des chiffres comme il en tombe à peu près tous les jours. Parmi ceux-ci, il y en a un qui m'a…
Tous foutus?

2019年2月2日

Tous foutus?

Nous sommes tous exceptionnels! La jeune génération est angoissée par le climat. Ma génération a été angoissée par la…

1 条评论
Europython 2018 - Day 3

2018年7月25日

Europython 2018 - Day 3

Today was the first conference day proper, after 2 days devoted to trainings. As there was some rush at the…
Europython 2018 - Day 2

2018年7月24日

Europython 2018 - Day 2

This was the second and last day of training. I managed to participate to 2 sessions (no more schedule…

1 条评论
Europython 2018 - Day 1

2018年7月23日

Europython 2018 - Day 1

I just finished my first day at Europython, in Edinburgh. Travelling always brings its lot of surprises, like waiting 1…
Too old to rock and roll: too young to die

2016年1月6日

Too old to rock and roll: too young to die

Dear Connections, Remember Jethro Tull? Well, you may be too young for that. But you might nevertheless want to listen…

14 条评论

See all articles

Europython 2018 - Day 4

Bruno Hanzen

CTO

Bruno Hanzen的更多文章

社区洞察

其他会员也浏览了

The Lang Project, Effective Visualization, LLM course, and More

What is Long Short-Term Memory (LSTM)?

Artificial Intelligence #85

Implementing the Self-Attention Mechanism from Scratch in PyTorch!

Introducing Claude 3.5 Sonnet: Anthropic's Fastest and Smartest Model that Outperforms Claude 3 Opus. ??

Old Machines, New Tricks: Building TensorFlow v2.16 from Scratch

The Final Awe of 2024, and Grand Design of Tasks that Inspire True Intelligence

McCulloch-Pitts: The First Computational Neuron

Getting started with Responsible Machine Learning - Recap

Ever Lost an Algorithm? A Suggestion for Addressing the Reusability Crisis

Bruno Hanzen的更多文章

Identify this device

Investir en production d'électricité pilotable

Ah, les grandes familles!

Une alternative aux sodas?

Insoutenable

Tous foutus?

Europython 2018 - Day 3

Europython 2018 - Day 2

Europython 2018 - Day 1

Too old to rock and roll: too young to die

社区洞察

其他会员也浏览了

The Lang Project, Effective Visualization, LLM course, and More

What is Long Short-Term Memory (LSTM)?

Artificial Intelligence #85

Implementing the Self-Attention Mechanism from Scratch in PyTorch!

Introducing Claude 3.5 Sonnet: Anthropic's Fastest and Smartest Model that Outperforms Claude 3 Opus. ??

Old Machines, New Tricks: Building TensorFlow v2.16 from Scratch

The Final Awe of 2024, and Grand Design of Tasks that Inspire True Intelligence

McCulloch-Pitts: The First Computational Neuron

Getting started with Responsible Machine Learning - Recap

Ever Lost an Algorithm? A Suggestion for Addressing the Reusability Crisis