A wander in to a Random Forest, and why it's never "too soon" to start learning
This last weekend, as the extra demands of Covid related closures to our childcare arrangements finally appear to be behind us, I affirmed that I'd spend time on some personal development and a good dose of the outdoors rather than catching up on work (as has become something of the norm for so many of us in recent months). It was bliss.
Earlier in the week I'd caught sight of a mini-course; literally a scratch the surface of Python, Data Science and Machine Learning running for free through analytics-link and Andrew Jones and thought I'd sign up. I signed my husband up too, I'm nice like that; the Superbowl is over and Everton won on Saturday so what else could he want to do than spend a few hours breaking code with his wife? (It turns out he was trying to watch the NASCAR).
I'm something of a continued learner, having gone to university in my late 20's to study Applied Computing and have kept hands-on in technical skills since moving out of industry a few years ago; I can spend hours racking up certificates on Datacamp - but I tend to stick to Statistics, SQL, Excel and business related conceptual courses since I'm out of the hands-on world these days, as this helps me to better understand customer problems. I've used various viz tools but for years have looked at Python though it always seemed a little out of reach, and I'm not entirely sure why.
If 2020 was my year for developing my skills in the tools I knew - 2021 is the year to try something different. Python it is; used across analytics and data science disciplines it's a sensible choice for me. At this point I should say Python has multiple other uses and is a great choice as it really is very flexible but for this blog I'm just talking about data.
First up, installing Python, pretty straightforward - https://www.anaconda.com/products/individual go here, follow the wizard Anaconda will preload in pretty much all you'll need; at least for a novice.
Next; IDE. The course introduced Spyder and I have to say, I really loved it. It's the IDE geared to scientific Python and it's very intuitive - you can easily examine your objects and files and the interface is smooth and review your outputs at a glance.
Next it was time to get dirty with some coding. It's been a while and I guess this has been part of the hesitation to get going. I can say it's the easiest language I've ever tried to pick up; I'm fortunate to have some background in Object Oriented languages, but much of the syntax makes sense even if you're more familiar with something more procedural. In the course the lessons were simple and designed to get you used to thinking about the language and it's uses and to keep it accessible - nonetheless I'm impressed with myself for the simple plot below using pandas and some of the other coding exercises were a lot of fun (I used sets with difference_update() to run a lightning quick primes up to 1m query).
Next was a jump start in to Machine Learning and actually training a model using a Random Forest algorithm (below). This is where my husband and I got hooked; outside of the professional sphere we are both avid NFL fans and members of a fantasy football league - it will be interesting to see which of us wins next year using our newly acquired skills! We have around 7 months to up our predictive model game...
The point of this post really is to say, something had held me back from attempting to learn Python, but start to finish this only took a Sunday evening. It's never to late to learn new skills and you don't know where the journey will take you either professionally or personally. I'm hoping for the top spot next fantasy season; just got to get going with web scraping for some stats and the viz libraries. This seems a perfect project to while away the boring lockdown evenings. Hopefully my marriage can handle the competition - I finished bottom last year! Wish me luck...
(With thanks to Andrew Jones at analytics-link for allowing the shares of the screen grabs)
Love this, well done you!
Data Visualization Workshops & Courses
4 年Love this!!! It's your very own version of #PDbeforeTV ????????
Data Science & Analytics Coach | 100k+ Followers | Amazon | PlayStation | 6x Patents | Author | Advisor
4 年Amazing work Lucy G. (and your husband) this is such a cool write-up! Good luck with your NFL predictions - you'll have to let me know how you get on... Or, sign up to the full-course where the section on Deep Learning is soon to be released and annihilate your competition ?? All the best for your future studies, you're going to be amazing!