5 Ways to Get Started with Machine Learning

    Lucero del Alba
    Share

    Machine learning has taken off and it’s doing so with fury, bringing new insights to every single industry. If you want to be in demand, this is a skill that will put you in the front line. As intimidating as it may seem, it’s surprisingly easy if you approach it the right way.

    Machine learning (ML) is a fascinating practice and field of study. It’s what allowed the introduction of self-driven cars, of robots that can clean your house, the navigation system of drones of all kinds, the recommendation system behind YouTube and Netflix, face recognition systems, hand written recognition, game playing, and lots more.

    And because of its incredibly high value and somewhat cryptic nature, it’s an expertise in very high demand that keeps expanding to different areas — which just five years ago would have seemed inconceivable. Through this article, we’ll see different practical ways to approach it.

    machine learning code

    “Pardon Me … but What is Machine Learning?”

    ML is a branch of artificial intelligence (AI). As Arthur Samuel — one of the pioneers in the field — put it, ML gives “computers the ability to learn without being explicitly programmed”. That is, instead of programming a computer (or robot) to do something, you give information and set the framework to let the system program itself.

    Freaking fascinating? Yes, but we won’t get into the details of this seemingly impossible thing here, but instead point you to the right places where you’ll be able to find that for yourself.

    Before Starting, a Word of Caution

    ML is something of an advanced practice, and you’ll need to have not only some foundations in computer sciences, but also be able to code in at least one programming language. Some popular programming languages for ML are Python, R, Java, C, and MATLAB, among others.

    1. Start Very Quickly … Like, Really, in Less than Ten Minutes

    Sometimes, and for some people, it’s better to just get hands on into something to have a first taste and develop an intuition of what this new art or skill is about, and then dig deeper into some specifics and details.

    Google’s Machine Learning Recipes with Josh Gordon is just that — a straightforward and practical approach to ML. Using the Python scikit-learn and TensorFlow libraries, Josh will walk you through very practical examples and down-to-earth explanations behind the very principles of ML.

    Here’s the first 7-minute video of the series, introducing a supervised learning algorithm in Python — in just six lines of code!:

    The publishing schedule is somewhat irregular, with videos published every month or second month, covering topics such as decision trees, feature selection, pipelines, classifiers: not bad at all for 6-to-8 minute videos that anyone with a little foundation in programming can follow.

    2. Take Courses from Top-Notch Universities, for Free

    If you’re hungry for quality knowledge, you may have heard by now about Coursera, edX, Udacity, and many, many others. We’re talking MOOCs, or massive open online courses.

    Let’s break it down quickly:

    • massive: they have no vacancy limits, and can be accessed by as many people as desired.
    • open: anyone can access them, regardless of their age and previous knowledge on the topic, and independently if they’re able to pay for a certification or not.
    • online: all you need is a device connected to the Internet; even a mobile phone would do.
    • course: these are actual courses with reading materials, practical exercises, and even deadlines.

    Let’s see some courses you could start with.

    Stanford’s Andrew Ng Machine Learning

    Stanford Prof. Ng is a leading researcher on the field of artificial intelligence, and is the person who pretty much started the MOOC spark that would later turn into a fire of knowledge when he first put his Machine Learning online course. The response was overwhelming, with many thousands of people from all around the world taking the course and discussing this topic. He later turned this course into what it is today Coursera, the leading provider of MOOCs.

    The course is as fabulous as it is challenging. I remember having spent an hour or so just to read a 5-page assignment scope before I was able to grasp it. So unlike Josh Gordon’s series, this is a little more on the academic side, but with a lot of practical knowledge and advice that will be very useful later on in your ML practices. But it is doable, and the amount of feedback on the forums is truly overwhelming. Mind you, it was among the first MOOCs I ever took, and one of the best.

    Course details:

    • Approx. duration: 2–5 months
    • Difficulty: high
    • Workload: mid-to-heavy

    Sebastian Thrun’s Intro to Artificial Intelligence

    Also a professor and AI researcher at Stanford (on the field of robotics), co-founder of Google X Lab (the “semi-secret” R&D company behind of Google’s self-driven cars, among other projects), Sebastian is also the founder of a mayor MOOC provider, Udacity. Along with Peter Norvig (Director of Research at Google), he put together the amazing Intro to Artificial Intelligence.

    This is pretty much the foundation to all things ML. It’s a lot lighter than Andrew’s course, with its content spread over more units to make it easier to digest, though it’s a long one.

    Course details:

    • Approx. duration: 4 months
    • Difficulty: intermediate
    • Workload: light

    Caltec’s Yaser S. Abu-Mostafa Learning from Data

    Prof. Yaser is another of the pioneers of putting quality learning material online, making available his Learning from Data ML course on his website, with all of its lectures, learning materials and exams, even before MOOCs were a thing. Later he would package these materials into a MOOC offered regularly by Caltech on edX.

    I took this one as well, and I can tell you that you’ll have to do some heavy lifting here. But if you’ve enjoyed Andrew’s course and are hungry for more foundations, this seems like a reasonable next step.

    Course details:

    • Approx. duration: 4 months
    • Difficulty: very high
    • Workload: very heavy (10–20 hours per week)

    Other Coursera, edX and Udacity Courses

    There’s a very extensive offering of ML and AI courses that you can take for free, not only at Coursera, edX, and Udacity, but at other MOOC providers as well, such as Data Camp — though data science seems to be something of a niche for the three providers we’ve discussed.

    3. Get Certified Education, for a Fraction of the Price

    So far, we’ve talked free MOOCs. They’re awesome, and you don’t need to pay a cent to enroll in them and start learning. In the beginning, these providers used to offer free certificates or statements of accomplishments, even some of them verifiable online. These programs, however, have been discontinued, so in most cases you won’t get a certificate or any type of credential that you could use to demonstrate your education to a potential employer, or even to another higher education institution.

    This may not be a problem if you just want to learn for the sake of it, and even use this knowledge to leverage a successful career as a freelancer, as many professionals already do around the globe. But applying for work can be a different matter, and certs and degrees do ease the way in many cases, so let’s discuss them.

    Verified Courses

    A verified course might be somewhere between $40–$200, depending on the course and the institution. Basically, you pay a premium to get your identity and assignments verified (this is what a verified certificate looks like.) You can find more about Coursera’s Course Certificates and edX’s Verified Certificates. You’ll find they both have a huge offering of ML and data-science–related verified courses, as you can see on this edX search.

    Notice that, whether you pay or not, the contents and materials of the course are exactly the same. What you get by paying is the certification that you actually took and passed the course.

    Coursera Specializations

    Coursera took the concept of verified courses a step forward by grouping some related courses and adding a capstone project to give you a specialization certificate.

    Some specializations of interest to us are:

    specialization courses institution
    Big Data 6 UC San Diego
    Deep Learning 5 deeplearning.ai
    Machine Learning 4 University of Washington
    Recommender Systems 5 University of Minnesota
    Introduction to Robotics 6 University of Pennsylvania
    Probabilistic Graphical Models (PGMs) 3 Stanford University

    Coursera Master’s Degree

    Coursera’s Master of Computer Science in Data Science (MCS-DS) is an actual, official master’s degree issued by an accredited university. Topics in the program are heavily ML-related, and include:

    • data visualization
    • machine learning
    • data mining
    • cloud computing
    • statistics
    • information science

    Course details:

    • Institution: University of Illinois at Urbana-Champaign
    • Price: $600 per credit-hour for a $19,200 in total tuition
    • Duration: 32 hours

    edX XSeries and Professional Certificates

    edX has an XSeries Program for courses within a single topic, in pretty much the same fashion as Coursera’s Specializations. Such series of interest to us include:

    series courses institution cost
    Microsoft Azure HDInsight Big Data Analyst 3 Microsoft $49–99 per course
    Genomics Data Analysis 3 Harvard University $132.30
    Data Analysis for Life Sciences 4 Harvard University $221.40
    Data Science and Engineering with Spark 3 UC Berkeley $49–99 per course

    edX also has Professional Certificate Programs for “critical skills,” including Data Science and Big Data, both offered by Microsoft.

    edX MicroMasters and College Credit

    You also have credit-eligible courses, which are not only verified, but may also serve you to claim for credit towards your B.S. or master’s degree. There are, naturally, a lot of details in the fine print, so you’ll have to do some extra research.

    edX MicroMasters are precisely in this vein. Here are some interesting ones (costs are higher here, as you also pay hours of tuition towards a degree):

    program courses institution cost
    Artificial Intelligence 4 Columbia University $1,200
    Big Data 5 University of Adelaide $1,215
    Data Science 4 UC San Diego $1,260
    Robotics 4 University of Pennsylvania $1,256

    Find out more to earn university credit on edX, and read the MOOCs for Credit report by Class Central.

    Udacity’s Nanodegrees

    A nanodegree is something of degree, issued by Udacity. While Udacity isn’t itself an accredited educational institution, they went to great lengths to partner with tech industry leaders to deliver the most market-targeted education possible — in other words, to prepare you specifically with the skills that the labor market is demanding right now.

    And we’re really talking big names, here: Google, Amazon, IBM, Nvidia, Mercedes-Benz, DiDi, AT&T, among many others. And Udacity’s partners not only co-design the study programs, but even have hiring agreements with Udacity!

    Udacity and their partners even go as far as to publish estimated salary figures:

    program time estimated salary
    Artificial Intelligence 6 months $59.4K to $250K
    Deep Learning TBD TBD
    Machine Learning 6 months $38.7K to $212K
    Robotics two 3-month terms $42k to $156k
    Self-Driving Car 9 months $67.8K to $265K

    Get a job or your money back!

    In fact, the ML nanodegree is part of the Nanodegree Plus program, which is probably one of the most reckless innovations in online learning: you study and graduate, and if you don’t get a high paying job, Udacity refunds your tuition! Unbelievable.

    4. Enroll in Online Competitions: Learn and Win Money (If You Get Good at It)

    Kaggle is an online platform (now part of Google) for predictive modeling and analytic competitions, where companies and researchers from around the world post data sets and statistics, for competitors to find models that will make predictions and explain the data — more often than not, using ML.

    Competitions have improved the gesture recognition software for Microsoft Kinect, the search for the Higgs boson at CERN, and even made ground-breaking advancements in biology and medicine, among other fields. And it must be noted that many of the winners had no prior knowledge of physics, chemistry, or any of the fields of study of the competitions, as you’ll read on Kaggle’s winners interviews.

    And you can win money! In fact, big money (for details on the $3 million price on a Kaggle competition, see “The latest incentive contest aims to predict hospitalizations by harnessing spare grey cells“).

    Individually or in teams

    You can join for free, choose the programming language and algorithms of your choice, and start participating immediately on any competition. There are very active forums from were you can get a lot of insight about what competitors do on real ML challenges, even partner with them and form teams, and share the price should your team get to win a competition.

    But even if you don’t win a competition, you’ll learn a lot in the process by approaching real data sets and discussing the ins and outs of data modeling for making predictions with other ML practitioners.

    Follow the leaderboard

    Kaggle has super cool live rankings for ongoing competitions, making the whole process feel as an actual competition:

    leaderboard

    But beware! As you’ll learn sooner or later, making a model that predicts the test data so accurately might get you some points on the leaderboard, but kill you later when new data is introduced (overfitting, hello!)

    5. Apply for a Job!

    As with pretty much everything, you’ll get better the more you challenge yourself and work at it. Solo or as a part of an organization, if you can ML you’ll be on demand.

    As a freelancer

    Working on ML as a freelancer is totally possible, and with time you could get to have a decent income by only working sparingly on ML projects.

    Sites like Freelancer, Upwork, or Guru can be a starting point for working on small to mid-size projects. But beware, this is an international and very competitive arena, and building a portfolio and your own network clients from scratch when you start solo can prove very challenging in the beginning.

    In a startup

    We live in a data-abundant era, and this is a trend that will only increase. Startup companies, often working with technology, are specially eager for engineers who can manage data and gain valuable insight out of it.

    Once you have built a solid foundation, search the local job boards for tech companies, and apply even if they aren’t openly looking for a ML engineer, talk them into how much value you can drive to their business with your data mining and analytics abilities.

    In a regular company

    ML engineers are also in high demand in industries such as finance, medicine, chemistry, and even in unexpected places such as social sciences if big data datasets are available.

    Applying won’t be easy, as you’ll need not only some credentials for your engineering skills, but also some knowledge in whichever industry you’re applying for. (For example, a “risk management analyst” position in a bank will require not only ML skills, but also a B.S. or master’s degree in finance or credit.) However, if you’ve somehow built these skills, rest assured you’ll be aiming at a top-paying job.

    What to Do Next

    You wanted to start with ML, and fortunately you’ve got choices:

    • Want to have a quick intuition on ML? Watch Josh Gordon’s videos and start coding in minutes.
    • Want to be at the deep learning vanguard? Take a specialized course and apply those techniques to a specific challenge.
    • Want to build a career on ML? Get some credentials and apply for a job.
    • Interested in the field on an academic level? You’re in luck, as there’s plenty of quality material available!

    ML is one of the few disciplines in IT that we can predict will still be trending for some time into the future. The algorithms may change, the techniques may improve and new libraries and approaches may be introduced, but we’re just at the beginning of letting machines learn by themselves.