Python

Python is the preferred programming language for data scientists. It is straightforward to use, especially for users with no prior programming experience. The best data science and machine learning libraries are primarily written in Python. Python has an active development community with new libraries and editions to the core codebases made often. There are dedicated environments for data scientists, including Anaconda and Virtualenv, which allow users to activate a simple set of libraries for their work with a single line of code.

Python is lightweight and efficient at executing code. It is faster than other commonly used languages like Matlab and Stata. It can support object-oriented, structured, and programming styles, meaning it is adaptable to any application. Youtube uses Python for video, control templates, canonical data access, and more as an example of its scalability and flexibility.

Another widely-used programming language in the data science community is R. Both are easy to pick up and are two of the minor decorative languages to use. Which language to start with boils down to personal preference. Start with the language that appears most intuitive to you. Becoming proficient in one language will carry over to all other languages, as thinking as a programmer is a skill in itself. You will have access to more libraries and resources by learning Python first.  In the long term, learning both Python and R will make you more adaptable and desirable for employment; plus, It never hurts to continue learning!

The courses I have listed below focus on both pure Python fundamentals and applications in data science. There is an emphasis on project building in each course as the best way to learn programming is by doing. It is helpful to have book companions alongside you as you work through your course of choice. The two books I recommend for companions are as follows:

  1. Python Data Science Handbook – Jake Vanderplas: If you have little or no knowledge of data science and want to dive into the field, this book is perfect for you. It will go into further depth than many online courses and covers everything necessary for a data scientist using Python. The book includes the following sections: Interactive Python, Numpy, Pandas, Data Visualization, and Machine Learning. There is an assumption that the reader knows the basics of Python. So use this after going through an introductory course.
  2. Python Crash Course – Eric Matthes: This book is great for Python beginners. It includes a perfect balance of teaching, coding examples, and code writing. You will be able to go through meaningful projects that have real-life applications.

Now on to the online courses!

TL;DR

  1. Complete Python Bootcamp: Go From Zero to Hero in Python 3 – Jose Portilla
  2. Python for Data Science and Machine Learning Bootcamp – Jose Portilla
  3. Python Fundamentals – Pluralsight
  4. Python for Data Science – University of California, San Diego
  5. Python for Data Science – Dataquest

Complete Python Bootcamp: Go from Zero to Hero in Python 3 – Jose Portilla, Udemy

Front page for Complete Python Bootcamp, Jose Portilla on Udemy
Frontpage for Complete Python Bootcamp, Jose Portilla on Udemy

Pricing: £39.99 with frequent discounts

Course Materials:

  • Object and Data Structure Basics
  • Comparison Operators
  • Statements
  • Methods and Functions
  • Obect Oriented Programming
  • Modules and Packages
  • Errors and Exceptions Handling
  • Decorators
  • Generators
  • Advanced Python Modules
  • Web Scraping
  • Working with Images
  • Working with PDFs, Spreadsheets, CSV files
  • Emails
  • Final Capstone Proect
  • Advanced Python Objects and Data Structures
  • GUIs
  • Unit Testing

This course is one of the most widely used and highly recommended courses for learning Python from the basics to the intermediate level. It boasts a wide variety of topics with over 100 lectures. One of the standouts of this course is the teacher will point you to additional materials to explore with the lectures. Referring to extra reading material is a vital habit and ensures you get the most out of your course. The course ramps up in difficulty relatively quickly, so spend as much time as you need to understand the fundamentals. There are multiple milestone projects and a final capstone project, so you will have plenty of opportunities to test what you have learned.

Python for Data Science and Machine Learning Bootcamp – Jose Portilla, Udemy

Front page for Python for Data Science and Machine Learning Bootcamp, Jose Portilla on Udemy
Frontpage for Python for Data Science and Machine Learning Bootcamp, Jose Portilla on Udemy

Pricing: £44.99 with frequent price reductions

Course Material:

  • Predict Movie Box Office Revenue with Linear Regression
  • Python Programming for Data Science and Machine Learning
  • Introduction to Optimisation and the Gradient Descent Algorithm
  • Predict House Prices with Multivariable Linear Regression
  • Preprocess Text Data for a Naive Bayes Classifier to Filter Spam Emails: Part 1
  • Train a Naive Bayes Classifier to Create a Spam Filter: Part 2
  • Test and Evaluate a Naive Bayes Classifier: Part 3
  • Introduction to Neural Networks and How to Use Pre-trained Models
  • Build an Artificial Neural Network to Recognise Images using Keras & Tensorflow
  • Use Tensorflow to Classify Handwritten Digits
  • Serving a Tensorflow Model Through a Website
  • Next Steps

This course builds on the Complete Python Bootcamp and focuses on data science and machine learning essentials. This course is great to do once you have a working knowledge of Python and apply it to real-world problems. It covers all of the popular tools you will come across as a data scientist and teaches how to build business solutions. Given the breadth of topics discussed in this course, I recommend using a book companion to help you when some concepts are not explained in as much detail as you need. You may also want to combine this course with one that includes more frequent projects or a substantial capstone project so that you get as much practice as possible.

Python Fundamentals – Pluralsight

Front page for Python Fundamentals on Pluralsight
Frontpage for Python Fundamentals on Pluralsight

Pricing: Free for ten-day trial, paid plans upwards from ~$30/month

Course Material

  • Strings and Collections
  • Modularity
  • Objects
  • Collections
  • Handling Errors
  • Iterables
  • Class
  • Files
  • Resource Management

Pluralsight is an online learning resource focused on technology skills. It has a vast library of courses that you can with a premium subscription. You will have live mentor support as you work through the courses. There is the opportunity to attain course completion certificates. The course is very high quality and well structured. There is a high level for author acceptance and course completion, so expect the course material to go into sufficient depth to build a strong foundation in Python. With Pluralsight, you can take advantage of the Roll IQ and Skill IQ trackers that give you an accurate measure of your current level and enable you to map out your learning path. There are more advanced Python courses available that build on Python Fundamentals, including Advanced Python.

Python For Data Science – University of California San Diego

Front page for Python for Data Science, UCSanDiego, edX.
Frontpage for Python for Data Science, UCSanDiego, edX

Pricing: Free to audit, $350 for certification

Course Materials:

  • Jupyter notebooks
  • Pandas
  • Numpy
  • Matplotlib
  • Git

This course teaches the fundamentals of Python and the tools that data scientists commonly use. It is 8-10 weeks long, with 10 hours invested per week. I recommend pairing this course with reference material to help reinforce concepts and syntax. Doing so will ensure you can progress through the mid-term and final projects of the course smoothly. You can have your projects assessed and have feedback on completed problem sets by paying for the course.

Python For Data Science – Dataquest

Front page for Python for Data Science Fundamentals
Frontpage for Python for Data Science Fundamentals

Pricing: Basic plan $29/month, Premium plan $49/month

There are multiple courses for Python on the Dataquest platform. The main differentiator of Dataquest is interactivity. Every aspect of the courses on this platform requires your immediate involvement, so there is no room for passive learning. The Python For Data Science: Fundamentals course provides an overview of the basics and best practices for data science workflows and building projects. The course works as an interactive textbook, where you will learn how to use the Jupyter notebook. The course finishes with a guided project, where you will build your recommender system. There are no lectures in the Dataquest courses, so if you prefer to learn through lectures but want to take advantage of the project building in Dataquest, combine this course with others listed on this page. If you work through this course and believe you need a more advanced course, you can attempt the Python for Data Science: Intermediate course.

Make Learning Programming Easy

  • You multiple resources at once. While it can be easy to get buried under all the possible courses to go through, choose a few that you believe are well suited to your learning process. Grab a book companion to add more clarity to course materials.
  • Learn by doing. Always apply what you have learned. Just like with learning human languages, the more you practice, the more intuitive the syntax and concepts will be
  • Invest in the fundamentals, so that you can learn the more advanced concepts more quickly. Do not be tempted to skip past concepts that appear elementary.
  • Write out code by hand to help engrain intent and understanding of syntax.
  • Seek out support, whether online or in-person. You will come across bugs or concepts you are not able to grasp, finding help allows you to solve and progress faster.
  • Limit the amount of reading sample code that you do. Alter sample code you find in the course material to increase your engagement and to help understand how pieces of code work.
  • Ensure that when you are on a difficult programming problem, you take time away from it. Doing so will help freshen your eyes, boost your motivation, and allow you to seek advice. I go into more detail about the importance of taking breaks during work in my blog post titled “7 Best Tips For Remote Working For Data Scientists“.

Python is a fantastic language to learn. It is easy to understand, and there are ample resources to get started. It is entirely open-source and actively developed. Becoming proficient in Python means you can leverage the most powerful scientific computing, machine learning libraries, data wrangling, and visualization tools available for data science. As you learn, you have the opportunity to build large projects that will boost your confidence and improve your career prospects.

If you want to learn another popular and powerful programming language used for data science and machine learning, click here for the list of the best online courses for R.

Have fun learning, and enjoy your journey!