7 Best Tips to Help Get a Data Scientist Job From Scratch

by | Data Science, Experience, Research, Tips

You have developed a passion and now you want to embark on a new career, but you are unsure where to start to enter the space of data science. This post will provide you with clear, practical steps to get on the road to a rewarding and stimulating career path. The critical message to carry when reading this is your career is a journey, not a destination. There is no guarantee the first job you land will be the best one for you. With every successful and unsuccessful step taken, you will learn and grow. A lot that goes into finding the best job requires some trial and error. This post will help narrow the focus and give you more surety in the next steps take, so you can devote your energy to what is essential.

TL;DR

  1. Immerse yourself in the field, form habits that ensure you are learning something new every day.
  2. Find out what type of data scientist job you want and do the appropriate research on the roles you find.
  3. Build your CV: gain experience with projects and certifications.
  4. Become visible, build an online and physical presence, and form a network.
  5. Find jobs through recruiters, build relationships with recruiters. Use LinkedIn and AngelList.
  6. Learn and practice what is typically asked in a data scientist/machine learning role and do role plays.
  7. Taking smaller steps to your ultimate goal is possible. Do not discount internships, short contracts, and temporary work if they contribute towards your goal.

Immerse Yourself In Your Field

If you are just starting in data science, getting fluent in the techniques and concepts used is essential. Ideally, you should go through one or more comprehensive courses to build a good foundation of knowledge. You can find some brilliant free online courses from renowned institutions like MIT, Harvard, and Stanford on the Online Courses page.

Aim to make staying up-to-date on the latest research a habit; I list some great sites to visit for this purpose on the Resources page. In addition to following new research, you can immerse yourself is through practice. You can create your projects through websites like:

  1. Kaggle
  2. HackerRank
  3. Google Colaboratory

Kaggle and HackerRank also serve as a channel for finding new data science positions and are actively used by companies. Using these platforms is a win-win as you will be able to work on building your data science skills and demonstrate your work to companies on the same platform. Alongside practice, you should develop your theoretical understanding. Aim to learn one concept that is related to data science to the level where you can explain it intuitively. Not only will doing this improve your confidence and understanding when putting the concepts to practice, but it will also make future technical interviews and discussions more successful. For an intuitive review of mathematical and statistical concepts for data science and machine learning, I recommend looking at 3Blue1Brown‘s video collection and Brilliant.

There are plenty of online communities for data scientists to learn and stay up to date on the latest trends for example:

  1. Hunch
  2. KDNuggets
  3. r/DataScience
  4. r/MachineLearning
  5. SmartDataCollective
  6. Distill

For offline immersion, if you are at university, you can seek out societies that explore data science and machine learning. Universities also present great networking opportunities through outreach and funded internships with companies. Positioning yourself within the groups that actively participate in industrial applications of research will boost your visibility and allow you to build your skills.

Researching Data Science Roles

Not all data science roles are the same in terms of requirements and challenges. The emphasis on specific skillsets changes depending on the challenges a company is facing. For example, an early-stage start-up might require a data scientist who has strong engineering skills to transform a prototype into a minimum viable product. A large, mature company might need a data scientist with more focus on analysis and algorithmic development skills as their data pipelines will be more established, and they may want to explore new ideas. The types of data scientists and the overlaps with other related roles are explored in my post titled: “Key Differences Between Data Scientist, Research and Machine Learning Engineer Roles“.

Before embarking on your search, you should do some self-reflection and decide on where you want to specialize. This reflection requires some understanding of your strengths; for example, if you have a background in software engineering and have solid coding skills, then you may find “product-building and engineering” data science roles more rewarding. On the other hand, if you have strong statistics skills, the ability to infer from complex datasets, and you can effectively communicate results, you may be more aligned with “analytical and research-focused” data science roles. You may be strong in both aspects; in that case, you could choose roles based on the problem or domain of the company.

To gather an understanding of the type of roles you will come across, make a note of the “need-to-haves” in the job description. Match these with the three categories: “Most Often Used By Data Scientists”, “Most Often Used By Data Engineers”, and “Most Often Used By Both”. If you are finding the need-to-haves are falling into one category more than another, that will give you some indication into what you will be focused on during the interview and if you land the job. You can see the categories in the Venn diagram and the lists below.

Venn Diagram For Software Tools Used By Data Scientists/Engineers. Source: Me/The Research Scientist Pod

Tools Used Most Often by Data Scientists

  • Python
  • R
  • Tensorflow
  • Keras
  • Matlab
  • Matplotlib
  • ggplot2
  • PyTorch
  • DeepLearning4Java
  • Rapidminer
  • Pandas
  • Numpy
  • Tableau
  • STATA
  • Caffe
  • Julia
  • SAS

Tools Used Most Often by Data Engineers

  • MySQL
  • Ruby
  • Hive
  • Sqoop
  • ElasticSearch
  • PostgreSQL
  • Cassandra
  • MongoDB
  • Riak
  • Redis
  • SAP
  • Neo4J
  • AWS/Redshift
  • Kubernetes

Tools Used Most Often By Both

  • Hadoop
  • Spark
  • Storm
  • Scala
  • Java
  • Kafka
  • Mesos
  • Docker

In addition to skill sets, the educational or academic background is a vital indicator of the type of role offered. In general, data scientists, analysts, and engineers will have a computer science background. You will often see a requirement for mathematics, statistics, econometrics, or physics for data science roles. Candidates with these backgrounds will typically be more on the analytical side and favor experimentation. Data engineers are more likely to have engineering backgrounds, with education being more specified around computer engineering.

There are opportunities to have a research-centric data science job. These types of jobs are becoming more common as companies realize the importance of algorithm innovation. To find companies with this focus, look at their teams. If they have a dedicated science or research team and have a history of publications, it is highly likely as a data scientist there, you will have some input to research. An example of a company with a robust research foundation would be DeepMind.

For product-based companies, if you can download and try their products under trials, take advantage of this. Doing so will give you insight into the moving parts and whether or not there is any algorithmic depth to the product. It will also confirm if you are interested in what the company is doing and the customer pain-points they are addressing. The presence of an analytical data pipeline and visualization is a good sign that statistical data analysis is integral to the business solution. For consultancy-based companies, look at the use cases or previous solutions.

Build Your Resume

Your resume is a vital document for building your career. Aim to keep your resume one page in length; recruiters go through many resumes per day and will roughly spend under a minute on each. You want your resume to be brief and impactful. To keep brevity, you should prioritize relevant experience and projects. It is tempting to play tech-bingo and list every piece of software and programming language with which you have come into brief contact. Only list those which you have working experience with, i.e., you can discuss their key concepts and implement them without help. Avoid progress bars on your resume because these are open to misinterpretation. If a bar is 100% for Python, does that mean you know everything there is to know about Python? It is more effective to show rather than tell by including links to coding projects. Github and Kaggle are fantastic resources for portfolio building because they enable you to exhibit your coding abilities, tell a story, and visualize your results, which are two of the defining skills of data science. You should have a “baseline” resume detailing most of your experience, which you can customize to suit the job description and company. Your resume will stand out to the hiring manager/recruiter.

If you are applying for a research-focused data science role, you can include any publications you have authored. Also include conferences that you have presented in. You may want to prioritize publications and talks over software projects depending on the emphasis on research in the job description.

Given data science is a computer science-related field, it is crucial to have an online personal brand. Gone are the days of focusing all your energy on your CV and cover letter. To stand out, you should focus on having an active online presence. Building a website, for example, will serve as a dynamic, online representative of you and your experience. You can use it to upload your CV, tell your story, and demonstrate your uniqueness through its design. Building a website also shows dedication and the ability to complete a project.

If you want to push yourself and have a creative flair, you can start a blog. A blog will enable you to build your brand further and to contribute your ideas online. One of the best ways to stand out from the crowd is to be an authority on a particular topic. You can center your blog on your strengths and the experience you have gained, for example. The closer the blog is to data science skillsets, the better. Link your website and blog on all of your social media accounts to get as much reach as possible.

Increase Your Visibility Through Networking

Your network is a valuable tool for developing your career. Networking provides a means of trading information and founding long-term relationships with mutual benefits. Use LinkedIn to find people in companies of high interest, particularly data scientists and human resources, and send them a message. You want to use your message as an introduction, to express your interest in their company and to learn more. Ask if they are available for a short call to get some information on the company and the role. Do not expect everyone to respond and do not expect an immediate response. You can send a follow-up message if there has been no response after a week or two. Send out messages out to different companies (but one person for each company maximum). You can also use LinkedIn to reach out to existing friends, colleagues, or alumni from your school or university who are currently in data science roles or in companies that are actively looking for fellow alumni.

AngelList is another powerful networking tool centered around growing start-ups by connecting founders with investors and talent. You can network with AngelList in the same way as with LinkedIn. The positive with focusing on start-ups is they are actively looking to grow and at a faster rate than more mature companies. Small start-ups can have more multi-faceted data science roles; you can diversify your skillset as the company grows and have a higher chance of finding a position that plays off of your strengths.

I would suggest that you have made some steps towards building your online presence through a website and portfolio before actively reaching out to build your network online. By taking those steps first, you have a higher likelihood of leaving a positive first impression and will have some discussion points for when you successfully arrange a meeting.

Offline networking is still very valuable in our increasingly online world. Meet-ups and conferences provide opportunities to be involved in the data science community and to increase your visibility. Presentations at these meetings often serve as advertising for companies; presenters are expecting to be approached and asked for more information, especially if they are hiring. You can ask presenters or other attendees of interest they have time for a discussion so you can learn more about what they do and their company culture. To make it easier to network offline, invest in business cards. Business cards are an efficient way to give all of your online details to potential additions to your network. They also convey professionalism and confidence. When involving yourself in any community, it is essential not to come across as needy or pushy; you are at a meeting to learn and to share your enthusiasm. Do state that you are actively looking for a position and build the conversation from there, but it is not necessary to push with a strong ask. If you remain active in your community, there will be opportunities for you to present your work!

Build Relationships With Recruiters

Making your job search as active and people-centric as possible is the key to success. Going the passive route and simply uploading a CV and cover letter to tech job boards may yield some success, and it is still worth doing. However, your submission is likely to get buried under a mountain of other applicants. In addition to job boards, contact recruiters through LinkedIn and start to foster relationships. On LinkedIn, recruiters are actively looking for talent, if you set your status to “Actively Looking For New Roles” and put data science keywords in your profile description, you will crop up in searches, and recruiters will likely reach out to you. The good thing about LinkedIn is it revolves around career building, so you do not have to worry about coming across as boastful. Invite recruiters to look at projects you have completed or blog-posts you have written. You can also present benefits to recruiters by suggesting potential candidates for other positions they are seeking to fill. By forming relationships of value with recruiters, you will be the first to come to mind when they see a new job opportunity. Also, if they know your strengths through interviewing you, they will have a better understanding of the positions suited to you and will send them your way. Investing time and effort into people will bolster your access to available roles while reducing your time spent scouring job boards in the future.

Practice Answering Interview Questions

There are many online practice interview guides that you can use for free. I will list some for you can get started with below:

Interview Questions

  1. Springboard 109 Data Science Interview Questions and Answers
  2. 100+ Data Science Interview Questions You Must Prepare For 2020
  3. Hackr.io Data Science Interview Questions
  4. Top 50 Data Science Interview Questions and Answers
  5. 21 Must-Know Data Science Interview Questions and Answers

There will be some overlap with the questions asked across the lists, which will help you pinpoint the most common questions. One of the complications of data science interviews is there can be a wide variation in the types of questions asked as opposed to other software development positions. You can narrow down the kinds of questions you will be asked by understanding the type of role being offered (see the first point). If you have an interview lined up, do the necessary research on both the company’s product and the role before practicing to be more efficient in your use of time. It is common to do a coding challenge as part of the interview process, Leetcode, provides practice questions for SQL, data structures, and algorithm development.

As a data scientist, you should expect to have frequent contact with the product and use data to improve its performance. Gain a more profound knowledge of the company product. Define some key performance indicators that the company might want to optimize. Use your curiosity to answer questions about the product, including:

  • What aspects of the product do you enjoy?
  • What are the functionalities of the product?
  • What can be changed about the product?
  • What new features could be included to increase growth/engagement/brand value/ revenue?

If you apply for more analytical data science roles, you should review experimental design, for example, A/B testing and how to interpret results statistically. If you apply for research-focused data science roles, you should understand the relevant research around the product or service and what future steps to improve upon findings.

Take Small Steps To Your Career Goal

It can be challenging to transition into data science from another field; fortunately, there are plenty of resources to put you in a more hirable position. If you are in academia, you can seek bootcamps that specialize in giving STEM academics the skills to move into data science. Science to Data Science is an excellent example of such bootcamps. Here are some examples of bootcamps and courses exist that are not solely for scientists:

  1. Thinkful
  2. The Dev Masters
  3. General Assembly
  4. Springboard
  5. BrainStation

Internships provide an excellent opportunity to get your foot in the door. Do not be afraid of these positions as they can and often do evolve into full-time job offers. With on-the-job learning, you and the company can see how you fit with the company culture and your level of commitment. Even if you do not get a position with the company you did the internship with, you can get references and recommendations elsewhere. You can find the six best tips for finding and getting a machine learning internship in the following blog post titled: “How to Get a Machine Learning Internship in 6 Easy Steps.”

LinkedIn, GradCracker, and Glassdoor are great options for searching for data science internships globally. As mentioned in the second point of this post, companies actively foster relationships with universities in order to recruit talent for internships and permanent positions. If you are currently at university find out if there is a group focused on applying STEM research to data science problems or Knowledge Transfer Partnerships and get in touch with the person in charge of building projects with companies.

Your first job in the field is unlikely to be your dream one, this does not mean that you should not have requirements for a job, but you should remain flexible in your job-hunt. For entry-level positions, you want to prioritize a work culture that is supportive and prioritizes employee development.

Concluding Remarks

Starting on a new career path can be a daunting experience. But there is an immense wealth of tools and resources to put yourself in an excellent position to stand out from the crowd and get closer to that dream job. By taking the time to invest in learning before job-hunting, you will be a more confident and valuable prospect. Build a formidable presence both online and offline while continuing to expand your network. Be active in your job-search and invest in people; this will pay dividends in terms of new opportunities. Do your research on the types of positions you are looking for and the requirements. Do not shy away from stepping stone positions that allow you to gain valuable experience. By taking these tips on board, you will create the best conditions to accelerate into the future that you are after.

Thank you for reading this blog post to the end. Share this post to others who will find this helpful and join the mailing list if you have not already to keep up to date on the latest from the Research Scientist Pod.

Profile Picture
Senior Advisor, Data Science | [email protected] | + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee ✨