How do I prepare for data science interviews

Overview

Preparing for data science interview is hard, people can be asked a wide range of questions without any focus and guidance. In the interview, sometimes people are asked to do hard-core coding on data structure and machine learning, sometimes they are asked to solve business problems using data and experiments, sometimes, they are going to work out probabilities and statistics questions, sometimes they have to figure out the right way to query using SQL.

Things can be overwhelming sometimes, but calm down, we can tackle the data science interviews. Personally, I would prepare the interview in the following three steps.

  1. Understand the industry and available positions
  2. Learn and review algorithms & data structure, machine learning, statistics, SQL, and business analytics.
  3. Prepare specifically for the upcoming phone/on-site interviews

People often focus on the technical side more when preparing for interviews. There is nothing wrong with that. Meanwhile, it would be great if you could have a bigger picture of the business and industry by reading online, talking to data science practitioners, looking and applying for a variety of jobs. Since the data science is such a new field which doesn’t exist 6 years ago, there are not that many standard and routines in the industry. It would be great to keep fresh understanding of the trend and job market of data science world. In addition, networking is one of the best way to get into the data science world since people tend to hire more people through referral, informal meeting, reaching out in data science then other fields.

My Experience

Getting into the data science world is not easy. I have been dreaming of being part of the cool data science world since senior year of college, it takes years for me to finally get there.

In order to get there, I get academic training in math/statistics/biostatistics. My adventure to the data science starts two years ago. I did my research in parallel data simulations and missing data imputation, worked as a statistician for a biotech startup, and learned about machine learning and big data infrastructure in Georgia Tech, Galvanize, Udacity and BitTiger. Now I will be working on data aggregation and engineering in a B2B marketing analytics company.

Along the way, I have been applying and interviewing data science positions with lots of struggles. Sometimes you think you know regression, data structure and algorithms, but actually you don’t. The hard way to figure that out is doing lots of interviews. Each time failure is a new starting point. The old saying never goes run, practice makes prefect.

For more detailed resources I used for the interview preparation, please see the Resources section.

Anyway, best of luck, data science fellows!

Resources

In this section, I will briefly introduce the resources and notes I used to prepare for the data science interviews. Those things which enjoy routine usage in my interview preparation are
denoted with a ♡ symbol. Definitely check it out if you have not done so!

Coding Preparation

  • White board
    • You defintely need a white board to practice coding
  • Leetcode ♡
    • No.1 online coding practice site that you should visit on daily basis for coding interview preparation
    • Weekly contest on coding is very good in term of the exercise and rewards
  • HackerRank
    • Many company uses HankerRank for online coding challenges, so make sure you are familiar with the environment.
  • Codewar
  • SQLZoo
  • Regexone ♡
    • Hands-on walk-through on regular expression
  • PostgreSQL Exercises
    • highly recommend using Postico ♡ to set up the local Postgres database to practice

Data Science/ Statistics Preparation

Analytics Preparation

Books

School

  • Online

  • Bootcamp

    • Galvanize
    • Data Incubator
    • Insight Data Science
  • Master Degree

    There are many choices, a lot of schools offer one-year or two-year MS programs on data science, data analytics, computer science, machine learning, statistics, which are all related to data science. My advice here is consider them if you do not have training or experience in computer science or statistics, and you really want to learn deep into methodologies to gather with applications. The cost of doing a MS degree is not ignorable, plus the data science filed is evolving a lot. There is no guarantee that you could learn the latest data science and machine learning technologies find a desired data science job afterward.

Data Science Challenges

  • Kaggle
  • Yelp Challenge
  • Udacity Didi Challenge
  • Other data team challenges and hackathon

Python 101

Coding/Debuging

  • Check variable names (spelling) and other typos
  • Check base/corner cases (when the input is None, [], -1,…)