Machine Learning Job Interview Questions

Following in the theme of "Getting a Machine Learning Job - Six Top Machine Learning Books" and "How to get a Data Science Internship?", in this blog article we take a look at the interview and those dreaded machine learning engineer questions. So if you have your first machine learning job interview or you are in the process of looking for that killer data science internship, read on!

Many people ask me about interviews. How should they prepare, and could I give them some example questions?

I guess writing this article means that I am now going to have to think of a whole new set of questions! But, here are 8 questions we have asked in machine learning job interviews at digiLab recently. Going through them one by one, we can look into the hidden questions behind the questions!

Q1: Suppose I have a function f(x), how would you find the x which minimises this function?

This is one of my favourite questions. This question aims to understand how you think through a problem. There isn't enough information for you to give a good answer, so the better way to answer is with a whole lot of questions.

Is $f(x)$ an analytical function? What is the dimension of $x$ ? What is the 'type' of $x$ ?
Is $f(x)$ expensive to calculate? Can I easily compute a derivative of $f(x)$ ?

From here, we are really asking what you know about optimisation methods. Depending on the answer to these questions, this question would be typically followed up with a challenge which changes the problem. So if someone suggests using gradient descent for example, then we might say well $f(x)$ is very computationally expensive to compute and the gradient is impossible to calculate, so what would you suggest now?

Q2: Can you tell me the difference between "Supervised and Unsupervised" learning algorithms?

This is a conversation starter - if you would like to know or see a discussion on the answer look here at our article on Supervised vs. Unsupervised Learning

After the answer then we might follow it with something related:

Can you give me three examples of supervised learning algorithms? Tell me how one of those methods work. What does a non-parametric supervised learning algorithm mean? Can you give an example? So what is semi-supervised learning?

Lots of avenues to go down. Don't make it up, be precise, and if you don't know what something is then ask.

Q3: What is overfitting? How might you detect overfitting while training an ML model? What techniques do you know about to overcome overfitting?

This is a basic well-known concept in ML. We would expect it to be known for machine learning, how to articulate the problem clearly and how to identify overfitting when training. A basic idea of the methods used for preventing overfitting is essential. This might be to apply a parameter reduction of the model (e.g. reduce the order of polynomial), this might be to collect more data, add a regularising term to the loss or apply dropout during training. Lots of answers.

Q4: How do you validate the performance of a machine learning model?

This again is just an understanding of how you go about your machine-learning workflow. Do you understand the basic principles of training, testing and validation?

Give an example of when it is hard to validate a model, or you had to validate a complex model.

Q5: Tell us about an exciting area of ML or application that you have read about recently? (Follow up) How does that work? Why aren't you working on that (if they aren't)?

The idea here is to access a level of interest in the broad field of machine learning. We want to hire people who are interested in the broader field, and passionate about what they (and others) do. I have a feeling at the moment everyone would say ChatGPT - that is ok, but make sure you know how it works.

I personally think it is better to answer with something a bit different that you can teach the interviewers about. People always like it when they learn something new, and educating something one is a clear route to kudos. A good way of picking up new interesting areas is via Linkedin (see digiLab's page here), with lots of people making posts about new techniques, challenges and applications. The most important thing is to demonstrate a passion for being a Machine Learning engineer which goes beyond a narrow area of interest.

Q6: What is the first thing you do with a new data set?

"Apply Principle Component Analysis . . ." this is what Prof Neil Lawrence always says. Again this is about understanding the way you think about data. There is no correct way to answer this question. When you give an answer, make sure you can reason with it. My answer to this question would be "If in doubt, plot it out!"

Q7: If I gave you a £10k training budget, what would you spend it on?

The question provides an opportunity to highlight an area in which you would like to learn more and shows humility that you aren't above continuous learning and improvement. Say something interesting that people can align with, and they clearly see the value of learning those additional skills in the context of the company you are applying to.

As an example for me, I would like to learn some new programming languages, like RUST. The reason why is I think such languages could be transformational in building reliable and efficient software, reducing dependence on extensive testing. Lots of my team are looking at these languages and I won't know the basics, to help communication in tech meetings.

If I gave you £100, the answer would be to take our "AI in the Wild" course from the digiLab Academy!

Q8: How do you handle missing or corrupted data in a dataset?

No trick question here. Being a proper ML engineer means that you will have come across some real-world data. Understanding the sort of things that you should look out for and how you might solve them is all that is required. Missing data could be an article in its own right but might include missing data, outliers, noisy data, mislabelled data points, and asynchronous data.

Interviews aren't all about giving the right answers

It might be a surprise, but the correctness of the answers is not the most important thing in an interview. Really interviews want to know - could I work with this person? Are they passionate about our mission? Did we communicate effectively? To do this, we find the best interviews do not feel like an interview they feel like an interesting conversation. So ask questions back, show interest about the company you are applying for and smile :D

About digiLab

digiLab is an agile, deep tech company operating as a trusted partner to help industry leaders make actionable decisions through digital twin solutions. We invent first-of-a-kind, world-leading machine learning solutions which empower senior decision-makers to exploit the value locked in their data and models.

Get a Machine Learning Job - 8 Interview Questions We Ask!