Python Tutorial: Conditional probabilities

Want to learn more? Take the full course at at your own pace. More than a video, you’ll learn hands-on coding & quickly apply skills to your daily work.

Hi there and welcome to Preparing for Statistics Interview Questions in Python! My name is Conor Dewey. In this course, you’ll prepare for real-world statistics questions common to the data science interview. Since our goal here is interview prep, keep in mind that we’ll move a little faster than a typical DataCamp course.

We’ll touch on probability concepts, data analysis, statistical experiments, and some machine learning as well. This first chapter is focused on probability, a topic that interviewers like to lean on quite heavily.

The idea with conditional probabilities is that we want to figure out the probability of something happening, given that we have some additional information that may influence the outcome. In the Venn diagram shown, you can see the overlap between A and B representing the probability of both independent events occurring.

While we’re on the topic of conditional probabilities, we have to go over Bayes’ theorem, a staple in data science interviews. Bayes’ theorem helps us tackle probability questions where we already know about the probability of B given A, but we want to find the probability of A given B.

This picture does a good job of breaking things down. You see that we’re solving for A given B by multiplying the independent events in the numerator to get the probability of A and B occurring together, then we divide by the probability of B to get our answer.

You should make sure you’re comfortable with this and have it memorized since interviewers really love using Bayes’ for more rapid-fire screening questions.

A complementary technique that can be helpful for these questions is tree diagrams. Given a sequence of independent events, you can chain together the singular probabilities to compute the overall probability. Here we see the odds of flipping a coin that lands on heads back to back could be easily described with one-half times one-half.

Not only are sketches like this on a whiteboard or piece of paper useful for understanding, but they also help convey your thinking to your interviewer more clearly.

Let’s see what one of these interview questions might actually look like in practice, using both Bayes’ theorem and tree diagrams.

We’re given some information regarding interview results for two separate stages shown above. We want to find out the probability that the applicant passes the stats interview, given that he or she passes the coding interview as well.

First, let’s draw out the tree diagram and go from there. By multiplying the independent events, we can compute the probability of each outcome. Once we have this, we can plug into Bayes’ theorem.

We can follow the top branch of the tree to get a 0 point 1 probability of the applicant passing both the stats and coding interviews. Next, we add the scenario where the applicant fails stats but passes coding to get the probability of passing the coding interview.

Now that we have our numerator and denominator, we can divide 0 point 1 by 0 point 25. This comes out to applicants having a 40 percent chance of passing the stats interview, given they passed the coding interview.

To summarize, we reviewed some basic conditional probability concepts and how to visualize them using Venn diagrams. We touched on Bayes’ theorem and emphasized its use in interviews. Then we discussed how to use tree diagrams to help break down and simplify interview questions.

Now let’s put this to work and take on a few potential interview questions in python!

Post Author: hatefull