Google logo

Google Data Scientist Interview Questions

48 practice questions for Google Data Scientist interviews

Google data scientist interviews test statistical reasoning, ML model design, SQL proficiency, A/B testing methodology, and Python-based algorithm implementation.

All Roles Software Engineer Backend Engineer Frontend Engineer Full Stack Engineer Mobile Engineer Data Engineer Data Scientist ML Engineer DevOps Engineer DevOps Engineer Product Manager SRE Security Engineer Engineering Manager Data Analyst UX/UI Designer QA Engineer
coding Medium Verified Question #1

1. Dictionary of Sorted Letters


Category: Array coding problem
# Question Given a string where letters are sorted in alphabetical order, identify all letters that appear more than twice and record their first and...
Input: Array
Output: Computed result
coding Medium Verified Question #2

2. GPS Error Tracking


Category: Algorithm coding problem
# Question You are tracking GPS location errors by comparing measured GPS locations against a set of "golden" (reference) locations. Each location...
Input: List
Output: Computed result
coding Hard Verified Question #3

3. Minimum Boxing Area


Category: Binary search coding problem
# Question Design a data structure to maintain a dynamic set of points on a 2D coordinate plane. Support operations to insert points, remove points,...
Input: List
Output: Integer
coding Medium Verified Question #4

4. Reverse Segment of Linked List


Category: Linked list coding problem
# Question Given a singly linked list, reverse the second half of the list and then interleave the nodes from the first half and the reversed second...
Input: Linked list
Output: Computed result
coding Medium Verified Question #5

5. Unpainted Segments


Category: Binary search coding problem
# Question You are given a range [A, B] and a sequence of painting operations. For each operation [L, R], calculate the total length of unpainted...
Input: Array of intervals
Output: Computed result
coding Medium Verified Question #6

6. Running Tests With Failing Pairs


Category: Algorithm coding problem
# Question You are given a set of test cases and a black-box function runTests() that accepts a subset of these test cases and returns whether...
Input: List
Output: Integer
coding Medium Verified Question #7

7. Connected Crop Allocation


Category: Grid/matrix coding problem
# Question You are given an M x N garden grid and a list of crops, each requiring a specific number of plots. The total number of plots required by...
Input: 2D grid
Output: Computed result
coding Medium Verified Question #8

8. [CodeSignal] Maximum Zero-Sum Triplets


Category: Array coding problem
# Question You are given an array A of integers. A triplet is a sequence of three consecutive elements. A triplet is called zero-sum if the...
Input: Array
Output: Computed result
coding Easy Verified Question #9

9. [CodeSignal] Coin Table Game


Category: String coding problem
# Question A player is playing a game in which coins are placed on and removed from a table. The game consists of multiple rounds. At the beginning...
Input: String
Output: Computed result
coding Medium Verified Question #10

10. Longest Match Tokenizer


Category: Array coding problem
You are given a text string text and a dictionary array where each element is in the format "<key>:<id>". Here key is a token string and id...
Input: Array
Output: Computed result
coding Hard Verified Question #11

11. Dual Extremes Queue


Category: Queue-based coding problem
Design a StreamBuffer class that buffers a stream of integer latency samples in FIFO order and supports O(1) access to both the minimum and maximum...
Input: Integer(s)
Output: Integer
coding Medium Verified Question #12

12. Daily Branch Pruning


Category: Tree coding problem
A file system manages a directory tree. Each day, all leaf directories (those with no child directories) are simultaneously removed. Directories that...
Input: Array
Output: Array
coding Medium Verified Question #13

13. Path Router


Category: Algorithm coding problem
# Question Design a PathRouter class that maps URL-like path patterns to handler names. Patterns may contain wildcard segments (*) that match any...
Input: Number(s)
Output: Computed result
coding Medium Verified Question #14

14. Frequency Merge Tree


Category: Tree coding problem
# Question Given a string, build a Frequency Merge Tree as follows: 1. Count the frequency of each character in the string. 2. Create a leaf node...
Input: String
Output: Computed result
coding Hard Verified Question #15

15. Expression Simplifier


Category: String coding problem
Given an algebraic expression string containing single lowercase-letter variables, the operators + and -, and parentheses ( and ), simplify...
Input: String
Output: Computed result
coding Medium Verified Question #16

16. Largest Island Perimeter


Category: Grid/matrix coding problem
You are given an m x n binary grid where each cell is either '1' (land) or '0' (water). A group of connected land cells (connected horizontally...
Input: 2D grid
Output: Computed result
coding Hard Verified Question #17

17. Interval Coverage Counter


Category: Interval-based coding problem
Given a list of closed intervals on the integer number line, build a data structure that efficiently answers point-coverage queries. A closed...
Input: List
Output: Computed result
coding Hard dynamic programming #1

1. [OA] Dynamic Programming — Build Efficient Advertisement Targeting for Google Ads

Google Ads needs a way to allocate budget efficiently to maximize conversions. Given a list of n campaigns with their individual expected conversions, calculate the maximum conversions achievable given a total budget of B.
Problem statement: Implement a function maxConversions(campaigns: List[int], B: int) -> int to return the maximum conversions that can be achieved without exceeding the budget.
- Input: A list of integers campaigns representing conversion estimates for each campaign and an integer B, the total budget available.
- Output: An integer representing the maximum conversions.
Example 1:
Input: [200, 400, 300], B = 500
Output: 3
Explanation: We can spend on campaigns with conversions 200 and 300.
Example 2:
Input: [100, 150, 240, 300], B = 600
Output: 4
Explanation: Spending on all campaigns gives maximum conversions.
Constraints:
- 1 <= n <= 100
- 1 <= campaigns[i] <= 10^4
- 1 <= B <= 10^6
coding Hard sliding window #2

2. [OA] Sliding Window — Optimize YouTube View Analytics

Google Analytics for YouTube needs an efficient way to process and optimize view counts over varying time windows. Given a list of n timestamps, determine the maximum number of views in any continuous time window of size k.
Problem statement: Implement a function maxViewsInWindow(timestamps: List[int], k: int) -> int that returns the maximum views within any continuous time window of size k.
- Input: A list timestamps of integers representing view timestamps and an integer k for the window size.
- Output: An integer representing the maximum number of views within any time window of size k.
Example 1:
Input: [1, 2, 3, 4, 5], k = 3
Output: 3
Explanation: The maximum views within any continuous window of size 3 are 3 views (from timestamps 3, 4, 5).
Example 2:
Input: [1, 2, 2, 2, 3, 3, 4], k = 2
Output: 4
Explanation: The timestamps 2, 2 and 3 yield the highest number of views in a window of size 2.
Constraints:
- 1 <= n <= 10^6
- 0 <= timestamps[i] <= 10^9
- 1 <= k <= n

Related Google Data Scientist interview prep

Start practicing Google questions

Sign up for free to access walkthroughs, AI-generated questions, and more.

Get Started Free