Amazon logo

Amazon Data Scientist Coding Questions

40 practice questions for Amazon Data Scientist interviews

Amazon data scientist interviews test statistical reasoning, ML model design, SQL proficiency, A/B testing methodology, and Python-based algorithm implementation.

All Roles Software Engineer Backend Engineer Frontend Engineer Full Stack Engineer Mobile Engineer Data Engineer Data Scientist ML Engineer DevOps Engineer DevOps Engineer Product Manager SRE Security Engineer Engineering Manager Data Analyst UX/UI Designer QA Engineer
coding Hard Verified Question #1

1. Binary Tree Cameras


Category: Binary tree coding problem
You are given the root of a binary tree. You need to install the minimum number of cameras on the tree nodes such that every node in the tree is...
Input: Binary tree
Output: Integer
coding Hard Verified Question #2

2. [CodeSignal] Warehouse Emergency Deliveries


Category: Array coding problem
Amazon has opened a new warehouse recently. There are no products in the warehouse currently. The warehouse is under inspection for n days. The...
Input: Array
Output: Integer
coding Hard Verified Question #3

3. [CodeSignal] Permutation Sorter


Category: Combinatorics coding problem
Amazon engineers are testing a new tool, the Permutation Sorter, built to reorder sequences using limited operations. Given a permutation of...
Input: Integer(s)
Output: Integer
coding Hard Verified Question #4

4. [CodeSignal] Maximum Product Rating


Category: Array coding problem
The engineers at Amazon are working on a new rating system for their products. For each product, an array customer_rating is maintained for the...
Input: Array
Output: Computed result
coding Medium Verified Question #5

5. [CodeSignal] Drone Hub Travel


Category: Array coding problem
Amazon is expanding its next-generation drone delivery network, consisting of m hubs arranged in a circular ring (Hub 1 is adjacent to Hub m)....
Input: Array
Output: Computed result
coding Medium Verified Question #6

6. [CodeSignal] Minimum Security Groups


Category: Array coding problem
A financial services company has requested AWS for a private deployment of its cloud network. There are n servers in the network where the security...
Input: Array
Output: Integer
coding Medium Verified Question #7

7. [CodeSignal] Maximum Secure Deliveries


Category: Array coding problem
You are given an array deliveryLogs of size n, where each element represents the number of parts delivered in the i-th log. You are also given...
Input: Array
Output: Integer
coding Medium Verified Question #8

8. Maximum Interval Overlap


Category: Interval-based coding problem
You are given a list of closed intervals on the number line, where each interval [start, end] includes both endpoints. Find the maximum number of...
Input: List
Output: Integer
coding Medium database #1

1. [OA] SQL Window Function — Analyze customer trends for Amazon Fresh

To improve customer targeting and marketing strategies, Amazon Fresh wants to analyze shopping trends over a sliding time window.
Problem statement: Write a SQL query to find the average purchase amount of customers over the last 7 days for the given date range. Assume we have a table named purchases with the columns customer_id, purchase_date, and amount.
Example 1:
Input: SELECT ... FROM purchases ... WHERE ... (actual SQL query)
Output: SELECT customer_id, AVG(amount) AS avg_purchase FROM (SELECT customer_id, amount, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date DESC) AS rn FROM purchases WHERE purchase_date >= DATE_SUB(CURRENT_DATE, INTERVAL 7 DAY)) sub WHERE rn <= 7 GROUP BY customer_id;
Explanation: This query calculates the average amount purchased by each customer for the past 7 days.
Constraints:
- Assume no duplicate purchases occur for a specific customer_id on purchase_date.
- purchase_date is of type DATE and has no NULL values.
coding Hard dynamic programming #2

2. [OA] Dynamic Programming — Build an Amazon Prime recommendation system

To enhance the user experience for Amazon Prime members, we need a dynamic programming solution that optimizes the personalized recommendations based on user purchase history.
Problem statement: Given a list of item_ids a user has purchased and a recommendation_count, determine the maximum number of unique item recommendations based on previous purchases. You should implement the function maxRecommendations(item_ids: List[str], recommendation_count: int) -> int that returns the maximum number of unique items that can be recommended.
Example 1:
Input: maxRecommendations(['A', 'B', 'C', 'A', 'E'], 3)
Output: 3
Explanation: The user can be recommended up to 3 unique items from their purchase history.
Example 2:
Input: maxRecommendations(['A', 'B', 'B', 'C'], 2)
Output: 2
Explanation: The user can be recommended up to 2 unique items from their purchase history.
Constraints:
- 1 <= item_ids.length <= 10^5
- 0 <= recommendation_count <= item_ids.length
coding Medium hash map #3

3. [OA] Sliding Window — Implement the session tracking for Amazon's Service Usage

Amazon often needs to monitor user sessions in a seamless manner to improve personalization and service recommendations. By keeping track of user activities in a sliding window, we can effectively manage and analyze session data.
Problem statement: You are tasked with implementing a SessionTracker class that maintains session usage data for users based on their active sessions within a specified time frame. The methods you need to implement are:
- start_session(user_id: str, timestamp: int) -> None: Start a session for a given user at the specified timestamp.
- end_session(user_id: str, timestamp: int) -> None: End the session for the user at the specified timestamp.
- get_active_users(current_time: int) -> List[str]: Return a list of users who have active sessions within the last 30 minutes of the current time.
Example 1:
Input: start_session('user1', 100)
Output: None
Explanation: session started for 'user1' at timestamp 100.
Example 2:
Input: start_session('user2', 200)
Output: None
Explanation: session started for 'user2' at timestamp 200.
Constraints:
- 1 <= user_id.length <= 100
- 0 <= timestamp <= 10^9

Related Amazon Data Scientist interview prep

Start practicing Amazon questions

Sign up for free to access walkthroughs, AI-generated questions, and more.

Get Started Free