Salesforce data scientist interviews test statistical reasoning, ML model design, SQL proficiency, A/B testing methodology, and Python-based algorithm implementation.
No verified questions yet for Salesforce.
FeatureAggregator that aggregates features from incoming streams of events over a specified time window.def add_event(self, event: Dict[str, Any]) -> None — Accepts an event with features.def get_aggregated_features(self, features: List[str]) -> Dict[str, float] — Returns the average of specified features over the time period.aggregator = FeatureAggregator(time_window=60)aggregator.add_event({'a': 10, 'b': 20, 'timestamp': 1})aggregator.get_aggregated_features(['a', 'b']) -> {'a': 10.0, 'b': 20.0}aggregator = FeatureAggregator(time_window=60)aggregator.add_event({'a': 30, 'b': 40, 'timestamp': 1})aggregator.add_event({'a': 10, 'b': 20, 'timestamp': 30})aggregator.get_aggregated_features(['a', 'b']) -> {'a': 20.0, 'b': 30.0}LeadPrioritizer that utilizes a priority queue to manage leads. The class should support adding leads, providing the highest priority lead, and removing that lead from the queue.def add_lead(self, lead: Tuple[str, int]) -> None — Adds a new lead with a lead_id and priority_score.def get_highest_priority(self) -> str — Returns the lead_id with the highest priority.def remove_highest_priority(self) -> None — Removes the lead with the highest priority from the queue.lead_prioritizer = LeadPrioritizer()add_lead(('lead1', 5))get_highest_priority() -> 'lead1' (highest priority because of score 5)lead_prioritizer = LeadPrioritizer()add_lead(('lead2', 10)), add_lead(('lead3', 3)), remove_highest_priority()get_highest_priority() -> 'lead2'10^5.payments with the columns client_id, payment_date, and amount, write a query to compute a retention rate for each client on a month-over-month basis. Consider retention as clients who paid in the current month compared to previous months and return the client_id, month, and retention_rate (as a percentage).sql
client_id | payment_date | amount
-----------|--------------|-------
1 | 2021-01-10 | 100
1 | 2021-02-15 | 150
2 | 2021-01-20 | 200
2 | 2021-03-10 | 100
1 | 2021-03-12 | 200
sql
client_id | month | retention_rate
-----------|-----------|----------------
1 | 2021-01 | NULL
1 | 2021-02 | 100.00
1 | 2021-03 | 50.00
2 | 2021-01 | NULL
2 | 2021-02 | NULL
2 | 2021-03 | 100.00
payment_date will be strictly in the format YYYY-MM-DD.referrals, where each entry is a tuple containing customer_id, timestamp, and reward_points. The function should return a mapping of customer_id to total reward_points earned within a specified time_window (in seconds).def track_referrals(referrals: List[Tuple[str, int, int]], time_window: int) -> Dict[str, int]: — The method returns a dictionary mapping customers to their total reward points within the time window.referrals = [('A', 1, 10), ('B', 2, 20), ('A', 3, 10), ('A', 4, 20)], time_window = 3{'A': 20, 'B': 20}referrals = [('A', 1, 10), ('B', 2, 20), ('A', 5, 30), ('C', 6, 40)], time_window = 5{'A': 10, 'B': 20, 'C': 40}1 <= len(referrals) <= 10^40 <= timestamp <= 10^91 <= reward_points <= 100Sign up for free to access walkthroughs, AI-generated questions, and more.
Get Started Free