Overview
Our Signals Modeling team at Microsoft AI is responsible for developing the core intelligence that understands and predicts user interactions with advertisements. This involves everything from initial impressions to clicks, post-click engagement, and ultimately, downstream business outcomes. We design and train transformer-based models with billions of parameters that are crucial for ad ranking, pricing, and optimization across various large-scale consumer platforms. These models are advanced, moving beyond simple click prediction by reasoning over extensive user histories, rich ad and content representations, and heterogeneous event streams to infer user intent and advertiser value, even when ground truth signals are sparse or partially unobservable.
The team maintains end-to-end ML systems, encompassing large-scale data and label construction, representation learning, multi-task and proxy objectives, calibration, and rigorous offline and online evaluation. We develop sophisticated training pipelines that transform weak signals, such as page visits, dwell time, or engagement events, into high-quality learning targets. The deployed models are designed to remain robust despite delayed conversions and dynamic marketplace shifts.
As an Applied Scientist on this team, you will operate at the intersection of deep learning, large-scale experimentation, and marketplace economics. This hands-on role offers significant ownership, allowing you to contribute to shaping next-generation transformer architectures, pushing the boundaries of scalable training and serving, and witnessing the measurable impact of your models within one of the world’s largest ads ecosystems.
Responsibilities
- Drive modeling and data innovations for ad interaction outcome prediction, particularly under partial and noisy feedback conditions.
- Focus on building estimated conversion models, designing data-driven attribution and weak-label generation pipelines, and developing robust learning and calibration methods for scenarios where true user outcomes are sparse, delayed, or unobservable.
- Design and evaluate multi-task and proxy-signal models, enhance offline and online measurement frameworks, and translate modeling advancements into production-ready systems that directly influence ad ranking, bidding, advertiser ROI, and user experience at web scale.
Required Qualifications
- Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or a related field AND 4+ years of related experience (e.g., statistics, predictive analytics, research); OR
- Master's Degree in one of the above fields AND 3+ years of related experience; OR
- Doctorate in one of the above fields AND 1+ year(s) of related experience; OR
- Equivalent experience.
Preferred Qualifications
- Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research); OR
- Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research); OR
- Equivalent experience.
- 3+ years experience creating publications (e.g., patents, libraries, peer-reviewed academic papers).
- 3+ years experience conducting research as part of a research program (in academic or industry settings).
- 1+ year(s) experience developing and deploying live production systems, as part of a product team.
- 1+ year(s) experience developing and deploying products or systems at multiple points in the product cycle from ideation to shipping.
- Experience working with noisy, weak, or proxy labels, including building training signals from indirect user behavior.
- Experience with conversion, outcome, or funnel modeling (e.g., post-click modeling, engagement modeling, attribution, or similar problems).
- Familiarity with model calibration, reliability analysis, or uncertainty estimation in production systems.
- Background in causal inference, attribution, or counterfactual evaluation.
- Experience with large-scale online marketplaces or ads/recommendation systems.
- Experience designing or operating multi-task / auxiliary-task learning systems.
- Proven technical leadership in cross-team modeling efforts or platform-level ML systems.
- 4+ years of industry experience building and shipping machine learning models in production.
- Solid hands-on experience with modern ML models (e.g., deep learning, tree-based models, or linear models) and feature engineering.
- Solid understanding of supervised learning and multi-task learning.
- Practical experience working with large-scale, real-world data and building end-to-end modeling pipelines (data preparation, training, validation, deployment).
- Experience with offline evaluation and online A/B experimentation for ML systems.
- Solid programming skills in Python and at least one major ML framework (e.g., PyTorch or TensorFlow).
- Ability to independently drive modeling projects from problem definition through production and iteration.