Overview
Be the voice behind the future—join us to build transformative speech technology for multilingual, intelligent experiences that reach billions.
Microsoft is pioneering next-generation AI-driven speech solutions across global languages for voice agents, video translation, and call centre analytics.
As a Senior Applied Scientist in Microsoft’s Azure Speech team, you will develop advanced multilingual speech models, LLM speech finetuning and multimodal generative AI powering real-time transcription, intelligent voice agents, and multilingual speech solutions across Microsoft products and enterprise solutions. Your work will impact millions—enabling next-generation human–machine experiences for diverse markets, with a keen focus on India.
In this strategic role, you will have significant ownership on technical direction and drive innovation in speech recognition, AOAI customisation, and generative AI. You’ll collaborate with top scientists and engineers and influence cross-functional teams to scale model quality, and deliver breakthrough technologies for voice agents, video translation, and call centre analytics.
Based in Hyderabad, this on-site role offers opportunities to mentor, grow, and shape the future of multimodal interaction for Indian and global audiences.
Microsoft’s mission is to empower every person and organisation to achieve more. We embrace a growth mindset and encourage teams and leaders to bring their best. Join us to shape the future of speech and multimodal LLM technology.
Responsibilities
- Deliver world-class and transformative speech solutions for Microsoft 1st party and 3rd party products and services.
- Set technical directions in multilingual speech model, speech LLMs, model customization and impact accuracy, latency, and compute.
- Build novel data generation solutions to synthesize complex speech scenarios and finetune models.
- Build data analysis metrics and solutions to understand the model results, identify gaps, and guide solutions.
- Collaborate with the global Microsoft team, drive innovative solutions for significant customer asks, and deliver sustained large impacts.
- Mentor and influence peers, sharing expertise and fostering a growth-oriented inclusive team culture.
- Contribute to patents and publications at top-tier conferences and represent the team’s technical leadership within and outside Microsoft.
Required Qualifications
- BS/MS/PhD Degree in CS/EE or related fields with strong focus in speech recognition systems, machine learning, and AI technology innovations.
- 8+ years of experience in speech or machine learning in academic or industrial setting, or 8 years’ experience in software development skills and aptitude for software design, coding and quality.
- Demonstration of excellent problem-solving skills in speech and machine learning areas.
- Proven track record of delivering impactful results and high-quality solutions in complex technical environments.
- Strong programming skills in Python, C++ or similar languages, with experience in large-scale data processing and distributed computing.
- Effective communication skills, both verbal and written.
Preferred Qualifications
- Experience with speech/audio processing, multilingual model development, or voice agent technologies.
- Familiarity with Azure, cloud-based AI platforms, or enterprise-scale deployment of speech solutions.
- Contributions to open-source projects, patents, or publications in top-tier conferences/journals.
- Demonstrated leadership in driving technical direction, influencing cross-functional teams, and mentoring peers.