Job Title: ML Engineer (Speech-to-Speech) — Subject Matter Expert

Level: Senior SME

Department: Software Development

Status: Contract (10-15 hours/week)

Work location: Fully Remote

Compensation: Hourly ($250)

Company Overview: At Vosyn , we embrace the exciting, game-changing world of Artificial Intelligence, driving innovation and pioneering impactful projects across various industries. Our incubator, AI Venture Lab, nestled in the heart of Office146.com, is a crucible of entrepreneurial spirit, supported by intelligent processes and industry-leading best practices. We believe in fostering a culture of flexibility, continuous improvement, and solution-focused strategies. Here, every idea is welcomed, nurtured, and has the potential to scale to new heights. Currently, we're at the forefront of a significant IPO endeavor, a truly unicorn in the making. We invite you to be part of our journey and leave your imprint on the future of AI. At Vosyn , you will have the opportunity to engage with a fast-growing global organization with diversity of thought, experience, and cultures.

About the Role: We are seeking an experienced ML Engineer SME to provide strategic guidance and technical leadership on key components of our end-to-end speech-to-speech (S2S) pipeline. As a senior project advisor, you will collaborate with the Vosyn Core team, identifying solutions to complex challenges, particularly in text-to-speech (TTS) model development and optimization. Your expertise will be crucial in driving project progress and ensuring our S2S pipeline meets or exceeds industry standards for quality and performance.

Key Responsibilities:

Provide expert-level advice and mentorship on the architecture, training, and production of text-to-speech (TTS) models
Guide the implementation of robust testing methodologies for TTS models using industry standards like MOS testing
Share expertise in distributed training, monitoring, and deployment of large-scale ML models on cloud platforms
Lead latency optimization initiatives in real-time systems for high-quality speech-to-speech conversion
Provide guidance on tuning TTS models for precise control over speech characteristics
Share in-depth knowledge of various TTS model architectures and waveform generation methods
Mentor the team on implementing advanced deep learning models for audio processing
Guide the development of transformer architectures for complex TTS model development

Required Qualifications:

5+ years of proven experience in machine learning development focused on audio generation and TTS systems
Extensive expertise in audio signal processing, particularly for human voices
Deep experience with TTS models, including waveform generation and spectrogram-based methods
Proven expertise in tuning TTS models

ML Engineer Speech-to-Speech - Subject Matter Expert

View Assessment Process

Think you'll be a good fit?