logo

Vosyn

ML Engineer Speech-to-Speech - Subject Matter Expert

Department
Engineering
Job Type / Location
remote
Experience Required
5+ years
Posted On

Job Title: ML Engineer (Speech-to-Speech) — Subject Matter Expert

Level: Senior SME

Department: Software Development

Status: Contract (10-15 hours/week)

Work location: Fully Remote

Compensation: Hourly ($250)

Company Overview: At Vosyn , we embrace the exciting, game-changing world of Artificial Intelligence, driving innovation and pioneering impactful projects across various industries. Our incubator, AI Venture Lab, nestled in the heart of Office146.com, is a crucible of entrepreneurial spirit, supported by intelligent processes and industry-leading best practices. We believe in fostering a culture of flexibility, continuous improvement, and solution-focused strategies. Here, every idea is welcomed, nurtured, and has the potential to scale to new heights. Currently, we're at the forefront of a significant IPO endeavor, a truly unicorn in the making. We invite you to be part of our journey and leave your imprint on the future of AI. At Vosyn , you will have the opportunity to engage with a fast-growing global organization with diversity of thought, experience, and cultures.

About the Role: We are seeking an experienced ML Engineer SME to provide strategic guidance and technical leadership on key components of our end-to-end speech-to-speech (S2S) pipeline. As a senior project advisor, you will collaborate with the Vosyn Core team, identifying solutions to complex challenges, particularly in text-to-speech (TTS) model development and optimization. Your expertise will be crucial in driving project progress and ensuring our S2S pipeline meets or exceeds industry standards for quality and performance.

Key Responsibilities:

  • Provide expert-level advice and mentorship on the architecture, training, and production of text-to-speech (TTS) models
  • Guide the implementation of robust testing methodologies for TTS models using industry standards like MOS testing
  • Share expertise in distributed training, monitoring, and deployment of large-scale ML models on cloud platforms
  • Lead latency optimization initiatives in real-time systems for high-quality speech-to-speech conversion
  • Provide guidance on tuning TTS models for precise control over speech characteristics
  • Share in-depth knowledge of various TTS model architectures and waveform generation methods
  • Mentor the team on implementing advanced deep learning models for audio processing
  • Guide the development of transformer architectures for complex TTS model development

Required Qualifications:

  • 5+ years of proven experience in machine learning development focused on audio generation and TTS systems
  • Extensive expertise in audio signal processing, particularly for human voices
  • Deep experience with TTS models, including waveform generation and spectrogram-based methods
  • Proven expertise in tuning TTS models

View Assessment Process

Think you'll be a good fit?