Key Responsibilities:
- Design, develop, and fine-tune deep learning models for voice synthesis (e.g., TTS, voice cloning).
- Implement and optimize neural network architectures such as Tacotron, FastSpeech, WaveNet, or similar.
- Collect, preprocess, and augment speech datasets.
- Collaborate with product and engineering teams to integrate voice models into production systems.
- Perform evaluation and quality assurance of voice model outputs.
- Research and stay current on advancements in speech processing, audio generation, and machine learning.Required Qualifications:
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field.
- Strong experience with Python and machine learning libraries (e.g., PyTorch, TensorFlow).
- Hands-on experience with speech/audio processing and relevant toolkits (e.g., Librosa, ESPnet, Kaldi).
- Familiarity with voice model architectures (TTS, ASR, vocoders).
- Understanding of deep learning concepts and model training processes.Preferred Qualifications:
- Experience with deploying models to real-time applications or mobile devices.
- Knowledge of data labeling, voice dataset creation, and noise handling techniques.
- Experience with cloud-based AI/ML infrastructure (e.g., AWS, GCP).
- Contributions to open-source projects or published papers in speech/voice-related domains.

Keyskills: AI applications deep learning PyTorch GCP voice model architectures machine learning AWS Python