We are looking for 3 ML models that look realistic enough to not notice the results are computer-generated. Getting them done in a short period of time while maintaining quality is a key priority.
The use case is obtaining rendered videos with the results, and the person in the videos would be sitting close to the camera with no hand or major body movement (as seen in the attached image).
The process would be as the following:
1) We have the video of a person and we give it to the model
2) The model makes the changes in X amount of time (doesn't have to be incredibly fast).
3) We render the result with the changes in 1920x1080 resolution and high quality and definition. (1080p)
- Audio Lip Syncing model: it would be a multilingual (mainly Spanish) model that works in the same way as the following GitHub repository: [login to view URL]
However, Wav2Lip doesn't provide an HD model so it would be necessary to have the output display in the best definition possible for realistic results.
- Hairstyle model: it would replace the hairstyle of the person in the video with a new one while maintaining realistic results.
- Changing clothes model: it would replace the clothes of the person in the video with new ones. Keep in mind it would only be needed to replace the top part because that's the only one that would be seen in the camera. Maybe inputting a random person wearing a piece of clothing alongside the video would be a way to do it.
You have the freedom to test any type of hairstyle or clothing in the video.
We're looking forward to creating a great result and are open to offers so feel free to contact us for more information.