Skip to content
maeblythe edited this page Jun 29, 2024 · 1 revision

Embeds

Embeds are optional things that you are able to enable while training that can add extra functionality to your model. The best example is Pitch. Without pitch, your model must be tuned in the same way that an UTAU must be tuned.

Precautions

Enabling too many embeds can cause your model to become unstable. Because of this, the Google Colab will only allow you to select certain combinations of embeds. You can override this by editing the config.yaml file.

The author of this page does not understand the concept of embeds fully and hopes that someone who does will correct this page in the future.

Types of Embeds

Pitch

The pitch embed allows your model to use the pitch based on the pitch within your dataset. This only works with Variance models and requires selecting "estimate midi" while preparing your data.

Gender

This was not as much an embed as an option to augment data. It seems to have been removed from the Colab

Energy and Breathiness

This was one of the first available embeds. In Synth V, breathiness determines the amount of air flow.

Tension

Tension in Synth V determines how sharp or relaxed a voice will sound.

Voicing

Voicing in Synth V determines whether a voice is singing normally or if it is whispering.

Clone this wiki locally