System Requirements

Usage Requirements via OpenUtau

It is very likely that Diffsinger can be used via OpenUtau on systems with very low specs. It has been reported to work on a processor as weak as an Intel Atom. However, using "load rendered" pitch will often fail if the section of pitch to be rendered is too long for the computer to handle. Cutting up a USTx into several parts may be a workaround in that case.

Training requirements via Google Colab

Any modern web browser can be used to train via Google Colab. This can be done for free, but be aware that you will need to regularly clear your trash in your Google Drive so that checkpoints can be saved.

Results of experiments training locally with the Colab

Some people will be confused why you would use the Google Colab with Docker instead of training locally. The colab will be using the most recent version every time you load it. Also, because each container is contained, you never have to worry about accidentally messing up your computer.

Containers are only allocated 2GB of RAM initially, but you are able to edit the config to fix this.

If you do not increase the batch size above 9, you will be able to use a graphics card with only 8GB of RAM. It is only at a batch size of 15 that I have seen the VRAM usage go to 8GB.

At the batch size of 15, the RAM used is still under 6GB.

If you wish to build your own computer to create a Diffsinger model, you will be okay with 16GB of RAM and a graphics card with its own 8GB of RAM.

Training locally

Training a Diffsinger model is an extremely complex task for your computer. It requires "CUDA cores" to function. This means that even if your computer has the highest specs possible, you will be locked out from training if you do not have an Nvidia GPU.

It was possible in the past to do workarounds such as using the Google Colab with Docker to try and lower memory requirements on systems that have low amounts of RAM but still have a GPU with CUDA cores. However, the last time this was seen to work was in December of 2023.

It seems that the amount of RAM and VRAM you have access to is more important than the model of your GPU.

Local vs Paying Google

If you wish to quickly complete your training quickly and do not plan on creating multiple banks, you are able to buy Compute Units from Google to use the Colab without usage limits. The price has been informally quoted as $100 USD, which is much more affordable than even buying a capable GPU.

Warning

The process of training may permanently damage your computer's hardware. You should only train on your own hardware that you are able to live without or replace in the case of failure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly