Unexpected Performance for AMD vs NVIDIA GPUs Running Disco Diffusion and Stable Diffusion Part 2

Disco Diffusion

This section of tests is for the AI script Disco Diffusion. Disco Diffusion is a very popular image generation AI, many people have made custom models to use with it. It is also very customizable, and you can make relatively large images with low VRAM. Unlike Stable Diffusion, Disco Diffusion is a very slow AI, since it can make much larger images, and uses many more models.

FP16 Precision

FP16 precision means floating point 16 precision, it’s similar to the more common FP32, but lower precision, and therefore, faster. However, NVIDIA GPUs are far more optimized for FP16, while AMD GPUs are more optimized for FP32, so the speed is not based on FP16/FP32 FLOPS alone. FLOPS means floating point operations, often measured in floating point operations per second. Many GPUs have processing power over a teraFLOP, which would be 1,099,511,627,776 or 1024^4 operations per second. Floating point means a decimal, which makes that number seem even higher. Generation time is in minutes:seconds.

Models #1 = VITB32, VITB16, and RN50
Models #2 = VITB32, VITL14, VITB32_LAION2B_E16

SettingsGeneration time on
AMD Radeon Instinct MI25
Generation time
on NVIDIA M40
1024×576, 100 steps, models #110:4110:43
512×320, 1000 steps, models #157:1171:55
512×512, 100 steps, models #211:0514:06
Time to generate images using the Disco Diffusion AI script, on an AMD Radeon Instinct MI25 and a NVIDIA M40

FP32 Precision

SettingsGeneration time on
AMD Radeon Instinct MI25
Generation time
on NVIDIA M40
1024×576, 100 steps, models #1
512×320, 1000 steps, models #1
512×512, 100 steps, models #2
Part 1Part 3

Leave a Comment

Your email address will not be published.