Post Processing, openCL, and your computer hardware

mikeee

Senior Member
I've recently been reading about a framework called "OpenCL" which is software on a computer that lets
your computer use the memory and compute power on your graphics card. Why do that?
Well a graphics card is a lot faster than a CPU at math! Representing graphics and some of these games require
millions of math calculations. I have a Ryzen 2600 cpu that's overclocked to 4GHZ, DDR4 RAM running at 3200 mhz
and a mid range graphics card (AMD RX 570 with 4GB of GDDR5 ram).

I have a benchmark called "clpeak" that shows how well the CPU and GPU do math:

GPU"
Platform: AMD Accelerated Parallel Processing
Device: gfx803
Driver version : 3212.0 (HSA1.1,LC) (Linux x64)
Compute units : 32
Clock frequency : 1280 MHz

Global memory bandwidth (GBPS)
float : 171.47
float2 : 166.89
float4 : 168.08
float8 : 168.09
float16 : 75.21

Single-precision compute (GFLOPS)
float : 5278.92
float2 : 5225.89
float4 : 5163.39
float8 : 5055.43
float16 : 4981.63

CPU
Platform: Portable Computing Language
Device: pthread-AMD Ryzen 5 2600 Six-Core Processor
Driver version : 1.5 (Linux x64)
Compute units : 12
Clock frequency : 3989 MHz
48 warnings generated.

Global memory bandwidth (GBPS)
float : 31.95
float2 : 38.94
float4 : 36.29
float8 : 30.67
float16 : 25.03

Single-precision compute (GFLOPS)
float : 12.47
float2 : 24.80
float4 : 49.45
float8 : 85.08
float16 : 19.94

The memory speed and GFLOPS on my graphics card are way highter than on my CPU and main RAM (and those are fairly modern/fast tech!)

Well I was curious about what software uses OpenCL, and found that most post processing software does!
Before I go further, I am using Fedora Linux on my computer, and not Windows or a Mac. I am not sure if those
platforms support OpenCL, but I imagine they do.
I have started using darktable, and there's a way to run it from the command line to see profiling/performance info:

darktable -d opencl -d perf

Found in the manual there are a lot of options for configuring opencl behaviour in darktable

Starting darktable at the command line prints a lot of stuff:

0.052982 [opencl_init] found opencl runtime library '/opt/rocm/lib/libOpenCL.so'
0.052997 [opencl_init] opencl library '/opt/rocm/lib/libOpenCL.so' found on your system and loaded
0.086635 [opencl_init] found 1 platform
0.086648 [opencl_init] found 1 device
0.086669 [opencl_init] device 0 `gfx803' supports image sizes of 16384 x 16384
0.086674 [opencl_init] device 0 `gfx803' allows GPU memory allocations of up to 3481MB
[opencl_init] device 0: gfx803
GLOBAL_MEM_SIZE: 4096MB
MAX_WORK_GROUP_SIZE: 256
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 1024 ]
DRIVER_VERSION: 3212.0 (HSA1.1,LC)
DEVICE_VERSION: OpenCL 1.2

It sees my card. It is working!

Further in the output, we can see if it's using the CPU or the GPU (graphics card). It prints all the things it does.
Here is a photo export:

48.303760 [dev] took 0.000 secs (0.000 CPU) to load the image.
48.345436 [export] creating pixelpipe took 0.037 secs (0.053 CPU)
48.345461 [pixelpipe_process] [export] using device 0
48.345489 [dev_pixelpipe] took 0.000 secs (0.000 CPU) initing base buffer [export]
48.356296 [dev_pixelpipe] took 0.011 secs (0.007 CPU) processed `raw black/white point' on GPU, blended on GPU [export]
48.359516 [dev_pixelpipe] took 0.003 secs (0.000 CPU) processed `white balance' on GPU, blended on GPU [export]
48.363161 [dev_pixelpipe] took 0.004 secs (0.000 CPU) processed `highlight reconstruction' on GPU, blended on GPU [export]
48.379319 [dev_pixelpipe] took 0.016 secs (0.007 CPU) processed `demosaic' on GPU, blended on GPU [export]
48.391465 [dev_pixelpipe] took 0.012 secs (0.008 CPU) processed `base curve' on GPU, blended on GPU [export]
48.403628 [dev_pixelpipe] took 0.012 secs (0.005 CPU) processed `input color profile' on GPU, blended on GPU [export]
image colorspace transform Lab-->RGB took 0.002 secs (0.002 GPU) [filmicrgb ]
53.199572 [dev_pixelpipe] took 4.796 secs (1.049 CPU) processed `filmic rgb' on GPU, blended on GPU [export]
image colorspace transform RGB-->Lab took 0.002 secs (0.003 GPU) [colorcorrection ]
53.267203 [dev_pixelpipe] took 0.068 secs (0.021 CPU) processed `color correction' on GPU, blended on GPU [export]
53.316746 [dev_pixelpipe] took 0.050 secs (0.015 CPU) processed `sharpen' on GPU, blended on GPU [export]
53.350306 [dev_pixelpipe] took 0.034 secs (0.010 CPU) processed `output color profile' on GPU, blended on GPU [export]
53.498589 [dev_pixelpipe] took 0.148 secs (0.579 CPU) processed `display encoding' on CPU, blended on CPU [export]

Mostly sharing because when buying a computer, you might think you don't need a big gaming graphics card, but having a good GPU
could make post processing a lot faster!
 

Needa

Senior Member
Challenge Team
The graphics in my computer which came over on the Mayflower, doesn't support openCL. It is something I will consider going forward.
 

mikeee

Senior Member
The graphics in my computer which came over on the Mayflower, doesn't support openCL. It is something I will consider going forward.

I can't remember how well my old phenom II box did when I putzed with it years ago, but it's definitely very fast on this one.
 
Top