Thank you for your explanation and suggesting the benchmark, Oshyan. I've just rendered the Terragen 3 benchmark in 3:37 minutes. So I think the driver settings seem to be totally fine now since this time matches your score in the benchmark results. I've also had a closer look at thread usage while rendering. Masonspappy pointed out an interesting idea about thread allocation and I think this may contribute to the divergence in render time/cores usage. With my little test scene, the render starts with 100% usage of cores, but shortly after slows down to 70% and gradually lower, though there are still unrendered parts of the image. Strangely, I would expect some "idle" cores then but the lower usage spreads over all cores equally. But my expectation may be incorrect, depending on how threading is implemented.
Ok, now I'm sratching my head. I just disabled "Defer atmo/cloud" and rendered my test scene again, with all cores. Whoosh - 3:33 minutes (was 9:38 minutes with option on). Rendered again with only 8 cores - 5:20 minutes. So the overall render is much faster, but the timing difference between 8/32 cores equals my previous measures (factor of ~1,5).
The TG3 benchmark renders as follows:
32 cores - 3:37 minutes
8 cores - 7:03 minutes (factor ~2,0)
So either the 32 cores are much slower than one would expect due to the guts of the Z820 or the 8 cores render incredibly fast.
But I think my inital issue is solved - the more cores used, the faster.
Since disabling "Defer atmo/cloud" offers a great improvement in speed, are there any drawbacks when not using this option?
Thanks again,
Norman