Hi Richard,
I've just checked this out. I can confirm your observation however your interpretation is incorrect :-). That is to say that while you're right that in this case rendering 2 individual frames side by side is quicker than rendering two frames in sequence, you are wrong about why that's happening.
I tested this on OS X using my dual quad core machine (8 real cores + 8 hyperthreads). I was also using a newer build than the public release in 64 bit mode. The reason I mention that is that multithreading rendering is more efficient in the newer version on the Mac (bringing it into line with the Windows version) so if any Mac users try to reproduce this with the current public version (v2.2.something) they won't be able to.
In any case, here are my results:
Sequence with 16 threads: 18:20
Two frames at once with 16 threads: 12:46
Two frames at once with 8 threads: 14:33
Right off the bat we can see that 8 cores is not actually faster than 16 cores (in your case 4 vs
. While 16 threads is not twice as fast as 8 it's still considerably faster. I'll talk a bit more about that later. First off I'll address why I think that it's quicker to render two frames at once instead.
TG2 divides renders up into buckets. Each bucket is assigned to a thread. By default TG2 creates one thread for each core on your CPU. When a thread finishes rendering a bucket it's assigned a new bucket if there are any remaining to be rendered. As the render gets closer to being finished there are fewer buckets available to be assigned to threads. When a thread has no buckets it stops. This means that as the render gets closer to being finished fewer and fewer threads are actually running. You can see this in the Task Manager/Activity Monitor by watching the CPU usage and thread count decrease as the render nears the end.
If you watch the benchmark scene you will see that the first 3/4 or so of it renders at a pretty steady pace. However it slows down dramatically when it gets to the lower right corner. I did quite a lot of work with this scene earlier in the year when I was investigating scaling performance. I had assumed that it was the black sphere which slows things down but the grass population also makes a big difference. If you turn on Ray trace objects and also Ray trace atmosphere the scene actually renders quite a lot quicker, with improved scaling.
In any case, that lower right corner really slows things down. As the rest of the image renders a lot quicker a big part of the render time is actually taken up by just a few threads, down to 2 from 16, working away on that part of the image. This is actually why it becomes faster to render two frames at the same time. When rendering in sequence only 2 threads are working on that part of the image and CPU usage is, let's say, 200%. However when you render two frames at the same time then 4 threads are working on that part of the image (2 per frame) and CPU usage is 400%. This means that on average more CPU is used while rendering the frames and therefore it finishes more quickly.
So you're absolutely right that rendering 2 frames at the same is quicker. However I think this will depend in large part on your scene. The reason this "works" is down to that slow part of the scene where few threads are rendering. If you had a scene which was more "balanced" you might find that rendering two frames was slower. Another aspect could be population time - if you weren't repopulating every frame then only populating once at the start of a sequence could be a win in terms of render time overall. That's all kind of educated guesswork though. I do have a stripped back version of the benchmark scene which is more suited to testing scaling and I will try this again with that. IIRC scaling with that scene is something like 25% better than with the original benchmark scene.
Now we get to the part where you're incorrect :-). I don't mean to seem rude here but people do have the impression that TG2 doesn't scale well and that isn't really the case, so I think it's important to dispel that myth. Earlier in the year I looked into scaling, prompted by the fact that the Mac version was actually getting a lot slower once you moved from real cores to hyperthreads. For example it would scale pretty well up to 8 cores on my machine but once it got past that it really slowed down with hyperthreads. I'm happy to say that's fixed for the next release and Mac and Windows versions have the same sort of performance.
I posted a graph in this message (where I also talk about scaling):
http://forums.planetside.co.uk/index.php?topic=11545.msg121102#msg121102You can see from that graph that on my machine scaling at 8 cores still has a little way to match the ideal but it's not so bad as I think people commonly believe. One of the alpha testers has a 12 core machine and he sees the same sort of scaling out to 12 real cores. Once you get on to hyperthreads performance still improves but not as much. That's because hyperthreads are not real cores and they're not nearly as fast.
This is also demonstrated by my results. Two frames rendering at the same time with 16 threads is faster than using 8 threads. I think you made an incorrect assumption when you thought 4 threads would be faster than 8. I would be interested to see what happens if you tried this again but using 8 threads.
Touching on the render farm aspect, I think you reached the wrong conclusion that it would be faster to render 2 frames simultaneously on each render farm machine using half the number of threads. My results show it would be faster to render using the maximum number of threads available, even if you were running two instances (dependingonthescene ;-).
Like I say, I don't mean to get down on you but I thought it was important to correct the idea that this has something to do with TG2 scaling poorly. There are still improvements which can be made, but it's pretty good. One thing I would like to see happen is the subdivision of buckets so that as the render nears completion the remaining buckets get divided up and more threads stay working for longer.
Congratulations on 2300 posts BTW :-). I notice that Kadri has also reached that milestone today, spooky!
Regards,
Jo