T2 and 2 quad-core CPUs

Started by antti, October 17, 2009, 01:14:47 PM

Previous topic - Next topic

antti

I have already learned from the forum that T2 is not necessarily getting any faster the more cores you have (after maybe 4-6 cores) due to the increased overhead. I also know that you can manually set the number of prefered cores to-be-used from T2 preferences. However, this information leaves me with a couple of questions. I am currently building ( = waiting for components) a new Windows workstation with 2 Xeon CPUs, which will give me 8 actual cores and 8 hyperthreading cores and I would like to know how to optimise the use of them (I know, I will do tests of my own when I get the computer up and running, but I would like to get some hints already now).

If I set the number of cores to 4, will T2 use all the real cores of CPU1 or will it e.g. use 2 real cores of CPU1 and 2 real cores of CPU2? Can this be defined somehow, I think that, at least in theory, it could be faster to use only 2 cores per CPU because that would allow Intel TurboBoost to kick in and increase the CPU speed.

Also, if I set the number of cores to e.g. 6, is the distribution 4+2 or 3+3? And if the number of cores to-be-used is something between 4 and 8, is it so that only the real cores will be used and HT will remain inactive?

And, is it possible to open two T2 applications so that the first one uses the cores of CPU1 and the second one uses the cores of CPU2, i.e. is it possible to use the second CPU as a separate "rendering node" within the same computer? 


Henry Blewer

This may not be helpful, but...
I have a Pentium 4 HT. If I use both logical processors (2 threads) to render, the size of the render block decreases a little. If there are four regions of an image to render, then it becomes six regions of the image. I have not noticed more memory overhead until I start adding object populations.
There could be a difference using dual core, or quad core. Then each core uses the same amount of cache. The P4 seems to shrink the cache. If the cache is 400 mb, four cores would need 1600 mb.
http://flickr.com/photos/njeneb/
Forget Tuesday; It's just Monday spelled with a T

Oshyan

#2
There is no way to specify the cores assigned within TG2 (or any application as far as I know). But you can do this through Task Manager, to a certain extent, with "processor affinity". I'm not sure if you can differentiate between physical and logical cores there, but you can probably find a reference that will tell you.

For a Nehalem-based quad core (4 cores, 8 threads), setting it to 8 threads still gets you superior performance in most cases to 4 threads, however it doubles memory requirements from 4 threads, so it can sometimes be disadvantageous. Particularly when working with very memory-intensive scenes. I have a Core i7, and generally use 8 threads, with an 800MB cache (100MB per thread). In some cases I need to reduce that to 600MB or even 400MB (50MB per thread), which may affect performance, but I seldom go down to 4 threads. That being said in severely memory constrained situations it would make sense as 4 physical cores without memory constraints ought to be faster than 4 physical and 4 logical cores all starved for memory.

Because you'll have 2 CPUs you could theoretically use up to 16 cores, but this would definitely run into performance limits from overhead. 8 is the maximum I would use. There are also, as I said, memory issues with this, since TG2 is still a 32 bit app and can only use up to 4GB of memory on a 64 bit OS (less on a 32 bit OS). So even without the overhead, assuming that 16 threads was faster than 8, it would probably still be a good idea to limit it due to memory reasons, at least until a 64 bit renderer is available.

You can run 2 instances of TG2 simultaneously, as long as they're working on different tasks. You could be rendering a scene in one, while tweaking another scene in the other. Or rendering multiple frames of the same animation in parallel.

- Oshyan

jo

Hi,

I don't think it's possible to predict which cores will be used by any particular thread, in fact the OS will more than likely shift them around. Some people have seen render speed improvements via setting processor affinity like Oshyan talks about. Generally speaking it's not really a good idea to try and outsmart the OS when it comes to scheduling threads.

I have a dual quad core Nehalem Xeon machine, actually a Mac Pro. I did some render tests on Vista 64 with a simple scene, just a heightfield and one layer of 3d clouds. Here is what I got for each number of threads:

16 threads : 5.58
8 threads : 6:10
6 threads : 6:59
4 threads : 9:41

So you can see it's worth experimenting with 16 threads when you get your machine together. It might be a different story with a more complex scene. I should try this with the TG2 benchmark.

To use 16 threads on Windows you will need to set that specifically in the preferences. By default TG2 will just detect 8 cores on your new machine because it ignores Hyperthread/virtual cores. However for Nehalem family CPUs the Hyperthreading is much better than it was and is much more of an asset.

Regards,

Jo

antti

OK. Thank you very much for the answers. I most certainly will try different combinations to see how to get the best results - even though as said here, it may be that the optimal settings depend on the task at hand. I will report back my findings.