TG2 reports how many cores it has detected in its startup/about box. This may not be the correct number, so TG allows you to override this number in Preferences.
Even if the detected number is correct, you may find that you get better results by reducing this number. If TG detects 8 cores it will render with 8 threads at once, unless your min/max thread settings in the render node prevent this. You may find that 4, 5 or 6 threads are more efficient, so you should experiment. You can either use the override in Preferences (which is a guide for how many threads a render should use), or reduce the max threads in the render settings to how many threads you want to use.
I would be very interested to see what results you get. Maybe even try as low as 3 threads.
Regarding TP3 vs. TP4: The speed difference between TP3 and TP4 may not be noticeable on all machines. There was a bug which seemed to afflict some configurations more than others.
Matt