Yes that would be a premature conclusion at this stage to conclude 32 threads are not worthwhile, but I made that statement in my original post, not after you pointed me out to the flaw of including the 24 and 32 thread results
However, I'm still confident that it holds true once we can get our hands on true 24 and 32 core results. More on that later.
Please look at this improved graph where I excluded the 24 and 32 thread data.
The flattening of the curve improves, logically, but there's way more to this than you may think. More on that below.
Then, using linear regression forcing the line through the origin to satisfy
@Matt we get a black line trying to describe our data linearly.
The slope of the ideal curve is y/x = 1/4 = 0.25 and this is also what my software tells me (Graphpad Prism)
The measured slope of our data = 0.1949
Say we
assume the linear fit is robust and fine, then we could extrapolate safely to predict the performance at any given number of threads and calculate back the predicted render-time for that number of threads.
So let's fill in 32 threads:
Ideal: 32 x 0.25 = 8
Ours: 32 x 0.1949 = 6.2368
Then calculate expected rendertime = render time in seconds for 4 threads for each participant divided by the perfomance index, then averaged + standard deviation for the 4 participants each:
Ideal: 170 seconds +/- 32 seconds
Ours: 218 seconds +/- 41 seconds
Hurray! Yes? No!?
No! Why no Martin, you critical sucker!?
Linear regression is not fitting our data well (RMSE = 0.25, which is high if the dependent variable has units of 1 (the normalized value).
You can clearly see that the linear regression is under-estimating performance at low thread counts and over-estimating the performance at higher thread counts, starting at around 12 cores.
That's not subjective, that's objective.
The linear fit intersects the non-linear exponential fit just right after 12 cores.
I'm too lazy to calculate the intercept of the two, but it looks to at about 13 threads, but we cannot buy such CPU's and 13 is closer to 12 than 16. So declination starting at 12'ish makes discussion easier.
Or another way: I can perform t-test on the 12 thread data point and show it is significantly worse from ideal.
Probably the 8 thread point could be statistically different too.
8 threads is already enormously statistically different from theory with p = 0.0001
Is that meaningful? No. The 8 thread data point is almost right on top of the ideal situation, you can clearly see that in the graph, but statistics would make you believe otherwise! Wrong!
The observed values are just very tightly together and that makes it easy for a statistical model to isolate and tell you that 8 threads is significantly worse than the ideal situation.
So this test, for this purpose is not the right one. Statistic difference does not always mean practical/noticeable difference.
Back to the curve fits.
Linear regression is over-estimating performance from 12'ish threads on.
The non-linear regression, which fits the data much better, describes a flattening curve which predicts declining performance as threadnumber increases.
I'll see if I can perform an extrapolation of this curve going up to 32 threads, but it doesn't take a lot of imaginative power to realize that 32 threads will not be reaching a score of 4, equivalent of 16
ideal cores.
This brings me back to my prior statement about 32 threads not being worthwhile.
However, as you said we need a native 24 and 32 core machine to verify and construct an experimentally derived plateau to prove I'm wrong, but so far I'm very confident that 32 threads is dead plateau.
Probably like all software, by the way, this is no way meant to criticize TG's architecture!
I just want to make a very informed decision about whether to go for a sub 3k dollar render machine or a just sub 4k one with all bells and whistles.
Thank you all again!