That is incredibly strange! My first thought when you mentioned flipping the render upside down was that the hardest work for the renderer to do was at the bottom of the image, but by the time the render process gets down there, it has fewer areas of work for render buckets than the amount of render work you have left to do. In other words you end up with a few render buckets - assigned to a single CPU thread each - that take proportionally longer, and thus the total render time is longer. This does happen on occasion and I have seen similar situations where flipping the camera fixes it because what then happen is these difficult buckets start rendering earlier on and can be crunching away while the rest of the image renders, and they have more time to finish before the rest are done, basically. So you put the hardest to render stuff at the beginning of the render process.
But... your situation isn't that way at all! Somehow and for some reason the *beginning* of the render is slower (lower CPU utilization) than later. Which is really... odd. This is definitely worth looking into further, and thank you for all the in-depth testing and reporting back.
- Oshyan