I feel that the video is quite misleading. Denoising in post is a good strategy and good advice, and that part of the video is useful. It's the stacking that I have a problem with. Stacking images like this using the 'mean' mode is not much different from what C4D's physical renderer does when you allow it to render for longer. He compares a stack of 15 images (total render time 12m51s) with a C4D render that took 30m32s. The 30 minute render is clearly much better. But he does not show what a 13 minute C4D render would look like. I suspect it would be similar in quality to the stacked version, or perhaps even better, considering how good the 30 minute render is. The 30 minute render is significantly better than the 13 minute stacked version. It's quite surprising actually, so I think C4D is doing something even more clever.
He suggests using a Denoise filter on the stacked version to clean up the remaining noise. It ends up about as smooth as the 30 minute render, of course. But my question would be "why not render a 12 minute render in C4D and then denoise that instead? Would the result be better or worse?"
In most of the examples he showed, the stacked version was noisier (before applying the Denoise filter). So where's the gain? But he didn't show examples of stacking at the same total render time as the native render, so we can't be sure.
It's the renderer's job to produce the lowest noise render in the time it's given, and if stacking were more efficient then the renderer would work like that already.
I'm being a bit harsh. There are some cases where stacking can reduce certain types of noise better than the renderer would do natively, such as very bright regions that contain super-bright pixels. The reason is that they can be clamped to low dynamic range before stacking, which helps a little bit.
Modes other than 'mean', e.g. 'median', might sometimes be a better choice for stacking, and that's where perhaps stacking in the image editor could work out smoother than the renderer's native output.
Matt