TG3 Crashes

Started by jaf, December 24, 2014, 02:56:16 PM

Previous topic - Next topic

jaf

Quote from: WASasquatch on January 02, 2015, 05:57:31 PM

Also, as far as nodes and rendering.... I don't think the preview should be restarted unless you are plugging/unplugging nodes, as simply moving a node from a group, or
I just lost a PSU from Terragen spiking CPU/RAM usages (specifically forgetting a render was going and changing a PF scale is when it audibly went "POP") because of these issues, and I should have a whole 250 watts to spare. The fact TG can send my computer way past it's hardware limits and programmed restraints is very bad and can kill a computer from something as simple as accidentally changing a node while forgetting a render is underway. Luckily all the seemed to have popped was the PSU, and a single RAM card.

I started using a program today, which I've used in the past, call Project Lasso Pro.  It's free today at: http://www.giveawayoftheday.com/  It can be a bit complicated but has potential and has a nice Task Manager type view.  I believe you could set limits to keep those "spikes" from frying your system.

Below is a screen capture of Lasso when I running TG3 "full bore".   The CPU's are up at 97% and using a pretty high ram load.  I believe you can "throttle" down processes with Lasso.  My cpu temp is 38C and rarely goes above 40, but that with water cooling. 

(04Dec20) Ryzen 1800x, 970 EVO 1TB M.2 SSD, Corsair Vengeance 64GB DDR4 3200 Mem,  EVGA GeForce GTX 1080 Ti FTW3 Graphics 457.51 (04Dec20), Win 10 Pro x64, Terragen Pro 4.5.43 Frontier, BenchMark 0:10:02

Oshyan

As I've mentioned, the fact that the 3D preview restarts its update cycle when you simply move a node in the node network is a definite bug. Some of these other issues and concerns are also valid, and we're aware of them and hopefully can fix the most clear issues in the future. But there is nothing Terragen can do that will break your hardware unless the hardware itself is not properly configured (e.g. power supply too small for the given CPU/graphics card combo, CPU not properly cooled, etc.). Any 3D rendering program that is multithreaded will use your hardware as much as TG is, if not more (for example renderers that make use of GPU+CPU, TG only uses CPU).

- Oshyan

WAS

#17
Quote from: Oshyan on January 02, 2015, 08:23:33 PM
As I've mentioned, the fact that the 3D preview restarts its update cycle when you simply move a node in the node network is a definite bug. Some of these other issues and concerns are also valid, and we're aware of them and hopefully can fix the most clear issues in the future. But there is nothing Terragen can do that will break your hardware unless the hardware itself is not properly configured (e.g. power supply too small for the given CPU/graphics card combo, CPU not properly cooled, etc.). Any 3D rendering program that is multithreaded will use your hardware as much as TG is, if not more (for example renderers that make use of GPU+CPU, TG only uses CPU).

- Oshyan

Well that is not entirely true. A lot of programs have governors in place that prevent 100% load, TG uses literally all it can, and anything to spare. For example, rendering a scene for me in Cinema 4D, I will never see my CPU spike above 91%. This was definitely not a hardware issue. CPU runs hot at 52c, safe-zone is 32-60c, max temp is 73c for a 24 hour rotation. It never hit 43c before dying. Everything was well in the green in temperatures, not even in the yellows or oranges. Then, my system has a full 250watts of energy to spare on a GPUless rig with a 5x original OEM PSU (on a 3 year warranty). That indicates nothing was strained what so ever power wise, or overheating.

However within a fraction of a second, while the load was at 100% I changed a PF, that creates immense strain on the system that has no room to recompute (at 100% already).  Like your computer manual states "anything above 90% CPU usage can cause harm to your computer". They don't state "your hardware may suck" it's the fact that the hardware is not suppose to run at those levels consistently. Why rendering voids a lot of warranties if you actually dive into your computers manual further.

I've worked in computer repair with data centers (including a render farm) for several years (mainly servers and render nodes for web deving), I know when something has been killed from a program, and when something under warranty shouldn't be dying, especially well out of it's max temp. It's how we charge people. In data centers, we expect every rig to die from a user not correctly managing the 'application' not the computer. As the application is the source of the "kill" By the end of a season theres usually only 50% of the original rigs left... because of people sending in projects with whack settings. It's a very dangerous field. I think that's why a lot of applications have all those warnings, correct usage, and try to prevent stuff like that from happening.

It's no biggy, I'm not blaming anyone as I due know the risks very well, but of all things I am familiar with, it's code, and hardware. I'd put a hideable warning when hitting the render button that it's highly discouraged to edit the project while rendering. Maybe the consistent reminder will help me remember when I minimized a render. xD I've been rendering on a Rasberry Pi and Netbook for crying out loud. I just haven't accidently messed up and edited a project while it's rendering yet. If the simple render was causng my system a progam, both those machines should be long dead. Lol

Quote from: jaf on January 02, 2015, 07:21:18 PM
Quote from: WASasquatch on January 02, 2015, 05:57:31 PM

Also, as far as nodes and rendering.... I don't think the preview should be restarted unless you are plugging/unplugging nodes, as simply moving a node from a group, or
I just lost a PSU from Terragen spiking CPU/RAM usages (specifically forgetting a render was going and changing a PF scale is when it audibly went "POP") because of these issues, and I should have a whole 250 watts to spare. The fact TG can send my computer way past it's hardware limits and programmed restraints is very bad and can kill a computer from something as simple as accidentally changing a node while forgetting a render is underway. Luckily all the seemed to have popped was the PSU, and a single RAM card.

I started using a program today, which I've used in the past, call Project Lasso Pro.  It's free today at: http://www.giveawayoftheday.com/  It can be a bit complicated but has potential and has a nice Task Manager type view.  I believe you could set limits to keep those "spikes" from frying your system.

Below is a screen capture of Lasso when I running TG3 "full bore".   The CPU's are up at 97% and using a pretty high ram load.  I believe you can "throttle" down processes with Lasso.  My cpu temp is 38C and rarely goes above 40, but that with water cooling. 


I use ProcessHacker, I am wondering if they're based on each other cause looks almost identical. I'm going to have a look at that.

jaf

Yes, please don't take my inputs as hating on TG -- I love it.  I understand what I think as a problem, other might not and I'm okay with that.  And it's the users responsibility to make sure their system can handle the loads that most rendering programs put on the hardware and do the necessary preventative maintenance (cleaning) and monitoring temperatures.  I learned the hard way a couple times when I used to run Seti on three computers 24/7 (except when I was using one for something else.)

WASasquatch -- I'm not sure you can force a processor over 100% -- it can only execute instructions up to it's maximum clock speed and the through-put memory and other I/O.  If your system can run Prime95 full blast, Terragen is not going to be able to stress it any harder.
(04Dec20) Ryzen 1800x, 970 EVO 1TB M.2 SSD, Corsair Vengeance 64GB DDR4 3200 Mem,  EVGA GeForce GTX 1080 Ti FTW3 Graphics 457.51 (04Dec20), Win 10 Pro x64, Terragen Pro 4.5.43 Frontier, BenchMark 0:10:02

WAS

#19
Quote from: jaf on January 02, 2015, 10:09:16 PM
WASasquatch -- I'm not sure you can force a processor over 100% -- it can only execute instructions up to it's maximum clock speed and the through-put memory and other I/O.  If your system can run Prime95 full blast, Terragen is not going to be able to stress it any harder.

The upside of a CPU is it is meant to hit 100%, the downside is its not suppose to be given anymore computations (reasons why even on a multi-threaded CPU they don't advise multiple programs running at once) as that can cause iteration failures and core failures. Each computation has a priority, and if its coming from the same program/method they can get picked up in the bypass stage in the load/store unit (LSU)

And that's probably where anomalies like this

come from when you haven't changed anything, but moved a node causing TG to act like something has changed.

Oshyan

I don't think anyone is hating on TG. ;) I'm just trying to avoid misinformation. I am less concerned with individual convictions, more so about others getting the wrong idea.

Fully utilizing multi-core CPUs is just fine, and doesn't void your warranty. Same with using your CPU at 100% for extended periods of time. Here's Intel's current official warranty documentation:
http://download.intel.com/support/processors/sb/warranty_processor_english.pdf
As far as I can see the only expressed limitation is on overclocking, which agrees with my prior long-held understanding.

Modern CPUs have various built-in protection mechanisms for over temperature situations and other variances. They will automatically shut down the machine if they go out of spec to avoid damage. Damage to motherboards and PSUs is more common.

- Oshyan

WAS

#21
Quote from: Oshyan on January 03, 2015, 01:05:08 AM
I don't think anyone is hating on TG. ;) I'm just trying to avoid misinformation. I am less concerned with individual convictions, more so about others getting the wrong idea.

Fully utilizing multi-core CPUs is just fine, and doesn't void your warranty. Same with using your CPU at 100% for extended periods of time. Here's Intel's current official warranty documentation:
http://download.intel.com/support/processors/sb/warranty_processor_english.pdf
As far as I can see the only expressed limitation is on overclocking, which agrees with my prior long-held understanding.

Modern CPUs have various built-in protection mechanisms for over temperature situations and other variances. They will automatically shut down the machine if they go out of spec to avoid damage. Damage to motherboards and PSUs is more common.

- Oshyan

Unfortunately those safe guards are never perfect and people lose cores daily. What happens in a fraction of a second, happens in a fraction of a second. Transistors and emergency shutoffs don't work as fast as the CPU. Separate SOCs read the CPU sensors and determine whether to shut-off, or not. If it happens immediately, there is no time for that to happen.

Also, the CPU warranty is completely separate from a motherboard and full computer warranty. For example "g) There is damage from use outside of the operation or storage parameters or environment detailed in the User's Manual or reasonably acceptable for similar product usage models deemed industry standard best practices;" in the ASUS warranty translated to them saying me running a CPU at 100% for more then 1-2 hours is neglect and voided my netbook. So effectively, rendering voided my warranty as of course it's going to take ages, and of course it's going to spike the CPU.

Also a CPU is only as good as it's mobo, and every mobo is differnet, and has different operation standards. Why we have "Gaming" boards, and "Server" boards, and "Budget" boards, and "OEM" (most restricted) boards.

Matt

#22
Although I haven't seen it, it's possible that a bug is allowing the preview to continue during a render. If I recall correctly there may be a short window of time where the preview may still be running threads after you start a render, but the threads should terminate after a short while (only a few seconds). If this isn't happening, that's a fault we need to correct.

I have seen the preview fail to update after changing parameters, so I know there are issues, and they might all be related. Now we've confirmed that moving nodes can also halt the preview or trigger an update (which it's not designed to do), we'll fix that.

The image you posted with the lighting anomaly in the clouds or atmosphere looks similar to what I've seen in a particular situation. This is when using the voxel options or 2D shadow map options in clouds, then I pause a render part way through, allow the preview to update a little, then unpause the render. Perhaps you did this, or perhaps if the preview is still running in the background when it shouldn't then I suppose this lighting anomaly could happen. The pausing/unpausing problem with the voxels and 2d shapow maps is something I've been meaning to fix anyway.

On the subject of CPU usage. Terragen, or even the system as a whole, cannot use more than 100% of your CPU's capacity. There is no such thing as spiking the CPU load above 100%.

Matt
Just because milk is white doesn't mean that clouds are made of milk.

archonforest

On my PC when I hit the render button the preview render stops right away. If I pause the main render then the preview render start to render again until I un-pause the main one. This is make sense so I guess TG designed like this. Anything else probably a hardware problem or OS crap... ???
Dell T5500 with Dual Hexa Xeon CPU 3Ghz, 32Gb ram, GTX 1080
Amiga 1200 8Mb ram, 8Gb ssd

Matt

Quote from: jaf on January 02, 2015, 05:30:17 PM
Okay, here's another "pet peeve."  I make a change in the preview window that requires I click on "Copy this view to the current render camera."  Okay, I remember most of the time, but there's time I don't or am interrupted and forget.  Well, I can just look a the icon, right?  The images below show the difference in the left-most icon.  Maybe a slow "blink" or a more radical change in the icon?

I agree that this needs to be clearer. It used to be clearer on Windows XP (but maybe only in the Classic theme, I forget), but on Vista and newer the distinction is very hard to see.

Matt
Just because milk is white doesn't mean that clouds are made of milk.

Matt

Quote from: archonforest on January 09, 2015, 11:13:41 AM
On my PC when I hit the render button the preview render stops right away. If I pause the main render then the preview render start to render again until I un-pause the main one. This is make sense so I guess TG designed like this. Anything else probably a hardware problem or OS crap... ???

Yes, that's what it's designed to do.

Matt
Just because milk is white doesn't mean that clouds are made of milk.

jaf

#26
Quote from: Matt on January 09, 2015, 11:18:58 AM
Quote from: jaf on January 02, 2015, 05:30:17 PM
Okay, here's another "pet peeve."  I make a change in the preview window that requires I click on "Copy this view to the current render camera."  Okay, I remember most of the time, but there's time I don't or am interrupted and forget.  Well, I can just look a the icon, right?  The images below show the difference in the left-most icon.  Maybe a slow "blink" or a more radical change in the icon?

I agree that this needs to be clearer. It used to be clearer on Windows XP (but maybe only in the Classic theme, I forget), but on Vista and newer the distinction is very hard to see.

Matt

Hi Matt.  Yes, I experimented at bit and it is clearer using some different themes.  Maybe just adding a "!" in the icon.  Or making it "blink".
(04Dec20) Ryzen 1800x, 970 EVO 1TB M.2 SSD, Corsair Vengeance 64GB DDR4 3200 Mem,  EVGA GeForce GTX 1080 Ti FTW3 Graphics 457.51 (04Dec20), Win 10 Pro x64, Terragen Pro 4.5.43 Frontier, BenchMark 0:10:02