Command line exit codes and rendering errors

jadson · February 08, 2016, 01:29:30 PM

A while back I wrote a simple distributed render manager that includes support for Terragen, and while working on some improvements I noticed a couple extremely odd and in my opinion problematic behaviors when rendering from the command line.

On Linux and OSX, Terragen (v3.3 build 3.3.03.0) does not recognize an incorrect or nonexistent path to a project file or a corrupted project file as reason to exit with a non-zero code. If the project file isn't found or isn't readable, Terragen prints an error message to stdout (not stderr), then quietly opens the default project file, renders it, and exits with code zero.

A similar issue occurs when Terragen is unable to save the rendered file. It prints an error message to stdout (again, not stderr), then exits with code zero. That's been an issue in the past because we were rendering to a network drive that went offline, but Terragen kept going frame after frame for a couple days until we noticed the files weren't showing up where they were supposed to.

I don't expect this to be a high priority issue because it doesn't significantly affect the usability of the software, but it does violate a couple fundamental expectations about how software on UNIX-like operating systems should behave. It also complicates scripting because the only way to tell if a file was successfully rendered and saved is to parse stdout and look for error messages that could potentially change with new versions.

Matt · February 08, 2016, 06:03:13 PM

Welcome to the forum!

We'd like to improve this in future. The first thing we'll do is redirect error messages to stderr instead of stdout. This should apply to anything that's currently considered an error. Such errors are already printed with the prefix "ERROR:" and we don't intend to change that prefix, so they should be easy to identify in any version, but we'll send them to stderr in future.

Unfortunately it's possible that some "errors" may be harmless. For example, if you have an image map shader that's not contributing to the render, it still attempts to load the image when the project loads and if it can't find the image it will report an error. Currently, it's up to your render manager to decide whether you want to take some action when you get these errors.

In future we also want to exit early and return a non-zero code whenever an error occurs. And we also want to improve Terragen's understanding of whether a particular asset is really needed to render, to avoid errors from unused nodes (and of course wasting resources).

Matt

jadson · February 08, 2016, 08:24:45 PM

Thanks for the info and I look forward to the improvements you mentioned. It's not a big problem to parse stdout for error messages (I do that anyway to get progress info), but I had noticed that both critical and trivial errors start with "ERROR:". Until now I've just been ignoring them because of that problem, but that's burned us a couple of times. I could try matching the full message string, but that's less than ideal for obvious reasons. I haven't thought of a better way to deal with it than just manually checking that the frames are showing up on the server. The only problem is, if there are thousands of frames and just one machine out of many is having a problem, it's pretty easy to overlook a missing frame or two until a big chunk of the sequence is done.

Most if not all of the trivial errors I've seen were because of unlinked and forgotten image or DEM shaders. One thing I've noticed (that's probably in the documentation and I just didn't see it) is that when you delete a node from the menu on the left, it just unlinks it from the node network but doesn't delete it. It's a bit counterintuitive, and unlinked nodes are easy to lose in a complicated network. I found some of our files that have been through many iterations were loading 5-10 unused DEMs at startup.

Changing the default behavior might cut down on a lot of unnecessary errors. I think Blender's approach to this is pretty good. It keeps unlinked datablocks around during a session, but when the project is closed any unlinked datablocks are removed unless the user has marked them for preservation. Alternatively, just a menu option to purge all unused nodes would streamline the cleanup process quite a bit.

Command line exit codes and rendering errors

jadson

Matt

jadson