TG2 Render farms

Started by jwdenzel, December 21, 2006, 04:55:16 PM

Previous topic - Next topic

sashley

Took a quick look at it and I'm pretty excited about it. Spoke to the guys in IT and we'll try to set it up officially on the network on monday (its friday night here and they're all out the door). So we'll definately give it a workout. Thanks so much for putting the time into this, appreciate it!

-S

Dark Fire

I will hopefully get some time during the weekend to look at that command line problem and to clean up the interface a bit (there are too many input boxes, output fields and variables, now that Terrain and Script files are not needed). I will also work on reading the total number of frames from the TGD file (for when no stop frame is specified). This will replace the much more complex frame counter for the old Terragen scripts.

swiftstream

Dark Fire,

Awesome! I was considering doing something like this myself, but haven't gotten around to it yet... in part because, as you found, command line rendering isn't working at the moment. Apparently it accidentally got broken, and should be fixed for the next update. So it's not anything you're doing wrong--we just have to wait for Planetside before this will work properly.

ShockFire

Quote from: Dark Fire on January 12, 2007, 05:17:41 PM
Terragen totally ignores my commands. What is going on?
I had to render 25 .tgd files and ran into the same problem.

From http://forums.planetside.co.uk/index.php?topic=56.msg310#msg310 :
"Ah, sorry! It's not working for me either - something has changed recently.
I will try to get this fixed in the next update."

Dark Fire

That's a relief. I am now pretty sure that my software works OK except for the problems when it tries to run Terragen (which is basically the whole point). We shall have to wait until the next update arrives, which I hope is soon. Meanwhile, I shall continue cleaning up the program and give it a new version number - currently all of the programs appear to be version 1.3 because that's the version of the software for the old Terragen that I created the software for the new Terragen from. I shall also work on getting it put into the Emptosoft Installer, so it can be updated more easily.

Ricowan

I've been thinking about how to split up a single image between PCs, and I think it would be relatively easy (once the issue with TGD not working from the command line is fixed).  Here's how I think it would go:

1) Server app loads the given tgd, queries the lan to see how many client apps are available for work.
2) Server app splits the given width (or height) of the render as given in the tgd, makes a new (in memory) version of the tgd for each client with the crop settings in each one set for it's piece.
3) Server app sends the modified tgd to each client.
4) Client saves the tgd to a location specified in the client setup, renders using tgdcli, then returns the finished image.
5) Server receives each finished image, stiching them together when all pieces have been returned.

This works best if all client PCs are roughly equal in power.  If the client network was composed of different powered PCs, the server could break up the image into a set (specified by GUI) number of pieces (more than the number of available clients) and the faster PCs could get extra chunks while the slower PCs are rendering theirs.

Is this even close to how your software works?  :)

Rich

Dark Fire

That summary is virtually the same as how my software works, but my software currently takes the easy route of splitting frames between computers and letting those that are faster do more frames. This solution means that the TGD file does not need to be changed. However, I do like your idea of modifying the TGD file to split individual frames up into chunks - I hadn't thought of that before because you simply couldn't do it with the old Terragen. However, at the moment such a solution would cause problems if the global illumination is in use. I will have no problem splitting the frames into chunks because that's just a load of maths - it should take me a few weeks to make that feature. However, the automated stitching together of crops is unlike anything I have ever attempted program before...

Ricowan

Bummer about the global illumination issue, but maybe they'll have that fixed in the same release that fixes command line rendering?  :)

Rich

sashley

Alright, so from what I understand, this issue is basically on hold until the command line is fixed in with a new update. Well, I'll just have to wait then. Thanks and I'll keep my eyes open for any updates.

-S

Dark Fire

Quote from: sashley on January 16, 2007, 11:12:41 AM
Alright, so from what I understand, this issue is basically on hold until the command line is fixed in with a new update. Well, I'll just have to wait then. Thanks and I'll keep my eyes open for any updates.

-S
That is correct. Render farms are stuck with TG 0.9 until the command line is fixed in T2TP. It is slightly annoying that T2TP is so amazing, and yet simple things like the command line are broken. Oh well...

sashley

Sooo, with the command line being fixed in this new update, is there a tool in the works to set up for network rendering? ;)

-S

Dark Fire

I have no idea. I need somebody with a paid-for copy of T2TP test my software. I am particularly annoyed by the fact that I can no longer test my own software. I have requested a version with a working command line but broken render system for development purposes (see here), but Oshyan has not replied yet.

For the moment all I require, before I continue development, is the confirmation that the command line is working - I can do virtually everything else (like the division of frames and more feedback statistics) without using the T2TP command line (the division of frames should just require some fiddling with the project file, and the statistics are calculated from internal measurements). There is only one thing I know of that I simply can't do without a copy of T2TP with a working command line: use the feedback from tgdcli.exe (I don't even know what sort of feedback it gives).

Anyway, here is the list of feature additions and removals, and bug fixes I will be working on first:
1. Clean up and optimise the interface (especially the job generator, which now has far too many interface features that are no longer needed because a single project file is used, as opposed to the world, terrain and script files used by TG 0.9).
2. Potential bug: warn about, and potentially offer to fix, problems due to local (rather than network) paths being used in the project file.
3. Tweak features that I simply ripped out before, so that they work for T2TP (such as the frame verification).

bigben

#27
@DarkFire: I'll be more than happy to test your software. I utilise some of the PCs at work overnight to speed things up occasionally and the IT guy is quite happy for me to try it (oh wait.. that's me ;))  My previous "render farms" have been crude but effective using a database to generate a batch file to be started manually on each computer, outputting to a mapped network drive to store everything in the same folder. Each queue was then copied to it's relevant machine and started manually via VNC. If one machine finished before the others I just reallocated the remaining frames and restarted rendering.

As for splitting large renders into multiple frames... well I'm doing that at the moment with my hi-res QTVR test. This was split into 10 degree tiles because this was the largest tile that I could successfully render in a portion of the image with the most objects. Larger tiles crashed due to running out of RAM. I'm considering adapting my database to manage tiled output for large images as well, taking the camera rotation settings and fov and outputting new camera settings to import into the TGD to create a tiled output (including animated masks to reduce populations to just those objects influencing the current tile). Setting up the animation files is more painful than importing the data manually into the TGD and using a database takes all of the pain out of that process.

While I'm doing everything on the fly at the moment, using multiple computers would cause some potential synchronisation problems, so I'd simplify it by just rendering the tiles first and then stitching them together (stitching commands also output from my database using the same data).

I'll also add an animated crop for those renders where RAM usage wasn't an issue as this is a lot simpler for single images.

The CLI produces a lot of output (most probably for debugging given it's a TP). Attached a screengrab of the end of a render for you (above the echo commands which are creating my PTStitcher script)

Dark Fire

Thanks for the screenshot. Now that I can see the data contained in the output and its structure I will also start working on capturing the output and extracting useful information from it (using the render times from T2TP and the running times of T2TP from my software, I could probably work out the load on the computer and/or network from the difference between the times - if the time T2TP is running for is much longer than the render time, obviously T2TP has been very slow to open and close for various reasons).

The software that needs testing can be found here. At the moment I only need confirmation that the command line is working so just stick to using one computer (there will be some path problems with the TGD files on a network because the paths are not relative), create a job and render a frame. If T2TP stays hidden but renders and outputs the frame, the software is running it correctly. If T2TP just runs like it does when you normally start it, or it closes immidiately, I will need to try alternative command line formats.

bigben

Quote from: Dark Fire on January 12, 2007, 05:17:41 PM
$TGLoc = Location of tgdcli.exe

This should be "%TERRAGEN_PATH%\tgdcli.exe" if we've followed the docs, so you shouldn't need to specify this. It should avoid some potential path issues. I usually install programs that use commandlines in simplified folders off root (e.g. c:\TG2 in this case)

Setting this environment variable throws up some errors for TG0.9 when it opens. Not sure if anything is broken as a result as I've only used it to open .TER files to determine sizing and find locations (3D view still works).

TGDCLI has been happily rendering non-stop for a couple of days now without any problems even though I've been using the computer for other stuff, with other application crashes.

Just reading the docs for now. Seems relatively straightforward. I can use network names OK, but you might also consider the case of files on a mapped network drive. I usually use mapped drives to simplify commandline stuff but it's not critical if this works.