Planetside Software Forums

Support => Terragen Support => Topic started by: WAS on November 13, 2019, 07:54:06 PM

Title: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 13, 2019, 07:54:06 PM
There is a memory exception happening during the Process2dPostEffects (or the next processes in the line) for the 44440 node. I can't say much more as there is no debugging, it just throws:  free(): invalid next size (fast) which I guess is something trying to free something that was no longer allocated (or never was).
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 13, 2019, 08:17:39 PM
Can you send us a TGD to test this with?
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 13, 2019, 10:01:51 PM
It's the v4 benchmark. V1.0
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 13, 2019, 10:09:54 PM
Under Ubuntu 18.04 Bionic Beaver. I can setup a KVM you guys can test on if you don't have access to similar distro.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 14, 2019, 10:34:57 PM
Was this able to be reproduced?
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 16, 2019, 01:14:45 PM
???
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 17, 2019, 04:36:43 PM
On my CentOS 6.8 install it works correctly.

Does it finish correctly if you disable bloom?
Does it happen every time?
How much RAM to do you have?
How many threads are you rendering with?
Does it work at lower resolutions?
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 18, 2019, 12:19:31 AM

Does the post effects rely on anything that might be silenced?

CentOS is a very different distro from Debian/Ubuntu.

Linux node has no debugging/log?
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 18, 2019, 01:58:41 AM
So it's the same scenario with Resolution 800x450 and no bloom

Ram was never an issue, I checked periodically and didn't see anything higher than 16gb. It fluctuated about from about 14.5gb - 16gb

I do notice that core usage between the two Xeons is not consistent. One Xeon suffers low usage. Multi-cpu support seems iffy. It seems to suffer in rendering. The dual CPU setup is very comparable to my current CPU. The benchmark, at normal resolution, only takes about 13-14 minutes on my home system. In fact at full resolution the dual xeon  is performing a little slower than the A10-5800k at 4c/4t.

<<<### APP RUN STARTED ###>>>
Terragen 4 build 4.4.44
Licensed to Jordan Thompson

Receiving maintenance until 2020-08-13
Maintenance days remaining: 270

No EDD key

License key file: Set by an administrator

Found license for Professional Edition
Found 24 processor cores
Using 24 processor cores
Loading plugins in: /home/was/Terragen/TG-44440-Node/
No files matching *.tgp in this directory
Loaded 0 modules in this directory
Loading plugins in: /home/was/Terragen/TG-44440-Node/
No files matching *.cpp in this directory
Loaded 0 modules in this directory
Loading plugins in: /home/was/Terragen/TG-44440-Node/plugins/
Loaded 3 modules in this directory
Loading plugins in: /home/was/Terragen/TG-44440-Node/plugins/
No files matching *.cpp in this directory
Loaded 0 modules in this directory
Loading plugins in: /home/was/Terragen/TG-44440-Node/../../tgdplugins/
No files matching *.tgp in this directory
Loaded 0 modules in this directory
Loading plugins in: /home/was/Terragen/TG-44440-Node/../../tgdplugins/
No files matching *.cpp in this directory
Loaded 0 modules in this directory
Loading plugins in: /home/was/Terragen/TG-44440-Node/../../tgdplugins/linux_intel/
No files matching *.tgp in this directory
Loaded 0 modules in this directory
Loading plugins in: /home/was/Terragen/TG-44440-Node/../../tgdplugins/linux_intel/
No files matching *.cpp in this directory
Loaded 0 modules in this directory
Loaded a total of 3 plugin modules
ReadXML: Attempting to read file: "/home/was/Terragen/Projects/TG4Bench/terragen-4-benchmark_v1.0-nobloom.tgd"
Content path for children of "/Billboard_Dune.tgo" set to "Project_Assets/"
trImage attempting to read file Project_Assets/grijshout-small.jpg
trImage attempting to read file Project_Assets/Dunes-Valley.jpg
trImage attempting to read file Project_Assets/Dunes-Valley.jpg
Content path for children of "/Pop tundraheather-2.tgo/tundraheather-2.tgo" set to "Project_Assets/"
trImage attempting to read file Project_Assets/heidetakje-bloei-small.png
trImage attempting to read file Project_Assets/heidetakje-bloei-small.png
trImage attempting to read file Project_Assets/heidetakje-3+alpha.png
trImage attempting to read file Project_Assets/heidetakje-3+alpha.png
trImage attempting to read file Project_Assets/heidetakje-2+alpha.png
trImage attempting to read file Project_Assets/heidetakje-2+alpha.png
trImage attempting to read file Project_Assets/heidetakje-bloei-small.png
trImage attempting to read file Project_Assets/heidetakje-bloei-small.png
Content path for children of "/Pop bush_5m--v1.tgo/bush_5m--v1.tgo" set to "Project_Assets/"
trImage attempting to read file Project_Assets/lijsterbes-blad-herfst.png
trImage attempting to read file Project_Assets/lijsterbes-blad-herfst.png
trImage attempting to read file Project_Assets/bushbark.jpg
trImage attempting to read file Project_Assets/lijsterbesblad-groen.png
trImage attempting to read file Project_Assets/lijsterbesblad-groen.png
Content path for children of "/Pop pijpestro_27-08-13_v2.tgo/pijpestro_27-08-13_v2.tgo" set to "Project_Assets/"
trImage attempting to read file Project_Assets/pijpestro-aar-opp1.png
trImage attempting to read file Project_Assets/pijpestro-aar-opp1.png
trImage attempting to read file Project_Assets/pijpestro-aar_opp.png
trImage attempting to read file Project_Assets/pijpestro-aar_opp.png
ReadXML: done
Content path for children of "Project" set to "/home/was/Terragen/Projects/TG4Bench/"
Starting render...
Output filename (-o) = ./800-400_no-bloom.tif
No -f specified, so rendering the project's current frame
Preparing to render frame 1
Number of buckets:  12 x 7 between 24 threads
Largest bucket size: 107 x 103
TGOReader: Attempting to open file: /home/was/Terragen/Projects/TG4Bench/Project_Assets/Billboard_Dune.tgo
Billboard_Dune.tgo: Loaded 444 triangles, 0 particles
TGOReader: Attempting to open file: /home/was/Terragen/Projects/TG4Bench/Project_Assets/tundraheather-2.tgo
tundraheather-2.tgo: Loaded 51744 triangles, 0 particles
33507 objects loaded from instance cache for "Pop tundraheather-2.tgo"
33507 objects (1.73 billion triangles) inserted by Populator "Pop tundraheather-2.tgo"
TGOReader: Attempting to open file: /home/was/Terragen/Projects/TG4Bench/Project_Assets/bush_5m--v1.tgo
bush_5m--v1.tgo: Loaded 121247 triangles, 0 particles
2134 objects loaded from instance cache for "Pop bush_5m--v1.tgo"
2134 objects (0.259 billion triangles) inserted by Populator "Pop bush_5m--v1.tgo"
TGOReader: Attempting to open file: /home/was/Terragen/Projects/TG4Bench/Project_Assets/pijpestro_27-08-13_v2.tgo
pijpestro_27-08-13_v2.tgo: Loaded 31080 triangles, 0 particles
17947 objects loaded from instance cache for "Pop pijpestro_27-08-13_v2.tgo"
17947 objects (0.558 billion triangles) inserted by Populator "Pop pijpestro_27-08-13_v2.tgo"
Starting pre pass
Rendered 100% of pre pass
Starting final pass
Number of buckets:  12 x 7 between 24 threads
Largest bucket size: 107 x 103
Rendering final pass... 0:00:30s, 0% of final pass, 0 micro-triangles
Rendering final pass... 0:01:00s, 0% of final pass, 0 micro-triangles
Rendering final pass... 0:01:30s, 0% of final pass, 0 micro-triangles
Rendering final pass... 0:02:00s, 0% of final pass, 0 micro-triangles
Rendering final pass... 0:02:30s, 7% of final pass, 97512 micro-triangles
Rendering final pass... 0:03:00s, 9% of final pass, 129659 micro-triangles
Rendering final pass... 0:03:30s, 13% of final pass, 177791 micro-triangles
Rendering final pass... 0:04:00s, 14% of final pass, 193995 micro-triangles
Rendering final pass... 0:04:30s, 14% of final pass, 193995 micro-triangles
Rendering final pass... 0:05:00s, 14% of final pass, 193995 micro-triangles
Rendering final pass... 0:05:30s, 14% of final pass, 193995 micro-triangles
Rendering final pass... 0:06:00s, 15% of final pass, 214094 micro-triangles
Rendering final pass... 0:06:30s, 17% of final pass, 258162 micro-triangles
Rendering final pass... 0:07:00s, 22% of final pass, 404830 micro-triangles
Rendering final pass... 0:07:30s, 26% of final pass, 546236 micro-triangles
Rendering final pass... 0:08:00s, 32% of final pass, 688827 micro-triangles
Rendering final pass... 0:08:30s, 32% of final pass, 688827 micro-triangles
Rendering final pass... 0:09:00s, 33% of final pass, 776360 micro-triangles
Rendering final pass... 0:09:30s, 33% of final pass, 776360 micro-triangles
Rendering final pass... 0:10:00s, 38% of final pass, 1002786 micro-triangles
Rendering final pass... 0:10:30s, 40% of final pass, 1148230 micro-triangles
Rendering final pass... 0:11:00s, 44% of final pass, 1360359 micro-triangles
Rendering final pass... 0:11:30s, 46% of final pass, 1427838 micro-triangles
Rendering final pass... 0:12:00s, 48% of final pass, 1492021 micro-triangles
Rendering final pass... 0:12:30s, 52% of final pass, 1606176 micro-triangles
Rendering final pass... 0:13:00s, 55% of final pass, 1716406 micro-triangles
Rendering final pass... 0:13:30s, 58% of final pass, 1785318 micro-triangles
Rendering final pass... 0:14:00s, 59% of final pass, 1817142 micro-triangles
Rendering final pass... 0:14:30s, 61% of final pass, 1892890 micro-triangles
Rendering final pass... 0:15:00s, 64% of final pass, 1945688 micro-triangles
Rendering final pass... 0:15:30s, 69% of final pass, 2066424 micro-triangles
Rendering final pass... 0:16:00s, 70% of final pass, 2100663 micro-triangles
Rendering final pass... 0:16:30s, 72% of final pass, 2156434 micro-triangles
Rendering final pass... 0:17:00s, 75% of final pass, 2211587 micro-triangles
Rendering final pass... 0:17:30s, 84% of final pass, 2421976 micro-triangles
Rendering final pass... 0:18:00s, 85% of final pass, 2448288 micro-triangles
Rendering final pass... 0:18:30s, 91% of final pass, 2565410 micro-triangles
Rendering final pass... 0:19:00s, 95% of final pass, 2633242 micro-triangles
Rendering final pass... 0:19:30s, 98% of final pass, 2712385 micro-triangles
Rendered 100% of final pass
Process2dPostEffects...
free(): invalid next size (fast)
Aborted (core dumped)

Have you looked into the error and use of C regarding the step?
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 18, 2019, 04:34:34 AM
I've looked into the error. I don't think I'm double-freeing because that should show up on CentOS, Windows and Mac. But there may be some memory corruption at an earlier stage, and different runtime environments may play it out differently.

I've looked at my code. There is just one function call between two calls to print "Process2dPostEffects..." and "Process2dPostEffects: done". But when both bloom and starburst are disabled, that function does literally nothing. So this is very puzzling.

Can you email me that exact .TGD file, no matter how simple it is?

If it only occurs on Ubuntu I'll have to look at that after the 4.4 launch.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 18, 2019, 01:19:12 PM
Why would you fix a program breaking error, preventing me, and probably anyone using Ubuntu 18.x after release? This puts a pause on everything as my PC is used for working on the next steps, I can only do one or the other in a timely matter. That doesn't even make sense from a development standpoint. You discovered it before launch, so why would you wait until after launch? This takes time from my active "maintenance" days.

The fact this is Terragen's benchmark failing, and you seem to think I've done something to it (besides edits you suggested), and not that this is a fault of TG is a little disconcerting too.

nobloom tgd was resaved from 4.44 so may give warnings on lower versions.

Also, if you do not provide a correct project to the node, instead of telling the user there was no project file found, it just renders the default project as the file that doesn't exit. Little confusing... And the default blank project renders fine. So it's settings related, or something in the scene like pops.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 18, 2019, 04:00:15 PM
Thanks for the TGD file.

When debugging something, the variables are numerous. It's important to eliminate as many as possible, and having the exact TGD file is a small but important part of that, no matter how obvious the changes seem to the one reporting the bug. Debugging is such a time consuming process that I have to eliminate as many variables as much as possible.

I'll dig into it ASAP this week, but in the mean time can you help by testing older builds? If it broke some time between 4.3 and 4.4 then there's a good chance we can find the cause and fix it quickly. Do you have the 4.3 release build? Does that work?

I'd love to get all our builds working on every Linux distro but we can't guarantee that because of the variety of Linux distros out there. I know Ubuntu is a big one but it's also not close to CentOS on the family tree, as far as I can tell. I'll do my best to fix it with the resources we have. I suspect it's probably going to turn to be a simple fix, but the unpredictable part is how long it takes to find the cause.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 18, 2019, 05:23:45 PM
Quote from: Matt on November 18, 2019, 04:00:15 PMThanks for the TGD file.

When debugging something, the variables are numerous. It's important to eliminate as many as possible, and having the exact TGD file is a small but important part of that, no matter how obvious the changes seem to the one reporting the bug. Debugging is such a time consuming process that I have to eliminate as many variables as much as possible.

I'll dig into it ASAP this week, but in the mean time can you help by testing older builds? If it broke some time between 4.3 and 4.4 then there's a good chance we can find the cause and fix it quickly. Do you have the 4.3 release build? Does that work?

I'd love to get all our builds working on every Linux distro but we can't guarantee that because of the variety of Linux distros out there. I know Ubuntu is a big one but it's also not close to CentOS on the family tree, as far as I can tell. I'll do my best to fix it with the resources we have. I suspect it's probably going to turn to be a simple fix, but the unpredictable part is how long it takes to find the cause.

Ubuntu is based on Debian, where CentOS is a free flavor of RHEL. For running enterprise level software developed for the RHEL platforms. In the age of information, their monopoly on Linux is seen through as just a proprietary distro, which could add a level of insecurity to use of bleeding-edge versions, and past versions. CentOS is community driven and open source.

 I'd think for something like Terragen, focusing on CentOS may not be the best option. It's not really geared towards multimedia. I'm pretty sure it's repos for things like libpng, libjpeg, etc are all outdated and not as up-to-date as Debian/Ubuntu. Ubuntu and Debian have started ditching old repos in fact. For example you would have to manually install anything under PHP 7.2 in Bionic Beaver.

Can you, or Oshyan send me some links to stable and later versions leading up to 4440? I'll test them all and see if I can narrow down what version this shows up in.

I'm also still very concerned that TG underutilizes the first CPU on the board. (It's a 2x6c - 24t system). The first CPU barely gets over 60-75% usage while the second CPU sees 100% max across cores. According to benchmarks (CPU benchmarks and specs) I should be pulling almost exactly same times as my home desktop.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 18, 2019, 07:01:55 PM
I've sent you a PM with links to previous Linux builds.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 18, 2019, 11:14:07 PM
Received. We'll continue this over PM or email.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 19, 2019, 12:23:17 PM
The TGD file you attached named "terragen-4-benchmark_v1.0-nobloom.tgd" still has "Bloom" enabled. See the attached screenshot. I hope you won't find it disconcerting when I ask for TGD files, because sometimes they are the only way to be sure about these things.

Can you test with "Bloom" turned OFF to find out if that is where the bug occurs?
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 19, 2019, 01:02:08 PM
The disconcerting issue is I'm doing all the testing for you, regardless of "bloom" on, or off. From a file hosted on your server. Why haven't you tested this? As a business and developer of the software, you should be far better equipped than I am to diagnose your software. Having to change anything when I alerted you to YOUR benchmark failing on YOUR software.

This could have been sent via PM. But it seems you want it public to try and save-face to EVERY build failing on Ubuntu since 41250 and probably before, that's it's just my fault for blooms sake. if it's that hard to get the info you need you need to do it. Cause oopsies happen. Resolution isn't changed either cause nano had no permission access cause I didn't login as Terragen user.

Fact is in reality; a user makes a report, and YOU do all the testing. Including the hassles of installing the distro. Not having tested your benchmark on a widely standardized distro before shipping to begin with was, well. Wow. Regardless of excuse: "we can't test every distro". No, but you can test widely used ones, especially for compatibilities sake.

41250
Process2dPostEffects...
Process2dPostEffects: done
free(): invalid pointer
Aborted (core dumped)


42200
Process2dPostEffects...
free(): invalid next size (fast)free(): invalid next size (fast)

Aborted (core dumped)


44181
Process2dPostEffects...
Process2dPostEffects: done
corrupted size vs. prev_size
Aborted (core dumped)

44440
Process2dPostEffects...
free(): invalid next size (fast)
Aborted (core dumped)

Regardless of "BLOOM" On or Off, and exactly why you cannot rely on users to test for you, your shit is broken. And yeah how you handle this all as the developer, and business owner, is disconcerting, very disconcerting.

I'll spend another hour and half testing all the builds with file edited form desktop...
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 19, 2019, 01:31:47 PM
TG 44440 is successfull with Resolution 800x450 and No Antialising Bloom (was enabled in 44440) or post bloom.

I will now try 41250 (lowest in line i have).

Edit: 41250 also finished with no bloom / aa bloom, also finished quicker by a minute, not sure what that's about on Windows latest versions are faster.

I also had my datacenter run memtest86 overnight, and I ran stress this morning and have no issues with my RAM or CPU, but still have issue with TG not utilizing one CPU well.

UPDATE: Nevermind. Failed tests. TG was rendering a blank project again and not warning and aborting about project missing. Ugh. so dumb.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 19, 2019, 02:00:39 PM
Alright with an absolute path it rendered, I guess using "../" to backup and head into project folder doesn't work.

It still is rendering with Bloom disabled on 44440
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: WAS on November 19, 2019, 02:27:25 PM
Lowest, 41250 also renders without bloom.

Also just a note that wow, your new robust AA is far superior to old standard. The difference in noise is crazy between 41250 and 44440 is crazy.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 19, 2019, 06:41:19 PM
Thanks for doing those tests. I'll work on it this week.
Title: Re: Linux Render Node - Memory Error during Process2dPostEffects
Post by: Matt on November 22, 2019, 02:01:09 PM
This has been reported as fixed in Build 4.4.45 (Release).