Square Voronoi?

Started by René, July 18, 2019, 12:15:55 PM

Previous topic - Next topic

Hetzen

Quote from: René on July 27, 2019, 04:04:23 AM
Quote from: Matt on July 27, 2019, 12:52:31 AM
Manhattan distance could be added to the existing Voronoi functions as an option. I'll see if I can squeeze that in to an update. Unfortunately it doesn't really solve the squareness problem on slopes.

That would be great, even if it is slow (Alpine fractal was also slow 10 years ago). Even if you don't get great results right away, it would be very good to be able to experiment with it to assess the pros and cons. And you never know what people will come up with.

The reason why the blue node method is really slow, is because the network needs to check each call every time it hits the Conditional Scalers and when you're checking 9 zones, each Conditional is asking for the result 4 times....

In code, you can cache each of the 9 zones so the Conditional hit will just check those values in memory, therefore speeding up the process considerably.

There was talk of a cache node a while back. Is that still a possibility Matt?

WAS

#46
Quote from: Hetzen on July 27, 2019, 09:36:14 AM
Quote from: René on July 27, 2019, 04:04:23 AM
Quote from: Matt on July 27, 2019, 12:52:31 AM
Manhattan distance could be added to the existing Voronoi functions as an option. I'll see if I can squeeze that in to an update. Unfortunately it doesn't really solve the squareness problem on slopes.

That would be great, even if it is slow (Alpine fractal was also slow 10 years ago). Even if you don't get great results right away, it would be very good to be able to experiment with it to assess the pros and cons. And you never know what people will come up with.

The reason why the blue node method is really slow, is because the network needs to check each call every time it hits the Conditional Scalers and when you're checking 9 zones, each Conditional is asking for the result 4 times....

In code, you can cache each of the 9 zones so the Conditional hit will just check those values in memory, therefore speeding up the process considerably.

There was talk of a cache node a while back. Is that still a possibility Matt?

That would be cool. Though I'm specifically talking about calculations in general within TG's GUI (like 3D previews, etc) and when rendering. Sure it's all split per core, but each core is just working on that slice of geometry, hasn't the math already been worked out to render anything? Lol I'm not sure how it could calculate random parts of the math to render as it goes lol

When working within blue nodes, and previews of shaders, this is all 32bit, there's no way the software can just dynamically pick apart your formula and decide what thread to throw it at (and than competently combine the result back), you'd have to manually that, or it's build into shaders like multi-threaded voronoi pathing, which is a algorithm divided per thread (similar to how rendering is divided up)

I'm not sure of any math tool that can do multi-threaded math without specifically programmed to. Think PC calculators, they don't know how to dynamically breakup your formulas, and thus single-threaded.

This may be why outside of rendering and holding objects in mem, Terragen see's little benefit from 64bit.

WAS

Additionally taking a look at the function you shared, you'd think the math for both trees would already be fired before ever being compared, and thus just comparing raster results? That shouldn't be too heavy. And in fact, the slowdown before rendering a preview is there, but the rest of the process is still incredibly slow, long after comparisons finished.

Hetzen

Everything is 64bit WASasquatch, as far is I know. All 64bit is over 32bit is the capacity to have a lot more digits in a number. It has nothing to do with multi threading.

As for my network, there is nothing to tell the Conditional Scalers that they are doing the same calculation several times. The tree just sees the links as another set nodes that need re-working. Hence the reason for sticking a 'Cache Node' in at the end of each of my nine zones, so when the Conditional calls each branch of its inputs, it hits the Cache Node without calculating the whole thing again.

Manhatten noise should have another set of conditionals to the clip I posted. I can post that too for you to have a look at, but it really is unwieldy not just for processing, but also the number of nodes in the scene. It was more of a thought experiment for myself to see if I understood the math and could translate that into TG blues. It really isn't very useful.

Matt

Was, 64 bit vs 32 bit has nothing to do with threading or the parallelism of a particular algorithm. I would be more movitated to explain how things work if you were to simply ask, without making pages of assertions on this forum about a whole range of things which you don't understand, and if you had given me much confidence over the years that after I explain it you wouldn't try to say that I'm wrong about the bits you don't understand.

But, in a nutshell, the voronoi algorithms we use in shaders in CG are completely parellelizable - they can be evaluated in many different locations by different threads without (much) loss of efficiency - because they are calculated using a deteministic noise pattern. This is generally true of all the blue function nodes, and most of the shaders too.
Just because milk is white doesn't mean that clouds are made of milk.

Matt

Quote from: Hetzen on July 27, 2019, 09:36:14 AM
There was talk of a cache node a while back. Is that still a possibility Matt?

The last time I looked at this it was more difficult than I thought it would be, with lots of corner cases to account for. I might revisit it as part of a refactoring effort but I don't expect it will be very soon.
Just because milk is white doesn't mean that clouds are made of milk.

Hetzen

Quote from: Matt on July 27, 2019, 03:39:37 PM
Quote from: Hetzen on July 27, 2019, 09:36:14 AM
There was talk of a cache node a while back. Is that still a possibility Matt?

The last time I looked at this it was more difficult than I thought it would be, with lots of corner cases to account for. I might revisit it as part of a refactoring effort but I don't expect it will be very soon.

I can imagine there's all sorts of situations this could go wrong and after all it's a pretty niche request so no worries.

WAS

#52
Quote from: Matt on July 27, 2019, 03:31:24 PM
Was, 64 bit vs 32 bit has nothing to do with threading or the parallelism of a particular algorithm. I would be more movitated to explain how things work if you were to simply ask, without making pages of assertions on this forum about a whole range of things which you don't understand, and if you had given me much confidence over the years that after I explain it you wouldn't try to say that I'm wrong about the bits you don't understand.

But, in a nutshell, the voronoi algorithms we use in shaders in CG are completely parellelizable - they can be evaluated in many different locations by different threads without (much) loss of efficiency - because they are calculated using a deteministic noise pattern. This is generally true of all the blue function nodes, and most of the shaders too.

Where have I ever said you were wrong, rather trying to point out what I'm trying to describe that you seem to be quoting something else entirely.

There seems to be absolutely zero difference from your Voronoi Algorithm when forced on one thread, to all threads. Can you explain that?

You seem to be explaining the rendering process itself that's already got the calculations to ever render anything?

Hetzen, is there a way to simply make the seams of your square noise soft rather than hard (the square noise function you shared awhile back)? That would solve the block issue in pattern to be augmented. I tried your current function in several instances and it seems to have issues with random hard parallel lines occurring in the soft squares.

The square noise function works well other than it's seams being hard straights (again past one you shared).

WAS

And to make things perfectly clear Matt, I'm not talking about render time which I could care less if it takes an hour or 2 days, I'm talking about working with the software and the shaders. 3D Preview Window, 3D Preview of Shaders, working with the actual shaders and manipulating them via other shaders and responsiveness of TG itself.

Hetzen

WAS, I'm not sure why you're getting those edges, maybe I missed something when clipping the network. Below is the full Minkowsky clip I made.

I'd also like to back up Matt a little here, in that you do have a habit of talking in certainties when actually you've been wrong about the terminology you've used to describe your problem/issue. There have been quite a few posts of yours that I've not understood what the fuck you're trying to talk about. A picture is worth a thousand words. We all want to help. So just show it without assuming something is not working in the background. I've been guilty myself.

WAS

#55
Quote from: Hetzen on July 27, 2019, 05:19:49 PM
WAS, I'm not sure why you're getting those edges, maybe I missed something when clipping the network. Below is the full Minkowsky clip I made.

I'd also like to back up Matt a little here, in that you do have a habit of talking in certainties when actually you've been wrong about the terminology you've used to describe your problem/issue. There have been quite a few posts of yours that I've not understood what the fuck you're trying to talk about. A picture is worth a thousand words. We all want to help. So just show it without assuming something is not working in the background. I've been guilty myself.

I am scaling the noise here down 0.5 x,y,z 3 times into their own displacements. Perhaps it's being effected. I noticed if I just add a new larger scale, the blocks spacing seems to become exponentially larger from their original positions.

I have dyslexia bad, so terminology in textual form is always going to be a issue. I also often use literal definition meanings of words where people have slang understandings from them, or pertain to some specific field they're part of.

And perhaps sometimes I do not know what I'm talking about, but I spent several hours reading documentation from some university about incorporating voro++ libraries in a ray tracing renderer and how they divided the pathing up on different threads -- entirely independent from the modeling and object creating and rendering of the final scene (which the objects were based on the voronoi similar to how we use displacement). From what Matt said, it sounds like it just renders the voronoi in it's entirety, in that bucket/thread.




Hetzen

#56
Yes, I understand.

The reason for those edges, is that the zones need to be redefined with off 1,1,1 scaling on the Get Position in Texture multiplier I put in that clip. So really it shouldn't be in the first clip.

Easy fix is to put a Transform Input at the end of the tree to stretch things.

WAS

Quote from: Hetzen on July 27, 2019, 05:44:37 PM
Yes, I understand.

The reason for those edges, is that the zones need to be redefined with off 1,1,1 scaling on the Get Position in Texture multiplier I put in that clip. So really it shouldn't be in the first clip.

Easy fix is to put a Transform Input at the end of the tree to stretch things.

That's what I did actually, scaled up 10,10,10 than down-scaled by 0.5 3 times.

Hetzen

Yep. I was re-playing things opening up that old scene. Should have left as is. It's difficult getting back into the mindset of something a while old.

Matt

Quote from: WASasquatch on July 27, 2019, 04:16:10 PM
There seems to be absolutely zero difference from your Voronoi Algorithm when forced on one thread, to all threads. Can you explain that?

If you mean in the code, then yeah, that's right. The same code is executed on each thread, and the only thing that's different between threads is the position being evaluated for. This is how the algorithm can scale up to an arbitrary number of threads. It is what's known as a "pure function" in the functional programming literature. That is, its behaviour depends only on the input values and, what's critically important here, it doesn't have any effect on anything else in the system. Its scalability comes from the fact that each invocation of the function is its own island. It doesn't have to interact with any other parallel invocation. 100 parallel invocations will get approximately 100 times as much work done in the same time.

If you mean in terms of render/preview time, then yes, it may be that for quick-to-evaluate functions the bottlenecks occur elsewhere, as in my next answer.

Quote
You seem to be explaining the rendering process itself that's already got the calculations to ever render anything?
...
And to make things perfectly clear Matt, I'm not talking about render time which I could care less if it takes an hour or 2 days, I'm talking about working with the software and the shaders. 3D Preview Window, 3D Preview of Shaders, working with the actual shaders and manipulating them via other shaders and responsiveness of TG itself.

We do have some issues with responsiveness when we are multi-threading the previews. Previews involve some things which are less cleanly parallel, and the fact that all running threads have to be able to respond to the user interaction is where things get more tricky. Thankfully, in many cases the shaders themselves don't need to know anything about this complexity, as long as they are "pure functions" like the various noise functions are. But it all runs through a rendering system that has some baggage from being optimized for the days before we had threading. And then there are some shaders which are not pure functions because they have to cache stuff (e.g. clouds), and that introduces extra lag. Definitely there could be some improvements. I think every minor update in 4.x has chipped away at some of these problems, bit by bit, but I'm still working on it.
Just because milk is white doesn't mean that clouds are made of milk.