Adding TShapeNode to RootNode from additional thread

kedzoo · July 29, 2020, 6:53am

Hi.
I was trying to dynamically load multiple shapes (TShapeNode) into the already shown scene (into the active TX3DRootNode).
If I do it from the same thread, everything works fine (but slowly)
But if I do the same from additional thread, I get exceptions from the CGE sources.
Is there any way to fix this?

eugeneloza · July 29, 2020, 8:14am

Hello, @kedzoo and welcome to the forum!

As a rule of a thumb you should never use any Castle Game Engine features from a thread other than the main thread (just as with other game engines).

You may try an unsafe hack - (re)construct the whole TX3DRootNode in a thread and load it into a TCastleScene in the main thread. However, note that loading something operates on cache, and that may mess with the main thread.

A much safer approach - preload all the assets you need in the main thread and then add after that (re)construct the whole TX3DRootNode in a thread. This should work.

If you need to add/remove multiple nodes to/from a scene, consider using A) Multiple scenes and set Scene.Enabled := true/false; and/or B) use TSwitchNode.WhichChoice and set it to -1 to hide element and 1 to set it to the first child node. This way you can operate huge amounts of elements safely and lighting-fast.

kedzoo · July 30, 2020, 3:35am

Hello, Eugene.
Thank you for advise. Idea was to load part of map when camera moves. And unload part that far from camera. It is required coz I want to implement support of extrahuge maps (up to 48kk hexagones with inner geometry). So I cannot just preload it. And I cannot decompose it to resonable amount of preloaded assets. Anyway, seems like I found some solution (thanks again for your advice). Seems like simple extension of scene class will works for 2 scenes case (at least it works now in my tests with single scene).

  TMX3DScene = class(TCastleScene)
  private
    FEnabled: Boolean;
    FSceneBusy: TCriticalSection;
    procedure SetEnabled(AValue: Boolean);
  protected
    procedure Render(const Params: TRenderParams); override; overload;
  public
    constructor Create(AOwner: TComponent); override;
    destructor Destroy; override;

    property Enabled: Boolean read FEnabled write SetEnabled;
  end;
  
procedure TMX3DScene.SetEnabled(AValue: Boolean);
begin
  if FEnabled = AValue then
    Exit;

  FSceneBusy.Acquire;
  try
    FEnabled := AValue;
    if FEnabled then
      Enable
    else
      Disable;
    Exists := FEnabled;
    Visible := FEnabled;
  finally
    FSceneBusy.Release;
  end;
end;

procedure TMX3DScene.Render(const Params: TRenderParams);
begin
  FSceneBusy.Acquire;
  try
    inherited Render(Params);
  finally
    FSceneBusy.Release;
  end;
end;

UPD: No. Each time when Scene was “enabled”, it allocates some memory. With no reason

kedzoo · July 31, 2020, 7:05pm

Finally (actually, not sure about “finally”) implemented it like that: geometry is calculated and shape nodes are created in additional thread. And Root node is updated in main thread (wnd.OnBeforeRender).

But can easily see that screen is hanging while root update. And sometimes scene recalculating everithing (in task manager can see that a lot of memory released and allocated again. I guess its Scene.ChangedAll method). This also causes lags.

[dynamic loading of 5kk tiles map]

kedzoo · August 3, 2020, 8:19am

I returned to experiments with a full load in a separate thread. Yeah, I remember that rule “do not use another threads”. But what I can do? And seems like some engines support async load (Unreal, Unity, Panda3D and some another - that google said)

Back to my tests. This works in many ways. And works fast. But there is always one problem. Something in memory is not freed. But this is not a memory leak, these are some registered objects that freed correctly afterwards. Allocated memory grows for each iteration.

Wondering what it is and how to release it …

Reconstructing\recreating root node, Scene.FreeResources, Scene.ChangedAll, RootNode.UnregisterScene, RootNode.FdChildren.Changed and anything I found in sources - does not help

eugeneloza · August 3, 2020, 8:49am

Be cautious of unexpected (they may be rare, or may be very rare, but I wouldn’t count on that) errors though. I’ve had a (very) painful experience of almost completely rewriting a large project because of that.

Also see Asynchronous (non-blocking) downloading using TCastleDownload class, and other HTTP communication features – Castle Game Engine - however, as I didn’t try the feature yet, I’m not sure how it’ll work with actually loading models, but at least preloading them can be made from a thread.

Note, that this may be the feature of Windows/Linux memory management. The memory is freed correctly, but is still allocated for the program. This way they keep the program run faster by avoiding defragmenting the memory when not needed.

kedzoo · August 3, 2020, 8:54am

No, Eugene. If you right, then in single thread case it must allocate memory in same way. But its totally different.

Sure. Thats why I trying to solve it now, before to do anything else. But I think my final extension of TCastleScene works quit fine. I can manage any collision, and waiting when all Scene methods in main thread finish works after removing scene from Viewport. Yeah, maybe something is still there but there is no exceptions…

kedzoo · August 3, 2020, 8:59am

Thank you, I will check it.

eugeneloza · August 3, 2020, 9:04am

Ah, ok. That makes sense.

Do you do FreeOnTerminate := true?

Sorry, I somehow missed your two previous messages… Looking at the image, it looks like not the best way to render that many tiles. Sorry, I’m busy right now and can’t go down into the details, but my first guess would be creating some sort of “LODs” (they can be autogenerated, but overall the idea is similar to GoogleEarth/GoogleMaps) and loading those instead of thousands of shapes which is heavy both for rendering and loading. Also splitting the whole world into multiple scenes can help greatly (to avoid recalculating one huge scene in ChangedAll), but I think you’ve done that already.

kedzoo · August 3, 2020, 9:22am

I use FreeOnTerminate := False and destroy Thread by myself on Loader object destructor. Coz thread implemented for many purposes, not just removing or adding shape nodes.

Previously I implemented it in 2D and it works fine, but amount of prepared sprites will be about 480k… I planned a lot of variations of landscape. Different indents depends to another tiles, vary river width and etc. So, I turned to 3D (orthogonal camera for now). This scale is just for testing, easy to see how it loads, I already planned what to do for so huge scales but it will be later.

not exactly, I glue hexagones in groups (can set group size, now using groups 10x10) and it works good enough, its even works faster then I expected and (normally) takes just about 800 MB of memory (video+ram, coz testing on notebook without discrete graphic card) for 5kk fully loaded map. What on screen - I think will be LOD 1

michalis · August 14, 2020, 7:45am

You just cannot use Castle Game Engine API from separate threads. In particular, operating on the X3D node graph accesses some caches and associations – some are per-scene (limited to the containing TCastleScene instance), but some are global (or static in some classes, e.g. temporary instances for passing events).

All the calls to CGE should be done from one single (“main”) thread. You’re welcome to do threads to perform some work in the backgrond of course (in fact CGE does it too, internally – for streaming music, for downloading http/https stuff). But the interaction with CGE API must be done from the main thread.

There is no way around it, and I absolutely very very very do not advise trying to workaround it by “trial and error”, and patching particular places selectively. I pretty much guarantee that you will end in endless problems. And thread-related problems are hard, as they seem like random crashes, completely unexpected values for some stuff etc.

Note that TCastleDownload, for asynchronous downloading, will use threads (for some protocols, on some platforms) under the hood. But that is hidden from you. Also it doesn’t initialize any graphic resources in thread, it just downloads the stream. Likewise some other engine operations work in thread (music streaming), and some libraries underneath use threads (OpenAL, and OpenGL of course does stuff on GPU and may as well use threads, that’s up to the implementation).

I know it sounds “final”, that you “definitely should not use CGE API from multiple threads”. But there’s no way to easily hide it / workaround. We could change some caches underneath, to make them thread-safe, and then some limited operations could be done from multiple threads. But that’s a significant of work, and it could hurt performance of applications that don’t need threads if done carelessly, so in the end it’s not my priority in CGE now.

Note that at least Unity also doesn’t do it. You have to use Unity API from main thread. You can download stuff in background (just like our TCastleDownload) but in general loading stuff and adding it to world is something you do in main thread.

Note that some graphic APIs (like OpenGL(ES)) require also usage from a main thread. So even when CGE would be more open the multi-threaded usage, in the end all loading on GPU would have to be done in main thread anyway. (Vulkan would solve it, but that’s a big task in itself, and using Vulkan with multi-threading makes it a really big task.)

kedzoo · August 15, 2020, 8:02am

Thanks for answer, michalis

Yeah, but as I understood by my testing with threads (OK, it was wrong idea, but anyway), OpengGL was updated quit fast and the main load was “on the shoulders” of engine

What about something like that: Unity - Scripting API: SceneManagement.SceneManager.LoadSceneAsync
Unity - Scripting API: AsyncOperation

OK, but maybe there is some another way to optimize dynamic loading? Maybe some caches possible to prepare dynamically to reduce the load on the main thread? Or there is just single way that preload anything at start? I afraid its impossible in my case (for huge maps). BTW, What about idea wth buffer scene (load to hidden scene in thread and then switch it with visible one)

michalis · August 16, 2020, 9:37pm

In cases when the load is on the engine CPU work (not on uploading stuff to GPU), there is definitely a possibility to make it in a thread in the future.

And yes, I would prefer to expose this through an API like Unity (which you call from main thread, but it works asynchronously). It is then asynchronous, and also easier/safer to use than explicit threads in user code. ( Quite like current TCastleDownload that uses threads internally, but exposes a simple asynchronous API. )

However, that is just not ready yet. It was not tested, there are some global caches and some things created “on-demand” accessed underneath.

As for a hidden scene loaded in a thread: Whether the scene is hidden or not, adding/initializing it right now would interact with some things assuming that we’re always in one thread. So, that is also not something possible for now.

In the end, I’m not saying we will never enable to do something in threads (maybe we will). And I would definitely want to have something like LoadSceneAsync one day! But it’s not available yet.