I wonder if there is a more efficient image blending function inside the engine for large batch image blending functions

a8265348 · October 26, 2023, 8:57am

Your own mixing is using loop mixing and it takes a little time to mix in the CPU layer, hope to provide suggestions Thank you very much

function BlendBytes(const Dest, Source, Opacity: Byte): Byte; {$ifdef SUPPORTS_INLINE} inline; {$endif}
var
W: Word;
begin
W := byte(Dest) * (255 - Opacity) div 255 + byte(Source) * Opacity div 255;
if W > 255 then W := 255;
Result := W;
end;

function AddBytes(const Dest, Source, Opacity: Byte): Byte; {$ifdef SUPPORTS_INLINE} inline; {$endif}
var
W: Word;
begin
W := Dest + Word(Source) * Opacity div 255;
if W > 255 then W := 255;
Result := W;
end;

function AddBytesPremultiplied(const Dest, Source: Byte): Byte; {$ifdef SUPPORTS_INLINE} inline; {$endif}
var
W: Word;
begin
W := Dest + Source;
if W > 255 then W := 255;
Result := W;
end;

function MultiplyBytes(const Dest, Source: Byte): Byte; {$ifdef SUPPORTS_INLINE} inline; {$endif}
var
W: Word;
begin
W := Dest * Word(Source) div 255;
if W > 255 then W := 255;
Result := W;
end;

michalis · October 26, 2023, 10:55am

Indeed there is a faster and more recommended method.

The idea is to not use CPU-based drawing methods in CastleImages, like TCastleImage.DrawFrom and TCastleImage.DrawTo. As you found, this approach uses a loop in CPU, so it’s never really going to be efficient for large images.

The more recommended approach is to use TDrawableImage and draw there. Like TDrawableImage.DrawFrom. Or, more generally, do any drawing you want between TDrawableImage.RenderToImageBegin and TDrawableImage.RenderToImageEnd – this way you can draw other images, or any other UI controls, even 3D stuff, inside an image. This approach is:

Fully GPU-accelerated (underneath we use OpenGL(ES) FBOs to draw into a texture).
More powerful, as you can really draw anything.
It also works just differently. You rely on OpenGL(ES) provided blending methods, see TDrawableImage.Alpha, TDrawableImage.BlendingSourceFactor, TDrawableImage.BlendingDestinationFactor.

See examples/images_videos/draw_images_on_gpu/draw_images_on_gpu.dpr for example.

In the long-term, we plan to either deprecate existing CPU-based drawing methods on TCastleImage or reimplement them on top of TDrawableImage.

TODO: I should document it better, so that others don’t fall into this trap. Maybe we should deprecate existing CPU-based drawing methods sooner. We should also have more examples of this.

a8265348 · October 26, 2023, 2:52pm

I don’t think it’s necessary to remove it because we should add more efficient batch CPU to compute the hybrid method for everyone to use because on the mobile side (especially IOS platform), its CPU processing power is strong, and the GPU processing power is obviously weaker than other platforms and you need to put some large batch work on the CPU to run Then give it to the GPU to show you the method you provided above. I did not perform well on the IOS platform