Implement Bounding Box Hierarchy in `Path` #6630

EVAST9919 · 2025-08-30T18:21:14Z

Split from #6613 which serves as base implementation of BBH inside a Path. By it's own this pr only improves Path.ReceivePositionalInputAt performance.

Structure

When passing path vertices to the PathBBH we will be building a binary tree (from bottom to top), in which most-bottom row will contain path segments and their bounding boxes. Then we will update all the parent nodes with combined bounds of left and right children.

Of course not all paths have 2^n segments, so we need to create such a tree which will be able to scale to any amount of segments without having empty nodes to reduce memory needed to store them. All the nodes are stored in an array of size roughly ~2n segment count in the following order: [nodes from the 0 depth from left to right][nodes from the 1 depth from left to right]...[nodes from the n-1 depth from left to right][leafs(segments)]. Array will be populated from end to start starting with leafs and then their parents and their parents and so on. And in the end the whole tree is built in just 1 cycle.

`Path.SetVertices` benchmark

master:

Method	Mean	Error	StdDev	Allocated
Compute100Segments	592.5 ns	1.81 ns	1.69 ns	-
Compute1KSegments	5,887.3 ns	110.78 ns	98.20 ns	-
Compute10KSegments	59,171.8 ns	127.50 ns	113.03 ns	-
Compute100KSegments	569,454.3 ns	2,341.02 ns	2,075.25 ns	-
Compute1MSegments	5,895,614.8 ns	22,614.86 ns	20,047.48 ns	-

pr:

Method	Mean	Error	StdDev	Allocated
Compute100Segments	2.234 us	0.0027 us	0.0021 us	56 B
Compute1KSegments	22.013 us	0.0453 us	0.0424 us	56 B
Compute10KSegments	365.738 us	2.3118 us	2.0493 us	56 B
Compute100KSegments	4,013.083 us	7.9266 us	7.4145 us	-
Compute1MSegments	41,526.195 us	65.9597 us	61.6988 us	-

pr (after 06d895d):

Method	Mean	Error	StdDev	Gen0	Allocated
Compute100Segments	1.508 us	0.0053 us	0.0049 us	0.0019	56 B
Compute1KSegments	15.007 us	0.0308 us	0.0288 us	-	56 B
Compute10KSegments	152.082 us	0.5901 us	0.5231 us	-	56 B
Compute100KSegments	1,535.200 us	2.7969 us	2.6163 us	-	57 B
Compute1MSegments	21,892.779 us	80.7825 us	63.0697 us	-	-

While pr values are about ~2.5x slower (with more micro-optimisations can be improved further, target would be 2x given amount of nodes processed is ~2x segment count), it's worth noting that in case of master (with snaking sliders) we will see these timings each frame and with #6613 - only once, and then timings from Path.SetStartProgress table (which are basically nothing)

`Path.ReceivePositionalInputAt` benchmark

master:

Method	Mean	Error	StdDev	Allocated
Contains100	302.0 ns	0.63 ns	0.56 ns	-
Contains1K	2,906.9 ns	12.67 ns	11.85 ns	-
Contains10K	76,607.3 ns	267.95 ns	250.64 ns	-
Contains100K	868,608.0 ns	2,824.05 ns	2,641.61 ns	-
Contains1M	8,683,885.7 ns	106,430.14 ns	99,554.82 ns	-

pr:

Method	Mean	Error	StdDev	Allocated
Contains100	195.1 ns	0.20 ns	0.19 ns	-
Contains1K	289.8 ns	0.21 ns	0.20 ns	-
Contains10K	444.2 ns	0.41 ns	0.37 ns	-
Contains100K	594.2 ns	1.17 ns	1.04 ns	-
Contains1M	945.1 ns	2.87 ns	2.54 ns	-

Input performance showcase

master	pr
https://github.com/user-attachments/assets/32cfe1e8-d9b3-4ace-a200-8ab1f85179b3	https://github.com/user-attachments/assets/f46a9a52-e76c-40df-abee-59fb23ead737

Regressed with ppy#6638

smoogipoo

Wanna resolve conflicts here and we can try getting this merged in?

smoogipoo · 2025-11-04T03:08:10Z

osu.Framework/Utils/MathUtils.cs

+        public static float BranchlessMin(float value1, float value2)
+        {
+            int b = Convert.ToInt32(value1 < value2);
+            return b * value1 + (1 - b) * value2;
+        }
+
+        public static float BranchlessMax(float value1, float value2)
+        {
+            int b = Convert.ToInt32(value1 > value2);
+            return b * value1 + (1 - b) * value2;
+        }


I don't think this is doing much, as it's still branching internally: https://source.dot.net/#System.Private.CoreLib/src/libraries/System.Private.CoreLib/src/System/Convert.cs,992

Would probably rather not do this and leave it to the JIT to hopefully do things correctly.

https://sharplab.io/#v2:C4LghgzgtgPgAgJgIwFgBQ6BmAbA9mYAAgEFCBeQgJTADsATXKAOgGUALMAJwFM6mA5bgA9gLAJY0A5tm4AKAJQBuLHgKEAQuSq0Gzdl14DhoidLlL0K/EQCyC9AG90hF4TgB2QjYJsmNibLEADQaFmgAvpZoONaE/PZoTmiubp7qnLQAxmwyEBD+NIEh6mGRGOVwSABshDFq6Vk53HkFsnVEAG5g2ACu3Egh7YRdvdwI8o7OrhJEAEZaAMK4NB3cnMBMACq4AJI0wADMCLIjfUiEADzD3X3jysmuHoTzAFTXo+cA1ISy5wC0z3khDepzG93CQA=

Program.<<Main>$>g__M|0_0(<>c__DisplayClass0_0 ByRef) L0000: push eax L0001: vmovss xmm0, [ecx+4] L0006: vmovss xmm1, [ecx] L000a: vrangess xmm2, xmm1, xmm0, 4 L0011: vmovups xmm3, [Program.<<Main>$>g__M|0_0(<>c__DisplayClass0_0 ByRef)] L0019: vfixupimmss xmm1, xmm0, xmm3, 0 L0020: vfixupimmss xmm2, xmm1, xmm3, 0 L0027: vmovss [esp], xmm2 L002c: fld st, dword ptr [esp] L002f: pop ecx L0030: ret Program.<<Main>$>g__N|0_1(<>c__DisplayClass0_0 ByRef) L0000: push eax L0001: push dword ptr [ecx] L0003: push dword ptr [ecx+4] L0006: call 0x2ad60048 L000b: fstp dword ptr [esp], st L000e: vmovss xmm0, [esp] L0013: vmovss [esp], xmm0 L0018: fld st, dword ptr [esp] L001b: pop ecx L001c: ret

I don't mind reverting since I'm not a huge fan of how it looks either. The only reason I pushed this is the fact that it does affect performance and it's quite noticeable (hence benchmarks in OP before and after the commit). But sure, let's revert for now and may be think about further improvements later.

I also made a microbenchmark for this, and the results are documented in the file: smoogipoo/Benchmarks@7004938

There's going to be more to this, and it likely has to do with CPU & branch prediction.

I'd be interested to see what the results are for you with that benchmark, but in general I would always assume the JIT is knowledgeable of tricks like this.

Yeah, with your benchmark my results are the same as well

Method Job Runtime Mean Error StdDev

Min .NET 10.0 .NET 10.0 7.134 us 0.0021 us 0.0019 us

BranchlessMin .NET 10.0 .NET 10.0 7.161 us 0.0480 us 0.0449 us

Min .NET 8.0 .NET 8.0 7.134 us 0.0041 us 0.0032 us

BranchlessMin .NET 8.0 .NET 8.0 7.132 us 0.0033 us 0.0026 us

osu.Framework/Graphics/Lines/PathBBH.cs

smoogipoo · 2026-01-06T06:23:11Z

I'm struggling a little bit with the algorithm here because it looks like you've made a conscious effort to build things in reverse order. Is there a reason you couldn't build the binary tree left-to-right?

That will also perform better during CPU prefetches too.

EVAST9919 added 24 commits July 9, 2025 03:49

Implement Path BBH

d1e25e7

Implement native start/end progress

a80cf02

Don't copy segments on range change

fe29508

Merge branch 'master' into path-bbh

d179b23

Expose PathBBH.CurvePositionAt

a569eaa

Trim array allocated by the tree

2ad3811

Avoid array copy if new segment count is smaller

ec87d99

Remove not needed checks

eaaebee

Make node bounds non-nullable

39865b4

Cleanup pass

64a40b4

Use ArrayPool for tree array

739adaf

Add benchmarks for tree creation and progress update

e23c7a1

Add xmldoc

3fc26e4

Minor cleanup pass

61b60fe

Use BitOperations.RoundUpToPowerOf2 instead of custom implementation

17ca24f

Cleanup bbh benchmarks

aadfe8a

Add BenchmarkPathSegmentCreation

18200b3

Make PathBBH IDisposable

c88baef

Add BenchmarkPathContains

1683f52

Remove duplicate benchmark

2ee5bdf

Rework bounding box collection

e952218

Implement FastMin and FastMax in hot paths

06d895d

Merge branch 'master' into path-bbh

f18fa94

Remove start/end progress logic

dbba05e

pull-request-size bot added the size/XL label Aug 30, 2025

EVAST9919 mentioned this pull request Aug 30, 2025

Implement performant Start/End progress for Path using BBH #6613

Draft

2 tasks

EVAST9919 added 4 commits September 1, 2025 21:33

Add more xmldoc

5893541

Merge branch 'master' into path-bbh-no-progress

f60f06f

Fix incorrect segment point positions

5bfff1f

Regressed with ppy#6638

Adjust naming

d744496

EVAST9919 added 3 commits September 16, 2025 22:12

Improve nodes array renting

8955393

Merge branch 'master' into path-bbh-no-progress

34bfd5d

Avoid re-adding same segments in the draw node

973d464

peppy requested a review from smoogipoo October 16, 2025 07:29

EVAST9919 added 4 commits November 7, 2025 03:39

Merge branch 'master' into path-bbh-no-progress

f946154

Merge branch 'master' into path-bbh-no-progress

20c9ddc

Merge branch 'master' into path-bbh-no-progress

f8c2a58

Simplify test scene boxes drawing

3429d37

smoogipoo requested changes Jan 5, 2026

View reviewed changes

EVAST9919 added 3 commits January 5, 2026 16:04

Remove custom min/max implementation

7ee8827

Improve PathBBH disposal

d5b0575

Merge branch 'master' into path-bbh-no-progress

1771064

EVAST9919 requested a review from smoogipoo January 5, 2026 13:14

smoogipoo added 2 commits January 6, 2026 15:12

Add disposal check

fdc595c

Remove unnecessary null check

1ee4eb1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Bounding Box Hierarchy in `Path` #6630

Implement Bounding Box Hierarchy in `Path` #6630

EVAST9919 commented Aug 30, 2025 •

edited

Loading

Uh oh!

smoogipoo left a comment

Uh oh!

smoogipoo Nov 4, 2025

Uh oh!

EVAST9919 Jan 5, 2026

Uh oh!

smoogipoo Jan 5, 2026

Uh oh!

EVAST9919 Jan 5, 2026

Uh oh!

Uh oh!

Uh oh!

smoogipoo commented Jan 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Method	Job	Runtime	Mean	Error	StdDev
Min	.NET 10.0	.NET 10.0	7.134 us	0.0021 us	0.0019 us
BranchlessMin	.NET 10.0	.NET 10.0	7.161 us	0.0480 us	0.0449 us
Min	.NET 8.0	.NET 8.0	7.134 us	0.0041 us	0.0032 us
BranchlessMin	.NET 8.0	.NET 8.0	7.132 us	0.0033 us	0.0026 us

Implement Bounding Box Hierarchy in Path #6630

Are you sure you want to change the base?

Implement Bounding Box Hierarchy in Path #6630

Conversation

EVAST9919 commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Structure

Path.SetVertices benchmark

Path.ReceivePositionalInputAt benchmark

Input performance showcase

Uh oh!

smoogipoo left a comment

Choose a reason for hiding this comment

Uh oh!

smoogipoo Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

EVAST9919 Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

smoogipoo Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

EVAST9919 Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

smoogipoo commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Implement Bounding Box Hierarchy in `Path` #6630

Implement Bounding Box Hierarchy in `Path` #6630

EVAST9919 commented Aug 30, 2025 •

edited

Loading

`Path.SetVertices` benchmark

`Path.ReceivePositionalInputAt` benchmark

smoogipoo commented Jan 6, 2026 •

edited

Loading