VirtualBox

source: vbox/trunk/src/VBox/Additions/3D/mesa/mesa-24.0.2/docs/gallium/buffermapping.rst@ 105254

最後變更 在這個檔案從105254是 103996,由 vboxsync 提交於 12 月 前

Additions/3D/mesa: export mesa-24.0.2 to OSE. bugref:10606

  • 屬性 svn:eol-style 設為 native
檔案大小: 25.2 KB
 
1Buffer mapping patterns
2-----------------------
3
4There are two main strategies the driver has for CPU access to GL buffer
5objects. One is that the GL calls allocate temporary storage and blit to the GPU
6at
7``glBufferSubData()``/``glBufferData()``/``glFlushMappedBufferRange()``/``glUnmapBuffer()``
8time. This makes the behavior easily match. However, this may be more costly
9than direct mapping of the GL BO on some platforms, and is essentially not
10available to tiling GPUs (since tiling involves running through the command
11stream multiple times). Thus, GL has additional interfaces to help make it so
12apps can directly access memory while avoiding implicit blocking on the GPU
13rendering from those BOs.
14
15Rendering engines have a variety of knobs to set on those GL interfaces for data
16upload, and as a whole they seem to take just about every path available. Let's
17look at some examples to see how they might constrain GL driver buffer upload
18behavior.
19
20Portal 2
21========
22
23.. code-block:: console
24
25 1030842 glXSwapBuffers(dpy = 0x82a8000, drawable = 20971540)
26 1030876 glBufferDataARB(target = GL_ELEMENT_ARRAY_BUFFER, size = 65536, data = NULL, usage = GL_DYNAMIC_DRAW)
27 1030877 glBufferSubData(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, size = 576, data = blob(576))
28 1030896 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 526, count = 252, type = GL_UNSIGNED_SHORT, indices = NULL, basevertex = 0)
29 1030915 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 19657, count = 36, type = GL_UNSIGNED_SHORT, indices = 0x1f8, basevertex = 0)
30 1030917 glBufferDataARB(target = GL_ARRAY_BUFFER, size = 1572864, data = NULL, usage = GL_DYNAMIC_DRAW)
31 1030918 glBufferSubData(target = GL_ARRAY_BUFFER, offset = 0, size = 128, data = blob(128))
32 1030919 glBufferSubData(target = GL_ELEMENT_ARRAY_BUFFER, offset = 576, size = 12, data = blob(12))
33 1030936 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 3, count = 6, type = GL_UNSIGNED_SHORT, indices = 0x240, basevertex = 0)
34 1030937 glBufferSubData(target = GL_ARRAY_BUFFER, offset = 128, size = 128, data = blob(128))
35 1030938 glBufferSubData(target = GL_ELEMENT_ARRAY_BUFFER, offset = 588, size = 12, data = blob(12))
36 1030940 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 4, end = 7, count = 6, type = GL_UNSIGNED_SHORT, indices = 0x24c, basevertex = 0)
37 [... repeated draws at increasing offsets]
38 1033097 glXSwapBuffers(dpy = 0x82a8000, drawable = 20971540)
39
40From this sequence, we can see that it is important that the driver either
41implement ``glBufferSubData()`` as a blit from a streaming uploader in sequence with
42the ``glDraw*()`` calls (a common behavior for non-tiled GPUs, particularly those with
43dedicated memory), or that you:
44
451) Track the valid range of the buffer so that you don't have to flush the draws
46 and synchronize on each following ``glBufferSubData()``.
47
482) Reallocate the buffer storage on ``glBufferData`` so that your first
49 ``glBufferSubData()`` of the frame doesn't stall on the last frame's
50 rendering completing.
51
52You can't just empty your valid range on ``glBufferData()`` unless you know that
53the GPU access from the previous frame has completed. This pattern of
54incrementing ``glBufferSubData()`` offsets interleaved with draws from that data
55is common among newer Valve games.
56
57.. code-block:: console
58
59 [ during setup ]
60
61 679259 glGenBuffersARB(n = 1, buffers = &1314)
62 679260 glBindBufferARB(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 1314)
63 679261 glBufferDataARB(target = GL_ELEMENT_ARRAY_BUFFER, size = 3072, data = NULL, usage = GL_STATIC_DRAW)
64 679264 glMapBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, length = 3072, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT) = 0xd7384000
65 679269 glFlushMappedBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, length = 3072)
66 679270 glUnmapBuffer(target = GL_ELEMENT_ARRAY_BUFFER) = GL_TRUE
67
68 [... setup of other buffers on this binding point]
69
70 679343 glBindBufferARB(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 1314)
71 679344 glMapBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, length = 768, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT) = 0xd7384000
72 679346 glFlushMappedBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, length = 768)
73 679347 glUnmapBuffer(target = GL_ELEMENT_ARRAY_BUFFER) = GL_TRUE
74 679348 glMapBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 768, length = 768, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT) = 0xd7384300
75 679350 glFlushMappedBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, length = 768)
76 679351 glUnmapBuffer(target = GL_ELEMENT_ARRAY_BUFFER) = GL_TRUE
77 679352 glMapBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 1536, length = 768, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT) = 0xd7384600
78 679354 glFlushMappedBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, length = 768)
79 679355 glUnmapBuffer(target = GL_ELEMENT_ARRAY_BUFFER) = GL_TRUE
80 679356 glMapBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 2304, length = 768, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT) = 0xd7384900
81 679358 glFlushMappedBufferRange(target = GL_ELEMENT_ARRAY_BUFFER, offset = 0, length = 768)
82 679359 glUnmapBuffer(target = GL_ELEMENT_ARRAY_BUFFER) = GL_TRUE
83
84 [... setup completes and we start drawing later]
85
86 761845 glBindBufferARB(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 1314)
87 761846 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 323, count = 384, type = GL_UNSIGNED_SHORT, indices = NULL, basevertex = 0)
88
89This suggests that, for non-blitting drivers, resetting your "might be used on
90the GPU" range after a stall could save you a bunch of additional GPU stalls
91during setup.
92
93Terraria
94========
95
96.. code-block:: console
97
98 167581 glXSwapBuffers(dpy = 0x3004630, drawable = 25165844)
99
100 167585 glBufferData(target = GL_ARRAY_BUFFER, size = 196608, data = NULL, usage = GL_STREAM_DRAW)
101 167586 glBufferSubData(target = GL_ARRAY_BUFFER, offset = 0, size = 1728, data = blob(1728))
102 167588 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 71, count = 108, type = GL_UNSIGNED_SHORT, indices = NULL, basevertex = 0)
103 167589 glBufferData(target = GL_ARRAY_BUFFER, size = 196608, data = NULL, usage = GL_STREAM_DRAW)
104 167590 glBufferSubData(target = GL_ARRAY_BUFFER, offset = 0, size = 27456, data = blob(27456))
105 167592 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 7, count = 12, type = GL_UNSIGNED_SHORT, indices = NULL, basevertex = 0)
106 167594 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 3, count = 6, type = GL_UNSIGNED_SHORT, indices = NULL, basevertex = 8)
107 167596 glDrawRangeElementsBaseVertex(mode = GL_TRIANGLES, start = 0, end = 3, count = 6, type = GL_UNSIGNED_SHORT, indices = NULL, basevertex = 12)
108 [...]
109
110In this game, we can see ``glBufferData()`` being used on the same array buffer
111throughout, to get new storage so that the ``glBufferSubData()`` doesn't cause
112synchronization.
113
114Don't Starve
115============
116
117.. code-block:: console
118
119 7251917 glGenBuffers(n = 1, buffers = &115052)
120 7251918 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 115052)
121 7251919 glBufferData(target = GL_ARRAY_BUFFER, size = 144, data = blob(144), usage = GL_STREAM_DRAW)
122 7251921 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 115052)
123 7251928 glDrawArrays(mode = GL_TRIANGLES, first = 0, count = 6)
124 7251930 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 114872)
125 7251936 glDrawArrays(mode = GL_TRIANGLES, first = 0, count = 18)
126 7251938 glGenBuffers(n = 1, buffers = &115053)
127 7251939 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 115053)
128 7251940 glBufferData(target = GL_ARRAY_BUFFER, size = 144, data = blob(144), usage = GL_STREAM_DRAW)
129 7251942 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 115053)
130 7251949 glDrawArrays(mode = GL_TRIANGLES, first = 0, count = 6)
131 7251973 glXSwapBuffers(dpy = 0x86dd860, drawable = 20971540)
132 [... drawing next frame]
133 7252388 glDeleteBuffers(n = 1, buffers = &115052)
134 7252389 glDeleteBuffers(n = 1, buffers = &115053)
135 7252390 glXSwapBuffers(dpy = 0x86dd860, drawable = 20971540)
136
137In this game we have a lot of tiny ``glBufferData()`` calls, suggesting that we
138could see working set wins and possibly CPU overhead reduction by packing small
139GL buffers in the same BO. Interestingly, the deletes of the temporary buffers
140always happen at the end of the next frame.
141
142Euro Truck Simulator
143====================
144
145.. code-block:: console
146
147 [usage of VBO 14,15]
148 [...]
149 885199 glXSwapBuffers(dpy = 0x379a3e0, drawable = 20971527)
150 885203 glInvalidateBufferData(buffer = 14)
151 885204 glInvalidateBufferData(buffer = 15)
152 [...]
153 889330 glXSwapBuffers(dpy = 0x379a3e0, drawable = 20971527)
154 889334 glInvalidateBufferData(buffer = 12)
155 889335 glInvalidateBufferData(buffer = 16)
156 [...]
157 893461 glXSwapBuffers(dpy = 0x379a3e0, drawable = 20971527)
158 893462 glClientWaitSync(sync = 0x77eee10, flags = 0x0, timeout = 0) = GL_ALREADY_SIGNALED
159 893463 glDeleteSync(sync = 0x780a630)
160 893464 glFenceSync(condition = GL_SYNC_GPU_COMMANDS_COMPLETE, flags = 0) = 0x78ec730
161 893465 glInvalidateBufferData(buffer = 13)
162 893466 glInvalidateBufferData(buffer = 17)
163 893505 glBindBuffer(target = GL_COPY_READ_BUFFER, buffer = 14)
164 893506 glMapBufferRange(target = GL_COPY_READ_BUFFER, offset = 0, length = 788, access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7b034efd1000
165 893508 glUnmapBuffer(target = GL_COPY_READ_BUFFER) = GL_TRUE
166 893509 glBindBuffer(target = GL_COPY_READ_BUFFER, buffer = 15)
167 893510 glMapBufferRange(target = GL_COPY_READ_BUFFER, offset = 0, length = 32, access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7b034e5df000
168 893512 glUnmapBuffer(target = GL_COPY_READ_BUFFER) = GL_TRUE
169 893532 glBindVertexBuffers(first = 0, count = 2, buffers = {10, 15}, offsets = {0, 0}, strides = {52, 16})
170 893552 glDrawElementsInstancedBaseVertex(mode = GL_TRIANGLES, count = 18, type = GL_UNSIGNED_SHORT, indices = 0x13f280, instancecount = 1, basevertex = 25131)
171 893609 glDrawArrays(mode = GL_TRIANGLES, first = 0, count = 6)
172 893732 glBindVertexBuffers(first = 0, count = 1, buffers = &14, offsets = &0, strides = &48)
173 893733 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 14)
174 893744 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 6, type = GL_UNSIGNED_SHORT, indices = 0xf0, basevertex = 0)
175 893759 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 24, type = GL_UNSIGNED_SHORT, indices = 0x2e0, basevertex = 6)
176 893786 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 600, type = GL_UNSIGNED_SHORT, indices = 0xe87b0, basevertex = 21515)
177 893822 glDrawArrays(mode = GL_TRIANGLES, first = 0, count = 6)
178 893845 glBindBuffer(target = GL_COPY_READ_BUFFER, buffer = 14)
179 893846 glMapBufferRange(target = GL_COPY_READ_BUFFER, offset = 788, length = 788, access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_RANGE_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7b034efd1314
180 893848 glUnmapBuffer(target = GL_COPY_READ_BUFFER) = GL_TRUE
181 893886 glDrawElementsInstancedBaseVertex(mode = GL_TRIANGLES, count = 18, type = GL_UNSIGNED_SHORT, indices = 0x13f280, instancecount = 1, basevertex = 25131)
182 893943 glDrawArrays(mode = GL_TRIANGLES, first = 0, count = 6)
183
184At the start of this frame, buffer 14 and 15 haven't been used in the previous 2
185frames, and the :ext:`GL_ARB_sync` fence has ensured that the GPU has at least started
186frame n-1 as the CPU starts the current frame. The first map is ``offset = 0,
187INVALIDATE_BUFFER | UNSYNCHRONIZED``, which suggests that the driver should
188reallocate storage for the mapping even in the ``UNSYNCHRONIZED`` case, except
189that the buffer is definitely going to be idle, making reallocation unnecessary
190(you may need to empty your valid range, though, to prevent unnecessary batch
191flushes).
192
193Also note the use of a totally unrelated binding point for the mapping of the
194vertex array -- you can't effectively use it as a hint for any buffer placement
195in memory. The game does also use ``glCopyBufferSubData()``, but only on a
196different buffer.
197
198
199Plague Inc
200==========
201
202.. code-block:: console
203
204 1640732 glXSwapBuffers(dpy = 0xb218f20, drawable = 23068674)
205 1640733 glClientWaitSync(sync = 0xb4141430, flags = 0x0, timeout = 0) = GL_ALREADY_SIGNALED
206 1640734 glDeleteSync(sync = 0xb4141430)
207 1640735 glFenceSync(condition = GL_SYNC_GPU_COMMANDS_COMPLETE, flags = 0) = 0xb4141430
208
209 1640780 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 78)
210 1640787 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 79)
211 1640788 glDrawElements(mode = GL_TRIANGLES, count = 9636, type = GL_UNSIGNED_SHORT, indices = NULL)
212 1640795 glDrawElements(mode = GL_TRIANGLES, count = 9636, type = GL_UNSIGNED_SHORT, indices = NULL)
213 1640813 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1096)
214 1640814 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 67584, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0xbfef4000
215 1640815 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1091)
216 1640816 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 12, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0xc3998000
217 1640817 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1096)
218 1640819 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 352)
219 1640820 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
220 1640821 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1091)
221 1640823 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 12)
222 1640824 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
223 1640825 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 1096)
224 1640831 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 1091)
225 1640832 glDrawElements(mode = GL_TRIANGLES, count = 6, type = GL_UNSIGNED_SHORT, indices = NULL)
226
227 1640847 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1096)
228 1640848 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 352, length = 67584, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0xbfef4160
229 1640849 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1091)
230 1640850 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 88, length = 12, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0xc3998058
231 1640851 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1096)
232 1640853 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 352)
233 1640854 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
234 1640855 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 1091)
235 1640857 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 12)
236 1640858 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
237 1640863 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 6, type = GL_UNSIGNED_SHORT, indices = 0x58, basevertex = 4)
238
239At the start of this frame, the VBOs haven't been used in about 6 frames, and
240the :ext:`GL_ARB_sync` fence has ensured that the GPU has started frame n-1.
241
242Note the use of ``glFlushMappedBufferRange()`` on a small fraction of the size
243of the VBO -- it is important that a blitting driver make use of the flush
244ranges when in explicit mode.
245
246Darkest Dungeon
247===============
248
249.. code-block:: console
250
251 938384 glXSwapBuffers(dpy = 0x377fcd0, drawable = 23068692)
252
253 938385 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 2)
254 938386 glBufferData(target = GL_ARRAY_BUFFER, size = 1048576, data = NULL, usage = GL_STREAM_DRAW)
255 938511 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 2)
256 938512 glMapBufferRange(target = GL_ARRAY_BUFFER, offset = 0, length = 1048576, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7a73fcaa7000
257 938514 glFlushMappedBufferRange(target = GL_ARRAY_BUFFER, offset = 0, length = 512)
258 938515 glUnmapBuffer(target = GL_ARRAY_BUFFER) = GL_TRUE
259 938523 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 1)
260 938524 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 2)
261 938525 glDrawElements(mode = GL_TRIANGLES, count = 24, type = GL_UNSIGNED_SHORT, indices = NULL)
262 938527 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 2)
263 938528 glMapBufferRange(target = GL_ARRAY_BUFFER, offset = 0, length = 1048576, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7a73fcaa7000
264 938530 glFlushMappedBufferRange(target = GL_ARRAY_BUFFER, offset = 512, length = 512)
265 938531 glUnmapBuffer(target = GL_ARRAY_BUFFER) = GL_TRUE
266 938539 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 1)
267 938540 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 2)
268 938541 glDrawElements(mode = GL_TRIANGLES, count = 24, type = GL_UNSIGNED_SHORT, indices = 0x30)
269 [... more maps and draws at increasing offsets]
270
271Interesting note for this game, after the initial ``glBufferData()`` in the
272frame to reallocate the storage, it unsync maps the whole buffer each time, and
273just changes which region it flushes. The same GL buffer name is used in every
274frame.
275
276Tabletop Simulator
277==================
278
279.. code-block:: console
280
281 1287594 glXSwapBuffers(dpy = 0x3e10810, drawable = 23068692)
282 1287595 glClientWaitSync(sync = 0x7abf554e37b0, flags = 0x0, timeout = 0) = GL_ALREADY_SIGNALED
283 1287596 glDeleteSync(sync = 0x7abf554e37b0)
284 1287597 glFenceSync(condition = GL_SYNC_GPU_COMMANDS_COMPLETE, flags = 0) = 0x7abf56647490
285
286 1287614 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 480)
287 1287615 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 384, access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_RANGE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7abf2e79a000
288 1287642 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 614)
289 1287650 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 5)
290 1287651 glBufferSubData(target = GL_COPY_WRITE_BUFFER, offset = 0, size = 1088, data = blob(1088))
291 1287652 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 615)
292 1287653 glDrawElements(mode = GL_TRIANGLES, count = 1788, type = GL_UNSIGNED_SHORT, indices = NULL)
293 [... more draw calls]
294 1289055 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 480)
295 1289057 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 384)
296 1289058 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
297 1289059 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 480)
298 1289066 glDrawArrays(mode = GL_TRIANGLE_STRIP, first = 12, count = 4)
299 1289068 glDrawArrays(mode = GL_TRIANGLE_STRIP, first = 8, count = 4)
300 1289553 glXSwapBuffers(dpy = 0x3e10810, drawable = 23068692)
301
302In this app, buffer 480 gets used like this every other frame. The :ext:`GL_ARB_sync`
303fence ensures that frame n-1 has started on the GPU before CPU work starts on
304the current frame, so the unsynchronized access to the buffers is safe.
305
306Hollow Knight
307=============
308
309.. code-block:: console
310
311 1873034 glXSwapBuffers(dpy = 0x28609d0, drawable = 23068692)
312 1873035 glClientWaitSync(sync = 0x7b1a5ca6e130, flags = 0x0, timeout = 0) = GL_ALREADY_SIGNALED
313 1873036 glDeleteSync(sync = 0x7b1a5ca6e130)
314 1873037 glFenceSync(condition = GL_SYNC_GPU_COMMANDS_COMPLETE, flags = 0) = 0x7b1a5ca6e130
315 1873038 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 29)
316 1873039 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 8640, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7b1a04c7e000
317 1873040 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 30)
318 1873041 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 720, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7b1a07430000
319 1873065 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 29)
320 1873067 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 8640)
321 1873068 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
322 1873069 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 30)
323 1873071 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 720)
324 1873072 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
325 1873073 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 29)
326 1873074 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 8640, length = 576, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7b1a04c801c0
327 1873075 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 30)
328 1873076 glMapBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 720, length = 72, access = GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT) = 0x7b1a074302d0
329 1873077 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 29)
330 1873079 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 576)
331 1873080 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
332 1873081 glBindBuffer(target = GL_COPY_WRITE_BUFFER, buffer = 30)
333 1873083 glFlushMappedBufferRange(target = GL_COPY_WRITE_BUFFER, offset = 0, length = 72)
334 1873084 glUnmapBuffer(target = GL_COPY_WRITE_BUFFER) = GL_TRUE
335 1873085 glBindBuffer(target = GL_ARRAY_BUFFER, buffer = 29)
336 1873096 glBindBuffer(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 30)
337 1873097 glDrawElementsBaseVertex(mode = GL_TRIANGLES, count = 36, type = GL_UNSIGNED_SHORT, indices = 0x2d0, basevertex = 240)
338
339In this app, buffer 29/30 get used like this starting from offset 0 every other
340frame. The :ext:`GL_ARB_sync` fence is used to make sure that the GPU has reached the
341start of the previous frame before we go unsynchronized writing over the n-2
342frame's buffer.
343
344Borderlands 2
345=============
346
347.. code-block:: console
348
349 3561998 glFlush()
350 3562004 glXSwapBuffers(dpy = 0xbaf0f90, drawable = 23068705)
351 3562006 glClientWaitSync(sync = 0x231c2ab0, flags = GL_SYNC_FLUSH_COMMANDS_BIT, timeout = 10000000000) = GL_ALREADY_SIGNALED
352 3562007 glDeleteSync(sync = 0x231c2ab0)
353 3562008 glFenceSync(condition = GL_SYNC_GPU_COMMANDS_COMPLETE, flags = 0) = 0x231aadc0
354
355 3562050 glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 1193)
356 3562051 glMapBufferRange(target = GL_ARRAY_BUFFER, offset = 0, length = 1792, access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT) = 0xde056000
357 3562053 glUnmapBufferARB(target = GL_ARRAY_BUFFER) = GL_TRUE
358 3562054 glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 1194)
359 3562055 glMapBufferRange(target = GL_ARRAY_BUFFER, offset = 0, length = 1280, access = GL_MAP_WRITE_BIT | GL_MAP_INVALIDATE_BUFFER_BIT) = 0xd9426000
360 3562057 glUnmapBufferARB(target = GL_ARRAY_BUFFER) = GL_TRUE
361 [... unrelated draws]
362 3563051 glBindBufferARB(target = GL_ARRAY_BUFFER, buffer = 1193)
363 3563064 glBindBufferARB(target = GL_ELEMENT_ARRAY_BUFFER, buffer = 875)
364 3563065 glDrawElementsInstancedARB(mode = GL_TRIANGLES, count = 72, type = GL_UNSIGNED_SHORT, indices = NULL, instancecount = 28)
365
366The :ext:`GL_ARB_sync` fence ensures that the GPU has started frame n-1 before the CPU
367starts on the current frame.
368
369This sequence of buffer uploads appears in each frame with the same buffer
370names, so you do need to handle the ``GL_MAP_INVALIDATE_BUFFER_BIT`` as a
371reallocate if the buffer is GPU-busy (it wasn't in this trace capture) to avoid
372stalls on the n-1 frame completing.
373
374Note that this is just one small buffer. Most of the vertex data goes through a
375``glBufferSubData()``/``glDraw*()`` path with the VBO used across multiple
376frames, with a ``glBufferData()`` when needing to wrap.
377
378Buffer mapping conclusions
379--------------------------
380
381* Non-blitting drivers must track the valid range of a freshly allocated buffer
382 as it gets uploaded in ``pipe_transfer_map()`` and avoid stalling on the GPU
383 when mapping an undefined portion of the buffer when ``glBufferSubData()`` is
384 interleaved with drawing.
385
386* Non-blitting drivers must reallocate storage on ``glBufferData(NULL)`` so that
387 the following ``glBufferSubData()`` won't stall. That ``glBufferData(NULL)``
388 call will appear in the driver as an ``invalidate_resource()`` call if
389 ``PIPE_CAP_INVALIDATE_BUFFER`` is available. (If that flag is not set, then
390 mesa/st will create a new pipe_resource for you). Storage reallocation may be
391 skipped if you for some reason know that the buffer is idle, in which case you
392 can just empty the valid region.
393
394* Blitting drivers must use the ``transfer_flush_region()`` region
395 instead of the mapped range when ``PIPE_MAP_FLUSH_EXPLICIT`` is set, to avoid
396 blitting too much data. (When that bit is unset, you just blit the whole
397 mapped range at unmap time.)
398
399* Buffer valid range tracking in non-blitting drivers must use the
400 ``transfer_flush_region()`` region instead of the mapped range when
401 ``PIPE_MAP_FLUSH_EXPLICIT`` is set, to avoid excess stalls.
402
403* Buffer valid range tracking doesn't need to be fancy, "number of bytes
404 valid starting from 0" is sufficient for all examples found.
405
406* Use the ``util_debug_callback`` to report stalls on buffer mapping to ease
407 debug.
408
409* Buffer binding points are not useful for tuning buffer placement (See all the
410 ``PIPE_COPY_WRITE_BUFFER`` instances), you have to track the actual usage
411 history of a GL BO name. mesa/st does this for optimizing its state updates
412 on reallocation in the ``!PIPE_CAP_INVALIDATE_BUFFER`` case, and if you set
413 ``PIPE_CAP_INVALIDATE_BUFFER`` then you have to flag your own internal state
414 updates (VBO addresses, XFB addresses, texture buffer addresses, etc.) on
415 reallocation based on usage history.
注意: 瀏覽 TracBrowser 來幫助您使用儲存庫瀏覽器

© 2025 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette