hc
2024-02-20 102a0743326a03cd1a1202ceda21e175b7d3575c
kernel/Documentation/driver-api/dma-buf.rst
....@@ -11,7 +11,7 @@
1111 The three main components of this are: (1) dma-buf, representing a
1212 sg_table and exposed to userspace as a file descriptor to allow passing
1313 between devices, (2) fence, which provides a mechanism to signal when
14
-one device as finished access, and (3) reservation, which manages the
14
+one device has finished access, and (3) reservation, which manages the
1515 shared or exclusive fence(s) associated with the buffer.
1616
1717 Shared DMA Buffers
....@@ -31,7 +31,7 @@
3131 - implements and manages operations in :c:type:`struct dma_buf_ops
3232 <dma_buf_ops>` for the buffer,
3333 - allows other users to share the buffer by using dma_buf sharing APIs,
34
- - manages the details of buffer allocation, wrapped int a :c:type:`struct
34
+ - manages the details of buffer allocation, wrapped in a :c:type:`struct
3535 dma_buf <dma_buf>`,
3636 - decides about the actual backing storage where this allocation happens,
3737 - and takes care of any migration of scatterlist - for all (shared) users of
....@@ -85,7 +85,7 @@
8585 - Memory mapping the contents of the DMA buffer is also supported. See the
8686 discussion below on `CPU Access to DMA Buffer Objects`_ for the full details.
8787
88
-- The DMA buffer FD is also pollable, see `Fence Poll Support`_ below for
88
+- The DMA buffer FD is also pollable, see `Implicit Fence Poll Support`_ below for
8989 details.
9090
9191 Basic Operation and Device DMA Access
....@@ -100,11 +100,11 @@
100100 .. kernel-doc:: drivers/dma-buf/dma-buf.c
101101 :doc: cpu access
102102
103
-Fence Poll Support
104
-~~~~~~~~~~~~~~~~~~
103
+Implicit Fence Poll Support
104
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
105105
106106 .. kernel-doc:: drivers/dma-buf/dma-buf.c
107
- :doc: fence polling
107
+ :doc: implicit fence polling
108108
109109 Kernel Functions and Structures Reference
110110 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
....@@ -118,13 +118,13 @@
118118 Reservation Objects
119119 -------------------
120120
121
-.. kernel-doc:: drivers/dma-buf/reservation.c
121
+.. kernel-doc:: drivers/dma-buf/dma-resv.c
122122 :doc: Reservation Object Overview
123123
124
-.. kernel-doc:: drivers/dma-buf/reservation.c
124
+.. kernel-doc:: drivers/dma-buf/dma-resv.c
125125 :export:
126126
127
-.. kernel-doc:: include/linux/reservation.h
127
+.. kernel-doc:: include/linux/dma-resv.h
128128 :internal:
129129
130130 DMA Fences
....@@ -132,6 +132,18 @@
132132
133133 .. kernel-doc:: drivers/dma-buf/dma-fence.c
134134 :doc: DMA fences overview
135
+
136
+DMA Fence Cross-Driver Contract
137
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
138
+
139
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
140
+ :doc: fence cross-driver contract
141
+
142
+DMA Fence Signalling Annotations
143
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144
+
145
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
146
+ :doc: fence signalling annotation
135147
136148 DMA Fences Functions Reference
137149 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
....@@ -166,3 +178,73 @@
166178 .. kernel-doc:: include/linux/sync_file.h
167179 :internal:
168180
181
+Indefinite DMA Fences
182
+~~~~~~~~~~~~~~~~~~~~~
183
+
184
+At various times &dma_fence with an indefinite time until dma_fence_wait()
185
+finishes have been proposed. Examples include:
186
+
187
+* Future fences, used in HWC1 to signal when a buffer isn't used by the display
188
+ any longer, and created with the screen update that makes the buffer visible.
189
+ The time this fence completes is entirely under userspace's control.
190
+
191
+* Proxy fences, proposed to handle &drm_syncobj for which the fence has not yet
192
+ been set. Used to asynchronously delay command submission.
193
+
194
+* Userspace fences or gpu futexes, fine-grained locking within a command buffer
195
+ that userspace uses for synchronization across engines or with the CPU, which
196
+ are then imported as a DMA fence for integration into existing winsys
197
+ protocols.
198
+
199
+* Long-running compute command buffers, while still using traditional end of
200
+ batch DMA fences for memory management instead of context preemption DMA
201
+ fences which get reattached when the compute job is rescheduled.
202
+
203
+Common to all these schemes is that userspace controls the dependencies of these
204
+fences and controls when they fire. Mixing indefinite fences with normal
205
+in-kernel DMA fences does not work, even when a fallback timeout is included to
206
+protect against malicious userspace:
207
+
208
+* Only the kernel knows about all DMA fence dependencies, userspace is not aware
209
+ of dependencies injected due to memory management or scheduler decisions.
210
+
211
+* Only userspace knows about all dependencies in indefinite fences and when
212
+ exactly they will complete, the kernel has no visibility.
213
+
214
+Furthermore the kernel has to be able to hold up userspace command submission
215
+for memory management needs, which means we must support indefinite fences being
216
+dependent upon DMA fences. If the kernel also support indefinite fences in the
217
+kernel like a DMA fence, like any of the above proposal would, there is the
218
+potential for deadlocks.
219
+
220
+.. kernel-render:: DOT
221
+ :alt: Indefinite Fencing Dependency Cycle
222
+ :caption: Indefinite Fencing Dependency Cycle
223
+
224
+ digraph "Fencing Cycle" {
225
+ node [shape=box bgcolor=grey style=filled]
226
+ kernel [label="Kernel DMA Fences"]
227
+ userspace [label="userspace controlled fences"]
228
+ kernel -> userspace [label="memory management"]
229
+ userspace -> kernel [label="Future fence, fence proxy, ..."]
230
+
231
+ { rank=same; kernel userspace }
232
+ }
233
+
234
+This means that the kernel might accidentally create deadlocks
235
+through memory management dependencies which userspace is unaware of, which
236
+randomly hangs workloads until the timeout kicks in. Workloads, which from
237
+userspace's perspective, do not contain a deadlock. In such a mixed fencing
238
+architecture there is no single entity with knowledge of all dependencies.
239
+Thefore preventing such deadlocks from within the kernel is not possible.
240
+
241
+The only solution to avoid dependencies loops is by not allowing indefinite
242
+fences in the kernel. This means:
243
+
244
+* No future fences, proxy fences or userspace fences imported as DMA fences,
245
+ with or without a timeout.
246
+
247
+* No DMA fences that signal end of batchbuffer for command submission where
248
+ userspace is allowed to use userspace fencing or long running compute
249
+ workloads. This also means no implicit fencing for shared buffers in these
250
+ cases.