tracing_mutex/lib.rs
1//! Mutexes can deadlock each other, but you can avoid this by always acquiring your locks in a
2//! consistent order. This crate provides tracing to ensure that you do.
3//!
4//! This crate tracks a virtual "stack" of locks that the current thread holds, and whenever a new
5//! lock is acquired, a dependency is created from the last lock to the new one. These dependencies
6//! together form a graph. As long as that graph does not contain any cycles, your program is
7//! guaranteed to never deadlock.
8//!
9//! # Panics
10//!
11//! The primary method by which this crate signals an invalid lock acquisition order is by
12//! panicking. When a cycle is created in the dependency graph when acquiring a lock, the thread
13//! will instead panic. This panic will not poison the underlying mutex.
14//!
15//! This conflicting dependency is not added to the graph, so future attempts at locking should
16//! succeed as normal.
17//!
18//! # Structure
19//!
20//! Each module in this crate exposes wrappers for a specific base-mutex with dependency trakcing
21//! added. This includes [`stdsync`] which provides wrappers for the base locks in the standard
22//! library, and more depending on enabled compile-time features. More back-ends may be added as
23//! features in the future.
24//!
25//! # Feature flags
26//!
27//! `tracing-mutex` uses feature flags to reduce the impact of this crate on both your compile time
28//! and runtime overhead. Below are the available flags. Modules are annotated with the features
29//! they require.
30//!
31//! - `backtraces`: Enables capturing backtraces of mutex dependencies, to make it easier to
32//! determine what sequence of events would trigger a deadlock. This is enabled by default, but if
33//! the performance overhead is unaccceptable, it can be disabled by disabling default features.
34//!
35//! - `lockapi`: Enables the wrapper lock for [`lock_api`][lock_api] locks
36//!
37//! - `parkinglot`: Enables wrapper types for [`parking_lot`][parking_lot] mutexes
38//!
39//! # Performance considerations
40//!
41//! Tracing a mutex adds overhead to certain mutex operations in order to do the required
42//! bookkeeping. The following actions have the following overhead.
43//!
44//! - **Acquiring a lock** locks the global dependency graph temporarily to check if the new lock
45//! would introduce a cyclic dependency. This crate uses the algorithm proposed in ["A Dynamic
46//! Topological Sort Algorithm for Directed Acyclic Graphs" by David J. Pearce and Paul H.J.
47//! Kelly][paper] to detect cycles as efficently as possible. In addition, a thread local lock set
48//! is updated with the new lock.
49//!
50//! - **Releasing a lock** updates a thread local lock set to remove the released lock.
51//!
52//! - **Allocating a lock** performs an atomic update to a shared counter.
53//!
54//! - **Deallocating a mutex** temporarily locks the global dependency graph to remove the lock
55//! entry in the dependency graph.
56//!
57//! These operations have been reasonably optimized, but the performance penalty may yet be too much
58//! for production use. In those cases, it may be beneficial to instead use debug-only versions
59//! (such as [`stdsync::Mutex`]) which evaluate to a tracing mutex when debug assertions are
60//! enabled, and to the underlying mutex when they're not.
61//!
62//! For ease of debugging, this crate will, by default, capture a backtrace when establishing a new
63//! dependency between two mutexes. This has an additional overhead of over 60%. If this additional
64//! debugging aid is not required, it can be disabled by disabling default features.
65//!
66//! [paper]: https://whileydave.com/publications/pk07_jea/
67//! [lock_api]: https://docs.rs/lock_api/0.4/lock_api/index.html
68//! [parking_lot]: https://docs.rs/parking_lot/0.12.1/parking_lot/
69#![cfg_attr(docsrs, feature(doc_cfg))]
70use std::cell::RefCell;
71use std::fmt;
72use std::marker::PhantomData;
73use std::ops::Deref;
74use std::ops::DerefMut;
75use std::sync::atomic::AtomicUsize;
76use std::sync::atomic::Ordering;
77use std::sync::Mutex;
78use std::sync::MutexGuard;
79use std::sync::OnceLock;
80use std::sync::PoisonError;
81
82#[cfg(feature = "lockapi")]
83#[cfg_attr(docsrs, doc(cfg(feature = "lockapi")))]
84pub use lock_api;
85#[cfg(feature = "parkinglot")]
86#[cfg_attr(docsrs, doc(cfg(feature = "parkinglot")))]
87pub use parking_lot;
88use reporting::Dep;
89use reporting::Reportable;
90
91use crate::graph::DiGraph;
92
93mod graph;
94#[cfg(feature = "lockapi")]
95#[cfg_attr(docsrs, doc(cfg(feature = "lockapi")))]
96pub mod lockapi;
97#[cfg(feature = "parkinglot")]
98#[cfg_attr(docsrs, doc(cfg(feature = "parkinglot")))]
99pub mod parkinglot;
100mod reporting;
101pub mod stdsync;
102
103thread_local! {
104 /// Stack to track which locks are held
105 ///
106 /// Assuming that locks are roughly released in the reverse order in which they were acquired,
107 /// a stack should be more efficient to keep track of the current state than a set would be.
108 static HELD_LOCKS: RefCell<Vec<usize>> = RefCell::new(Vec::new());
109}
110
111/// Dedicated ID type for Mutexes
112///
113/// # Unstable
114///
115/// This type is currently private to prevent usage while the exact implementation is figured out,
116/// but it will likely be public in the future.
117struct MutexId(usize);
118
119impl MutexId {
120 /// Get a new, unique, mutex ID.
121 ///
122 /// This ID is guaranteed to be unique within the runtime of the program.
123 ///
124 /// # Panics
125 ///
126 /// This function may panic when there are no more mutex IDs available. The number of mutex ids
127 /// is `usize::MAX - 1` which should be plenty for most practical applications.
128 pub fn new() -> Self {
129 // Counter for Mutex IDs. Atomic avoids the need for locking.
130 static ID_SEQUENCE: AtomicUsize = AtomicUsize::new(0);
131
132 ID_SEQUENCE
133 .fetch_update(Ordering::SeqCst, Ordering::SeqCst, |id| id.checked_add(1))
134 .map(Self)
135 .expect("Mutex ID wraparound happened, results unreliable")
136 }
137
138 pub fn value(&self) -> usize {
139 self.0
140 }
141
142 /// Get a borrowed guard for this lock.
143 ///
144 /// This method adds checks adds this Mutex ID to the dependency graph as needed, and adds the
145 /// lock to the list of
146 ///
147 /// # Panics
148 ///
149 /// This method panics if the new dependency would introduce a cycle.
150 pub fn get_borrowed(&self) -> BorrowedMutex {
151 self.mark_held();
152 BorrowedMutex {
153 id: self,
154 _not_send: PhantomData,
155 }
156 }
157
158 /// Mark this lock as held for the purposes of dependency tracking.
159 ///
160 /// # Panics
161 ///
162 /// This method panics if the new dependency would introduce a cycle.
163 pub fn mark_held(&self) {
164 let opt_cycle = HELD_LOCKS.with(|locks| {
165 if let Some(&previous) = locks.borrow().last() {
166 let mut graph = get_dependency_graph();
167
168 graph.add_edge(previous, self.value(), Dep::capture).err()
169 } else {
170 None
171 }
172 });
173
174 if let Some(cycle) = opt_cycle {
175 panic!("{}", Dep::panic_message(&cycle))
176 }
177
178 HELD_LOCKS.with(|locks| locks.borrow_mut().push(self.value()));
179 }
180
181 pub unsafe fn mark_released(&self) {
182 HELD_LOCKS.with(|locks| {
183 let mut locks = locks.borrow_mut();
184
185 for (i, &lock) in locks.iter().enumerate().rev() {
186 if lock == self.value() {
187 locks.remove(i);
188 return;
189 }
190 }
191
192 // Drop impls shouldn't panic but if this happens something is seriously broken.
193 unreachable!("Tried to drop lock for mutex {:?} but it wasn't held", self)
194 });
195 }
196}
197
198impl Default for MutexId {
199 fn default() -> Self {
200 Self::new()
201 }
202}
203
204impl fmt::Debug for MutexId {
205 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
206 write!(f, "MutexID({:?})", self.0)
207 }
208}
209
210impl Drop for MutexId {
211 fn drop(&mut self) {
212 get_dependency_graph().remove_node(self.value());
213 }
214}
215
216/// `const`-compatible version of [`crate::MutexId`].
217///
218/// This struct can be used similarly to the normal mutex ID, but to be const-compatible its ID is
219/// generated on first use. This allows it to be used as the mutex ID for mutexes with a `const`
220/// constructor.
221///
222/// This type can be largely replaced once std::lazy gets stabilized.
223struct LazyMutexId {
224 inner: OnceLock<MutexId>,
225}
226
227impl LazyMutexId {
228 pub const fn new() -> Self {
229 Self {
230 inner: OnceLock::new(),
231 }
232 }
233}
234
235impl fmt::Debug for LazyMutexId {
236 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
237 write!(f, "{:?}", self.deref())
238 }
239}
240
241impl Default for LazyMutexId {
242 fn default() -> Self {
243 Self::new()
244 }
245}
246
247impl Deref for LazyMutexId {
248 type Target = MutexId;
249
250 fn deref(&self) -> &Self::Target {
251 self.inner.get_or_init(MutexId::new)
252 }
253}
254
255/// Borrowed mutex ID
256///
257/// This type should be used as part of a mutex guard wrapper. It can be acquired through
258/// [`MutexId::get_borrowed`] and will automatically mark the mutex as not borrowed when it is
259/// dropped.
260///
261/// This type intentionally is [`!Send`](std::marker::Send) because the ownership tracking is based
262/// on a thread-local stack which doesn't work if a guard gets released in a different thread from
263/// where they're acquired.
264#[derive(Debug)]
265struct BorrowedMutex<'a> {
266 /// Reference to the mutex we're borrowing from
267 id: &'a MutexId,
268 /// This value serves no purpose but to make the type [`!Send`](std::marker::Send)
269 _not_send: PhantomData<MutexGuard<'static, ()>>,
270}
271
272/// Drop a lock held by the current thread.
273///
274/// # Panics
275///
276/// This function panics if the lock did not appear to be handled by this thread. If that happens,
277/// that is an indication of a serious design flaw in this library.
278impl<'a> Drop for BorrowedMutex<'a> {
279 fn drop(&mut self) {
280 // Safety: the only way to get a BorrowedMutex is by locking the mutex.
281 unsafe { self.id.mark_released() };
282 }
283}
284
285/// Get a reference to the current dependency graph
286fn get_dependency_graph() -> impl DerefMut<Target = DiGraph<usize, Dep>> {
287 static DEPENDENCY_GRAPH: OnceLock<Mutex<DiGraph<usize, Dep>>> = OnceLock::new();
288
289 DEPENDENCY_GRAPH
290 .get_or_init(Default::default)
291 .lock()
292 .unwrap_or_else(PoisonError::into_inner)
293}
294
295#[cfg(test)]
296mod tests {
297 use rand::seq::SliceRandom;
298 use rand::thread_rng;
299
300 use super::*;
301
302 #[test]
303 fn test_next_mutex_id() {
304 let initial = MutexId::new();
305 let next = MutexId::new();
306
307 // Can't assert N + 1 because multiple threads running tests
308 assert!(initial.0 < next.0);
309 }
310
311 #[test]
312 fn test_lazy_mutex_id() {
313 let a = LazyMutexId::new();
314 let b = LazyMutexId::new();
315 let c = LazyMutexId::new();
316
317 let mut graph = get_dependency_graph();
318 assert!(graph.add_edge(a.value(), b.value(), Dep::capture).is_ok());
319 assert!(graph.add_edge(b.value(), c.value(), Dep::capture).is_ok());
320
321 // Creating an edge c → a should fail as it introduces a cycle.
322 assert!(graph.add_edge(c.value(), a.value(), Dep::capture).is_err());
323
324 // Drop graph handle so we can drop vertices without deadlocking
325 drop(graph);
326
327 drop(b);
328
329 // If b's destructor correctly ran correctly we can now add an edge from c to a.
330 assert!(get_dependency_graph()
331 .add_edge(c.value(), a.value(), Dep::capture)
332 .is_ok());
333 }
334
335 /// Test creating a cycle, then panicking.
336 #[test]
337 #[should_panic]
338 fn test_mutex_id_conflict() {
339 let ids = [MutexId::new(), MutexId::new(), MutexId::new()];
340
341 for i in 0..3 {
342 let _first_lock = ids[i].get_borrowed();
343 let _second_lock = ids[(i + 1) % 3].get_borrowed();
344 }
345 }
346
347 /// Fuzz the global dependency graph by fake-acquiring lots of mutexes in a valid order.
348 ///
349 /// This test generates all possible forward edges in a 100-node graph consisting of natural
350 /// numbers, shuffles them, then adds them to the graph. This will always be a valid directed,
351 /// acyclic graph because there is a trivial order (the natural numbers) but because the edges
352 /// are added in a random order the DiGraph will still occassionally need to reorder nodes.
353 #[test]
354 fn fuzz_mutex_id() {
355 const NUM_NODES: usize = 100;
356
357 let ids: Vec<MutexId> = (0..NUM_NODES).map(|_| Default::default()).collect();
358
359 let mut edges = Vec::with_capacity(NUM_NODES * NUM_NODES);
360 for i in 0..NUM_NODES {
361 for j in i..NUM_NODES {
362 if i != j {
363 edges.push((i, j));
364 }
365 }
366 }
367
368 edges.shuffle(&mut thread_rng());
369
370 for (x, y) in edges {
371 // Acquire the mutexes, smallest first to ensure a cycle-free graph
372 let _ignored = ids[x].get_borrowed();
373 let _ = ids[y].get_borrowed();
374 }
375 }
376}