tracing_mutex/lib.rs
1//! Mutexes can deadlock each other, but you can avoid this by always acquiring your locks in a
2//! consistent order. This crate provides tracing to ensure that you do.
3//!
4//! This crate tracks a virtual "stack" of locks that the current thread holds, and whenever a new
5//! lock is acquired, a dependency is created from the last lock to the new one. These dependencies
6//! together form a graph. As long as that graph does not contain any cycles, your program is
7//! guaranteed to never deadlock.
8//!
9//! # Panics
10//!
11//! The primary method by which this crate signals an invalid lock acquisition order is by
12//! panicking. When a cycle is created in the dependency graph when acquiring a lock, the thread
13//! will instead panic. This panic will not poison the underlying mutex.
14//!
15//! This conflicting dependency is not added to the graph, so future attempts at locking should
16//! succeed as normal.
17//!
18//! You can suppress panics by calling [`suppress_panics`]. This will cause the crate to print the
19//! cycle to stderr instead of panicking. This is useful for incrementally adopting tracing-mutex to
20//! a large codebase compiled with `panic=abort`, as it allows you to continue running your program
21//! even when a cycle is detected.
22//!
23//! # Structure
24//!
25//! Each module in this crate exposes wrappers for a specific base-mutex with dependency trakcing
26//! added. This includes [`stdsync`] which provides wrappers for the base locks in the standard
27//! library, and more depending on enabled compile-time features. More back-ends may be added as
28//! features in the future.
29//!
30//! # Feature flags
31//!
32//! `tracing-mutex` uses feature flags to reduce the impact of this crate on both your compile time
33//! and runtime overhead. Below are the available flags. Modules are annotated with the features
34//! they require.
35//!
36//! - `backtraces`: Enables capturing backtraces of mutex dependencies, to make it easier to
37//! determine what sequence of events would trigger a deadlock. This is enabled by default, but if
38//! the performance overhead is unacceptable, it can be disabled by disabling default features.
39//!
40//! - `lockapi`: Enables the wrapper lock for [`lock_api`][lock_api] locks
41//!
42//! - `parkinglot`: Enables wrapper types for [`parking_lot`][parking_lot] mutexes
43//!
44//! - `experimental`: Enables experimental features. Experimental features are intended to test new
45//! APIs and play with new APIs before committing to them. As such, breaking changes may be
46//! introduced in it between otherwise semver-compatible versions, and the MSRV does not apply to
47//! experimental features.
48//!
49//! # Performance considerations
50//!
51//! Tracing a mutex adds overhead to certain mutex operations in order to do the required
52//! bookkeeping. The following actions have the following overhead.
53//!
54//! - **Acquiring a lock** locks the global dependency graph temporarily to check if the new lock
55//! would introduce a cyclic dependency. This crate uses the algorithm proposed in ["A Dynamic
56//! Topological Sort Algorithm for Directed Acyclic Graphs" by David J. Pearce and Paul H.J.
57//! Kelly][paper] to detect cycles as efficently as possible. In addition, a thread local lock set
58//! is updated with the new lock.
59//!
60//! - **Releasing a lock** updates a thread local lock set to remove the released lock.
61//!
62//! - **Allocating a lock** performs an atomic update to a shared counter.
63//!
64//! - **Deallocating a mutex** temporarily locks the global dependency graph to remove the lock
65//! entry in the dependency graph.
66//!
67//! These operations have been reasonably optimized, but the performance penalty may yet be too much
68//! for production use. In those cases, it may be beneficial to instead use debug-only versions
69//! (such as [`stdsync::Mutex`]) which evaluate to a tracing mutex when debug assertions are
70//! enabled, and to the underlying mutex when they're not.
71//!
72//! For ease of debugging, this crate will, by default, capture a backtrace when establishing a new
73//! dependency between two mutexes. This has an additional overhead of over 60%. If this additional
74//! debugging aid is not required, it can be disabled by disabling default features.
75//!
76//! [paper]: https://whileydave.com/publications/pk07_jea/
77//! [lock_api]: https://docs.rs/lock_api/0.4/lock_api/index.html
78//! [parking_lot]: https://docs.rs/parking_lot/0.12.1/parking_lot/
79#![cfg_attr(docsrs, feature(doc_cfg))]
80use std::cell::RefCell;
81use std::fmt;
82use std::marker::PhantomData;
83use std::ops::Deref;
84use std::ops::DerefMut;
85use std::sync::Mutex;
86use std::sync::MutexGuard;
87use std::sync::OnceLock;
88use std::sync::PoisonError;
89use std::sync::atomic::AtomicUsize;
90use std::sync::atomic::Ordering;
91
92#[cfg(feature = "lock_api")]
93#[cfg_attr(docsrs, doc(cfg(feature = "lockapi")))]
94#[deprecated = "The top-level re-export `lock_api` is deprecated. Use `tracing_mutex::lockapi::raw` instead"]
95pub use lock_api;
96#[cfg(feature = "parking_lot")]
97#[cfg_attr(docsrs, doc(cfg(feature = "parkinglot")))]
98#[deprecated = "The top-level re-export `parking_lot` is deprecated. Use `tracing_mutex::parkinglot::raw` instead"]
99pub use parking_lot;
100
101use graph::DiGraph;
102use reporting::Dep;
103use reporting::Reportable;
104
105mod graph;
106#[cfg(any(feature = "lock_api", feature = "lockapi"))]
107#[cfg_attr(docsrs, doc(cfg(feature = "lock_api")))]
108#[cfg_attr(
109 all(not(docsrs), feature = "lockapi", not(feature = "lock_api")),
110 deprecated = "The `lockapi` feature has been renamed `lock_api`"
111)]
112pub mod lockapi;
113#[cfg(any(feature = "parking_lot", feature = "parkinglot"))]
114#[cfg_attr(docsrs, doc(cfg(feature = "parking_lot")))]
115#[cfg_attr(
116 all(not(docsrs), feature = "parkinglot", not(feature = "parking_lot")),
117 deprecated = "The `parkinglot` feature has been renamed `parking_lot`"
118)]
119pub mod parkinglot;
120mod reporting;
121pub mod stdsync;
122pub mod util;
123
124pub use reporting::suppress_panics;
125
126thread_local! {
127 /// Stack to track which locks are held
128 ///
129 /// Assuming that locks are roughly released in the reverse order in which they were acquired,
130 /// a stack should be more efficient to keep track of the current state than a set would be.
131 static HELD_LOCKS: RefCell<Vec<usize>> = const { RefCell::new(Vec::new()) };
132}
133
134/// Dedicated ID type for Mutexes
135///
136/// # Unstable
137///
138/// This type is currently private to prevent usage while the exact implementation is figured out,
139/// but it will likely be public in the future.
140struct MutexId(usize);
141
142impl MutexId {
143 /// Get a new, unique, mutex ID.
144 ///
145 /// This ID is guaranteed to be unique within the runtime of the program.
146 ///
147 /// # Panics
148 ///
149 /// This function may panic when there are no more mutex IDs available. The number of mutex ids
150 /// is `usize::MAX - 1` which should be plenty for most practical applications.
151 pub fn new() -> Self {
152 // Counter for Mutex IDs. Atomic avoids the need for locking.
153 static ID_SEQUENCE: AtomicUsize = AtomicUsize::new(0);
154
155 ID_SEQUENCE
156 .fetch_update(Ordering::SeqCst, Ordering::SeqCst, |id| id.checked_add(1))
157 .map(Self)
158 .expect("Mutex ID wraparound happened, results unreliable")
159 }
160
161 pub fn value(&self) -> usize {
162 self.0
163 }
164
165 /// Get a borrowed guard for this lock.
166 ///
167 /// This method adds checks adds this Mutex ID to the dependency graph as needed, and adds the
168 /// lock to the list of
169 ///
170 /// # Panics
171 ///
172 /// This method panics if the new dependency would introduce a cycle.
173 pub fn get_borrowed(&self) -> BorrowedMutex<'_> {
174 self.mark_held();
175 BorrowedMutex {
176 id: self,
177 _not_send: PhantomData,
178 }
179 }
180
181 /// Mark this lock as held for the purposes of dependency tracking.
182 ///
183 /// # Panics
184 ///
185 /// This method panics if the new dependency would introduce a cycle.
186 pub fn mark_held(&self) {
187 let Ok(opt_cycle) = HELD_LOCKS.try_with(|locks| {
188 if let Some(&previous) = locks.borrow().last() {
189 let mut graph = get_dependency_graph();
190
191 graph.add_edge(previous, self.value(), Dep::capture).err()
192 } else {
193 None
194 }
195 }) else {
196 return;
197 };
198
199 if let Some(cycle) = opt_cycle {
200 reporting::report_cycle(&cycle);
201 }
202
203 HELD_LOCKS.with(|locks| locks.borrow_mut().push(self.value()));
204 }
205
206 pub unsafe fn mark_released(&self) {
207 let _ = HELD_LOCKS.try_with(|locks| {
208 let mut locks = locks.borrow_mut();
209
210 for (i, &lock) in locks.iter().enumerate().rev() {
211 if lock == self.value() {
212 locks.remove(i);
213 return;
214 }
215 }
216
217 // Drop impls shouldn't panic but if this happens something is seriously broken.
218 unreachable!("Tried to drop lock for mutex {:?} but it wasn't held", self)
219 });
220 }
221
222 /// Execute the given closure while the guard is held.
223 pub fn with_held<T>(&self, f: impl FnOnce() -> T) -> T {
224 // Note: we MUST construct the RAII guard, we cannot simply mark held + mark released, as
225 // f() may panic and corrupt our state.
226 let _guard = self.get_borrowed();
227 f()
228 }
229}
230
231impl Default for MutexId {
232 fn default() -> Self {
233 Self::new()
234 }
235}
236
237impl fmt::Debug for MutexId {
238 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
239 write!(f, "MutexID({:?})", self.0)
240 }
241}
242
243impl Drop for MutexId {
244 fn drop(&mut self) {
245 get_dependency_graph().remove_node(self.value());
246 }
247}
248
249/// `const`-compatible version of [`crate::MutexId`].
250///
251/// This struct can be used similarly to the normal mutex ID, but to be const-compatible its ID is
252/// generated on first use. This allows it to be used as the mutex ID for mutexes with a `const`
253/// constructor.
254///
255/// This type can be largely replaced once std::lazy gets stabilized.
256struct LazyMutexId {
257 inner: OnceLock<MutexId>,
258}
259
260impl LazyMutexId {
261 pub const fn new() -> Self {
262 Self {
263 inner: OnceLock::new(),
264 }
265 }
266}
267
268impl fmt::Debug for LazyMutexId {
269 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
270 write!(f, "{:?}", self.deref())
271 }
272}
273
274impl Default for LazyMutexId {
275 fn default() -> Self {
276 Self::new()
277 }
278}
279
280impl Deref for LazyMutexId {
281 type Target = MutexId;
282
283 fn deref(&self) -> &Self::Target {
284 self.inner.get_or_init(MutexId::new)
285 }
286}
287
288/// Borrowed mutex ID
289///
290/// This type should be used as part of a mutex guard wrapper. It can be acquired through
291/// [`MutexId::get_borrowed`] and will automatically mark the mutex as not borrowed when it is
292/// dropped.
293///
294/// This type intentionally is [`!Send`](std::marker::Send) because the ownership tracking is based
295/// on a thread-local stack which doesn't work if a guard gets released in a different thread from
296/// where they're acquired.
297#[derive(Debug)]
298struct BorrowedMutex<'a> {
299 /// Reference to the mutex we're borrowing from
300 id: &'a MutexId,
301 /// This value serves no purpose but to make the type [`!Send`](std::marker::Send)
302 _not_send: PhantomData<MutexGuard<'static, ()>>,
303}
304
305/// Drop a lock held by the current thread.
306///
307/// # Panics
308///
309/// This function panics if the lock did not appear to be handled by this thread. If that happens,
310/// that is an indication of a serious design flaw in this library.
311impl Drop for BorrowedMutex<'_> {
312 fn drop(&mut self) {
313 // Safety: the only way to get a BorrowedMutex is by locking the mutex.
314 unsafe { self.id.mark_released() };
315 }
316}
317
318/// Get a reference to the current dependency graph
319fn get_dependency_graph() -> impl DerefMut<Target = DiGraph<usize, Dep>> {
320 static DEPENDENCY_GRAPH: OnceLock<Mutex<DiGraph<usize, Dep>>> = OnceLock::new();
321
322 DEPENDENCY_GRAPH
323 .get_or_init(Default::default)
324 .lock()
325 .unwrap_or_else(PoisonError::into_inner)
326}
327
328#[cfg(test)]
329mod tests {
330 use rand::seq::SliceRandom;
331 use rand::thread_rng;
332
333 use super::*;
334
335 #[test]
336 fn test_next_mutex_id() {
337 let initial = MutexId::new();
338 let next = MutexId::new();
339
340 // Can't assert N + 1 because multiple threads running tests
341 assert!(initial.0 < next.0);
342 }
343
344 #[test]
345 fn test_lazy_mutex_id() {
346 let a = LazyMutexId::new();
347 let b = LazyMutexId::new();
348 let c = LazyMutexId::new();
349
350 let mut graph = get_dependency_graph();
351 assert!(graph.add_edge(a.value(), b.value(), Dep::capture).is_ok());
352 assert!(graph.add_edge(b.value(), c.value(), Dep::capture).is_ok());
353
354 // Creating an edge c → a should fail as it introduces a cycle.
355 assert!(graph.add_edge(c.value(), a.value(), Dep::capture).is_err());
356
357 // Drop graph handle so we can drop vertices without deadlocking
358 drop(graph);
359
360 drop(b);
361
362 // If b's destructor correctly ran correctly we can now add an edge from c to a.
363 assert!(
364 get_dependency_graph()
365 .add_edge(c.value(), a.value(), Dep::capture)
366 .is_ok()
367 );
368 }
369
370 /// Test creating a cycle, then panicking.
371 #[test]
372 #[should_panic]
373 fn test_mutex_id_conflict() {
374 let ids = [MutexId::new(), MutexId::new(), MutexId::new()];
375
376 for i in 0..3 {
377 let _first_lock = ids[i].get_borrowed();
378 let _second_lock = ids[(i + 1) % 3].get_borrowed();
379 }
380 }
381
382 /// Fuzz the global dependency graph by fake-acquiring lots of mutexes in a valid order.
383 ///
384 /// This test generates all possible forward edges in a 100-node graph consisting of natural
385 /// numbers, shuffles them, then adds them to the graph. This will always be a valid directed,
386 /// acyclic graph because there is a trivial order (the natural numbers) but because the edges
387 /// are added in a random order the DiGraph will still occassionally need to reorder nodes.
388 #[test]
389 fn fuzz_mutex_id() {
390 const NUM_NODES: usize = 100;
391
392 let ids: Vec<MutexId> = (0..NUM_NODES).map(|_| Default::default()).collect();
393
394 let mut edges = Vec::with_capacity(NUM_NODES * NUM_NODES);
395 for i in 0..NUM_NODES {
396 for j in i..NUM_NODES {
397 if i != j {
398 edges.push((i, j));
399 }
400 }
401 }
402
403 edges.shuffle(&mut thread_rng());
404
405 for (x, y) in edges {
406 // Acquire the mutexes, smallest first to ensure a cycle-free graph
407 let _ignored = ids[x].get_borrowed();
408 let _ = ids[y].get_borrowed();
409 }
410 }
411}