Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call site bkpts #9747

Merged
merged 4 commits into from
Jan 14, 2025
Merged

Conversation

jimingham
Copy link

This is a reworking of the PR I originally submitted as:

#9493

Turns out that patch could cause a A/B lock inversion - the SwiftREPL/BreakpointSimple.test was the only instance of that.

Fixing that required reworking the locking strategy for ThreadPlanStack and StackFrameList. The deadlock was artificial since in fact everybody acquiring these lock where it deadlocked were readers, not writers, so there wasn't in fact any contention.

This PR is a combination of the three patches on the llvm.org side, two that converted the two exclusive mutexes to shared mutexes, and the reapplication of the call site patch.

I'm doing this as a combo patch because the original patch caused no problems on llvm.org and so was not reverted, making putting this in as three separate patches troublesome.

I have some reports of A/B inversion deadlocks between the
ThreadPlanStack and the StackFrameList accesses. There's a fair bit of
reasonable code in lldb that does "While accessing the ThreadPlanStack,
look at that threads's StackFrameList", and also plenty of "While
accessing the ThreadPlanStack, look at the StackFrameList."

In all the cases I've seen so far, there was at most one of the locks
taken that were trying to mutate the list, the other three were just
reading. So we could solve the deadlock by converting the two mutexes
over to shared mutexes.

This patch is the easy part, the ThreadPlanStack mutex.

The tricky part was because these were originally recursive mutexes, and
recursive access to shared mutexes is undefined behavior according to
the C++ standard, I had to add a couple NoLock variants to make sure it
didn't get used recursively. Then since the only remaining calls are out
to ThreadPlans and ThreadPlans don't have access to their containing
ThreadPlanStack, converting this to a non-recursive lock should be safe.

(cherry picked from commit b42a816)
…t-call-site-breakpoints""

This reverts commit c93fb6d.
I have some reports of A/B inversion deadlocks between the
ThreadPlanStack and the StackFrameList accesses. There's a fair bit of
reasonable code in lldb that does "While accessing the ThreadPlanStack,
look at that threads's StackFrameList", and also plenty of "While
accessing the ThreadPlanStack, look at the StackFrameList."

In all the cases I've seen so far, there was at most one of the locks
taken that were trying to mutate the list, the other three were just
reading. So we could solve the deadlock by converting the two mutexes
over to shared mutexes.

This patch is the easy part, the ThreadPlanStack mutex.

The tricky part was because these were originally recursive mutexes, and
recursive access to shared mutexes is undefined behavior according to
the C++ standard, I had to add a couple NoLock variants to make sure it
didn't get used recursively. Then since the only remaining calls are out
to ThreadPlans and ThreadPlans don't have access to their containing
ThreadPlanStack, converting this to a non-recursive lock should be safe.

(cherry picked from commit b42a816)
In fact, there's only one public API in StackFrameList that changes
 the list explicitly.  The rest only change the list if you happen to
ask for more frames than lldb has currently fetched and that
always adds frames "behind the user's back".  So we were
much more prone to deadlocking than we needed to be.

This patch uses a shared_mutex instead, and when we have to add more
frames (in GetFramesUpTo) we switches to exclusive long enough to add
the frames, then goes back to shared.

Most of the work here was actually getting the stack frame list locking
to not
require a recursive mutex (shared mutexes aren't recursive).

I also added a test that has 5 threads progressively asking for more
frames simultaneously to make sure we get back valid frames and don't
deadlock.

(cherry picked from commit 186fac3)
@jimingham jimingham requested a review from a team as a code owner December 14, 2024 00:20
@jimingham
Copy link
Author

@swift-ci please test

2 similar comments
@jimingham
Copy link
Author

@swift-ci please test

@jimingham
Copy link
Author

@swift-ci please test

@JDevlieghere JDevlieghere merged commit b81995b into swiftlang:swift/release/6.1 Jan 14, 2025
3 checks passed
@jimingham jimingham deleted the call-site-bkpts branch January 14, 2025 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants