[lldb] Override UpdateBreakpointSites in ProcessGDBRemote to use MultiBreakpoint
This concludes the implementation of MultiBreakpoint by actually using
the new packet to batch breakpoint requests.
https://github.com/llvm/llvm-project/pull/192910
[X86] lock opt ptr const inconsistencies (#185195)
Resolves: https://github.com/llvm/llvm-project/issues/147280
The linked issue mentions cases of atomic arithmetic followed by a test
which can be recovered by flags that are emitting cas loops instead of
lock + op which can be inferred from flags.
There's one fold that solves the issue's code: `lock and` sets ZF on the
result of old & C, so any nonzero comparison against new = old & C can
be answered with ZF, so this fold does just that, reduces to a != 0 or
== 0.
I also decided to refactor `shouldExpandCmpArithRMWInIR` into a
dispatching function and make `getCmpArithCC` just return X86::CondCodes
directly. This deleted the dispatching switch later in the code.
Also I broke out the different cases of `getCmpArithWithCC` into helper
functions for each case (add, sub, and, xor, or, add with overflow, sub
[3 lines not shown]
[LangRef] Specify that syncscopes can affect the monotonic modification order
If a target specifies that atomics with mismatching syncscopes appear
non-atomic to each other, there is no point in requiring them to be ordered in
the monotonic modification order. Notably, the [AMDGPU target user
guide](https://llvm.org/docs/AMDGPUUsage.html#memory-scopes) has specified
syncscopes to relax the modification order for years.
So far, I haven't found an example where this less constrained ordering would
be observable (at least with the AMDGPU inclusive scope rules). Whenever a load
would be able to see two monotonic stores with non-inclusive scope, that's
considered a data race (i.e., the load would return `undef`), so it cannot be
used to observe the order of the stores.
[X86] combineVTRUNCSAT - don't split 128-bit concatenated vectors when folding to PACKSS/US (#194347)
If the VTRUNCS/US node has 128-bit src and dst types, then ensure we
don't split into sub-128-bit vectors - just treat it as padded with
zeros to match VTRUNC behaviour.
Fixes #194344
[VPlan] Remove unused PhisToFix member from VPRecipeBuilder. NFC (#194451)
The field is not referenced anywhere; the only PhisToFix in the codebase
is a local variable in VPlanConstruction.cpp.
[X86] combineKSHIFT - fold kshift(logicop(X,C1),C2) -> logicop(kshift(X,C2),kshift(C1,C2)) (#194343)
Attempt to push KSHIFTs up through logicops in the DAG to expose additional folding
Requires us to add constant folding handling for KSHIFTL/R instructions as well
Yak shaving for #193700
[AArch64][GlobalISel] Update fp legalization mir tests. NFC (#194561)
This updates a number of the floating point mir legalization tests to
use f
types instead of generic s types.
Hash.cpp: include ErrorHandling.h (#194553)
Hash.cpp uses llvm_unreachable but currently picks up ErrorHandling.h
only transitively through xxhash.h -> ArrayRef.h -> Hashing.h.
[LangRef] Specify that syncscopes can affect the monotonic modification order
If a target specifies that atomics with mismatching syncscopes appear
non-atomic to each other, there is no point in requiring them to be ordered in
the monotonic modification order. Notably, the [AMDGPU target user
guide](https://llvm.org/docs/AMDGPUUsage.html#memory-scopes) has specified
syncscopes to relax the modification order for years.
So far, I haven't found an example where this less constrained ordering would
be observable (at least with the AMDGPU inclusive scope rules). Whenever a load
would be able to see two monotonic stores with non-inclusive scope, that's
considered a data race (i.e., the load would return `undef`), so it cannot be
used to observe the order of the stores.
[AMDGPUUsage] Specify what one-as syncscopes do (#189016)
This matches the currently implemented and (as far as I could determine)
intended semantics of these syncscopes.
The sync scope table is unchanged except for removing its indentation;
otherwise it would be rendered as part of the preceding note.
[LangRef] Specify that syncscopes can affect the monotonic modification order
If a target specifies that atomics with mismatching syncscopes appear
non-atomic to each other, there is no point in requiring them to be ordered in
the monotonic modification order. Notably, the [AMDGPU target user
guide](https://llvm.org/docs/AMDGPUUsage.html#memory-scopes) has specified
syncscopes to relax the modification order for years.
So far, I haven't found an example where this less constrained ordering would
be observable (at least with the AMDGPU inclusive scope rules). Whenever a load
would be able to see two monotonic stores with non-inclusive scope, that's
considered a data race (i.e., the load would return `undef`), so it cannot be
used to observe the order of the stores.
[AMDGPUUsage] Specify what one-as syncscopes do
This matches the currently implemented and (as far as I could determine)
intended semantics of these syncscopes.
The sync scope table is unchanged except for removing its indentation;
otherwise it would be rendered as part of the preceding note.
[LangRef][AMDGPU] Specify that syncscope can cause atomic operations to race (#189015)
Targets should be able to specify that the syncscope of atomic operations
influences whether they participate in data races with each other.
For example, in AMDGPU, we want (and already implement) the load in the
following case to be in a data race (i.e., return `undef` according to the
current definition), because there is an atomic store with workgroup syncscope
executing in a different workgroup:
```
; workgroup 0:
store atomic i32 1, ptr %p syncscope("workgroup") monotonic, align 4
; workgroup 1:
store atomic i32 2, ptr %p syncscope("workgroup") monotonic, align 4
load atomic i32, ptr %p syncscope("workgroup") monotonic, align 4
```
[3 lines not shown]
[libc][math] Refactor fmin family to header-only (#194549)
Refactors the fmin math family to be header-only.
part of: #147386
Target Functions:
- fmin
- fminbf16
- fminf
- fminf128
- fminf16
- fminl
[clang][bytecode] Add a missing fallthrough() call (#194537)
When the local variable is enabled but we don't emit any dtor
instructions, we need to fallthrough to the `EndLabel`.
[mlir][tosa] Make tosa-attach-target optional in addTosaToLinalgPasses (#193467)
addTosaToLinalgPasses unconditionally scheduled tosa-attach-target, thus
adding a `tosa.target_env` attribute to the module. Callers therefore
had no way to opt out of such attribute. This attribute is consumed if
validationOptions is enabled, which is optional. Therefore, if the
caller disables validationOptions, the tosa-attach-target attribute will
exist even after TosaToLinalg. So consumers that don't load the TOSA
dialect can't even parse the resulting module.
This PR makes sure that we schedule tosa-attach-target only when the
caller actually needs it, i.e. when validationOptions or
attachTargetOptions is set. The default values stay inside the
`!attachTargetOptions` branch so callers that set validationOptions
still get a target env to validate against, preserving existing
behavior.
Also add a `validation` pipeline option (default `true`) to the
registered `tosa-to-linalg-pipeline`, so it can be invoked without
scheduling either `tosa-attach-target` / `tosa-validate`. A lit test is
also added.