Add mmu_range_is_tt() helper function that consults the TT register
configuration to determine if a physical address range is transparently-
translated for the specified access.
(missed "cvs add")
[MachineSSAUpdater][AMDGPU] Add faster version of MachineSSAUpdater class. (#145722)
This is a port of SSAUpdaterBulk to machine IR minus "bulk" part. Phi
deduplication and simplification are not yet implemented but can be
added if needed.
When used in AMDGPU to replace MachineSSAUpdater for i1 copy lowering,
it reduced compilation time from 417 to 180 seconds for the pass on a
large test case (56% improvement).
Add mmu_range_is_tt() helper function that consults the TT register
configuration to determine if a physical address range is transparently-
translated for the specified access.
[flang][OpenMP] Move branching verification to semantic checks (#193324)
Move the check for branching into and out of an OpenMP construct from
symbol resolution into semantic checks.
Instead of using directive contexts to check for crossing a construct
boundary, use construct pointers and source ranges.
[flang][OpenMP] Make OpenMPLoopConstruct inherit from OmpBlockConstruct (#193823)
Conceptually OpenMPLoopConstruct has the exact same structure as
OmpBlockConstruct: directive specification for the begin directive,
optional one for the end directive, and a block of code. The reason why
OpenMPLoopConstruct was not originally made to be a descendant of
OmpBlockConstruct was to preserve the behavior of AST visitors, where a
separate (type-based) visitor could be defined for the begin/end
directives of a block construct, and for a loop construct. The AST nodes
representing the begin/end directives in block and loop construct had
different types: Omp{Begin|End}Directive for block constructs, and
Omp{Begin|End}LoopDirective for loop constructs.
Today this distinction is not needed anywhere, and so the loop construct
will be represented in the same way as a block construct.
[AArch64][GlobalISel] Add a variant of gi_extract_high_v8bf16 (#193345)
This allows the upper extract_high to match for bf16 types, allowing us
to generate a sshl2 instruction.
[DAG] visitIS_FPCLASS - fold to constant when result is fully determined by KnownFPClass (#193737)
This PR teaches `DAGCombiner::visitIS_FPCLASS` to fold directly to
constant `true` based on the source's `KnownFPClass`, instead of only
narrowing the test mask.
Prep work to help with https://github.com/llvm/llvm-project/pull/193672
py-wheel: updated to 0.47.0
0.47.0
- Added the ``wheel info`` subcommand to display metadata about wheel files without
unpacking them
- Fixed ``WheelFile`` raising ``Missing RECORD file`` when the wheel filename contains
uppercase characters (e.g. ``Django-3.2.5.whl``) but the ``.dist-info`` directory
inside uses normalized lowercase naming
[AArch64][llvm] Remove support for FEAT_MPAMv2_VID (#193191)
`FEAT_MPAMv2_VID` instructions and system registers, as introduced
in change d30f18d2c, are being removed at this time, as they've been
removed from the latest Arm ARM, which doesn't preclude them returning
in some form in future.
Other system registers introduced with `FEAT_MPAMv2` are unaffected,
and these continue to be ungated, but since `+mpamv2` gating is now
empty,
I'm removing this superfluous gating code.
[Clang][Sema] Change `ExtnameUndeclaredIdentifiers` to MapVector. (#193924)
Iteration order of this map does not matter for compilation, except that
since 475f71e8fa15ee71f99e450a0e1c90d3961005f9, this data is dumped into
precompiled header files and thus affects content of those files.
To make precompiled header file contents deterministic, changing its
type to one that has deterministic iteration order, matching the nearby
`WeakUndeclaredIdentifiers`.
Fixes #193923
[X86] masked div/rem tests - fix avx512 and add sse4/avx2 test coverage (#193933)
Noticed the incorrect "-mattr=+avx512" attribute, and replaced it with proper x86-64-v* level test coverage
Reland "[lldb][Linux] Read memory protection keys for memory regions (#193934)" (#193936)
This reverts commit 390a29ea833965f481a7011b07deed9612229d6e.
Two tests failed on the X86 buildbot but not in GitHub CI because the
buildbot has protection keys and the CI machines do not. I ran the tests
on an AArch64 host without protection keys, and only selected tests on a
simulated AArch64 machine with protection keys, so I did not find this
earlier.
The fix was to add "protection-key" to the list of possible
qMemoryRegionInfo response keys.
[clang][bytecode] Fix `MemberExpr`s with a static member (#193902)
We need logic to load from the reference pointer, similar to the one we
have for regular `DeclRefExpr`s.
[AAEval] Print ModRefInfo for atomic operations (#193935)
Print ModRefInfo for fence, atomicrmw, etc. Also for atomic
load and store, as these may have additional effects beyond
what is implied by the simple alias result.
[LangRef] Specify that syncscopes can affect the monotonic modification order
If a target specifies that atomics with mismatching syncscopes appear
non-atomic to each other, there is no point in requiring them to be ordered in
the monotonic modification order. Notably, the [AMDGPU target user
guide](https://llvm.org/docs/AMDGPUUsage.html#memory-scopes) has specified
syncscopes to relax the modification order for years.
So far, I haven't found an example where this less constrained ordering would
be observable (at least with the AMDGPU inclusive scope rules). Whenever a load
would be able to see two monotonic stores with non-inclusive scope, that's
considered a data race (i.e., the load would return `undef`), so it cannot be
used to observe the order of the stores.
[AMDGPUUsage] Specify what one-as syncscopes do
This matches the currently implemented and (as far as I could determine)
intended semantics of these syncscopes.
The sync scope table is unchanged except for removing its indentation;
otherwise it would be rendered as part of the preceding note.
[LangRef][AMDGPU] Specify that syncscope can cause atomic operations to race
Targets should be able to specify that the syncscope of atomic operations
influences whether they participate in data races with each other.
For example, in AMDGPU, we want (and already implement) the load in the
following case to be in a data race (i.e., return `undef` according to the
current definition), because there is an atomic store with workgroup syncscope
executing in a different workgroup:
```
; workgroup 0:
store atomic i32 1, ptr %p syncscope("workgroup") monotonic, align 4
; workgroup 1:
store atomic i32 2, ptr %p syncscope("workgroup") monotonic, align 4
load atomic i32, ptr %p syncscope("workgroup") monotonic, align 4
```
[3 lines not shown]
[LangRef] Allow monotonic & seq_cst accesses to inter-operate with other accesses (#189014)
Currently, the LangRef says that atomic operations (which includes `unordered`
operations, which don't participate in the monotonic modification order) must
read a value from the modification order of monotonic operations.
In the following example, this means that the load does not have a store it
could read from, because all stores it may see do not participate in the
monotonic modification order:
```
; thread 0:
store atomic i32 1, ptr %p unordered, align 4
; thread 1:
store atomic i32 2, ptr %p unordered, align 4
load atomic i32, ptr %p unordered, align 4
```
[19 lines not shown]
py-simplejson: updated to 4.1.0
Version 4.1.0 released 2026-04-22
* The C extension now accelerates encoding when ``indent=`` is set.
Previously the encoder fell back to the pure-Python implementation
whenever a non-None ``indent`` was passed; now the C encoder emits
the newline-plus-indent prefix, the level-aware item separator, and
the closing indent directly. A representative nested-dict workload
benchmarks about 4-5x faster end-to-end, and the ``indent=0`` and
empty-container edge cases continue to match the Python output
byte-for-byte.
* The C extension now emits PEP 678 ``exc.add_note()`` annotations on
serialization failures, matching the pure-Python encoder. A chained
error on ``{'a': [1, object(), 3]}`` produces the same three notes
(``when serializing object object``, ``when serializing list item 1``,
``when serializing dict item 'a'``) whether the speedups are loaded
or not, so the add_note assertions in ``test_errors.py`` no longer
need ``indent=2`` to force the Python path.