[lldb] Fix wrong buffer size when fetching Objective-C classes (#197389)
LLDB calls objc_getRealizedClassList_trylock to fetch the list of
realized Objective-C classes.
Jim spotted that we currently pass the buffer length in *bytes*, when
actually this API takes the buffer length in number of elements. This
causes that the Objective-C runtime write more memory that we allocated
for it. This can cause that the function calling expression crashes and
leaves the Objective-C runtime mutex locked.
[AArch64][SVE] Use truncating stores whenever possible (#196029)
For fixed length SVE and fixed length vectors x/y, fold
```
store(concat_vector(truncate(x), truncate(y)))
--> store(truncate(x))
store(truncate(y))
```
[Flang][Driver] Add per-target search path for modules (#196558)
Adds the version- and target-specific path
../lib/clang/<version>/finclude/flang/<target>
to the intrinsic module search path in addition to
../finclude/flang
with the former taking precedence if a module file should exist in both.
The version/target-specific path is added by the driver by passing
`-fintrinsic-modules-path` to the `-fc1` invocation. This is consistent
with gfortran and the usual pattern that the driver resolves paths into
the resource path, not the frontend.
This PR adds nothing into that directory, which will be done in #171515.
Extracted out of #171515 as requested by
[4 lines not shown]
Add peer to qwx(4) firmware after starting the vdev, not before.
From mglocker@ via qwz(4)
On ath12k this fixed a firmware crash by avoiding the peer getting
created with a half-initialized vdev. The fix does not hurt on ath11k
so apply it to qwx(4) as well.
[MIPS][GlobalISel] Remove dependency on legal ruleset (#197379)
This fills in always legal rules, to remove the dependency on the legacy
ruleset. This is not guaranteed to be all the rules, just the ones that
appear in tests.
[MLIR][NVGPU] Use NVVM enums in NVGPU dialect (#195812)
Updates the `nvgpu.rcp` Op to use the NVVM `FPRoundingModeAttr`
attribute instead of redefining the attribute in the NVGPU dialect.
[LoopPeel] Peel last iteration to enable load widening
In loops that contain multiple consecutive small loads (e.g., 3 bytes
loading i8's), peeling the last iteration makes it safe to read beyond
the accessed region, enabling the use of a wider load (e.g., i32) for
all other N-1 iterations.
Patterns such as:
```
%a = load i8, ptr %p
%b = load i8, ptr %p+1
%c = load i8, ptr %p+2
...
%p.next = getelementptr i8, ptr %p, 3
```
Can be transformed to:
```
%wide = load i32, ptr %p ; Read 4 bytes
[9 lines not shown]
[VPlan] Expand simple SCEVs directly to VPInstructions. (#189455)
Add initial simple SCEV expansion directly to VPInstructions. To start
with, just support expanding SCEV expressions for the vector step (VF *
UF). This requires expanding VScale, constants and multiply expressions.
This allows enables CSE for some redundant vscale calls as first step
and also enables expanding SCEV expressions in blocks other than the
header as follow-ups. For example, this could be useful to avoid some
code movement with https://github.com/llvm/llvm-project/pull/189372.
[ARM] Reject unencodable Thumb2 LDRD/STRD post-index offsets (#197228)
Thumb2 post-index LDRD/STRD immediates are encoded as scaled-by-4
values. Without the Imm8s4 parser class, plain assembly can accept byte
offsets that cannot be represented exactly.
Reject those operands during asm matching so invalid input does not
reach later MC paths where debug builds assert and release builds can
silently encode the rounded-down value.
The t2am_imm8s4_offset_asmoperand parser is updated to accommodate
constructs such as #-0 which preserve the negative sentinel on the zero.