[NewPM] Port AArch64RedundantCondBranch to the new pass manager (#190897)
Adds a newPM pass for AArch64RedundantCondBranch
- Refactors base logic into an Impl class
- Renames old pass with the "Legacy" suffix
- Adds the new pass manager pass using refactored logic
- Updated existing .mir tests to also test with the New Pass Manager.
Context and motivation in
https://llvm.org/docs/NewPassManager.html#status-of-the-new-and-legacy-pass-managers
[NVPTX] Constant fold blockDim when reqntid is specified (#191575)
Currently, NVPTX cannot fold the `ntid.x/y/z` intrinsic calls into const
values when `reqntid` is specified, which prevents the code from further
optimization.
Therefore, in this change, we extend the `NVVMIntrRange` pass to:
- Tighten `ntid.x/y/z` intrinsic calls to one value range, which can be
const folded in later InstCombine pass
- Tighten `tid.x/y/z` range attributes to use per-dimension reqntid
bounds
- When .reqntid exceeds hardware limits, garbage-in/garbage-out
[AMDGPU] Report only local per-function resource usage when object linking is enabled
With object linking the linker aggregates resource usage across TUs via
`.amdgpu.info`, so compile-time pessimism and call-graph propagation duplicate
the linker's work or pollute its inputs.
In this mode, skip the per-callsite conservative bumps in
`AMDGPUResourceUsageAnalysis` and assign each resource symbol in
`AMDGPUMCResourceInfo` a concrete local constant instead of building call-graph
max/or expressions.
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
"Most of the diff stat comes from Xu Kuohai's fix to emit ENDBR/BTI,
since all JITs had to be touched to move constant blinding out and
pass bpf_verifier_env in.
- Fix use-after-free in arena_vm_close on fork (Alexei Starovoitov)
- Dissociate struct_ops program with map if map_update fails (Amery
Hung)
- Fix out-of-range and off-by-one bugs in arm64 JIT (Daniel Borkmann)
- Fix precedence bug in convert_bpf_ld_abs alignment check (Daniel
Borkmann)
- Fix arg tracking for imprecise/multi-offset in BPF_ST/STX insns
(Eduard Zingerman)
[46 lines not shown]
Merge tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL (Compute Express Link) updates from Dave Jiang:
"The significant change of interest is the handling of soft reserved
memory conflict between CXL and HMEM. In essence CXL will be the first
to claim the soft reserved memory ranges that belongs to CXL and
attempt to enumerate them with best effort. If CXL is not able to
enumerate the ranges it will punt them to HMEM.
There are also MAINTAINERS email changes from Dan Williams and
Jonathan Cameron"
* tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (37 commits)
MAINTAINERS: Update Jonathan Cameron's email address
cxl/hdm: Add support for 32 switch decoders
MAINTAINERS: Update address for Dan Williams
tools/testing/cxl: Enable replay of user regions as auto regions
cxl/region: Add a region sysfs interface for region lock status
tools/testing/cxl: Test dax_hmem takeover of CXL regions
[15 lines not shown]
18036 sys/ccompile.h: want __nonstring attribute
Reviewed by: Andy Fiddaman <illumos at fiddaman.net>
Reviewed by: Bill Sommerfeld <sommerfeld at hamachi.org>
Approved by: Robert Mustacchi <rm at fingolfin.org>
tests/netinet6: Add test for route information option
Test handling of receiving multiple route information options in RA.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D56216