[NVPTX] Constant fold blockDim when reqntid is specified (#191575)
Currently, NVPTX cannot fold the `ntid.x/y/z` intrinsic calls into const
values when `reqntid` is specified, which prevents the code from further
optimization.
Therefore, in this change, we extend the `NVVMIntrRange` pass to:
- Tighten `ntid.x/y/z` intrinsic calls to one value range, which can be
const folded in later InstCombine pass
- Tighten `tid.x/y/z` range attributes to use per-dimension reqntid
bounds
- When .reqntid exceeds hardware limits, garbage-in/garbage-out
[AMDGPU] Report only local per-function resource usage when object linking is enabled
With object linking the linker aggregates resource usage across TUs via
`.amdgpu.info`, so compile-time pessimism and call-graph propagation duplicate
the linker's work or pollute its inputs.
In this mode, skip the per-callsite conservative bumps in
`AMDGPUResourceUsageAnalysis` and assign each resource symbol in
`AMDGPUMCResourceInfo` a concrete local constant instead of building call-graph
max/or expressions.
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
"Most of the diff stat comes from Xu Kuohai's fix to emit ENDBR/BTI,
since all JITs had to be touched to move constant blinding out and
pass bpf_verifier_env in.
- Fix use-after-free in arena_vm_close on fork (Alexei Starovoitov)
- Dissociate struct_ops program with map if map_update fails (Amery
Hung)
- Fix out-of-range and off-by-one bugs in arm64 JIT (Daniel Borkmann)
- Fix precedence bug in convert_bpf_ld_abs alignment check (Daniel
Borkmann)
- Fix arg tracking for imprecise/multi-offset in BPF_ST/STX insns
(Eduard Zingerman)
[46 lines not shown]
Merge tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL (Compute Express Link) updates from Dave Jiang:
"The significant change of interest is the handling of soft reserved
memory conflict between CXL and HMEM. In essence CXL will be the first
to claim the soft reserved memory ranges that belongs to CXL and
attempt to enumerate them with best effort. If CXL is not able to
enumerate the ranges it will punt them to HMEM.
There are also MAINTAINERS email changes from Dan Williams and
Jonathan Cameron"
* tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (37 commits)
MAINTAINERS: Update Jonathan Cameron's email address
cxl/hdm: Add support for 32 switch decoders
MAINTAINERS: Update address for Dan Williams
tools/testing/cxl: Enable replay of user regions as auto regions
cxl/region: Add a region sysfs interface for region lock status
tools/testing/cxl: Test dax_hmem takeover of CXL regions
[15 lines not shown]
18036 sys/ccompile.h: want __nonstring attribute
Reviewed by: Andy Fiddaman <illumos at fiddaman.net>
Reviewed by: Bill Sommerfeld <sommerfeld at hamachi.org>
Approved by: Robert Mustacchi <rm at fingolfin.org>
tests/netinet6: Add test for route information option
Test handling of receiving multiple route information options in RA.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D56216
tests/netinet6: Add SLAAC and RA validation tests to ndp
* RA hop limit validation
* RA source address validation
* Multi router RA validation
* Two hour rule RA validation
* SLAAC onlink prefix switching test
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D56128
nd6: Remove goto and unused condition in prelist_update
While here, style it.
Reviewed by: markj, zlei
Differential Revision: https://reviews.freebsd.org/D56136
nd6: Break nd6_prefix_lifetime_update out of prelist_update
Logic of updating prefix lifetime is big enough that deserves
its own function.
While here, fix style.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D56135
nd6: Remove anycast check in prelist_update
RFC 2462 is obsoleted by RFC 4862 and it made statements more
clear than before.
Considering SLAAC can't create anycast addresses by itself, remove
its check.
While here, update comments based on RFC 4862.
Reviewed by: markj, zlei
Differential Revision: https://reviews.freebsd.org/D56134
nd6: Ignore entire PI if violates RFC 4862 section 5.5.3
Ignore prefix information update earlier in `prelist_update()`.
If PI is invalid or autonomous bit is unset, we better to let our
SLAAC address expire and if we don't have any previous matching
prefix, better not to create new one.
Because either our router don't want us to have one anymore, or
the very RA is malicious.
Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D56133
nd6: Change prelist_update return type to void
The return value of `prelist_update()` is unused.
Reviewed by: markj, zlei
Differential Revision: https://reviews.freebsd.org/D56132
nd6: Break pfxrtr_add out of nd6_prelist_add
Updating defrouter only required by `prelist_update()`.
since `nd6_prelist_add()` is a public function, exclude unsed
dr logic from it.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D56131
nd6: Break nd6_prefix_update out of prelist_update
if PI exists, call prefix_update, instead of doing it inside
the prelist_update.
no functional change intended.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D56130
Merge tag 'stop-machine.2026.04.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull stop-machine update from Paul McKenney:
- kernel-doc updates for stop_machine() and stop_machine_cpuslocked()
functions
* tag 'stop-machine.2026.04.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
stop_machine: Fix the documentation for a NULL cpus argument
[AMDGPU][GlobalIsel] Add regbank support for cvt_scalef32_sr_pk_f6_f116/32 intrinsics (#192745)
This patch adds register bank legalization rules for
cvt_scalef32_sr_pk_f6_f116/32 intrinsics in the AMDGPU GlobalISel
pipeline.