[AMDGPU] InstCombine: fold invalid calls to amdgcn intrinsics into poison values (#191904)
Replace a call to amdgpu intrinsic into a poison value when the call is
invalid because of "amdgpu-no-<xyz>" attribute in the caller function.
Upon
https://github.com/llvm/llvm-project/pull/186925#pullrequestreview-3983414064
Assisted by claude-4.6-sonnet-medium through CURSOR.
pkru.3: Note that the kernel may not respect PKRU protections
There are cases where the kernel will be able to access memory covered
by a PKRU key which nomially prohibits accesses. I believe regular
copyin()/copyout() are subject to the contents of PKRU, but memory
accesses via uiomove_fromphys() will not be. This can arise when
performing fault I/O, for instance. I didn't test, but I suspect AIO is
another case.
Update the man page to acknowledge this.
Reviewed by: alc, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D56416
pkru.3: Remove a qualifier
Now that i386 kernels are deprecated, we don't really need to mention
this limitation. It's also a bit dated since PKRU is supported with
5-level paging as well.
Reviewed by: alc, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D56415
[CIR] Fix InlineAsmOp roundtrip parse crash on cir.asm (#186588)
Fix InlineAsmOp parser/printer roundtrip for cir.asm and avoid null
operand_attrs entries that crash alias printing during
--verify-roundtrip.
- Parse attr-dict before optional result arrow to match print order.
- Use non-null sentinel attributes for non-maybe_memory operands and
check UnitAttr explicitly.
- Keep lowering semantics by treating only UnitAttr as maybe_memory
marker.
- Update inline-asm CIR IR test to run with --verify-roundtrip and add
an attr+result coverage case.
Fix https://github.com/llvm/llvm-project/issues/161441
[CIR] Add noundef to __cxx_global_array_dtor parameter (#191529)
The synthetic __cxx_global_array_dtor helper created by
LoweringPrepare was missing noundef on its ptr parameter,
causing a mismatch with classic codegen.
[CIR][NFC] Remove redundant global-var-simple.cpp test (#192354)
This early smoke test is fully covered by
`clang/test/CIR/CodeGen/globals.cpp` and is no longer needed.
Per @andykaylor's feedback on #191521.
Made with [Cursor](https://cursor.com)
[mlir][spirv][nfc] Move GroupNonUniformBallotBitCount tests to `non-uniform-ops.mlir` (#192115)
Tests were incorrectly placed in `group-ops.mlir` since the op is
defined in `SPIRVNonUniformOps.td`.
AMDGPU: Add NextUseAnalysis Pass (#178873)
Based on
- https://github.com/llvm/llvm-project/pull/156079 and
- https://github.com/llvm/llvm-project/pull/171520
See those PRs for background.
Provides a compatibility mode option
`--amdgpu-next-use-analysis-compatibility-mode` that produces results
that match either PR #156079 (`compute`) or PR #171520 (`graphics`).
Co-authored-by: alex-t <atimofee at amd.com>
Co-authored-by: Konstantina Mitropoulou <KonstantinaMitropoulou at amd.com>
---------
Co-authored-by: Konstantina Mitropoulou <KonstantinaMitropoulou at amd.com>
[SLP] Normalize copyable operand order via majority voting
When building operands for entries with copyable elements, non-copyable
lanes of commutative ops may have inconsistent operand order (e.g. some
lanes have load,add while others have add,load). This prevents
VLOperands::reorder() from grouping consecutive loads on one side,
degrading downstream vectorization.
Add majority-voting normalization during buildOperands: track the
(ValueID, ValueID) pair frequency across non-copyable lanes and swap
any lane whose operand types are the exact inverse of the most common
pattern. This makes operand order consistent, enabling better load
grouping.
This is part 1 of #189181.
Reviewers: RKSimon, hiraditya
Pull Request: https://github.com/llvm/llvm-project/pull/191631
Update ietf-cli 1.31pre0, ok job kn sthen (with a tweak)
This updates the tool to the latest commit which includes version 1.31,
but isn't tagged: https://github.com/paulehoffman/ietf-cli/issues/8
1.30 adds an index subcommand for bcp and std (just like for rfc)
1.31 prints the current document status on exit