[libc] Add posix_memalign as external entrypoint on Linux x86/ARM. (#185310)
`posix_memalign` is provided by Scudo allocator and is a part of POSIX
standard, so we can safely declare it in the `<stdlib.h>` header on
Linux systems.
[CIR][AMDGPU] Add module flags for AMDGPU target (#186081)
Upstreaming clangIR PR: https://github.com/llvm/clangir/pull/2100
This PR adds support to emit AMDGPU-specific module flags
`amdhsa_code_object_version` and `amdgpu_printf_kind` to match OGCG
behavior.
In `CIRGenModule`, the flags are stored as CIR module attributes:
`cir.amdhsa_code_object_version` (integer)
`cir.amdgpu_printf_kind` (string: "hostcall" or "buffered")
During lowering to LLVM IR (in LowerToLLVMIR.cpp), these attributes are
converted to LLVM module flags.
[XeVM] Add translation for XeVM cache-control attributes. (#181856)
Use `llvm.intr.ptr.annotation` to attach cache-control metadata to a
pointer. Each cache-control attribute produces its own annotation call;
multiple attributes are chained so every annotation sits on the same
pointer.
This approach protects the metadata across optimizations.
make: ensure .MAKE.SAVE_DOLLARS is set.
This makes it possible for macros to be set so as to work correctly
whether .MAKE.SAVE_DOLLARS is "yes" (NetBSD) or "no" (bmake).
[offload] Remove LIBOMPTARGET_SHARED_MEMORY_SIZE envar (#186231)
This commit removes the `LIBOMPTARGET_SHARED_MEMORY_SIZE` envar and
outputs a runtime warning if it is defined. Access to dynamic shared memory
should be obtained through the `dyn_groupprivate` clause (OpenMP 6.1) or
the launch arguments in liboffload kernel launch.
Exclude known failure case (#186305)
External resources does not produce same result on big-endian. Keeping
this test for regressions of the encoding scoped keeps it simple while
it doesn't affect the usage there. So just mark as XFAIL.
Exclude known failure case
External resources does not produce same result on big-endian. Keeping this test for regressions of the encoding scoped keeps it simple while it doesn't affect the usage there. So just mark as XFAIL.
[RISCV] Fix crash in getShuffleCost for P-extension without V extension (#186149)
RISCVTTIImpl::getShuffleCost() crashes when querying the cost of a
reverse shufflevector on a target with the P-extension but without V/Zve
extensions. The SK_Reverse case calls
getContainerForFixedLengthVector(), which asserts hasVInstructions().
The P-extension uses fixed-width packed SIMD in GPRs, not RVV registers,
so V extension is typically not enabled.
Add an early return for P-extension fixed vectors in getShuffleCost,
consistent with the existing guards in getScalarizationOverhead,
getCastInstrCost, and getVectorInstrCost.
[RISCV] Fix crash in combinePExtTruncate for truncate(srl) without MUL/SUB (#186141)
combinePExtTruncate is called from performTRUNCATECombine when the
P-extension is enabled. It attempts to match patterns like
truncate(srl(mul/sub(...), shamt)) and combine them into P-extension
narrowing shift instructions (e.g. PNSRLI, PNSRAI).
However, after extracting the shift input operand `Op` from the SRL
node, the function unconditionally accessed Op.getOperand(0) and
Op.getOperand(1) without first verifying that Op has at least two
operands. For example, when combining:
```
truncate(v2i16
srl(v2i32
bitcast(v2i32 i64), <-- Op = bitcast, a unary op with 1 operand
BUILD_VECTOR <8, 8>))
```
[7 lines not shown]
[MLIR][XeGPU] Enhance Layout Propagation for broadcasting both leading dimensions and inner unit dimensions (#185583)
This PR enhances the layout propagation rules for broadcast operations.
The source layout is derived from the result layout based on the
broadcast pattern:
1. Broadcast on leading dimensions
The source layout is the slice layout of the result layout.
2. Broadcast on inner unit dimensions
The source layout matches the result layout, with sg_data and lane_data
set to 1.
3. Broadcast on both leading dimensions and inner unit dimensions
The source layout is derived by combining the above two rules.
drm/sched: Fix kernel-doc warning for drm_sched_job_done()
From Yujie Liu
da09dfc90cb7ed1ab40d675234382f151eeb0563 in linux-6.18.y/6.18.17
61ded1083b264ff67ca8c2de822c66b6febaf9a8 in mainline linux
drm/syncobj: Fix handle <-> fd ioctls with dirty stack
From Julian Orth
7196a1ff7b9a2ab6d973fe3c1dfc426d8d8ed4d2 in linux-6.18.y/6.18.17
2e3649e237237258a08d75afef96648dd2b379f7 in mainline linux
drm/amd/display: Use GFP_ATOMIC in dc_create_stream_for_sink
From Natalie Vock
0381584929791c4b989fb0a36a466ae20aea1608 in linux-6.18.y/6.18.17
28dfe4317541e57fe52f9a290394cd29c348228b in mainline linux
drm/i915/dp: Fix pipe BPP clamping due to HDR
From Imre Deak
9498fa25a0b0d8c095ce3d1f15d7864228692822 in linux-6.18.y/6.18.17
fe26ae6ac8b88fcdac5036b557c129a17fe520d2 in mainline linux
drm/i915/dp: Fail state computation for invalid DSC source input BPP values
From Imre Deak
99f617ea2ff017b0ba10d5371d83345331091afa in linux-6.18.y/6.18.17
338465490cf7bd4a700ecd33e4855fee4622fa5f in mainline linux
drm/amd: Fix hang on amdgpu unload by using pci_dev_is_disconnected()
From Mario Limonciello
378dff71efddd15f34124bf9d7c98cd69cd05286 in linux-6.18.y/6.18.17
f7afda7fcd169a9168695247d07ad94cf7b9798f in mainline linux
drm/amdgpu: Fix error handling in slot reset
From Lijo Lazar
73e8bdf14248136459753252a438177df7ed8c7c in linux-6.18.y/6.18.17
b57c4ec98c17789136a4db948aec6daadceb5024 in mainline linux