LLVM/project b5a0aa7clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaLifetimeSafety.h

helpful-destroyed-here
DeltaFile
+157-157clang/test/Sema/LifetimeSafety/safety.cpp
+44-44clang/test/Sema/LifetimeSafety/nocfg.cpp
+7-7clang/test/Sema/LifetimeSafety/annotation-suggestions.cpp
+2-2clang/test/Sema/LifetimeSafety/cfg-bailout.cpp
+2-1clang/lib/Sema/SemaLifetimeSafety.h
+1-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+213-2126 files

LLVM/project 0579490llvm/include/llvm/IR Intrinsics.td Intrinsics.h, llvm/lib/IR Intrinsics.cpp

[NFC][LLVM] Refactor IIT_ANY payload for vector/element constraint (#203506)

Change `IIT_ANY` payload from a single packed OverloadIndex + AnyKind
byte to 2 bytes:
- An 8 bit OverloadIndex
- An 8 pit packed vector + element type constraint.
This will enable `IIT_ANY` to express constraints on the overload type
is a more general fashion compared to a flat `AnyKind` enum.

Also fixed a latent bug in fixed encodings generated by the intrinsic
emitter (exposed by this change). Existing `encodePacked` packs the
type-signature as 8 nibbles into a 32-bit word and then checks if the
MSB bit position (i.e., bit 15) is 0 (to allow it's use in fixed
encoding). This effectively drop any 0 valued bytes in the encoding in
the upper 4 nibbles. Fix this by changing `encodePacked` to use the
actual fixed encoding type and its size.
DeltaFile
+67-56llvm/include/llvm/IR/Intrinsics.td
+66-15llvm/lib/IR/Intrinsics.cpp
+42-26llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp
+38-0llvm/test/TableGen/intrinsic-overload-index-oor.td
+15-12llvm/test/TableGen/intrinsic-struct.td
+13-9llvm/include/llvm/IR/Intrinsics.h
+241-1186 files

LLVM/project 7670d88flang/lib/Optimizer/Transforms/CUDA CUFDeviceFuncTransform.cpp, flang/test/Fir/CUDA cuda-device-func-transform.mlir

[flang][cuda] Set kernel intent(in) as const __restrict__ (#203652)

Set attributes on `intent(in)` so `ld.global.nc` is generated by the
backend.
DeltaFile
+38-0flang/lib/Optimizer/Transforms/CUDA/CUFDeviceFuncTransform.cpp
+16-2flang/test/Fir/CUDA/cuda-device-func-transform.mlir
+54-22 files

LLVM/project 1b58516clang/lib/Sema SemaLifetimeSafety.h, clang/test/Sema/LifetimeSafety safety.cpp nocfg.cpp

[LifetimeSafety] Improve aliasing notes to include callee name (#203606)
DeltaFile
+56-51clang/test/Sema/LifetimeSafety/safety.cpp
+39-25clang/test/Sema/LifetimeSafety/nocfg.cpp
+14-1clang/lib/Sema/SemaLifetimeSafety.h
+6-6clang/test/Sema/LifetimeSafety/annotation-suggestions.cpp
+115-834 files

LLVM/project f676da3clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaLifetimeSafety.h

users/usx95/helpful-invalidations
DeltaFile
+74-74clang/test/Sema/LifetimeSafety/invalidations.cpp
+14-6clang/lib/Sema/SemaLifetimeSafety.h
+4-4clang/test/Sema/LifetimeSafety/safety.cpp
+4-4clang/include/clang/Basic/DiagnosticSemaKinds.td
+96-884 files

LLVM/project af60d56flang/lib/Semantics expression.cpp, flang/test/Semantics cuf28.cuf cuf-generic-literal-host.cuf

[flang][CUDA] Keep host literals from using unified-memory generic distance (#201257)

Fix CUDA generic resolution under `-gpu=mem:unified` so unattributed
literals and expression temporaries are not treated as unified-memory
actuals.

Previously, a host scalar literal such as `1.0` could score as
compatible with a `DEVICE` dummy and incorrectly select the
device-scalar overload. This could pass a host stack address to a device
helper and fail at runtime. The fix applies the unified/managed memory
distance columns only to symbol-backed actuals.
DeltaFile
+37-0flang/test/Semantics/cuf28.cuf
+34-0flang/test/Semantics/cuf-generic-literal-host.cuf
+15-2flang/lib/Semantics/expression.cpp
+86-23 files

LLVM/project 3203867llvm/test/CodeGen/AMDGPU fcanonicalize.ll maximumnum.ll, llvm/test/Transforms/LICM vector-insert.ll

Merge branch 'main' into users/usx95/06-12-users_usx95_helpful-invalidations
DeltaFile
+2,760-227llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,357-0llvm/test/CodeGen/AMDGPU/maximumnum.ll
+1,317-0llvm/test/CodeGen/AMDGPU/minimumnum.ll
+1,313-0llvm/test/CodeGen/AMDGPU/packed-u64.ll
+736-0llvm/test/CodeGen/AMDGPU/shl.v2i64.ll
+0-572llvm/test/Transforms/LICM/vector-insert.ll
+7,483-799172 files not shown
+14,680-2,140178 files

LLVM/project 04f1175clang/lib/Sema SemaLifetimeSafety.h, clang/test/Sema/LifetimeSafety safety.cpp nocfg.cpp

[LifetimeSafety] Improve aliasing notes to include callee name (#203606)
DeltaFile
+56-51clang/test/Sema/LifetimeSafety/safety.cpp
+39-25clang/test/Sema/LifetimeSafety/nocfg.cpp
+14-1clang/lib/Sema/SemaLifetimeSafety.h
+6-6clang/test/Sema/LifetimeSafety/annotation-suggestions.cpp
+115-834 files

LLVM/project 6b82a04flang/lib/Optimizer/Transforms/CUDA CUFOpConversion.cpp, flang/test/Fir/CUDA cuda-global-addr.mlir

[flang][cuda] Fix host loads from CUDA constant globals (#203064)

This fixes CUDA Fortran lowering for scalar module variables with the
constant attribute that are read from host code, such as launch
configuration expressions or CUF kernel loop bounds.

Previously, host-side declarations for these globals could be rewritten
to device constant-memory addresses, causing host loads to dereference
the result of _FortranACUFGetDeviceAddress. The fix preserves host reads
from the host-visible global while still using the device address for
host-to-device assignment updates.

A FIR regression test covers host reads and assignment updates for
scalar CUDA constant globals.
DeltaFile
+36-0flang/lib/Optimizer/Transforms/CUDA/CUFOpConversion.cpp
+25-0flang/test/Fir/CUDA/cuda-global-addr.mlir
+61-02 files

LLVM/project 0a713dbllvm/test/CodeGen/AMDGPU fcanonicalize.ll llvm.amdgcn.sched.group.barrier.ll

Merge branch 'main' into users/petar-avramovic/pk-f64
DeltaFile
+2,760-227llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+1,813-654llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+1,357-0llvm/test/CodeGen/AMDGPU/maximumnum.ll
+1,317-0llvm/test/CodeGen/AMDGPU/minimumnum.ll
+1,313-0llvm/test/CodeGen/AMDGPU/packed-u64.ll
+784-230llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.opt.ll
+9,344-1,111279 files not shown
+20,659-3,533285 files

LLVM/project baf76a8bolt/lib/Profile DataAggregator.cpp, bolt/test/perf2bolt perf_test.test

[BOLT] Change DataAggregator error types (#203651)

1. In `filterBinaryMMapInfo`, replace `incovertibleErrorCode` with errc
   code as `parseMainEvents` converts returned Error to std::error_code.
2. In `parsePerfData`, pass through Error returned by `prepareToParse`
   for memory events.

Test Plan: updated perf_test.test
DeltaFile
+4-0bolt/test/perf2bolt/perf_test.test
+2-2bolt/lib/Profile/DataAggregator.cpp
+6-22 files

LLVM/project 52a3108llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

Merge branch 'main' into users/shiltian/reqd_work_group_size-verifier
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3694,730 files not shown
+509,330-370,8814,736 files

LLVM/project a8e3c08libc/src/__support/threads raw_rwlock.h raw_mutex.h, libc/src/__support/threads/linux futex_utils.h

[libc] fix EAGAIN being treated as timeout in mutex and rwlock (#203574)

fix #203411. 

This PR addresses the problem that `EAGAIN` may be treated as timeout in
mutex and rwlock. Two changes are applied:

1. timeout sites always explicitly check for timeout now to make the
logic more robust;
2. the futex wait now discards the error of `EAGAIN/EWOULDBLOCK` and
returns 0;

We don't distinguish waking up from signal and waking up from mismatch
for the following 3 reasons:
- We have userspace guard to avoid futex syscall if we already know
value would match, it seems awkward to make that check returns error, as
we may wake up and loop back to the check, where signal is consumed but
we still return error....;
- futex syscall can spuriously wake up anyway, there is no way to tell

    [3 lines not shown]
DeltaFile
+4-2libc/src/__support/threads/raw_rwlock.h
+5-0libc/src/__support/threads/linux/futex_utils.h
+2-2libc/src/__support/threads/raw_mutex.h
+1-1libc/test/integration/src/__support/threads/futex_requeue_test.cpp
+12-54 files

LLVM/project 92d7a7fmlir/include/mlir/Dialect/Quant/IR QuantDialectBytecode.td QuantBase.td, mlir/lib/Dialect/Quant/IR QuantDialectBytecode.cpp

QuantileType bytecode patch (#203495)

Since the merge of this
PR(https://github.com/llvm/llvm-project/pull/190321) there were some
issues identified, such as QuantileType not being added in the ByteCode
files. This PR focuses on fixing these missing pieces which should make
QuantileType a complete and functional type.
DeltaFile
+23-0mlir/lib/Dialect/Quant/IR/QuantDialectBytecode.cpp
+15-1mlir/include/mlir/Dialect/Quant/IR/QuantDialectBytecode.td
+16-0mlir/test/Dialect/Quant/Bytecode/types.mlir
+10-0mlir/include/mlir/Dialect/Quant/IR/QuantBase.td
+1-0mlir/include/mlir/Dialect/Quant/IR/Quant.h
+65-15 files

LLVM/project c9b25a6libc/include stdlib.yaml, libc/src/stdlib mkstemp.cpp mkstemp.h

[libc] implement mkstemp (#199220)

Fixes #191266
Implements `mkstemp` as specified in POSIX
Currently Linux-only since it relies on the Linux syscall wrappers for
`getrandom` and `open`
DeltaFile
+207-0libc/test/src/stdlib/mkstemp_test.cpp
+87-0libc/src/stdlib/mkstemp.cpp
+31-0libc/src/stdlib/mkstemp.h
+21-0libc/test/src/stdlib/CMakeLists.txt
+17-0libc/src/stdlib/CMakeLists.txt
+6-0libc/include/stdlib.yaml
+369-03 files not shown
+372-09 files

LLVM/project 7430170clang/test/Analysis/Scalable/PointerFlow lref-to-rref-cast.test

add ' --ssaf-compilation-unit-id'
DeltaFile
+2-1clang/test/Analysis/Scalable/PointerFlow/lref-to-rref-cast.test
+2-11 files

LLVM/project 81a81d7llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU dynamic_stackalloc.ll amdgpu-cs-chain-fp-nosave.ll

Revert "[AMDGPU] In `LowerDYNAMIC_STACKALLOC`, hoist the `readfirstlane` up one instruction" (#203645)

Reverts llvm/llvm-project#201528

Reverting due to change causing "illegal VGPR to SGPR copy"
DeltaFile
+210-180llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
+49-36llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-fp-nosave.ll
+7-5llvm/test/CodeGen/AMDGPU/llvm.sponentry.ll
+6-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+272-2264 files

LLVM/project d2163f7flang/lib/Semantics check-cuda.cpp, flang/test/Semantics cuf09.cuf

[flang][cuda] Error out if pause statement is used in device code (#203642)
DeltaFile
+7-0flang/test/Semantics/cuf09.cuf
+4-0flang/lib/Semantics/check-cuda.cpp
+11-02 files

LLVM/project 4d5862cllvm/lib/Target/RISCV RISCVInstrInfoXqci.td, llvm/lib/Target/RISCV/Disassembler RISCVDisassembler.cpp

address feedback

Created using spr 1.3.8-beta.1
DeltaFile
+112-128llvm/utils/TableGen/Common/CodeGenHwModes.cpp
+14-1llvm/utils/TableGen/Common/CodeGenHwModes.h
+4-4llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+2-3llvm/utils/TableGen/Common/SubtargetFeatureInfo.h
+0-1llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
+132-1375 files

LLVM/project 0a6e021llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp, llvm/test/CodeGen/SPIRV freeze-aggregate.ll

[SPIR-V] Lower freeze instructions with aggregate operands (#203584)

An aggregate freeze takes its result type from its operand, like a PHI
or select, but was handled by neither the up-front value-id mutation nor
replaceMemInstrUses, so the pass aborted with "illegal aggregate
intrinsic user". Mutate aggregate freezes to the i32 value-id type and
replace their operands alongside PHIs and selects.
DeltaFile
+47-0llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_poison_freeze/freeze-aggregate.ll
+40-0llvm/test/CodeGen/SPIRV/freeze-aggregate.ll
+12-11llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+99-113 files

LLVM/project 4c057fellvm/lib/Target/AMDGPU SIRegisterInfo.cpp, llvm/test/CodeGen/AMDGPU spillv16Kernel.ll

[AMDGPU][true16] extract 16bit for scratch_load_ubyte_st when spilling (#203589)

In sramecc mode scratch_load_ubyte_st is selected for 16bit spilling.
Need a tmp vgpr32 and extract lo16 from it
DeltaFile
+46-0llvm/test/CodeGen/AMDGPU/spillv16Kernel.ll
+2-1llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+48-12 files

LLVM/project 0618f10llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

Merge branch 'main' of github.com:llvm/llvm-project into users/ziqingluo/PR-179173940

 Conflicts:
        clang/unittests/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowTest.cpp
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3693,338 files not shown
+387,072-351,1923,344 files

LLVM/project 181d808llvm/lib/Target/AArch64 AArch64PointerAuth.cpp, llvm/test/CodeGen/AArch64 swifttail-ptrauth.ll pauth-lr-tail-call-fpdiff.ll

[AArch64][PAuth] Fix return-address auth for swifttailcc with FPDiff > 0 (#203340)

When a swifttailcc tail call has FPDiff > 0 (the caller received more
stack argument space than the callee pops), the epilogue contains an SP
adjustment to discard the leftover argument space. The existing code
treated both FPDiff < 0 and FPDiff > 0 uniformly in a single 'FPDiff !=
0' block, using AUTI[AB]1716 with a reconstructed entry-SP in x16 for
both cases.

For FPDiff < 0 (callee pops more) that reconstruction is necessary and
correct. For FPDiff > 0 it is wrong: by the time we enter the block the
post-index LDP has already adjusted SP back to the frame base, but the
'add sp, sp, #N' argument pop has not yet run. Entry SP equals the
current SP at that point, so AUTI[AB]SP would work directly, but instead
the combined block bumped SP via StackOffset::getFixed(-FPDiff) which
overshoots, and then emits AUTIA1716 with a wrong discriminator. Worse
yet, the SP restore had already been emitted *before* the auth, leaving
the live argument stack below SP and outside the red-zone during the
authentication window.

    [9 lines not shown]
DeltaFile
+202-0llvm/test/CodeGen/AArch64/swifttail-ptrauth.ll
+38-8llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
+6-6llvm/test/CodeGen/AArch64/pauth-lr-tail-call-fpdiff.ll
+1-1llvm/test/CodeGen/AArch64/arm64e-tail-call-autib.ll
+247-154 files

LLVM/project 00b39efclang/lib/ScalableStaticAnalysisFramework/Analyses SSAFAnalysesCommon.h, clang/test/Analysis/Scalable/PointerFlow lref-to-rref-cast.test

[SSAF][PointerFlow] Recognize reference-to-pointer/array Decls

Decls of reference-to-pointer/array types are now treated the same as
those of pointer/array type.

rdar://179173940
DeltaFile
+62-2clang/unittests/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowTest.cpp
+40-0clang/test/Analysis/Scalable/PointerFlow/lref-to-rref-cast.test
+8-1clang/lib/ScalableStaticAnalysisFramework/Analyses/SSAFAnalysesCommon.h
+110-33 files

LLVM/project 8fb9963llvm/lib/Target/AMDGPU VOP3PInstructions.td, llvm/test/CodeGen/AMDGPU shl.v2i64.ll pk-lshl-add-u64.ll

[AMDGPU] Add gfx1251 V_PK_LSHL_ADD_U64 (#203612)
DeltaFile
+736-0llvm/test/CodeGen/AMDGPU/shl.v2i64.ll
+241-0llvm/test/CodeGen/AMDGPU/pk-lshl-add-u64.ll
+52-0llvm/test/MC/AMDGPU/gfx1251_asm_vop3p.s
+46-0llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+39-0llvm/test/MC/Disassembler/AMDGPU/gfx1251_dasm_vop3p.txt
+34-0llvm/test/MC/AMDGPU/gfx1251_err.s
+1,148-04 files not shown
+1,167-210 files

LLVM/project 3c846b2llvm/include/llvm/Transforms/Vectorize SLPVectorizer.h, llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+695-17llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+90-0llvm/test/Transforms/SLPVectorizer/X86/runtime-alias-checks.ll
+41-10llvm/test/Transforms/SLPVectorizer/AArch64/loadi8.ll
+13-0llvm/include/llvm/Transforms/Vectorize/SLPVectorizer.h
+839-274 files

LLVM/project e882286bolt/lib/Profile DataAggregator.cpp

[BOLT] Fix perf data return identification (#203628)

If perf data doesn't have branch type recorded, missing value would
incorrectly be interpreted as not-a-return. Only populate Returns map if
the branch type is available.
Fixes bug introduced in #202813.
DeltaFile
+2-1bolt/lib/Profile/DataAggregator.cpp
+2-11 files

LLVM/project d0cd530llvm/lib/Transforms/Scalar LoopInterchange.cpp

[LoopInterchange] Mark getAddRecCoefficient with static (#203624)

As this function is a file-scope non-member function, it's better to
mark it with static.
DeltaFile
+2-2llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+2-21 files

LLVM/project 2f8a39dllvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange lcssa-incoming-value-is-not-instr.ll

[LoopInterchange] Fix crash when followLCSSA returns constant (#203515)

Similar as the case in ##201069, `followLCSSA` may return a constant
value, but it was cast to Instruction unconditionally. We need to
explicitly check whether the returned value is an Instruction or not.

Fix #203375.
DeltaFile
+70-0llvm/test/Transforms/LoopInterchange/lcssa-incoming-value-is-not-instr.ll
+7-5llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+77-52 files

LLVM/project ae026a5llvm/lib/Target/AMDGPU AMDGPU.td, llvm/test/CodeGen/AMDGPU branch-relaxation-gfx1250.ll

[AMDGPU] Enable S_ADD_PC_I64 on gfx1251 (#203613)
DeltaFile
+2-1llvm/lib/Target/AMDGPU/AMDGPU.td
+1-1llvm/test/CodeGen/AMDGPU/branch-relaxation-gfx1250.ll
+3-22 files