LLVM/project d48575fclang/include/clang/Driver Distro.h, clang/lib/Driver Distro.cpp

Add support for Ubuntu 26.10 - Stonking Stingray (#196896)

Co-authored-by: Oliver Reiche <oliver.reiche at canonical.com>
DeltaFile
+2-1clang/include/clang/Driver/Distro.h
+1-0clang/lib/Driver/Distro.cpp
+3-12 files

LLVM/project ef6fd03llvm/lib/Target/X86 X86InstrCompiler.td X86ISelLowering.cpp, llvm/test/CodeGen/X86 atomic-load-store.ll

[X86] Cast atomic vectors in IR to support floats

This commit casts floats to ints in an atomic load during AtomicExpand to support
floating point types. It also is required to support 128 bit vectors in SSE/AVX.
DeltaFile
+98-288llvm/test/CodeGen/X86/atomic-load-store.ll
+15-0llvm/lib/Target/X86/X86InstrCompiler.td
+7-0llvm/lib/Target/X86/X86ISelLowering.cpp
+2-0llvm/lib/Target/X86/X86ISelLowering.h
+122-2884 files

LLVM/project b71b576llvm/include/llvm/Target TargetSelectionDAG.td, llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp LegalizeTypes.h

[SelectionDAG] Split vector types for atomic load (#165818)

Vector types that aren't widened are split so that a single ATOMIC_LOAD
is issued for the entire vector at once. This change utilizes the load
vectorization infrastructure in SelectionDAG in order to group the
vectors. This enables SelectionDAG to translate vectors with type
bfloat,half.
DeltaFile
+349-4llvm/test/CodeGen/X86/atomic-load-store.ll
+34-0llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+14-0llvm/include/llvm/Target/TargetSelectionDAG.td
+1-0llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+398-44 files

LLVM/project c319640flang/lib/Semantics resolve-names.cpp, flang/test/Semantics stmt-func01.f90

[flang] dummy arguments used as function calls (#196426)

Adding an error when a dummy argument is used as a statement function.
 
```
SUBROUTINE a(foo)
foo(c) = 0
END SUBROUTINE a
```
This PR now points out:
   1) Dummy argument 'foo' may not be used as a statement function
   2) 'foo' is not a callable procedure
   
Handles issue
[196424](https://github.com/llvm/llvm-project/issues/196424)

---------

Co-authored-by: Sunil Kuravinakop <kuravina at pe31.hpc.amslabs.hpecorp.net>
DeltaFile
+19-0flang/test/Semantics/stmt-func01.f90
+5-0flang/lib/Semantics/resolve-names.cpp
+24-02 files

LLVM/project e2e2529llvm/utils/git github-automation.py

Update GitHub PR Greeter (#194307)

Following these two discussions:
* https://discourse.llvm.org/t/rfc-mention-our-ai-policy-in-the-greeting-message-for-first-time-contributors/,
* https://discourse.llvm.org/t/concerns-about-influx-of-ai-generated-bug-fixes/,

add a reference to the LLVM AI policy in the GH greeter. 

In addition:
* Update the message to include links to other relevant policies as
  well, since these are often shared during PR review.
* Add FAQ section and move some of the original content there.
* Include a request for people to confirm that they have familiarised themselves with
  the policies.
* Add `Hello @{self.author} :wave:` to make the greeting more personal.
DeltaFile
+29-7llvm/utils/git/github-automation.py
+29-71 files

LLVM/project 8ae9471llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-instructions.ll, llvm/test/CodeGen/AMDGPU ctlz_zero_poison.ll ctlz_zero_undef.ll

Merge upstream/main into users/mariusz-sikora-at-amd/gfx13/add-vop3
DeltaFile
+4,634-367llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+3,071-1,257llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+2,614-0llvm/test/CodeGen/AMDGPU/ctlz_zero_poison.ll
+0-2,614llvm/test/CodeGen/AMDGPU/ctlz_zero_undef.ll
+1,660-649llvm/test/CodeGen/AArch64/bf16-instructions.ll
+1,440-725llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+13,419-5,6122,308 files not shown
+70,306-33,8002,314 files

LLVM/project 3fb3383llvm/lib/Analysis AliasAnalysis.cpp, llvm/test/Analysis/BasicAA atomics.ll

Revert "[AA] No synchronization effects for never-escaping identified local" (#196890)

Reverts llvm/llvm-project#193939

Caused buildbot failure.
DeltaFile
+22-24llvm/test/Analysis/BasicAA/atomics.ll
+11-28llvm/lib/Analysis/AliasAnalysis.cpp
+8-0llvm/test/Transforms/DeadStoreElimination/fence.ll
+3-3llvm/test/Transforms/LICM/atomics.ll
+2-2llvm/test/Transforms/GVN/fence.ll
+2-0llvm/test/Transforms/GVN/simplify-icf-cache-invalidation.ll
+48-571 files not shown
+49-587 files

LLVM/project 9f58135llvm/test/CodeGen/AArch64 itofp.ll sve-fixed-vector-llrint.ll

[AArch64] Use dup (lane mov) over ext for high-half extract (#195010)

This changes the instruction we use to extract the high half of a vector
register from a `ext v0, v1, v1, 8` to a `dup d0, v1.d[1]`. This is
apparently slightly quicker on certain cpus and is generally a simpler
instruction. This matches the instruction that gisel produced.

Some of the old patterns for extract_subvector with index of 1 seem
incorrect but were never used as we do not reach selection with such
instructions. They have been repurposed to emit the new DUPi64
instructions.
DeltaFile
+100-332llvm/test/CodeGen/AArch64/itofp.ll
+112-112llvm/test/CodeGen/AArch64/sve-fixed-vector-llrint.ll
+112-112llvm/test/CodeGen/AArch64/sve-fixed-vector-lrint.ll
+40-126llvm/test/CodeGen/AArch64/fptoi.ll
+65-65llvm/test/CodeGen/AArch64/arm64-neon-2velem.ll
+56-65llvm/test/CodeGen/AArch64/neon-scalar-copy.ll
+485-81294 files not shown
+1,315-1,791100 files

LLVM/project 0a181a1llvm/lib/CodeGen TwoAddressInstructionPass.cpp TailDuplicator.cpp, llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

Use auto for DenseMap/SmallDenseMap iterator variables. NFC (#196883)

To match the prevailing style.
DeltaFile
+8-16llvm/lib/Transforms/IPO/IROutliner.cpp
+6-8llvm/lib/IR/LegacyPassManager.cpp
+5-7llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+6-6llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+4-6llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+3-6llvm/lib/CodeGen/TailDuplicator.cpp
+32-4943 files not shown
+95-13749 files

LLVM/project 47495f4clang/lib/AST/ByteCode EvalEmitter.cpp Compiler.cpp, clang/test/AST/ByteCode builtin-object-size-codegen.c

[clang][bytecode] Visit `tryEvaluateObjectSize` expr as lvalue (#196010)

Just like we do with the first parameter of a regular
`__builtin_object_size` call.

This still doesn't fix the bigger bos test cases since e.g.
```c++
int NoViableOverloadObjectSize3(void *const p PS(3))
    __attribute__((overloadable)) {
  return __builtin_object_size(p, 3);
}
void test4(struct Foo *t) {
  gi = NoViableOverloadObjectSize3(&t[1].t[1]);
}
```
is still broken because we don't have special handling for the
`&t[1].t[1]` handling here and we can't usually access a one-past-end
pointer.
DeltaFile
+15-2clang/lib/AST/ByteCode/EvalEmitter.cpp
+9-0clang/lib/AST/ByteCode/Compiler.cpp
+6-0clang/test/AST/ByteCode/builtin-object-size-codegen.c
+2-0clang/lib/AST/ByteCode/EvalEmitter.h
+1-1clang/lib/AST/ByteCode/Context.cpp
+1-0clang/lib/AST/ByteCode/ByteCodeEmitter.h
+34-31 files not shown
+35-37 files

LLVM/project 3eab15alldb/test/API/functionalities/breakpoint/delayed_breakpoints TestDelayedBreakpoint.py

[lldb] Fix TestDelayedBreakpoint on ARM Thumb (#196888)

The original address used for the "fake breakpoint" is not valid in
Thumb mode. To be safe, change it to have 0's in the LSBs.
DeltaFile
+1-1lldb/test/API/functionalities/breakpoint/delayed_breakpoints/TestDelayedBreakpoint.py
+1-11 files

LLVM/project 52b6343clang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn.hip

[CIR][AMDGPU] Add lowering for amdgcn ds swizzle builtin. (#196011)

Upstreaming clangIR PR: https://github.com/llvm/clangir/pull/2052

This PR adds support for lowering of _builtin_amdgcn_ds_swizzle* amdgpu
builtin to clangIR.
DeltaFile
+7-1clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+8-0clang/test/CIR/CodeGenHIP/builtins-amdgcn.hip
+15-12 files

LLVM/project 23b7a13clang/test/CodeGen c-strings.c

[clang][NFC] Remove alignment checks from test/CodeGen/c-strings.c (#196501)

and re-enable it on more targets.

I don't think this test was intended to check for alignment. Those
expectations were added as part of FileCheck-izing the test in
e29dadb6403c8b0d3658f9bbbe2f5fbde5431fdb and we've been working around
them or xfailing the test since.
DeltaFile
+4-26clang/test/CodeGen/c-strings.c
+4-261 files

LLVM/project 8503d7allvm/lib/Analysis AliasAnalysis.cpp, llvm/test/Analysis/BasicAA atomics.ll

Revert "[AA] No synchronization effects for never-escaping identified local (…"

This reverts commit 8a230212a560a60bef18e576ad62b0554158b3b3.
DeltaFile
+22-24llvm/test/Analysis/BasicAA/atomics.ll
+11-28llvm/lib/Analysis/AliasAnalysis.cpp
+8-0llvm/test/Transforms/DeadStoreElimination/fence.ll
+3-3llvm/test/Transforms/LICM/atomics.ll
+2-2llvm/test/Transforms/GVN/fence.ll
+2-0llvm/test/Transforms/GVN/simplify-icf-cache-invalidation.ll
+48-571 files not shown
+49-587 files

LLVM/project 8e9493fclang/lib/CIR/CodeGen CIRGenFunction.cpp

fix Address emission
DeltaFile
+1-1clang/lib/CIR/CodeGen/CIRGenFunction.cpp
+1-11 files

LLVM/project c1084a8llvm/test/Transforms/InstSimplify call.ll

update test
DeltaFile
+3-12llvm/test/Transforms/InstSimplify/call.ll
+3-121 files

LLVM/project a419733llvm/test/Transforms/InstSimplify call.ll

add test
DeltaFile
+81-0llvm/test/Transforms/InstSimplify/call.ll
+81-01 files

LLVM/project b5371acllvm/lib/Analysis InstructionSimplify.cpp

[InstSimplify] Fold fshl/fshr of complementary shifts to identity
DeltaFile
+11-0llvm/lib/Analysis/InstructionSimplify.cpp
+11-01 files

LLVM/project 58936f7utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[Bazel] Fixes ce6605a (#196880)

This fixes ce6605a4931a294bd17b5e56658b701b18d2bcf9.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+5-1utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+5-11 files

LLVM/project 8a23021llvm/lib/Analysis AliasAnalysis.cpp, llvm/test/Analysis/BasicAA atomics.ll

[AA] No synchronization effects for never-escaping identified local (#193939)

Fences and other synchronizing operations (such as atomic accesses
stronger than monotonic) are modelled as reading and writing all memory,
in order to enforce their implied ordering constraints.

Currently, this happens even for identified function locals that do not
escape. This patch excludes those objects.

Notably, we can *not* reason based on captures-before here, because the
synchronizing operation still has an effect even if the object only
escapes *later*.

The hope here is that with this restriction in place, it may be viable
to respect potential synchronization inside non-nosync function calls.
DeltaFile
+24-22llvm/test/Analysis/BasicAA/atomics.ll
+28-11llvm/lib/Analysis/AliasAnalysis.cpp
+0-8llvm/test/Transforms/DeadStoreElimination/fence.ll
+3-3llvm/test/Transforms/LICM/atomics.ll
+2-2llvm/test/Transforms/GVN/fence.ll
+0-2llvm/test/Transforms/GVN/simplify-icf-cache-invalidation.ll
+57-481 files not shown
+58-497 files

LLVM/project 4ef4f90libc/src/__support/File file.cpp, libc/test/src/__support/File file_test.cpp

[libc] Fix partial multi-byte write detection in File (#196402)

File::write_unlocked(const wchar_t*, size_t) checked 'write_res.value <
1' after writing a converted UTF-8 sequence. For multi-byte characters,
a short platform write (e.g. 2 of 3 bytes for a 3-byte character) passed
this check and was counted as a successful write. The output stream
would then contain an incomplete UTF-8 sequence with no error reported
to the caller.

Changed the check to 'write_res.value < char_size' and set the error
indicator on the stream when it triggers.

Added a regression test using a mock File subclass that limits
platform_write to 2 bytes per call, simulating short writes on pipes and
sockets.

Assisted-by: Automated tooling, human reviewed.

---------

Co-authored-by: Michael Jones <michaelrj at google.com>
DeltaFile
+86-0libc/test/src/__support/File/file_test.cpp
+4-2libc/src/__support/File/file.cpp
+90-22 files

LLVM/project 1a683f6clang-tools-extra/test/clang-tidy/checkers/readability redundant-casting.cpp

[clang-tidy][NFC] Fix tests on 32bit ARM (#196873)

Should fix
https://github.com/llvm/llvm-project/pull/191386#issuecomment-4408294981.
DeltaFile
+1-1clang-tools-extra/test/clang-tidy/checkers/readability/redundant-casting.cpp
+1-11 files

LLVM/project 422678dllvm/lib/Transforms/Scalar LoopFuse.cpp, llvm/test/Transforms/LoopFusion loop_invariant.ll

[LoopFusion] Remove SCEV-based dependence analysis path (#195864)

Loop Fusion has used Dependence Analysis (DA) as the default dependence
check since the option default was flipped in #187309. The SCEV-based
strategy and the combined "all" mode were retained only for fallback and
experimentation, with a comment noting that the SCEV code would be
removed in a follow-up.

This patch removes the SCEV-based dependence path and the now-unused
selector machinery.

Fixes #194821.

Assisted by Cursor.
DeltaFile
+65-199llvm/lib/Transforms/Scalar/LoopFuse.cpp
+1-8llvm/test/Transforms/LoopFusion/loop_invariant.ll
+66-2072 files

LLVM/project eba2d58llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-instructions.ll, llvm/test/CodeGen/AMDGPU ctlz_zero_poison.ll ctlz_zero_undef.ll

Merge upstream/main into users/mariusz-sikora-at-amd/add-mov-b64-feature
DeltaFile
+4,634-367llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+3,071-1,257llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+2,614-0llvm/test/CodeGen/AMDGPU/ctlz_zero_poison.ll
+0-2,614llvm/test/CodeGen/AMDGPU/ctlz_zero_undef.ll
+1,660-649llvm/test/CodeGen/AArch64/bf16-instructions.ll
+1,440-725llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+13,419-5,6122,055 files not shown
+64,172-30,8822,061 files

LLVM/project 54a4861clang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/CodeGen Address.h CIRGenModule.cpp

Fix tests
DeltaFile
+8-5clang/lib/CIR/CodeGen/Address.h
+5-3clang/lib/CIR/CodeGen/CIRGenModule.cpp
+4-1clang/lib/CIR/CodeGen/CIRGenExprCXX.cpp
+1-0clang/include/clang/CIR/MissingFeatures.h
+18-94 files

LLVM/project e9197e8clang/include/clang/CIR/Dialect/Builder CIRBaseBuilder.h, clang/lib/CIR/CodeGen CIRGenFunction.h Address.h

fix fmt and some coding conventions
DeltaFile
+7-8clang/lib/CIR/CodeGen/CIRGenFunction.h
+2-2clang/lib/CIR/CodeGen/Address.h
+1-1clang/lib/CIR/CodeGen/CIRGenDecl.cpp
+1-1clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+1-1clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+12-135 files

LLVM/project f62f8d8clang/lib/CIR/CodeGen CIRGenExpr.cpp Address.h, clang/test/CIR/CodeGen amdgpu-stack-alloca-array-decay.cpp

[CIR][CIRGen] Cast stack allocas to the language-visible address space
DeltaFile
+45-20clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+42-0clang/test/CIR/CodeGen/amdgpu-stack-alloca-array-decay.cpp
+14-0clang/lib/CIR/CodeGen/Address.h
+7-6clang/lib/CIR/CodeGen/CIRGenFunction.cpp
+12-0clang/lib/CIR/CodeGen/CIRGenFunction.h
+5-4clang/lib/CIR/CodeGen/CIRGenDecl.cpp
+125-303 files not shown
+134-349 files

LLVM/project ed50ea5llvm/lib/IR LLVMContextImpl.h

[DebugInfo] Pack DILocation hash inputs (#196556)

Pack DILocation fields before hashing. Now that column is 16-bits
Line/Column/ImplicitCode fit in one 64-bit value (32 + 16 + 1 = 49 bits)
and AtomGroup and AtomRank also fit cleanly in one 64-bit value (61 + 3
= 64 bits).

Fewer hash_combine inputs on the hot DILocation path is a small
compile-time improvement.

CTMark geomean:
- stage1-ReleaseLTO-g: -0.10%
- stage1-O0-g: -0.23%
- stage1-aarch64-O0-g: -0.19%
- stage2-O0-g: -0.07%

https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=1d80b5f5aa98561d2ba09adc3f20c3eacd24cb88&stat=instructions%3Au

Assisted-by: codex
DeltaFile
+5-3llvm/lib/IR/LLVMContextImpl.h
+5-31 files

LLVM/project f16e1b3llvm/include/llvm/CodeGen/GlobalISel CombinerHelper.h, llvm/lib/CodeGen/GlobalISel CombinerHelper.cpp

[GlobalISel] Avoid repeated target info queries in combiners (#196530)

tryCombineAllImpl queries target info for every instruction. Cache
TargetInstrInfo/TargetRegisterInfo/RegisterBankInfo in CombinerHelper
and pass to executeMatchTable instead.

This avoids repeated virtual calls on the combiner executeMatchTable
path.

CTMark -0.08% geomean improvement on aarch64-O0-g.

https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=13bc49510657450402c066098e3a4b7d1af9d0e6&stat=instructions%3Au

Assisted-by: codex
DeltaFile
+8-0llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
+2-3llvm/utils/TableGen/GlobalISelCombinerEmitter.cpp
+1-2llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td
+1-0llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+12-54 files

LLVM/project a7e4e25llvm/include/llvm/CodeGen/GlobalISel GIMatchTableExecutorImpl.h, llvm/test/TableGen/GlobalISelCombinerEmitter match-table.td

[GlobalISel] Delay match table builder initialization (#196506)

MachineIRBuilder::setInstrAndDebugLoc is expensive, delay until needed.

CTMark -0.10% geomean improvement on aarch64-O0-g.

https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=8a87845dfde9de9d141b42d2fce92fcf3be02276&stat=instructions%3Au

Assisted-by: codex
DeltaFile
+12-0llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+0-1llvm/test/TableGen/GlobalISelCombinerEmitter/match-table.td
+0-1llvm/utils/TableGen/GlobalISelCombinerEmitter.cpp
+12-23 files