LLVM/project ae0ed3eflang/include/flang/Parser parse-tree.h, flang/lib/Lower/OpenMP OpenMP.cpp

[flang][OpenMP] Rename some "construct" AST nodes to directives, NFC

Change
  OpenMPAssumeConstruct    -> OmpAssumeDirective
  OpenMPDeclarativeAssumes -> OmpAssumesDirective
  OpenMPSectionConstruct   -> OmpSectionDirective
DeltaFile
+10-11flang/include/flang/Parser/parse-tree.h
+9-9flang/test/Parser/OpenMP/sections.f90
+7-7flang/test/Parser/OpenMP/assumption.f90
+7-7flang/lib/Parser/openmp-parsers.cpp
+6-6flang/lib/Semantics/resolve-directives.cpp
+5-5flang/lib/Lower/OpenMP/OpenMP.cpp
+44-4510 files not shown
+66-6816 files

LLVM/project 5d51698llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/PhaseOrdering/X86 avg.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+31-42llvm/test/Transforms/PhaseOrdering/X86/avg.ll
+25-24llvm/test/Transforms/SLPVectorizer/X86/gathered-loads-non-full-reg.ll
+17-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+6-10llvm/test/Transforms/SLPVectorizer/buildvector-nodes-dependency.ll
+3-7llvm/test/Transforms/SLPVectorizer/X86/complex-fma-combine.ll
+82-835 files

LLVM/project 6ae15a1llvm/utils/lit/lit TestingConfig.py main.py, llvm/utils/lit/tests timeout-config.py

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+20-0llvm/utils/lit/lit/TestingConfig.py
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+1-12llvm/utils/lit/lit/main.py
+9-0llvm/utils/lit/tests/timeout-config.py
+4-4llvm/utils/lit/lit/TestRunner.py
+4-1llvm/utils/lit/lit/LitConfig.py
+51-178 files not shown
+60-2514 files

LLVM/project 390fa4ellvm/utils/lit/lit TestingConfig.py main.py, llvm/utils/lit/tests timeout-config.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+20-0llvm/utils/lit/lit/TestingConfig.py
+1-12llvm/utils/lit/lit/main.py
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+12-0llvm/utils/lit/tests/Inputs/lit-config-readonly/lit.cfg
+9-1llvm/utils/lit/lit/LitConfig.py
+9-0llvm/utils/lit/tests/timeout-config.py
+64-1311 files not shown
+82-2517 files

LLVM/project 4f90d1dllvm/utils/lit/lit TestingConfig.py main.py, llvm/utils/lit/tests timeout-config.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+20-0llvm/utils/lit/lit/TestingConfig.py
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+1-12llvm/utils/lit/lit/main.py
+9-0llvm/utils/lit/tests/timeout-config.py
+4-4llvm/utils/lit/lit/TestRunner.py
+4-1llvm/utils/lit/lit/LitConfig.py
+51-178 files not shown
+60-2514 files

LLVM/project dc3f4f4llvm/utils/lit/tests timeout-config.py, llvm/utils/lit/tests/Inputs/timeout-config lit.cfg test.py

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+9-0llvm/utils/lit/tests/timeout-config.py
+1-0llvm/utils/lit/tests/Inputs/timeout-config/test.py
+23-03 files

LLVM/project 050eb5allvm/utils/lit/tests timeout-config.py, llvm/utils/lit/tests/Inputs/timeout-config lit.cfg test.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+9-0llvm/utils/lit/tests/timeout-config.py
+1-0llvm/utils/lit/tests/Inputs/timeout-config/test.py
+23-03 files

LLVM/project cce57dcllvm/utils/lit/lit TestingConfig.py main.py, llvm/utils/lit/tests timeout-config.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+20-0llvm/utils/lit/lit/TestingConfig.py
+1-12llvm/utils/lit/lit/main.py
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+12-0llvm/utils/lit/tests/Inputs/lit-config-readonly/lit.cfg
+9-1llvm/utils/lit/lit/LitConfig.py
+9-0llvm/utils/lit/tests/timeout-config.py
+64-1311 files not shown
+82-2517 files

LLVM/project 0e680fbllvm/utils/lit/lit TestingConfig.py main.py, llvm/utils/lit/tests timeout-config.py

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+20-0llvm/utils/lit/lit/TestingConfig.py
+1-12llvm/utils/lit/lit/main.py
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+9-0llvm/utils/lit/tests/timeout-config.py
+4-4llvm/utils/lit/lit/TestRunner.py
+4-1llvm/utils/lit/lit/LitConfig.py
+51-178 files not shown
+60-2514 files

LLVM/project 8299ebbllvm/utils/lit/lit TestingConfig.py main.py, llvm/utils/lit/tests timeout-config.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+20-0llvm/utils/lit/lit/TestingConfig.py
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+1-12llvm/utils/lit/lit/main.py
+9-0llvm/utils/lit/tests/timeout-config.py
+4-4llvm/utils/lit/lit/TestRunner.py
+4-1llvm/utils/lit/lit/LitConfig.py
+51-178 files not shown
+60-2514 files

LLVM/project 714c2cfllvm/utils/lit/tests timeout-config.py, llvm/utils/lit/tests/Inputs/timeout-config lit.cfg test.py

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+9-0llvm/utils/lit/tests/timeout-config.py
+1-0llvm/utils/lit/tests/Inputs/timeout-config/test.py
+23-03 files

LLVM/project 6a31413llvm/utils/lit/tests timeout-config.py, llvm/utils/lit/tests/Inputs/timeout-config lit.cfg test.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+13-0llvm/utils/lit/tests/Inputs/timeout-config/lit.cfg
+9-0llvm/utils/lit/tests/timeout-config.py
+1-0llvm/utils/lit/tests/Inputs/timeout-config/test.py
+23-03 files

LLVM/project 195c622llvm/test/Transforms/SLPVectorizer/X86 complex-fma-combine.ll

[SLP][NFC]Add a test with the complex fma, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/198186
DeltaFile
+55-0llvm/test/Transforms/SLPVectorizer/X86/complex-fma-combine.ll
+55-01 files

LLVM/project 0518e54llvm/lib/Transforms/Vectorize VPlanTransforms.cpp

[LV] Update stale comment for partial reduction operands (NFC) (#198118)

The `neg` form was removed in #187228 (this case now uses the
out-of-loop sub, which is preferable, see #189739).
DeltaFile
+0-2llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+0-21 files

LLVM/project a264335clang/lib/AST ItaniumMangle.cpp

[Clang][ItaniumMangle][NFC] Refactor FunctionTypeDepthState (#196240)

This patch refactors `FunctionTypeDepthState` to use bit-fields and
moves the
`getNestingDepth` logic into it. It also renames
`{enter,leave}ResultType` to
`{enter,leave}FunctionDeclSuffix`, since the old names no longer match
their
current role.
DeltaFile
+28-42clang/lib/AST/ItaniumMangle.cpp
+28-421 files

LLVM/project e462584clang/lib/CIR/CodeGen CIRGenBuiltinAMDGPU.cpp CIRGenTypes.cpp, clang/test/CIR/CodeGenHIP builtins-amdgcn-image.hip

[CIR][AMDGPU] Adds lowering for amdgcn image load/store builtins
DeltaFile
+466-0clang/test/CIR/CodeGenHIP/builtins-amdgcn-image.hip
+106-12clang/lib/CIR/CodeGen/CIRGenBuiltinAMDGPU.cpp
+17-0clang/lib/CIR/CodeGen/CIRGenTypes.cpp
+589-123 files

LLVM/project f45f3cellvm/test/CodeGen/X86 avx512-calling-conv.ll

[X86] avx512-calling-conv.ll - ensure we check stack math (#198182)

Calling conventions should ensure the stack offsets are correct
DeltaFile
+1,052-1,052llvm/test/CodeGen/X86/avx512-calling-conv.ll
+1,052-1,0521 files

LLVM/project 571a106llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU load-atomic-global.ll load-atomic-flat.ll

[AMDGPU][AtomicExpand] Support sub naturally aligned 64 bit atomic load/store
DeltaFile
+87-0llvm/test/CodeGen/AMDGPU/load-atomic-global.ll
+87-0llvm/test/CodeGen/AMDGPU/load-atomic-flat.ll
+49-0llvm/test/CodeGen/AMDGPU/store-atomic-flat.ll
+48-0llvm/test/CodeGen/AMDGPU/store-atomic-global.ll
+44-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+36-0llvm/test/Transforms/AtomicExpand/AMDGPU/unaligned-atomic.ll
+351-06 files not shown
+421-612 files

LLVM/project 2f79d41llvm/test/CodeGen/X86 avx512-calling-conv.ll avx512-masked_memop-16-8.ll

[X86] LowerBUILD_VECTORvXi1 - attempt to fold as VPTESTMB(BUILD_VECTOR_vXi8(X),1) (#198166)

i1 scalar elements will be legalised to i8 (and the BUILD_VECTOR relies
on implicit truncation) - but it will often be cheaper to perform the
BUILD_VECTOR as a vXi8 and then perform a comparison to convert to the
vXi1 mask, assuming we're inserting more than one non-constant i1
element.

Without BWI we have to extend this to vXi32 types to perform the
comparison.

There's probably a lot we can do here (v2i8/v4i8/v8i8 types), but this
patch at least addresses the worst codegen cases.

Fixes #179334
DeltaFile
+749-4,307llvm/test/CodeGen/X86/avx512-calling-conv.ll
+207-1,438llvm/test/CodeGen/X86/avx512-masked_memop-16-8.ll
+197-1,083llvm/test/CodeGen/X86/avx512-load-store.ll
+203-915llvm/test/CodeGen/X86/vector-compress.ll
+158-868llvm/test/CodeGen/X86/avx512-ext.ll
+154-866llvm/test/CodeGen/X86/avx512-mask-op.ll
+1,668-9,4771 files not shown
+1,688-9,4787 files

LLVM/project 1907b58llvm/lib/Target/PowerPC PPCISelLowering.cpp, llvm/test/CodeGen/PowerPC ppc-i128-cmp.ll

[PowerPC] Fix i128 vcmpequb optimization for loads with range metadata and small constants (#196801)

The combine introduced in 55aff64d2c6ef50d2ed725d7dd1fb34080486237
lowers scalar i128 compares into vector compares by reissuing the
original loads as v16i8 loads. However, the combine was reusing the
original MachineMemOperand without modification.

If the original i128 load carries !range metadata, the MMO encodes that
range using i128 values. Reusing this MMO for a v16i8 load is incorrect
as range metadata is only valid for integer scalar types and its
bitwidth must match the memory VT.

This patch fixes this by creating a new MachineMemOperand for the vector
vector load. Additionally, we restrict the combine for constant operands
to avoid cases that are better handled by scalar lowering. Small
constants (fit within 16 bits) are excluded to prevent generating
suboptimal vector compares.
DeltaFile
+282-0llvm/test/CodeGen/PowerPC/ppc-i128-cmp.ll
+28-8llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+310-82 files

LLVM/project 24b5f1dllvm/include/llvm/Transforms/Vectorize SLPVectorizer.h, llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+53-28llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+15-24llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll
+14-20llvm/test/Transforms/SLPVectorizer/AArch64/commute.ll
+10-16llvm/test/Transforms/SLPVectorizer/X86/dot-product.ll
+16-9llvm/include/llvm/Transforms/Vectorize/SLPVectorizer.h
+8-4llvm/test/Transforms/SLPVectorizer/X86/slp-fma-loss.ll
+116-1011 files not shown
+119-1047 files

LLVM/project 6931a33llvm/test/CodeGen/X86 avx512-masked_memop-16-8.ll avx512-load-store.ll

[X86] avx512-load-store.ll - add test coverage for #198154 and #198165 (#198169)
DeltaFile
+1,450-0llvm/test/CodeGen/X86/avx512-masked_memop-16-8.ll
+1,105-16llvm/test/CodeGen/X86/avx512-load-store.ll
+2,555-162 files

LLVM/project aa2f124llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/PhaseOrdering/AArch64 reduce_submuladd.ll

[SLP] Enable full non-power-of-2 vectorization by default

Default slp-vectorize-non-power-of-2 to true and broaden the set of
supported widths beyond NumElts + 1 == bit_ceil(NumElts) to include
small widths (<= 5), widths where NumElts - 1 is also non-power of two
(e.g. 6, 7, 10..15), and any width when the elements being vectorized
are themselves vectors (REVEC). Tweak gathered loads, stores, and
reduction support to the non-power-of-2 vector factors.

Reviewers: hiraditya, bababuck, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/196825
DeltaFile
+140-76llvm/test/Transforms/SLPVectorizer/X86/horizontal-minmax.ll
+137-42llvm/test/Transforms/SLPVectorizer/X86/dot-product.ll
+120-29llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+31-98llvm/test/Transforms/PhaseOrdering/AArch64/reduce_submuladd.ll
+44-56llvm/test/Transforms/SLPVectorizer/X86/horizontal-list.ll
+24-60llvm/test/Transforms/SLPVectorizer/RISCV/revec.ll
+496-36133 files not shown
+703-70139 files

LLVM/project 265dcddllvm/test/CodeGen/X86 avx512-calling-conv.ll

[X86] avx512-calling-conv.ll - add test coverage for #179334 (#198163)
DeltaFile
+3,451-0llvm/test/CodeGen/X86/avx512-calling-conv.ll
+3,451-01 files

LLVM/project d926f39flang/include/flang/Optimizer/Dialect/CUF/Attributes CUFAttr.h, flang/lib/Optimizer/Transforms CompilerGeneratedNames.cpp

[CUF] Fix CompilerGeneratedNamesConversion renaming managed companion globals

CUFAddConstructor creates a companion pointer global (e.g. foo.managed.ptr)
for each non-allocatable managed variable. When CompilerGeneratedNamesConversion
ran after CUFAddConstructor, it replaced the dots with 'X',
so CUFOpConversionLate could no longer find the companion by name and fell back
to CUFGetDeviceAddress with the wrong host pointer, causing cudaErrorInvalidSymbol.

Fix: mark the companion global with a cuf.managed_ptr unit attribute in
CUFAddConstructor and skip it in CompilerGeneratedNamesConversionPass.

Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>
DeltaFile
+51-0flang/test/Fir/CUDA/cuda-managed-ptr-companion.mlir
+7-0flang/include/flang/Optimizer/Dialect/CUF/Attributes/CUFAttr.h
+2-2flang/test/Fir/CUDA/cuda-constructor-2.f90
+3-1flang/lib/Optimizer/Transforms/CompilerGeneratedNames.cpp
+4-0flang/lib/Optimizer/Transforms/CUDA/CUFAddConstructor.cpp
+67-35 files

LLVM/project d90a802llvm/lib/IR PrintPasses.cpp

[IR][PrintPasses] Disable IO Sandbox in doSystemDiff (#198151)

Fix `fatal error: error in backend: IO sandbox violation` when executing
`clang -cc1 -print-after-all -print-changed=diff`.

doSystemDiff does temporary file I/O and executes external diff program.
This conflicts with IO sandbox.
DeltaFile
+3-0llvm/lib/IR/PrintPasses.cpp
+3-01 files

LLVM/project e925b35libcxx/include/__algorithm copy_backward.h move.h

[libc++] Introduce a private version of in_out_result and use it for copy/move algorithms (#198086)

This patch introduces a new `__in_out_result`, which is an internal
back-ported version of `in_out_result`, and is convertible to that when
it exists. This improves the readability of the code, since it replaces
uses of `first` and `second` with `__in_` and `__out_`, making it clear
which iterator is accessed.

Other algorithms will be updated in separate patches.
DeltaFile
+18-17libcxx/include/__algorithm/copy_backward.h
+16-15libcxx/include/__algorithm/move.h
+16-15libcxx/include/__algorithm/copy.h
+16-14libcxx/include/__algorithm/move_backward.h
+9-12libcxx/include/__algorithm/copy_move_common.h
+14-0libcxx/include/__algorithm/in_out_result.h
+89-7317 files not shown
+137-14023 files

LLVM/project 4dc415fclang/lib/CodeGen CGCall.cpp

[CGCall] Initially store arg attrs using AttrBuilder (NFCI) (#197906)

Make the argument attribute more similar to fn/ret handling, by first
populating an AttrBuilder and then converting it to AttributeSet once at
the end, instead of using a lot of intermediate AttrBuilders. This also
ensures we cannot lose any attributes because one code path overwrites
another.
DeltaFile
+16-20clang/lib/CodeGen/CGCall.cpp
+16-201 files

LLVM/project e6a1278llvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64Arm64ECCallLowering.cpp, llvm/test/CodeGen/AArch64 arm64ec-exit-thunks.ll arm64ec-hybrid-patchable.ll

[AArch64] Copy x4/x5 vararg payload into the x64 stack in Arm64EC exit thunks (#190933)

Currently the x4/x5 in a variadic Arm64EC exit thunks are treated by
LLVM like any other outgoing arguments. x4/x5 contain a pointer to the
first stack parameter and the size of the parameters passed on the
stack, and the generated exit thunk must memcpy these to the x86-64
stack. Current MSVC does this correctly.

Rather than introducing a new entry to the CallingConv enum, we mark the
call as vararg in AArch64ArmECCallLowering so that the lowering logic in
AArch64ISelLowering.cpp can recognise this case, perform the necessary
memcpy, and drop the x4/x5 arguments.

LLVM should additionally ensure that x0-x3 are mirrored to f0-f3 in
order to match the Windows x86-64 vararg ABI, but that change is left
for a follow-up patch.
DeltaFile
+208-6llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
+62-4llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+11-9llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
+9-1llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+290-204 files

LLVM/project 606a570llvm/lib/Target/RISCV/Disassembler RISCVDisassembler.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+17-31llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
+17-311 files