LLVM/project b577049llvm/include/llvm/Transforms/InstCombine InstCombiner.h, llvm/lib/Transforms/InstCombine InstructionCombining.cpp InstCombineInternal.h

[spr] initial version

Created using spr 1.3.8-wip
DeltaFile
+27-8llvm/include/llvm/Transforms/InstCombine/InstCombiner.h
+18-14llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+8-8llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+53-303 files

LLVM/project b1d1bb2llvm/test/Transforms/LoopVectorize/VPlan/AArch64 single-scalar-cast.ll

[LV] Add VPlan printing test with casts converted to single scalar (NFC) (#202200)

Add test coverage for additional paths that can create single-scalar
casts: sinkScalarOperands and induction optimization.
DeltaFile
+134-0llvm/test/Transforms/LoopVectorize/VPlan/AArch64/single-scalar-cast.ll
+134-01 files

LLVM/project 86940d7llvm/lib/Transforms/Utils LoopUtils.cpp, llvm/lib/Transforms/Vectorize VectorCombine.cpp

[VectorCombine] foldShuffleChainsToReduce - add FADD/FMUL handling (#201302)

Extend `foldShuffleChainsToReduce` to fold shuffle-reduction chains of
fadd/fmul into the corresponding vector reduction intrinsics
(llvm.vector.reduce.fadd / llvm.vector.reduce.fmul).

The transformation requires the `reassoc` fast-math flag on every binop
in the chain based on the
[langspec](https://llvm.org/docs/LangRef.html#rewrite-based-flags). The
output intrinsic receives the intersection of all binops' FMF, and the
identity start value is selected via ConstantExpr::getBinOpIdentity
(-0.0 for fadd, 1.0 for fmul, respecting nsz for the sign of zero).

Fixes #199030.
DeltaFile
+134-0llvm/test/Transforms/VectorCombine/fold-shuffle-chains-to-reduce.ll
+69-0llvm/test/Transforms/VectorCombine/X86/fold-shuffle-chains-to-reduce-fp.ll
+43-4llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+4-0llvm/lib/Transforms/Utils/LoopUtils.cpp
+250-44 files

LLVM/project 7df3d92llvm/lib/Transforms/Vectorize VPlan.h VPlanTransforms.cpp

[VPlan] Add VPReplicateRecipe::operandsWithoutMask() (NFC). (#202115)

Add a helper to access a VPReplicateRecipe's operands while excluding
the mask of a predicated recipe, and use it in createReplicateRegion.

Split off from https://github.com/llvm/llvm-project/pull/201676.
DeltaFile
+5-0llvm/lib/Transforms/Vectorize/VPlan.h
+1-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+6-12 files

LLVM/project e6bd788llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/X86 avx512-maxnum-minnum-masked-store.ll

[DAG] Narrow vselect mask to vXi1 in foldToMaskedStore (#201609)

foldToMaskedStore (added in
https://github.com/llvm/llvm-project/commit/1c0ac80d4a9ef6c21914f2317003979952c2a2c3)
rewrites
  store(vselect(cond, x, load(ptr)), ptr) -> masked_store(x, ptr, cond)
passing the vselect condition straight through as the store mask. A
masked
store follows the IR convention of a vXi1 mask, but the condition can be
a
wider boolean vector. On AVX512F targets without VLX, a maxnum/minnum
store-back lowers the NaN test with a legacy packed (CMPP) comparison
whose
result is a vXi32/vXi64 vector, so the masked store is created with a
wide
mask and LowerMSTORE asserts:

  Assertion `Mask.getSimpleValueType().getScalarType() == MVT::i1 &&
             "Unexpected mask type"' failed.

    [13 lines not shown]
DeltaFile
+151-0llvm/test/CodeGen/X86/avx512-maxnum-minnum-masked-store.ll
+13-0llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+164-02 files

LLVM/project 18ad6a5clang/lib/AST/ByteCode Program.cpp InterpBuiltin.cpp, clang/test/SemaCXX constant-expression-p2280r4.cpp

[clang][bytecode] Register global constexpr-unknown variables with their pointee type (#201347)
DeltaFile
+6-0clang/lib/AST/ByteCode/Program.cpp
+1-2clang/test/SemaCXX/constant-expression-p2280r4.cpp
+3-0clang/lib/AST/ByteCode/InterpBuiltin.cpp
+10-23 files

LLVM/project 3b5f8felibcxx/test/libcxx/containers/views/views.span nodiscard.iterator.verify.cpp

[libc++][span] Test `[[nodiscard]]` applied to `span::iterator` (#202068)

Adds test coverage.

`[[nodicard]]` applied in:
- https://github.com/llvm/llvm-project/pull/198489
- https://github.com/llvm/llvm-project/pull/198492

Towards #172124
DeltaFile
+43-0libcxx/test/libcxx/containers/views/views.span/nodiscard.iterator.verify.cpp
+43-01 files

LLVM/project a383c1alibcxx/test/libcxx/containers/sequences/array nodiscard.iterator.verify.cpp

[libc++][array] Test `[[nodicard]]`  with `array::const_iterator` (#202070)

Added tests with `array::const_iterator` for completeness.

Implemented in https://github.com/llvm/llvm-project/pull/198492

Towards #172124
DeltaFile
+14-1libcxx/test/libcxx/containers/sequences/array/nodiscard.iterator.verify.cpp
+14-11 files

LLVM/project efaed42llvm/lib/Transforms/Utils SimplifyCFG.cpp, llvm/test/Transforms/PhaseOrdering/X86 merge-functions2.ll

[SimplifyCFG] Shrink integer lookup tables (#202071)

After #200664, we generate lookup tables in more cases, leading to
higher memory use and larger binaries. Partially alleviate this by
shrinking the lookup tables if all elements are small integers. The
underlying idea is that an extra integer extension can typically be
folded into a load instruction at no extra cost.

This reduces the size of stage2-clang by 0.13%.
DeltaFile
+44-33llvm/test/Transforms/SimplifyCFG/X86/switch_to_lookup_table.ll
+28-21llvm/test/Transforms/SimplifyCFG/X86/switch_to_lookup_table_big.ll
+15-11llvm/test/Transforms/SimplifyCFG/X86/switch-of-powers-of-two.ll
+21-2llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+5-4llvm/test/Transforms/SimplifyCFG/X86/debugloc-switch-powers-of-two.ll
+4-3llvm/test/Transforms/PhaseOrdering/X86/merge-functions2.ll
+117-747 files not shown
+136-8713 files

LLVM/project 825b3c7compiler-rt/lib/asan asan_allocator.h asan_mapping.h

[ASan] Improve qemu-alpha shadow mapping (#201861)

With a 1T fixed shadow offset the usable app memory is split between
LowMem (0-1T) and HighMem (1.5T-4T). This works on real Alpha hardware
where all addresses stay within TASK_SIZE (4T). However, under
qemu-alpha user mode mmap(NULL) returns addresses from the host x86-64
address space (~127T), outside both regions, causing AddrIsInMem() CHECK
failures in PoisonShadow.

Switch to a fixed shadow offset of 0x70000000000 (7 TiB). TASK_SIZE is
well below the shadow offset so HighMem is empty: kHighMemBeg =
MEM_TO_SHADOW(kHighMemEnd) + 1 > kHighMemEnd. All app memory fits in
LowMem [0, 7T), a simpler layout with no HighMem split. On qemu-alpha,
-R 0x80000000000 constrains guest mappings to [0, 8T), keeping them
within LowMem.
DeltaFile
+3-2compiler-rt/lib/asan/asan_allocator.h
+1-1compiler-rt/lib/asan/asan_mapping.h
+4-32 files

LLVM/project 5f5642dclang/lib/CIR/CodeGen CIRGenModule.cpp CIRGenModule.h, clang/test/CIR/CodeGenOpenCL kernel-arg-info.cl kernel-arg-info-single-as.cl

[CIR][OpenCL] Attach kernel argument metadata to CIR functions

Emit the CIR OpenCL kernel argument metadata attribute for kernel functions. Preserve CIR language address-space kinds until lowering and include argument names only when `-cl-kernel-arg-info` is enabled.
DeltaFile
+152-0clang/test/CIR/CodeGenOpenCL/kernel-arg-info.cl
+83-0clang/lib/CIR/CodeGen/CIRGenModule.cpp
+19-0clang/test/CIR/CodeGenOpenCL/kernel-arg-info-single-as.cl
+12-0clang/test/CIR/CodeGenOpenCL/kernel-arg-metadata.cl
+4-0clang/lib/CIR/CodeGen/CIRGenModule.h
+3-0clang/lib/CIR/CodeGen/CIRGenFunction.cpp
+273-06 files

LLVM/project 3e4bf54clang/include/clang/CIR/Dialect/IR CIRAttrConstraints.td CIROpenCLAttrs.td, clang/lib/CIR/Dialect/IR CIROpenCLAttrs.cpp

fix: constrain CIR OpenCL metadata arrays in TableGen
DeltaFile
+25-6clang/include/clang/CIR/Dialect/IR/CIRAttrConstraints.td
+0-23clang/lib/CIR/Dialect/IR/CIROpenCLAttrs.cpp
+6-6clang/include/clang/CIR/Dialect/IR/CIROpenCLAttrs.td
+6-6clang/test/CIR/IR/invalid-opencl-kernel-arg-metadata.cir
+6-2clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+43-435 files

LLVM/project 6cfd155clang/include/clang/CIR/Dialect/IR CIREnumAttr.td, clang/lib/CIR/Dialect/IR CIROpenCLAttrs.cpp CIRTypes.cpp

fix: Use CIR_LangAddressSpace instead of a raw integer
DeltaFile
+7-29clang/test/CIR/IR/invalid-opencl-kernel-arg-metadata.cir
+4-12clang/lib/CIR/Dialect/IR/CIROpenCLAttrs.cpp
+4-4clang/test/CIR/IR/opencl-kernel-arg-metadata.cir
+3-1clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td
+2-0clang/lib/CIR/Dialect/Transforms/TargetLowering/Targets/AMDGPU.cpp
+2-0clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+22-462 files not shown
+25-468 files

LLVM/project cadc32fclang/test/CIR/IR opencl-kernel-arg-metadata.cir

fix: Add zero-argument kernel arg metadata test
DeltaFile
+12-0clang/test/CIR/IR/opencl-kernel-arg-metadata.cir
+12-01 files

LLVM/project 1ac5d63clang/include/clang/CIR/Dialect/IR CIROpenCLAttrs.td CIRAttrs.td, clang/lib/CIR/Dialect/IR CIROpenCLAttrs.cpp

[CIR][OpenCL] Add kernel argument metadata attribute

Add a CIR attribute that carries OpenCL kernel argument metadata in source argument order. Verify that each metadata field has the expected element type and that all present arrays describe the same number of arguments.
DeltaFile
+78-0clang/test/CIR/IR/invalid-opencl-kernel-arg-metadata.cir
+60-0clang/lib/CIR/Dialect/IR/CIROpenCLAttrs.cpp
+46-0clang/include/clang/CIR/Dialect/IR/CIROpenCLAttrs.td
+27-0clang/test/CIR/IR/opencl-kernel-arg-metadata.cir
+1-0clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+1-0clang/include/clang/CIR/Dialect/IR/CIRDialect.td
+213-01 files not shown
+214-07 files

LLVM/project 23b5ed0clang/test/CIR/IR invalid-addrspace.cir

fix: Update CIR invalid address space diagnostic
DeltaFile
+1-1clang/test/CIR/IR/invalid-addrspace.cir
+1-11 files

LLVM/project 06fa129clang/lib/CIR/Dialect/IR CIROpenCLAttrs.cpp, clang/test/CIR/IR invalid-opencl-kernel-arg-metadata.cir

fix: Verify kernel arg addr_space values
DeltaFile
+23-1clang/test/CIR/IR/invalid-opencl-kernel-arg-metadata.cir
+14-5clang/lib/CIR/Dialect/IR/CIROpenCLAttrs.cpp
+37-62 files

LLVM/project a0344e9clang-tools-extra/clang-doc JSONGenerator.cpp MDGenerator.cpp

[clang-doc] Move Generator classes into the anonymous namespace (#202058)

Clang-Tidy suggest moving these classes into the anonymous namespace,
to enforce internal linkage.
DeltaFile
+24-26clang-tools-extra/clang-doc/JSONGenerator.cpp
+15-11clang-tools-extra/clang-doc/MDGenerator.cpp
+10-11clang-tools-extra/clang-doc/HTMLGenerator.cpp
+8-4clang-tools-extra/clang-doc/MDMustacheGenerator.cpp
+57-524 files

LLVM/project 84debf4clang-tools-extra/clang-doc Representation.cpp

[clang-doc] Clean up implementation with better casting (#202060)

Having access to RTTI style casting lets us use slightly nicer
structures to clean up the overly complicated dispatch logic in merging
and other places.
DeltaFile
+19-51clang-tools-extra/clang-doc/Representation.cpp
+19-511 files

LLVM/project 048c7f8clang-tools-extra/clang-doc Representation.cpp

[clang-doc] Clean up implementation with better casting

Having access to RTTI style casting lets us use slightly nicer
structures to clean up the overly complicated dispatch logic in merging
and other places.
DeltaFile
+19-51clang-tools-extra/clang-doc/Representation.cpp
+19-511 files

LLVM/project dfc0c22clang-tools-extra/clang-doc JSONGenerator.cpp MDGenerator.cpp

[clang-doc] Move Generator classes into the anonymous namespace

Clang-Tidy suggest moving these classes into the anonymous namespace,
to enforce internal linkage.
DeltaFile
+24-26clang-tools-extra/clang-doc/JSONGenerator.cpp
+15-11clang-tools-extra/clang-doc/MDGenerator.cpp
+10-11clang-tools-extra/clang-doc/HTMLGenerator.cpp
+8-4clang-tools-extra/clang-doc/MDMustacheGenerator.cpp
+57-524 files

LLVM/project baa69e9clang/test lit.cfg.py, clang/test/Driver driverkit-path.c

Updating test clang/test/Driver/driverkit-path.c for usage with CLANG_RESOURCE_DIR (#197154)

When the CMake option CLANG_RESOURCE_DIR is specified, it changes 
the path to various tools and thus breaks some tests that look for things
in the "standard" location. This change updates one of the tests to take
into account the CLANG_RESOURCE_DIR value if specified by querying
compiler using `-print-resource-dir` to more accurately find the expected
directory in tests.
DeltaFile
+4-4clang/test/Driver/driverkit-path.c
+7-0clang/test/lit.cfg.py
+11-42 files

LLVM/project b7086e6llvm/test/Transforms/LoopInterchange fp-reductions.ll reductions-across-inner-and-outer-loop.ll

[LoopInterchange] Use UTC as much as possible (NFC)
DeltaFile
+364-49llvm/test/Transforms/LoopInterchange/fp-reductions.ll
+209-117llvm/test/Transforms/LoopInterchange/reductions-across-inner-and-outer-loop.ll
+251-37llvm/test/Transforms/LoopInterchange/reductions-non-wrapped-operations.ll
+188-40llvm/test/Transforms/LoopInterchange/legality-for-scalar-deps.ll
+148-33llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
+97-45llvm/test/Transforms/LoopInterchange/currentLimitation.ll
+1,257-32116 files not shown
+2,063-56222 files

LLVM/project b542c92flang/lib/Optimizer/CodeGen CodeGen.cpp, flang/test/Fir/CUDA cuda-code-gen.mlir

[flang][CUDA] Allocate converted kernel descriptors in device-accessible storage (#201950)

Fix CUDA descriptor lowering when an `fir.embox` result reaches a
`gpu.launch_func` through an intermediate `fir.convert`.

CodeGen previously failed to recognize this use chain and could place
the descriptor in host stack storage. Since CUDA kernels may dereference
assumed-shape descriptors on the device, such descriptors must be
allocated through the CUDA descriptor allocation path. Teach the
GPU-launch-use check to look through `fir.convert` so these descriptors
are lowered with `_FortranACUFAllocDescriptor`.

Also adds a regression test for the `fir.embox -> fir.convert ->
gpu.launch_func` case.
DeltaFile
+30-0flang/test/Fir/CUDA/cuda-code-gen.mlir
+24-5flang/lib/Optimizer/CodeGen/CodeGen.cpp
+54-52 files

LLVM/project 2aa5210.ci compute_projects_test.py compute_projects.py, .github/workflows libclang-python-tests.yml

CI: move libclang python byindings tests to main CI

This removes the separate python bindings CI, which run on the GitHub free
runners and take more than one hour to build libclang.

The tests are executed instead in the monolithic pipelines,
whenever clang would be tested.

This is fine in terms of resources because all the dependencies are
built anyway, and the tests themselves take less than one second to
run on the free runners.
DeltaFile
+0-60.github/workflows/libclang-python-tests.yml
+13-12.ci/compute_projects_test.py
+1-1.ci/compute_projects.py
+2-0clang/bindings/python/tests/cindex/test_source_range.py
+1-0clang/bindings/python/tests/cindex/test_translation_unit.py
+17-735 files

LLVM/project aca0ce5clang/include/clang/AST DeclTemplate.h, clang/lib/AST DeclTemplate.cpp

[clang] Reland: fix getTemplateInstantiationArgs (#202088)

Relands https://github.com/llvm/llvm-project/pull/199528
Previous: #201373

This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the
template
context of out-of-line definitions.

This greatly simplifies the signature of that function, by removing a
bunch
of workarounds, and simpliffying a couple that weren't removed yet.

Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.

Also makes the explicit specialization AST nodes stop abusing the

    [2 lines not shown]
DeltaFile
+194-429clang/lib/Sema/SemaTemplateInstantiate.cpp
+275-165clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+151-147clang/lib/Sema/SemaTemplate.cpp
+96-95clang/include/clang/AST/DeclTemplate.h
+59-129clang/lib/Sema/SemaConcept.cpp
+60-92clang/lib/AST/DeclTemplate.cpp
+835-1,05756 files not shown
+1,505-1,71762 files

LLVM/project a148c11.ci compute_projects_test.py compute_projects.py, .github/workflows libclang-python-tests.yml

CI: move libclang python byindings tests to main CI

This removes the separate python bindings CI, which run on the GitHub free
runners and take more than one hour to build libclang.

The tests are executed instead in the monolithic pipelines,
whenever clang would be tested.

This is fine in terms of resources because all the dependencies are
built anyway, and the tests themselves take less than one second to
run on the free runners.
DeltaFile
+0-60.github/workflows/libclang-python-tests.yml
+13-12.ci/compute_projects_test.py
+8-3clang/bindings/python/tests/cindex/test_source_range.py
+3-1clang/bindings/python/tests/cindex/test_translation_unit.py
+1-1.ci/compute_projects.py
+25-775 files

LLVM/project dfae3c0.ci compute_projects_test.py compute_projects.py, .github/workflows libclang-python-tests.yml

CI: move libclang python byindings tests to main CI

This removes the separate python bindings CI, which run on the GitHub free
runners and take more than one hour to build libclang.

The tests are executed instead in the monolithic pipelines,
whenever clang would be tested.

This is fine in terms of resources because all the dependencies are
built anyway, and the tests themselves take less than one second to
run on the free runners.
DeltaFile
+0-60.github/workflows/libclang-python-tests.yml
+13-12.ci/compute_projects_test.py
+6-3clang/bindings/python/tests/cindex/test_source_range.py
+3-1clang/bindings/python/tests/cindex/test_translation_unit.py
+1-1.ci/compute_projects.py
+23-775 files

LLVM/project 197282cllvm/test/Transforms/LoopInterchange reduction-anyof.ll reductions-non-wrapped-operations.ll

[LoopInterchange] Add test for loop contains AnyOf reduction (NFC)
DeltaFile
+90-0llvm/test/Transforms/LoopInterchange/reduction-anyof.ll
+0-42llvm/test/Transforms/LoopInterchange/reductions-non-wrapped-operations.ll
+90-422 files

LLVM/project 498307bllvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange reduction-anyof.ll

[LoopInterchange] Reject interchange when AnyOf reduction exists
DeltaFile
+10-22llvm/test/Transforms/LoopInterchange/reduction-anyof.ll
+0-1llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+10-232 files