LLVM/project 32aca4cmlir/include/mlir/Interfaces ValueBoundsOpInterface.h, mlir/lib/Dialect/Affine/Transforms ReifyValueBounds.cpp

[mlir][Interfaces] Allow integer types for `ValueBoundsOpInterface`
DeltaFile
+74-18mlir/lib/Dialect/Arith/Transforms/ReifyValueBounds.cpp
+44-29mlir/lib/Interfaces/ValueBoundsOpInterface.cpp
+47-9mlir/lib/Dialect/Affine/Transforms/ReifyValueBounds.cpp
+35-14mlir/include/mlir/Interfaces/ValueBoundsOpInterface.h
+41-0mlir/test/Dialect/Arith/value-bounds-op-interface-impl.mlir
+14-7mlir/test/lib/Dialect/Affine/TestReifyValueBounds.cpp
+255-7714 files not shown
+301-11120 files

LLVM/project dc78a52clang/test/CXX/drs cwg13xx.cpp, clang/www cxx_dr_status.html

[clang][NFC] Mark CWG1336 as implemented and add a test (#196000)

[CWG1336](https://wg21.link/cwg1336) clarifies that, as long as it isn't
explicit, a constructor is still a converting constructor even if it has
multiple arguments. Clang seems to implement this since 3.1:
https://godbolt.org/z/919zdMd3h (I checked a few versions following 3.1
as well, and didn't notice any regressions).
DeltaFile
+19-0clang/test/CXX/drs/cwg13xx.cpp
+1-1clang/www/cxx_dr_status.html
+20-12 files

LLVM/project 6020a97mlir/lib/Dialect/SCF/Transforms ParallelLoopFusion.cpp

[MLIR] Fix use-after-scope when interchanging ploops (#196076)

getInductionVars returns a SmallVector, so going through zip+reverse
gets us a dangling reference. Quite a footgun.

Found by asan.
DeltaFile
+3-2mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp
+3-21 files

LLVM/project 459d174llvm/lib/Target/X86/GISel X86CallLowering.cpp, llvm/test/CodeGen/X86/GlobalISel irtranslator-callingconv.ll

[X86][GlobalISel] Extend scalar float values to s80 when returning in FP0/FP1 registers (#196009)

**Reference PR** - https://github.com/llvm/llvm-project/pull/167919
DeltaFile
+6-3llvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll
+3-0llvm/lib/Target/X86/GISel/X86CallLowering.cpp
+9-32 files

LLVM/project 6a9dac2clang/lib/Driver Driver.cpp, clang/test/Driver aix-object-mode.c

[AIX][Clang][Driver] Fix OBJECT_MODE bug on AIX (#193550)

If `--target` is specified it should take precedence over `OBJECT_MODE`.
This is important, for example, for lit tests which want to specify an
explicitly 32-bit or 64-bit triple on AIX, or they may get the wrong bit
mode depending on the environment they run in.
DeltaFile
+10-3clang/test/Driver/aix-object-mode.c
+4-1clang/lib/Driver/Driver.cpp
+14-42 files

LLVM/project af626e7clang/test/AST ast-dump-templates.cpp, llvm/test/CodeGen/RISCV rvp-simd-64.ll atomic-rmw.ll

Merge branch 'main' into users/jdenny-ornl/filecheck-diaglist
DeltaFile
+652-9,343clang/test/AST/ast-dump-templates.cpp
+5,061-4,162llvm/test/CodeGen/Thumb2/mve-clmul.ll
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+4,652-0llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+2,420-2,120llvm/test/CodeGen/RISCV/atomic-rmw.ll
+2,940-1,458llvm/test/CodeGen/X86/vector-reduce-smin.ll
+22,598-17,0835,314 files not shown
+243,030-113,5005,320 files

LLVM/project aa2fe53llvm/utils/gn/secondary/llvm/lib/Transforms/IPO BUILD.gn

[gn build] Port 7b4175c17f16 (#196073)
DeltaFile
+2-0llvm/utils/gn/secondary/llvm/lib/Transforms/IPO/BUILD.gn
+2-01 files

LLVM/project 5f43760llvm/include/llvm/Support UndefPoison.h, llvm/lib/Analysis ValueTracking.cpp

[Support] Move UndefPoisonKind enum to a shared header (#195523)

This patch moves the **`UndefPoisonKind`** enum to a shared header in
`llvm/include/llvm/Support/UndefPoison.h` to resolve the dependency
issues identified in #194818.

Changes:

- Created the new header` llvm/include/llvm/Support/UndefPoison.h`.

- Removed duplicate local definitions from
`llvm/lib/Analysis/ValueTracking.cpp` and
`llvm/lib/CodeGen/GlobalISel/Utils.cpp`.
DeltaFile
+40-0llvm/include/llvm/Support/UndefPoison.h
+1-16llvm/lib/CodeGen/GlobalISel/Utils.cpp
+1-14llvm/lib/Analysis/ValueTracking.cpp
+42-303 files

LLVM/project 97c6c00mlir/include/mlir/Interfaces ValueBoundsOpInterface.h, mlir/lib/Dialect/Affine/Transforms ReifyValueBounds.cpp

[mlir][Interfaces] Allow integer types for `ValueBoundsOpInterface`
DeltaFile
+71-19mlir/lib/Dialect/Arith/Transforms/ReifyValueBounds.cpp
+44-29mlir/lib/Interfaces/ValueBoundsOpInterface.cpp
+35-14mlir/include/mlir/Interfaces/ValueBoundsOpInterface.h
+15-8mlir/lib/Dialect/Affine/Transforms/ReifyValueBounds.cpp
+16-0mlir/test/Dialect/Arith/value-bounds-op-interface-impl.mlir
+11-5mlir/test/lib/Dialect/Affine/TestReifyValueBounds.cpp
+192-7514 files not shown
+232-10820 files

LLVM/project 480ba1eclang/lib/Driver/ToolChains HIPAMD.cpp, clang/test/Driver hip-spirv-linker-crash.c

[Driver][HIP/SPIRV] Fix crash when llvm-link is executed.

There is a design limitation that is forwarding flags to llvm-link
when it shouldn't happen. This commit fixes this issue by sanitizing
the arguments forwarded to llvm-link.

This may happen when clang-linker-wrapper eventually calls clang.
Crash reproducer is here: https://gcc.godbolt.org/z/rxvWcvan3.

The fix is based on MrSidims's old PR (#183492).

Co-authored-by: Dmitry Sidorov <18708689+MrSidims at users.noreply.github.com>
Co-authored-by: Manuel Carrasco <manuel.carrasco at amd.com>
DeltaFile
+16-0clang/test/Driver/hip-spirv-linker-crash.c
+5-2clang/lib/Driver/ToolChains/HIPAMD.cpp
+21-22 files

LLVM/project 728f99fllvm/lib/Target/PowerPC PPCInstrInfo.td PPCInstr64Bit.td

[PowerPC] Remove duplicate patterns for atomic_swap (#195936)

The definition and implementation of atomic_load_* and atomic_swap is
basically similar. Changing the way how the operations are enumerated
makes it possible to remove the separate patterns for atomic_swap.
DeltaFile
+5-23llvm/lib/Target/PowerPC/PPCInstrInfo.td
+6-12llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+11-352 files

LLVM/project bc4ffe8mlir/include/mlir/Dialect/SPIRV/IR SPIRVTosaTypes.td SPIRVTosaOps.td, mlir/lib/Dialect/SPIRV/IR SPIRVTosaOps.cpp

[mlir][spirv] Improve verification for SPIR-V TOSA ops (#195624)

Add shape and attribute verification for several SPIR-V TOSA ops:
reductions, FFT2D, RFFT2D, MatMul, Clamp, Concat, and Resize.

Add negative parser/verification tests for the new checks.

Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>
DeltaFile
+152-20mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTosaTypes.td
+137-0mlir/test/Dialect/SPIRV/IR/tosa-ops-verification.mlir
+49-0mlir/lib/Dialect/SPIRV/IR/SPIRVTosaOps.cpp
+17-5mlir/include/mlir/Dialect/SPIRV/IR/SPIRVTosaOps.td
+355-254 files

LLVM/project 51698f1llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 cmtst-neg-and-one.ll

[AArch64] Match vector neg(and X, 1) as CMTST (#194833)

AArch64 already recognizes vector icmp/sext forms such as
sext(icmp ne (and X, C), 0) as CMTST.

However, for bit-zero mask idioms, the middle-end can canonicalize the
expression to sub 0, (and X, 1). This produces a 0/-1 vector mask, but
currently lowers to and+neg instead of CMTST.

Recognize vector neg(and X, splat(1)) / sub 0, (and X, splat(1)) as a
CMTST idiom.

The match is intentionally limited to exact splat(1). For example,
neg(and X, 2) produces 0/-2, not a 0/-1 mask, and is not equivalent to
CMTST.

Fixes #107093.
DeltaFile
+109-0llvm/test/CodeGen/AArch64/cmtst-neg-and-one.ll
+25-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+134-02 files

LLVM/project b130c80clang/test/AST ast-dump-templates.cpp, llvm/test/CodeGen/RISCV atomic-rmw.ll

Merge branch 'users/s-perron/constantbuffer-spirv-getbasepointer' into users/s-perron/constantbuffer-constantbuffer-t
DeltaFile
+652-9,343clang/test/AST/ast-dump-templates.cpp
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+2,420-2,120llvm/test/CodeGen/RISCV/atomic-rmw.ll
+2,940-1,458llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,936-1,457llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,969-1,160llvm/test/CodeGen/X86/vector-reduce-mul.ll
+18,790-15,5383,419 files not shown
+157,322-69,8233,425 files

LLVM/project 56dea64clang/test/AST ast-dump-templates.cpp, llvm/test/CodeGen/RISCV atomic-rmw.ll

Merge branch 'main' into users/s-perron/constantbuffer-spirv-getbasepointer
DeltaFile
+652-9,343clang/test/AST/ast-dump-templates.cpp
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+2,420-2,120llvm/test/CodeGen/RISCV/atomic-rmw.ll
+2,940-1,458llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,936-1,457llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,969-1,160llvm/test/CodeGen/X86/vector-reduce-mul.ll
+18,790-15,5383,419 files not shown
+157,322-69,8233,425 files

LLVM/project f08b4ffclang/lib/CodeGen CGHLSLBuiltins.cpp CGHLSLRuntime.h, clang/lib/Sema SemaHLSL.cpp

[HLSL] Allow __builtin_hlsl_resource_getpointer to take no indices (#195151)

In preperation for adding ConstnatBuffer<T>, we will need to be able to
access the base pointer for the data constat buffer resource handle is
pointingto
to. This is done by:

1. Making the index operand in __builtin_hlsl_resource_getpointer
   optional.
2. Modifing the codegen for __builtin_hlsl_resource_getpointer to emit a
   call to resource.getbasepointer when no index is provided.
3. Add the resource.getbasepointer for the dx and spv targets.

Another issue is that the address space for the pointer returned by
__builtin_hlsl_resource_getpointer is not always hlsl_device any more.
Changes are made to get the correct address space based on the resource
class of the handle.

Note that we cannot implement codegen for

    [17 lines not shown]
DeltaFile
+23-7clang/lib/Sema/SemaHLSL.cpp
+11-3clang/lib/CodeGen/CGHLSLBuiltins.cpp
+5-5clang/test/SemaHLSL/BuiltIns/resource_getpointer-errors.hlsl
+4-0llvm/include/llvm/IR/IntrinsicsSPIRV.td
+4-0llvm/include/llvm/IR/IntrinsicsDirectX.td
+2-0clang/lib/CodeGen/CGHLSLRuntime.h
+49-156 files

LLVM/project 4d30700utils/bazel/llvm-project-overlay/mlir BUILD.bazel, utils/bazel/llvm-project-overlay/mlir/unittests BUILD.bazel

[Bazel] Fixes 0ff5c32 (#196065)

This fixes 0ff5c32c28219ea7e75869678fb5fe3b1b4b0e0d.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+1-0utils/bazel/llvm-project-overlay/mlir/unittests/BUILD.bazel
+2-02 files

LLVM/project 24346a3bolt/lib/Passes StokeInfo.cpp, bolt/test/AArch64 unsupported-passes.test

[BOLT][AArch64] Refuse to run Stoke analysis on AArch64 (#195878)

`--stoke` and `--stoke-out` yields an UNIMPLEMENTED crash on AArch64. It
is a fundamentally X86 pass.

- Add a non-X86 guard
- Add the error message to unsupported-passes.test.
DeltaFile
+5-0bolt/lib/Passes/StokeInfo.cpp
+2-0bolt/test/AArch64/unsupported-passes.test
+7-02 files

LLVM/project ccdf56allvm/utils/gn/build sync_dir.py write_file.py

[gn] Add +x bit on scripts missing it (#196064)

rg -l '#!' llvm/utils/gn/build/*.py | xargs chmod +x

No effective behavior change. Makes it easier to run these scripts
manually.
DeltaFile
+0-0llvm/utils/gn/build/sync_dir.py
+0-0llvm/utils/gn/build/write_file.py
+0-0llvm/utils/gn/build/write_library_dependencies.py
+0-0llvm/utils/gn/build/remove_if_exists.py
+0-04 files

LLVM/project 7b4175cllvm/include/llvm/Transforms/IPO Instrumentor.h, llvm/lib/Transforms/IPO Instrumentor.cpp InstrumentorConfigFile.cpp

[Instrumentor] Add Instrumentor pass (#138958)

This commit adds the basic infrastructure for the Instrumentor pass, which
allows instrumenting code in a simple and customizable way. This commit
adds support for instrumenting load and store instructions. The
Instrumentor can be configured with a JSON file that describes what
should be instrumented, or can be used programmatically from another
pass.

The default JSON config file can be found in:
`llvm/test/Instrumentation/Instrumentor/default_config.json`. More
information about Instrumentor in the
[RFC](https://discourse.llvm.org/t/rfc-introducing-instrumentor-easily-customizable-code-instrumentation/86020).

This is only a squash commit of several contributions to the
Instrumentor. The authors and contributors of this pass are:

- Johannes Doerfert @jdoerfert
- Kevin Sala @kevinsala

    [7 lines not shown]
DeltaFile
+767-0llvm/lib/Transforms/IPO/Instrumentor.cpp
+707-0llvm/include/llvm/Transforms/IPO/Instrumentor.h
+226-0llvm/test/Instrumentation/Instrumentor/load_store_args.ll
+225-0llvm/lib/Transforms/IPO/InstrumentorConfigFile.cpp
+217-0llvm/test/Instrumentation/Instrumentor/load_store.ll
+208-0llvm/test/Instrumentation/Instrumentor/load_store_noreplace.ll
+2,350-014 files not shown
+2,984-020 files

LLVM/project 66276d9flang/lib/Optimizer/OpenMP MapInfoFinalization.cpp

[flang] Fix unused variable (NFC) (#195994)
DeltaFile
+1-1flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+1-11 files

LLVM/project 4f8d785clang/test/CodeGen/AArch64 lit.local.cfg neon-intrinsics.c, clang/test/CodeGen/AArch64/neon bf16-getset.c fullfp16.c

[clang][test] Add `%clang_cc1_cg_arm64_neon` substitution (#188547)

Add a LIT substitution `%clang_cc1_cg_arm64_neon` expanding to:
```python
  clang -cc1 -internal-isystem <path> \
    -triple arm64-none-linux-gnu \
    -target-feature +neon -o -
```
This invocation is repeated across multiple tests. Introducing a
substitution reduces duplication, shortens RUN lines, and ensures
consistency across `clang -cc1` invocations.

Shorter RUN lines also make test-specific flags easier to spot.
DeltaFile
+13-0clang/test/CodeGen/AArch64/lit.local.cfg
+3-3clang/test/CodeGen/AArch64/neon/bf16-getset.c
+3-3clang/test/CodeGen/AArch64/neon/fullfp16.c
+3-3clang/test/CodeGen/AArch64/neon/intrinsics.c
+1-5clang/test/CodeGen/AArch64/neon-intrinsics.c
+1-1clang/test/CodeGen/AArch64/bf16-getset-intrinsics.c
+24-156 files

LLVM/project 22f70cbllvm/utils/gn/build sync_dir.py sync_source_dir.py, llvm/utils/gn/secondary/libcxx/include BUILD.gn

[gn] Rename sync_source_dir.py to just sync_dir.py (#196059)

No behavior change.
DeltaFile
+59-0llvm/utils/gn/build/sync_dir.py
+0-59llvm/utils/gn/build/sync_source_dir.py
+1-1llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+60-603 files

LLVM/project 9d9f1eemlir/lib/Dialect/GPU/IR GPUDialect.cpp, mlir/test/Dialect/GPU invalid.mlir

[mlir][gpu] Reject conflicting async operands on gpu.launch_func (#196012)

Reject gpu.launch_func ops that have both async dependencies and an
explicit async object.
DeltaFile
+19-0mlir/test/Dialect/GPU/invalid.mlir
+4-0mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+23-02 files

LLVM/project 0ff5c32mlir/include/mlir/IR BuiltinAttributes.td BuiltinAttributes.h, mlir/include/mlir/Support Complex.h

[mlir] Use custom mlir::Complex type for non-float complex numbers (#191821)

Instantiating std::complex for types where std::is_floating_point<T> is
false is not allowed, and throws warnings when building with MSSTL. This
patch fixes those warnings by introducing an mlir::Complex type, which
is a typedef to std::complex when T satisfies is_floating_point, and a
custom complex type otherwise.

The std::complex implementation from libc++ has been used as a guide for
implementing the custom type.

Fixes #65255
DeltaFile
+269-0mlir/include/mlir/Support/Complex.h
+256-0mlir/unittests/Support/ComplexTest.cpp
+15-15mlir/include/mlir/IR/BuiltinAttributes.td
+14-12mlir/include/mlir/IR/BuiltinAttributes.h
+8-8mlir/lib/IR/BuiltinAttributes.cpp
+4-4mlir/unittests/IR/AttributeTest.cpp
+566-396 files not shown
+579-4712 files

LLVM/project bb128b7openmp/device/src Reduction.cpp

[OpenMP][offload] Inline target reductions

Significantly reduces register usage and removes register spilling in
`offload/test/offloading/multiple-reductions.cpp`, for example.
Provides speedup of up to 5-10x for a lot of reductions in such a larger
setup.
DeltaFile
+16-5openmp/device/src/Reduction.cpp
+16-51 files

LLVM/project 325463foffload/test/offloading multiple_reductions.cpp

[OpenMP][offload] Add enhanced cross-team reduction test

Tests different patterns of OpenMP cross-team reductions, for multiple
data types.
If run with `LIBOMPTARGET_INFO=16`, shows current register spilling due
to dispatch jump chains (which grow for every reduction in the same
translation unit) for indirect function calls in the reduction runtime.
DeltaFile
+129-0offload/test/offloading/multiple_reductions.cpp
+129-01 files

LLVM/project 3ed76d0utils/bazel/llvm-project-overlay/libc BUILD.bazel, utils/bazel/llvm-project-overlay/libc/test/src/sys/socket BUILD.bazel

[Bazel] Fixes 7cea026 (#196033)

This fixes 7cea026109ab3308cafae38dc5b1b89d8770fdab.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+11-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+5-0utils/bazel/llvm-project-overlay/libc/test/src/sys/socket/BUILD.bazel
+16-02 files

LLVM/project cf24489llvm/lib/Target/AArch64 AArch64InstrInfo.cpp, llvm/unittests/Target/AArch64 InstSizes.cpp

[AArch64] Report accurate sizes for MOVaddr and MOVimm pseudos
DeltaFile
+89-0llvm/unittests/Target/AArch64/InstSizes.cpp
+28-16llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+117-162 files

LLVM/project 60edd8cllvm/lib/Target/AArch64 AArch64ExpandPseudo.cpp AArch64ExpandImm.cpp

[NFC][AArch64] Extract MOVaddr* expansion model into common header

This makes the expansion logic reusable by getInstSizeInBytes in a
follow-up patch.
DeltaFile
+742-0llvm/lib/Target/AArch64/AArch64ExpandPseudo.cpp
+0-722llvm/lib/Target/AArch64/AArch64ExpandImm.cpp
+75-56llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+42-0llvm/lib/Target/AArch64/AArch64ExpandPseudo.h
+0-35llvm/lib/Target/AArch64/AArch64ExpandImm.h
+10-9llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+869-8226 files not shown
+889-84212 files