LLVM/project f146677llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp

[TargetLowering] Refactor expandDIVREMByConstant to share more code. NFC (#187582)

Make the (1 << HBitWidth) % Divisor == 1 path a special case within
the recently added chunk summing algorithm. This allows us to
share the trailing zero shifting code.

While there make some comment improvements and avoid creating
unnecessary nodes.
DeltaFile
+76-93llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+76-931 files

LLVM/project 34203a5llvm/docs RISCVUsage.rst

[RISCV][Docs] Removed 'specified in' text from SiFive custom instruction links. NFC (#187817)

The URL isn't printed, the text in the backticks is the link text.
DeltaFile
+3-3llvm/docs/RISCVUsage.rst
+3-31 files

LLVM/project 0adf4efflang/lib/Lower/OpenMP ClauseProcessor.cpp, flang/test/Lower/OpenMP declare-simd.f90

Fix declare simd linear stride rescaling and arg_types verifier

1. Rescale constant linear steps from source-level element counts to byte
   strides in Flang's processLinear(). For reference-like parameters
   (pointers or non-VALUE dummy arguments) with Linear or LinearRef ABI
   kind, the step must be multiplied by the element size in bytes. This
   matches Clang's rescaling in CGOpenMPRuntime.cpp. Val and UVal kinds
   are not rescaled as they describe value changes, not pointer strides.
   Var-strides are also not rescaled as the value is an argument index.

2. Add a verifier check in DeclareSimdOp to ensure 'arg_types' length
   matches the number of function arguments, preventing out-of-bounds
   access during MLIR-to-LLVM IR translation.

Also restructure processLinear() to compute stepOperand per-variable
instead of appending the same operand for all objects in the clause,
enabling per-variable rescaling.

Assisted with copilot.
DeltaFile
+49-6flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+17-17mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+7-7flang/test/Lower/OpenMP/declare-simd.f90
+8-0mlir/test/Dialect/OpenMP/invalid.mlir
+7-0mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+2-2mlir/test/Dialect/OpenMP/ops.mlir
+90-322 files not shown
+93-358 files

LLVM/project df9eb79clang/lib/CodeGen CodeGenTypes.cpp, clang/test/CodeGen builtins-extended-image.c builtins-image-load.c

[Clang][AMDGPU] Lower `__amdgpu_texture_t` to `<8 x i32>` instead of ptr adrspace(0) (#187774)

Fix the IR lowering for `__amdgpu_texture_t` to generate a single
256-bit load instead of a double indirection through a flat pointer.

Previously, `__amdgpu_texture_t` was lowered to `ptr addrspace(0)`
(64-bit flat pointer), which caused the double load and indirection.
With the same reproducer like #187697.

```c
#define TSHARP __constant uint *

// Old tsharp handling:
// #define LOAD_TSHARP(I) *(__constant uint8 *)I

#define LOAD_TSHARP(I) *(__constant __amdgpu_texture_t *)I

float4 test_image_load_1D(TSHARP i, int c) {
  return __builtin_amdgcn_image_load_1d_v4f32_i32(15, c, LOAD_TSHARP(i), 0, 0);

    [24 lines not shown]
DeltaFile
+220-264clang/test/CodeGen/builtins-extended-image.c
+210-252clang/test/CodeGen/builtins-image-load.c
+140-168clang/test/CodeGen/builtins-image-store.c
+5-5clang/test/CodeGen/amdgpu-image-rsrc-type-debug-info.c
+7-2clang/lib/CodeGen/CodeGenTypes.cpp
+582-6915 files

LLVM/project d818fa4mlir/python/mlir/dialects ext.py, mlir/test/python/dialects ext.py transform_op_interface.py

[MLIR][Python] Make init parameters follow the field definition order (#186574)

Currently, Python-defined operations automatically generate an
`__init__` function to serve as the operation builder. Previously, the
parameters of this `__init__` function followed a fairly complex set of
rules. For example:

* All result fields were moved to the front to align with other op
builders.
* Fields of `Optional` type were automatically moved to the end and
treated as keyword parameters.
* If the types of all result fields could be inferred automatically,
then all result fields were removed from the parameter list.
* Other than that, the parameter order followed the field definition
order.

These rules may seem reasonable, and they have worked well in practice,
but they have one major drawback: users cannot easily tell what the
actual `__init__` parameter list will look like when writing code,

    [28 lines not shown]
DeltaFile
+111-48mlir/python/mlir/dialects/ext.py
+72-31mlir/test/python/dialects/ext.py
+8-8mlir/test/python/dialects/transform_op_interface.py
+3-3mlir/test/python/integration/dialects/bf.py
+194-904 files

LLVM/project 93d256bllvm/lib/DebugInfo/PDB/Native TpiStreamBuilder.cpp, llvm/test/DebugInfo/PDB pdbdump-mergetypes.test

[llvm-pdbutil] Hash type records in yaml2pdb (#187593)

The TPI and IPI streams didn't include a hash map for the generated
types, because the types never got hashes. A hash map is necessary to
resolve forward references when dumping the PDB (checks for
`TpiStream::supportsTypeLookup` which checks the hash map).

With this PR, the hashes are generated. There's no good test that we do
this for the IPI stream as well, because it doesn't have forward
references.
DeltaFile
+64-0llvm/test/tools/llvm-pdbutil/Inputs/forward-refs2.yaml
+63-0llvm/test/tools/llvm-pdbutil/Inputs/forward-refs1.yaml
+35-0llvm/test/tools/llvm-pdbutil/merged-forward-refs.test
+9-4llvm/tools/llvm-pdbutil/llvm-pdbutil.cpp
+2-5llvm/lib/DebugInfo/PDB/Native/TpiStreamBuilder.cpp
+2-2llvm/test/DebugInfo/PDB/pdbdump-mergetypes.test
+175-111 files not shown
+176-137 files

LLVM/project c1df693llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp, llvm/test/CodeGen/RISCV urem-lkk.ll

[TargetLowering] Use legally typed shifts to split chunks in expandDIVREMByConstant. (#187567)

This replaces LegalVT with HiLoVT and LegalWidth with HBitWidth as
they are the same for all current uses.
    
Then we rewrite the shifts to operate on LL and LH.
    
There's a slight regression on RISC-V due to different node creation
order leading to different DAG combine order. I have other refactoring
I'd like to explore then I may try to fix that.
DeltaFile
+48-36llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+13-11llvm/test/CodeGen/RISCV/urem-lkk.ll
+61-472 files

LLVM/project c2ff8f4flang-rt CMakeLists.txt

Remove default-disable on GPGPU
DeltaFile
+1-9flang-rt/CMakeLists.txt
+1-91 files

LLVM/project 7d7cd74libc/shared/math atanbf16.h, libc/src/__support/math atanbf16.h

[libc][math][c23] Add atanbf16 function (#184019)

This PR intends to add atanbf16 higher math function for BFloat16 type
along with the tests.
DeltaFile
+105-0libc/src/__support/math/atanbf16.h
+56-0libc/test/src/math/atanbf16_test.cpp
+44-0libc/test/src/math/smoke/atanbf16_test.cpp
+26-0libc/shared/math/atanbf16.h
+21-0libc/src/math/atanbf16.h
+20-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+272-022 files not shown
+360-228 files

LLVM/project 49ebf38flang/lib/Semantics openmp-utils.cpp

Fix message for absent LOOPRANGE
DeltaFile
+1-1flang/lib/Semantics/openmp-utils.cpp
+1-11 files

LLVM/project 82eee26clang/bindings/python/clang cindex.py

[libclang/python] Fix Type.get_offset annotation (#187841)

As discussed in
https://github.com/llvm/llvm-project/pull/180876#discussion_r2934372753,
`Type.get_offset` can process `bytes` arguments as well. For consistency
with other functions taking `str` arguments, its type annotation is
adapted to reflect this.
DeltaFile
+1-1clang/bindings/python/clang/cindex.py
+1-11 files

LLVM/project 573bbefllvm/lib/Frontend/OpenMP OMPIRBuilder.cpp

[OpenMP][flang] Fix crash in host offload

Guard `getGridValue` in `OMPIRBuilder` to avoid reaching the
`unreachable` in `getGridValue` when offloading to host device without
an explicit num_threads clause.

Reproducer (`-fopenmp -fopenmp-targets=x86_64-unknown-linux-gnu`):
```
program test
  implicit none

  !$omp target
  !$omp end target
end program test
```

(Note: the linker still fails, but that's another issue.)
DeltaFile
+13-3llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+13-31 files

LLVM/project dac1c43flang-rt CMakeLists.txt, flang-rt/lib CMakeLists.txt

Support building no library
DeltaFile
+29-23flang-rt/lib/runtime/CMakeLists.txt
+21-12flang-rt/CMakeLists.txt
+10-5flang-rt/lib/CMakeLists.txt
+60-403 files

LLVM/project 4d058ae

[lldb] Fix LLVMSupportHTTP linkage against libLLVM (#187848)

Regression introduced in 39d6bb21804d21abe2fa0ec019919d72104827ac.

Signed-off-by: Michał Górny <mgorny at gentoo.org>
DeltaFile
+0-00 files

LLVM/project 849038aflang-rt CMakeLists.txt, flang-rt/lib CMakeLists.txt

Support building no library
DeltaFile
+26-23flang-rt/lib/runtime/CMakeLists.txt
+21-12flang-rt/CMakeLists.txt
+10-5flang-rt/lib/CMakeLists.txt
+57-403 files

LLVM/project efdb981flang/include/flang/Semantics openmp-utils.h, flang/lib/Semantics openmp-utils.cpp check-omp-loop.cpp

[flang][OpenMP] Provide reasons for calculated sequence length

If the length was limited by some factor, include the reason for what
caused the reduction.

Issue: https://github.com/llvm/llvm-project/issues/185287
DeltaFile
+33-22flang/lib/Semantics/openmp-utils.cpp
+9-7flang/lib/Semantics/check-omp-loop.cpp
+5-5flang/include/flang/Semantics/openmp-utils.h
+2-0flang/test/Semantics/OpenMP/loop-transformation-construct04.f90
+1-0flang/test/Semantics/OpenMP/loop-transformation-construct02.f90
+1-0flang/test/Semantics/OpenMP/fuse1.f90
+51-346 files

LLVM/project 6162403llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/RISCV split-vectorize-parent-for-copyables.ll

[SLP]Do not consider copyable node with SplitVectorize parent

If the copyables are schedulable and the parent node is plit vectorize,
need to skip the scheduling analysis for such nodes to avoid a compiler
crash
DeltaFile
+53-0llvm/test/Transforms/SLPVectorizer/RISCV/split-vectorize-parent-for-copyables.ll
+10-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+63-02 files

LLVM/project f5de28fclang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/Dialect/IR CIRDialect.cpp

[CIR] Add RecursiveMemoryEffects to region-bearing ops

Add the RecursiveMemoryEffects trait to cir.if, cir.case, loop ops
(cir.while/cir.do/cir.for), cir.ternary, cir.await,
cir.array.ctor, cir.array.dtor, and cir.try. Without this trait,
MLIR conservatively assumes unknown memory effects for ops with
regions, preventing DCE of ops whose bodies are provably pure.

Also fix a crash in ConditionOp::getSuccessorRegions where the
missing early return after the loop-op case would fall through to
cast<AwaitOp>(...), which asserts when the parent is a loop rather
than an await op.

Add tests verifying that region ops with pure bodies are eliminated
and ops with stores or calls are preserved, including two-level nested
propagation (cir.if inside cir.while).
DeltaFile
+304-0clang/test/CIR/Transforms/recursive-memory-effects.cir
+10-8clang/include/clang/CIR/Dialect/IR/CIROps.td
+1-0clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+315-83 files

LLVM/project acf9eedllvm/include/llvm/Object BBAddrMap.h

[Object] Fix issues in BBAddrMap.h (#187704)
DeltaFile
+6-9llvm/include/llvm/Object/BBAddrMap.h
+6-91 files

LLVM/project 656fce8clang/bindings/python/clang cindex.py, clang/bindings/python/tests/cindex test_version.py

[libclang/python] export libclang version to the bindings (#86931)

It's useful to know which clang library the python bindings are running.

---------

Co-authored-by: Vlad Serebrennikov <serebrennikov.vladislav at gmail.com>
DeltaFile
+11-0clang/bindings/python/tests/cindex/test_version.py
+6-0clang/bindings/python/clang/cindex.py
+2-0clang/docs/ReleaseNotes.rst
+19-03 files

LLVM/project 3b91061lldb/source/Plugins/SymbolLocator/SymStore CMakeLists.txt

[lldb] Fix linking liblldb in a dylib build after 39d6bb21804d21ab

Referencing libSupportHTTP under LINK_LIBS of add_lldb_library() pulls
in the static archive even in a build configuration with
LLVM_LINK_LLVM_DYLIB=On, where libSupportHTTP is part of libLLVM. This
patch moves it to LINK_COMPONENTS to fix the issue.

This is the same fix as in
036429881f8d3037894042c6268b2a94eac8c950, applied on another
library.
DeltaFile
+3-1lldb/source/Plugins/SymbolLocator/SymStore/CMakeLists.txt
+3-11 files

LLVM/project 1a9de93clang/include/clang/CIR/Dialect/IR CIROps.td, clang/test/CIR/Transforms bit.cir

[CIR] Add Involution trait to BitReverseOp and ByteSwapOp
DeltaFile
+20-0clang/test/CIR/Transforms/bit.cir
+4-0clang/include/clang/CIR/Dialect/IR/CIROps.td
+24-02 files

LLVM/project db143fbllvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[SLP][NFC]Use block number instead of pointer for stable sorting, NFC
DeltaFile
+3-3llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+3-31 files

LLVM/project a5dd902clang/lib/AST ASTContext.cpp DeclCXX.cpp, clang/lib/CIR/CodeGen CIRGenClass.cpp

[CIR] Implement isMemcpyEquivalentSpecialMember for trivial copy/move ctors
DeltaFile
+41-0clang/test/CIR/CodeGen/copy-constructor-memcpy.cpp
+7-31clang/lib/CodeGen/CGClass.cpp
+20-7clang/lib/CIR/CodeGen/CIRGenClass.cpp
+22-0clang/lib/AST/ASTContext.cpp
+6-5clang/test/CIR/CodeGen/cxx-special-member-attr.cpp
+11-0clang/lib/AST/DeclCXX.cpp
+107-438 files not shown
+121-6014 files

LLVM/project 2d01df1clang/lib/CIR/CodeGen CIRGenCall.cpp CIRGenModule.cpp, clang/test/CIR/CodeGen arg-attrs.cpp invoke-attrs.cpp

[CIR] Fix reference alignment to use pointee type (#186667)

getNaturalTypeAlignment on a reference type returned pointer alignment
instead of pointee alignment. Pass the pointee type with
forPointeeType=true to match traditional codegen's
getNaturalPointeeTypeAlignment behavior. Fix applies to both argument
and return type attribute construction paths.
DeltaFile
+27-4clang/test/CIR/CodeGen/arg-attrs.cpp
+9-7clang/lib/CIR/CodeGen/CIRGenCall.cpp
+7-0clang/lib/CIR/CodeGen/CIRGenModule.cpp
+2-4clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+3-3clang/test/CIR/CodeGen/invoke-attrs.cpp
+3-0clang/lib/CIR/CodeGen/CIRGenModule.h
+51-186 files

LLVM/project de00349llvm/lib/Transforms/InstCombine InstCombineSelect.cpp

Address comments
DeltaFile
+9-13llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+9-131 files

LLVM/project dba9d90llvm/lib/Transforms/InstCombine InstCombineSelect.cpp, llvm/test/Transforms/InstCombine nanless-canonicalize-combine.ll

InstCombine: Fold out nanless canonicalize pattern

Pattern match a wrapper around llvm.canonicalize which
weakens the semantics to not require quieting signaling
nans. Depending on the denormal mode and FP type, we can
either drop the pattern entirely or reduce it only to
a canonicalize call. I'm inventing this pattern to deal
with LLVM's lax canonicalization model in math library
code.

The math library code currently has explicit checks for
the denormal mode, and conditionally canonicalizes the
result if there is flushing. Semantically, this could be
directly replaced with a simple call to llvm.canonicalize,
but doing so would incur an additional cost when using
standard IEEE behavior. If we do not care about quieting
a signaling nan, this should be a no-op unless the denormal
mode may flush. This will allow replacement of the
conditional code with a zero cost abstraction utility

    [17 lines not shown]
DeltaFile
+51-155llvm/test/Transforms/InstCombine/nanless-canonicalize-combine.ll
+103-0llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+154-1552 files

LLVM/project 291ca72llvm/test/Transforms/InstCombine nanless-canonicalize-combine.ll

InstCombine: Add baseline test for nanless canonicalize combine
DeltaFile
+832-0llvm/test/Transforms/InstCombine/nanless-canonicalize-combine.ll
+832-01 files

LLVM/project c825177clang/lib/CIR/CodeGen CIRGenCall.cpp CIRGenModule.cpp, clang/test/CIR/CodeGen arg-attrs.cpp invoke-attrs.cpp

[CIR] Fix reference alignment to use pointee type

getNaturalTypeAlignment on a reference type returned pointer alignment
instead of pointee alignment. Pass the pointee type with
forPointeeType=true to match traditional codegen's
getNaturalPointeeTypeAlignment behavior. Fix applies to both argument
and return type attribute construction paths.
DeltaFile
+27-4clang/test/CIR/CodeGen/arg-attrs.cpp
+9-7clang/lib/CIR/CodeGen/CIRGenCall.cpp
+7-0clang/lib/CIR/CodeGen/CIRGenModule.cpp
+3-3clang/test/CIR/CodeGen/invoke-attrs.cpp
+2-4clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+3-0clang/lib/CIR/CodeGen/CIRGenModule.h
+51-186 files

LLVM/project 618b9b2llvm/lib/Frontend/OpenMP OMPIRBuilder.cpp

[OpenMP][flang] Fix crash in host offload

Guard `getGridValue` in `OMPIRBuilder` to avoid reaching the
`unreachable` in `getGridValue` when offloading to host device without
an explicit num_threads clause.

Reproducer (`-fopenmp -fopenmp-targets=x86_64-unknown-linux-gnu`):
```
program test
  implicit none

  !$omp target
  !$omp end target
end program test
```

(Note: the linker still fails, but that's another issue.)
DeltaFile
+13-3llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+13-31 files