LLVM/project 05a2b14llvm/include/llvm/Analysis IVDescriptors.h, llvm/lib/Analysis IVDescriptors.cpp

[LV] Optimize FindLast recurrences to FindIV (NFCI). (#177870)

This patch restructures Find(First|Last)IV handling. Instead of
differentiating between FindLast, FindFirstIV and FindLastIV up front,
this patch simplifies the logic in IVDescriptor to just identify the
FindLast pattern up-front.

It then adds a new VPlan transformation to optimize FindLast reductions
to FindIV reductions if there is a suitable sentinel value.
Find(Last|First)IV recurrence kinds to a single FindIV kind.

This is simpler and more accurate, given selecting the first/last
induction of the final IV reduction is directly controlled by the
corresponding recurrence kind of the ComputeReductionResult.

The new structure also allows further optimizations, like vectorizing
FindLastIV with another boolean reduction that tracks if the condition
in the loop was ever true, if there is no suitable sentinel value.

PR: https://github.com/llvm/llvm-project/pull/177870
DeltaFile
+95-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+2-91llvm/lib/Analysis/IVDescriptors.cpp
+10-57llvm/include/llvm/Analysis/IVDescriptors.h
+5-27llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+3-12llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+13-1llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+128-1884 files not shown
+138-19210 files

LLVM/project 3b8876bmlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinTypeInterfaces.h, mlir/lib/IR BuiltinAttributes.cpp BuiltinTypes.cpp

getter / iterator via interface
DeltaFile
+36-112mlir/lib/IR/BuiltinAttributes.cpp
+68-56mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+121-0mlir/lib/IR/BuiltinTypes.cpp
+26-32mlir/lib/IR/AsmPrinter.cpp
+34-0mlir/lib/IR/BuiltinTypeInterfaces.cpp
+24-0mlir/include/mlir/IR/BuiltinTypeInterfaces.h
+309-2004 files not shown
+332-21410 files

LLVM/project 5b5fdd0llvm/test/CodeGen/AArch64 clmul-fixed.ll, llvm/test/CodeGen/AMDGPU mad-mix.ll

Merge branch 'main' into users/usx95/01-29-revisit_handling_moved_origins
DeltaFile
+4,545-0llvm/test/CodeGen/AArch64/clmul-fixed.ll
+3,338-71llvm/test/CodeGen/AMDGPU/mad-mix.ll
+3,137-0llvm/test/CodeGen/NVPTX/atomicrmw-sm60.ll
+3,111-0llvm/test/CodeGen/NVPTX/atomicrmw-sm70.ll
+2,983-0llvm/test/CodeGen/NVPTX/atomicrmw-sm90.ll
+2,360-0llvm/test/MC/AMDGPU/gfx13_asm_sopc.s
+19,474-712,123 files not shown
+83,978-28,3112,129 files

LLVM/project 3ee7a2fllvm/test/CodeGen/AArch64 clmul-fixed.ll clmul-scalable.ll

Add clmul zext AArch64 lowering tests (#179641)

DeltaFile
+4,100-13llvm/test/CodeGen/AArch64/clmul-fixed.ll
+2,212-1,142llvm/test/CodeGen/AArch64/clmul-scalable.ll
+756-0llvm/test/CodeGen/AArch64/clmul.ll
+7,068-1,1553 files

LLVM/project d3e5e9bllvm/test/DebugInfo/Generic debuginfofinder-macros.ll

Simplify test.
DeltaFile
+17-18llvm/test/DebugInfo/Generic/debuginfofinder-macros.ll
+17-181 files

LLVM/project 80569fellvm/include/llvm/ADT FloatingPointMode.h

ADT: Mark DenormalMode comparison operators as constexpr (#179939)

Try to fix buildbot error with gcc.
DeltaFile
+5-7llvm/include/llvm/ADT/FloatingPointMode.h
+5-71 files

LLVM/project b9f3710clang/include/clang/CIR/Dialect Passes.td, clang/lib/CIR/Dialect/Transforms TargetLowering.cpp CXXABILowering.cpp

[CIR] Add TargetLowering pass (#179245)

This patch adds a new TargetLowering pass to the CIR pipeline. The new
pass is run immediately before CXXABILowering. This new pass does not
perform any heavy transformations yet -- for now it only converts sync
scopes attached to load and store operations according to the target
info, which was done in the LLVM lowering pass.

Related to #175968 .
DeltaFile
+68-0clang/lib/CIR/Dialect/Transforms/TargetLowering.cpp
+60-3clang/include/clang/CIR/Dialect/Passes.td
+6-20clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+10-13clang/lib/CIR/Dialect/Transforms/CXXABILowering.cpp
+15-5clang/test/CIR/CodeGen/atomic-scoped.c
+10-8clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerModule.cpp
+169-496 files not shown
+182-5912 files

LLVM/project f002505lldb/packages/Python/lldbsuite/test/tools/lldb-dap dap_server.py, lldb/test/API/tools/lldb-dap/stopped-events TestDAP_stopped_events.py main.cpp

[lldb-dap] Fix flaky TestDAP_stopped_events.py (#179689)

We are waiting for both stopped event at once.
We may not get both events within the (0.25 seconds) time interval to
fetch more events. Retry with the `DEFAULT TIMEOUT` if we got one of the
event.

Increase the `EVENT_QUIET_PERIOD`'s value for ASAN mode

Fixes #179648
DeltaFile
+40-14lldb/test/API/tools/lldb-dap/stopped-events/TestDAP_stopped_events.py
+4-1lldb/test/API/tools/lldb-dap/stopped-events/main.cpp
+1-1lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
+45-163 files

LLVM/project 2502e3bllvm/lib/AsmParser LLParser.cpp, llvm/test/Assembler denormal_fpenv.ll invalid_denormal_fpenv.ll

IR: Promote "denormal-fp-math" to a first class attribute (#174293)

Convert "denormal-fp-math" and "denormal-fp-math-f32" into a first
class denormal_fpenv attribute. Previously the query for the effective
denormal mode involved two string attribute queries with parsing. I'm
introducing more uses of this, so it makes sense to convert this
to a more efficient encoding. The old representation was also awkward
since it was split across two separate attributes. The new encoding
just stores the default and float modes as bitfields, largely avoiding
the need to consider if the other mode is set.

The syntax in the common cases looks like this:
  `denormal_fpenv(preservesign,preservesign)`
  `denormal_fpenv(float: preservesign,preservesign)`
  `denormal_fpenv(dynamic,dynamic float: preservesign,preservesign)`

I wasn't sure about reusing the float type name instead of adding a
new keyword. It's parsed as a type but only accepts float. I'm also
debating switching the name to subnormal to match the current

    [17 lines not shown]
DeltaFile
+378-0llvm/test/Bitcode/auto_upgrade_denormal_fp_math.ll
+295-0llvm/test/Assembler/denormal_fpenv.ll
+228-2llvm/test/Bitcode/compatibility.ll
+201-0llvm/test/Assembler/invalid_denormal_fpenv.ll
+57-57llvm/test/Transforms/Attributor/nofpclass-canonicalize.ll
+105-0llvm/lib/AsmParser/LLParser.cpp
+1,264-59230 files not shown
+2,520-1,034236 files

LLVM/project 96d2cb4llvm/lib/Target/DirectX DXILDataScalarization.cpp, llvm/lib/Target/DirectX/DXILWriter DXILBitcodeWriter.cpp

[LLVM][CodeGen][DirectX] Fix scalarisation when vector ConstantFP is used. (#172684)

When using -use-constant-fp-for-fixed-length-splat `splat (float C)`
becomes ConstantFP(C) rather than ConstantVector(C, C, C...).
DeltaFile
+26-0llvm/test/CodeGen/DirectX/scalarize-static-array-of-float-vectors.ll
+0-25llvm/test/CodeGen/DirectX/scalar-bug-117273.ll
+7-11llvm/lib/Target/DirectX/DXILDataScalarization.cpp
+1-1llvm/lib/Target/DirectX/DXILWriter/DXILBitcodeWriter.cpp
+34-374 files

LLVM/project f3bd1b9llvm/include/llvm/CodeGen TargetLoweringObjectFileImpl.h, llvm/lib/CodeGen TargetLoweringObjectFileImpl.cpp

[SystemZ][z/OS] Use the text section for jump tables (#179793)

Jump tables are read only data, and the text section is the best choice
for them.
DeltaFile
+34-0llvm/test/CodeGen/SystemZ/zos-jumptable.ll
+5-0llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+2-0llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
+41-03 files

LLVM/project f0a4a91clang/lib/CodeGen ItaniumCXXABI.cpp, clang/test/CodeGenCXX zos-typename.cpp

[SystemZ][z/OS] Support both EBCDIC & ASCII form of type_info::name() (#179687)

On z/OS, typename is stored as 2 encodings: EBCDIC (default system
encoding) followed by ASCII.
DeltaFile
+14-0clang/test/CodeGenCXX/zos-typename.cpp
+12-2clang/lib/CodeGen/ItaniumCXXABI.cpp
+26-22 files

LLVM/project 3726e8dmlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinTypeInterfaces.h, mlir/lib/IR BuiltinTypes.cpp BuiltinAttributes.cpp

getter / iterator via interface
DeltaFile
+153-0mlir/lib/IR/BuiltinTypes.cpp
+32-110mlir/lib/IR/BuiltinAttributes.cpp
+68-56mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+26-32mlir/lib/IR/AsmPrinter.cpp
+53-0mlir/lib/IR/BuiltinTypeInterfaces.cpp
+16-0mlir/include/mlir/IR/BuiltinTypeInterfaces.h
+348-1984 files not shown
+371-21210 files

LLVM/project 2d10684libcxx/docs/ReleaseNotes 23.rst, libcxx/include/__algorithm ranges_fold.h

[libcxx] Optimize `ranges::fold_left_with_iter` for segmented iterators (#177853)

Part of https://github.com/llvm/llvm-project/issues/102817.

This patch attempts to optimize the performance of
`ranges::fold_left_with_iter` for segmented iterators.

- before

```
# | rng::fold_left(vector<int>)/8             2.78 ns         2.78 ns    241953718
# | rng::fold_left(vector<int>)/32            12.2 ns         12.2 ns     57579851
# | rng::fold_left(vector<int>)/50            19.2 ns         19.2 ns     36487764
# | rng::fold_left(vector<int>)/8192          3226 ns         3226 ns       216811
# | rng::fold_left(vector<int>)/1048576     441842 ns       441839 ns         1592
# | rng::fold_left(deque<int>)/8              2.83 ns         2.83 ns    243888678
# | rng::fold_left(deque<int>)/32             16.6 ns         16.6 ns     42297458
# | rng::fold_left(deque<int>)/50             22.3 ns         22.3 ns     31387998
# | rng::fold_left(deque<int>)/8192           2492 ns         2492 ns       281637

    [26 lines not shown]
DeltaFile
+12-4libcxx/include/__algorithm/ranges_fold.h
+7-0libcxx/test/std/algorithms/alg.nonmodifying/alg.fold/left_folds.pass.cpp
+4-0libcxx/docs/ReleaseNotes/23.rst
+23-43 files

LLVM/project 823e3e0libc/src/__support/math exp.h expm1.h

[libc][math] Resolve size issues on baremetal and cleanup code. (#179707)

DeltaFile
+21-20libc/src/__support/math/exp.h
+17-18libc/src/__support/math/expm1.h
+16-18libc/src/__support/math/exp10.h
+24-9libc/src/__support/math/sincosf_utils.h
+14-13libc/src/__support/math/acosf.h
+12-14libc/src/__support/math/exp2.h
+104-92110 files not shown
+336-322116 files

LLVM/project 744827ellvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 pr179489.ll

[X86] Fixed truncated masked stores (#179853)

Fixes: #179489
DeltaFile
+57-0llvm/test/CodeGen/X86/pr179489.ll
+10-2llvm/lib/Target/X86/X86ISelLowering.cpp
+67-22 files

LLVM/project db07843clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

Comment
DeltaFile
+0-2clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+0-21 files

LLVM/project f0dcb2fmlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinTypeInterfaces.h, mlir/lib/IR BuiltinTypes.cpp BuiltinAttributes.cpp

getter / iterator via interface
DeltaFile
+153-0mlir/lib/IR/BuiltinTypes.cpp
+32-110mlir/lib/IR/BuiltinAttributes.cpp
+68-56mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+53-0mlir/lib/IR/BuiltinTypeInterfaces.cpp
+20-13mlir/lib/IR/AsmPrinter.cpp
+16-0mlir/include/mlir/IR/BuiltinTypeInterfaces.h
+342-1794 files not shown
+365-19310 files

LLVM/project 3e240e0llvm/include/llvm/IR DebugInfo.h, llvm/lib/Analysis ModuleDebugInfoPrinter.cpp

[DebugInfo] Add macro tracking support to DebugInfoFinder

Extend DebugInfoFinder to collect and expose macro debug information
(DIMacro and DIMacroFile nodes).

Also update ModuleDebugInfoPrinter to display macro information including
the macro type, name, value, and source location.
DeltaFile
+44-0llvm/lib/IR/DebugInfo.cpp
+32-0llvm/test/DebugInfo/Generic/debuginfofinder-macros.ll
+25-0llvm/lib/Analysis/ModuleDebugInfoPrinter.cpp
+8-0llvm/include/llvm/IR/DebugInfo.h
+109-04 files

LLVM/project 31a0195mlir/include/mlir/IR BuiltinTypeInterfaces.td BuiltinTypeInterfaces.h, mlir/lib/IR BuiltinTypes.cpp BuiltinAttributes.cpp

getter / iterator via interface
DeltaFile
+153-0mlir/lib/IR/BuiltinTypes.cpp
+32-110mlir/lib/IR/BuiltinAttributes.cpp
+68-56mlir/include/mlir/IR/BuiltinTypeInterfaces.td
+53-0mlir/lib/IR/BuiltinTypeInterfaces.cpp
+20-13mlir/lib/IR/AsmPrinter.cpp
+16-0mlir/include/mlir/IR/BuiltinTypeInterfaces.h
+342-1795 files not shown
+363-19011 files

LLVM/project 238ccd0llvm/include/llvm/ADT ScopeExit.h

[llvm][ADT] Mark scope_exit contructors [[nodiscard]] (#179720)

DeltaFile
+3-2llvm/include/llvm/ADT/ScopeExit.h
+3-21 files

LLVM/project df38810llvm/lib/Target/AMDGPU AMDGPUPromoteAlloca.cpp, llvm/test/CodeGen/AMDGPU promote-alloca-non-volatile-accesses.ll promote-alloca-vgpr-ratio.ll

[AMDGPU][PromoteAlloca] Set !amdgpu.non.volatile if promotion fails

I thought about doing this in a separate pass, but this pass already has all the necessary analysis for this to be a trivial addition.
We can simply set `!amdgpu.non.volatile`  if all other attempts to promote the operation failed.
DeltaFile
+45-0llvm/test/CodeGen/AMDGPU/promote-alloca-non-volatile-accesses.ll
+23-18llvm/test/CodeGen/AMDGPU/promote-alloca-vgpr-ratio.ll
+29-2llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+2-2llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
+99-224 files

LLVM/project 33cb864llvm/lib/Target/AMDGPU SIISelLowering.cpp

Rename to MOThreadPrivate
DeltaFile
+1-1llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+1-11 files

LLVM/project 28c76cfllvm/docs AMDGPUUsage.rst, llvm/lib/Target/AMDGPU SIISelLowering.cpp

Pull metadata impl at the top of the patch stack
DeltaFile
+218-0llvm/test/CodeGen/AMDGPU/memory-legalizer-non-volatile.ll
+23-0llvm/docs/AMDGPUUsage.rst
+2-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+243-03 files

LLVM/project e619011llvm/test/CodeGen/AMDGPU pei-build-spill-offset-overflow-gfx950.mir

fix test
DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/pei-build-spill-offset-overflow-gfx950.mir
+2-21 files

LLVM/project c493141llvm/test/Bitcode compatibility.ll, llvm/test/CodeGen/AMDGPU default-fp-mode.ll

Fix test merge
DeltaFile
+34-67llvm/test/Transforms/Attributor/nofpclass.ll
+9-9llvm/test/Transforms/Attributor/denormal-fp-math.ll
+8-8llvm/test/Bitcode/compatibility.ll
+5-5llvm/test/CodeGen/AMDGPU/default-fp-mode.ll
+2-2llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll
+0-3llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-maxnum.ll
+58-9412 files not shown
+69-10818 files

LLVM/project 54eb7b8llvm/test/Transforms/Attributor nofpclass.ll

Fix test
DeltaFile
+10-10llvm/test/Transforms/Attributor/nofpclass.ll
+10-101 files

LLVM/project 99858b1llvm/test/Bitcode auto_upgrade_denormal_fp_math.ll

Clean up upgrade test
DeltaFile
+8-5llvm/test/Bitcode/auto_upgrade_denormal_fp_math.ll
+8-51 files

LLVM/project 38ce2c1clang/test/CodeGen denormalfpmode-f32.c, llvm/docs LangRef.rst

Address comments
DeltaFile
+49-48llvm/test/Bitcode/auto_upgrade_denormal_fp_math.ll
+12-6llvm/test/Assembler/invalid_denormal_fpenv.ll
+4-13mlir/test/Target/LLVMIR/Import/function-attributes.ll
+9-5llvm/lib/AsmParser/LLParser.cpp
+8-4llvm/docs/LangRef.rst
+6-6clang/test/CodeGen/denormalfpmode-f32.c
+88-826 files not shown
+103-9612 files

LLVM/project a204d2dllvm/include/llvm-c Core.h, llvm/lib/IR Core.cpp

Add C API to set this
DeltaFile
+34-0llvm/include/llvm-c/Core.h
+30-0llvm/unittests/IR/AttributesTest.cpp
+16-0llvm/lib/IR/Core.cpp
+80-03 files