LLVM/project 3f0d0b7llvm/test/CodeGen/AMDGPU machine-sink-cycle.mir global_atomics_scan_fmin.ll

[AMDGPU] Take into account amdgpu-waves-per-eu in getRegPressureLimit

The minimum occupancy computed by `getOccupancyWithWorkGroupSizes`
doesn't take into account that the user may have provided a
low-occupancy target through the amdgpu-waves-per-eu attribute.

Use getWavesPerEU which gives the proper occupancy bounds.

When the user specifies a small amdgpu-waves-per-eu range (like "1,1"), this
results in higher vpgr limits.
DeltaFile
+106-294llvm/test/CodeGen/AMDGPU/machine-sink-cycle.mir
+48-51llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmin.ll
+48-51llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmax.ll
+34-50llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
+10-10llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll
+6-6llvm/test/CodeGen/AMDGPU/licm-regpressure.mir
+252-4621 files not shown
+254-4637 files

LLVM/project 8b75dd2llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

InstCombine: Stop preserving undef in SimplifyDemandedFPClass

If we know there are no valid values, fold to poison. Previously this
would leave values that started as undef alone.
DeltaFile
+17-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+2-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+19-22 files

LLVM/project 21ba594llvm/test/CodeGen/AMDGPU licm-regpressure.mir

Pre-commit test: [AMDGPU] Take into account amdgpu-waves-per-eu in getRegPressureLimit
DeltaFile
+301-2llvm/test/CodeGen/AMDGPU/licm-regpressure.mir
+301-21 files

LLVM/project 34f434ellvm/test/Transforms/PreISelIntrinsicLowering/AMDGPU lit.local.cfg

[NFC][AMDGPU] Add missing `lit.local.cfg` to `PreISelIntrinsicLowering` tests (#178154)

Add `lit.local.cfg` to restrict the `PreISelIntrinsicLowering/AMDGPU`
tests to AMDGPU only.

These tests were previously being run for all targets.
DeltaFile
+2-0llvm/test/Transforms/PreISelIntrinsicLowering/AMDGPU/lit.local.cfg
+2-01 files

LLVM/project 9e09c4ellvm/test/Transforms/InstCombine simplify-demanded-fpclass-log.ll

InstCombine: Add more log nnan/ninf log intrinsic inference tests

These got lost in various merges.
DeltaFile
+110-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-log.ll
+110-01 files

LLVM/project e74e970flang/lib/Optimizer/OpenMP DoConcurrentConversion.cpp, flang/test/Transforms/DoConcurrent multiple_iteration_ranges.f90

[flang][OpenMP][DoConcurrent] Add `collapse` clause to generated `omp.loop_nest` op (#178138)

Adds the collpase clause to the generated loop nest both on host and
device.
DeltaFile
+2-0flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
+1-1flang/test/Transforms/DoConcurrent/multiple_iteration_ranges.f90
+3-12 files

LLVM/project 0fbcafbllvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

InstCombine: Apply demanded mask at recursion limit in SimplifyDemandedFPClass

This fixes missed flag inference in some cases, due to not inferring
no-nan result implies no-nan source. Also start treating explicit nofpclass
attributes as a leaf value, like a constant or argument.
DeltaFile
+5-3llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+1-1llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+6-42 files

LLVM/project b7c68c3llvm/test/CodeGen/AMDGPU release-vgprs.mir

[AMDGPU] Add test for a bug in the early release VGPRs optimization (#178141)

DeltaFile
+63-0llvm/test/CodeGen/AMDGPU/release-vgprs.mir
+63-01 files

LLVM/project 00fb401llvm/test/Transforms/InstCombine simplify-demanded-fpclass-exp.ll

InstCombine: Add a few more tests for SimplifyDemandedeFPClass exp handling (#178147)

These got lost in various merges. Test a few cases where flags are
inferred from context.
DeltaFile
+40-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-exp.ll
+40-01 files

LLVM/project eaf6c14clang/www get_started.html hacking.html

[Clang] Remove gnuwin32 documentation references (#177557)

Remove the documentation references to GnuWin32. The project is no
longer maintained, and as LLVM is now using Git, `llvm-lit` is now using
the GNU core utilities packaged with it rather than requiring a separate
installation.

This appears to have been on
[discourse](https://discourse.llvm.org/t/gnuwin32-alternatives-for-tests-of-msvc-build/42846/3)
but not implemented yet.
DeltaFile
+6-12clang/www/get_started.html
+5-7clang/www/hacking.html
+11-192 files

LLVM/project 7287f95utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[bazel] Port ec57636ae447247683716c00437552645a52ba68 (#178151)

DeltaFile
+13-1utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+13-11 files

LLVM/project 3278ff7clang/lib/Analysis/LifetimeSafety LifetimeAnnotations.cpp, clang/test/Sema warn-lifetime-safety.cpp warn-lifetime-analysis-nocfg.cpp

[LifetimeSafety] Fix 'clang::lifetimebound' ignored on template method definition (#178000)

Closes https://github.com/llvm/llvm-project/issues/177798.

Same pattern with `getTemplateInstantiationPattern()` for attrs checking
is already used in other LLVM places, e.g.:


https://github.com/llvm/llvm-project/blob/b19238d0d0a7d026de8e2ad28775db57afccb01d/clang/lib/CodeGen/CodeGenModule.cpp#L2830-L2839
DeltaFile
+42-1clang/test/Sema/warn-lifetime-safety.cpp
+12-5clang/lib/Analysis/LifetimeSafety/LifetimeAnnotations.cpp
+1-2clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+55-83 files

LLVM/project e8f9501llvm/test/Transforms/InstCombine simplify-demanded-fpclass-exp.ll

InstCombine: Add a few more tests for SimplifyDemandedeFPClass exp handling

These got lost in various merges. Test a few cases where flags are inferred
from context.
DeltaFile
+40-0llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-exp.ll
+40-01 files

LLVM/project ec57636mlir/include/mlir/Dialect/Bufferization/Extensions AllExtensions.h ShardingExtensions.h, mlir/include/mlir/Dialect/Tensor/IR ShardingInterfaceImpl.h

[mlir][shard, bufferization] Adding sharding extensions for bufferization ops (#177378)

Adding trivial sharding support for `bufferization.alloc_tensor`,
`bufferization.dealloc_tensor` and
`bufferization.materialize_in_destination`.

include/mlir/Dialect/Tensor/IR/ShardingInterfaceImpl.h -> mlir/include/mlir/Dialect/Bufferization/Extensions/ShardingExtensions.h

---------

Co-authored-by: Adam Siemieniuk <adam.siemieniuk at intel.com>
DeltaFile
+55-0mlir/test/Dialect/Bufferization/shard-partition.mlir
+33-0mlir/lib/Dialect/Bufferization/Extensions/ShardingExtensions.cpp
+30-0mlir/include/mlir/Dialect/Bufferization/Extensions/AllExtensions.h
+26-0mlir/lib/Dialect/Bufferization/Extensions/CMakeLists.txt
+0-23mlir/include/mlir/Dialect/Tensor/IR/ShardingInterfaceImpl.h
+22-0mlir/include/mlir/Dialect/Bufferization/Extensions/ShardingExtensions.h
+166-234 files not shown
+187-2510 files

LLVM/project 08654adflang/lib/Lower/OpenMP OpenMP.cpp Clauses.cpp, flang/test/Lower/OpenMP num-threads-dims.f90

[OpenMP][MLIR] Add num_threads clause with dims modifier support (#171767)

PR adds support of openmp 6.1 feature num_threads with dims modifier.
llvmIR translation for num_threads with dims modifier is marked as NYI.
DeltaFile
+61-0flang/test/Lower/OpenMP/num-threads-dims.f90
+35-4mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+13-5mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+7-7flang/lib/Lower/OpenMP/OpenMP.cpp
+10-3flang/lib/Lower/OpenMP/Clauses.cpp
+12-0mlir/test/Dialect/OpenMP/ops.mlir
+138-195 files not shown
+165-2911 files

LLVM/project 13c2934clang-tools-extra/clang-tidy/bugprone UseAfterMoveCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Add invalidation function name to bugprone-use-after-move (#178042)

Make clearer messages because of reports
https://github.com/llvm/llvm-project/pull/170346#issuecomment-3798583117.

---------

Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
DeltaFile
+15-7clang-tools-extra/test/clang-tidy/checkers/bugprone/use-after-move.cpp
+7-5clang-tools-extra/clang-tidy/bugprone/UseAfterMoveCheck.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+27-123 files

LLVM/project 995ad3cflang/lib/Lower/OpenMP OpenMP.cpp Clauses.cpp, flang/test/Lower/OpenMP thread-limit-dims.f90

[OpenMP][MLIR] Add thread_limit with dims modifier support
DeltaFile
+61-0flang/test/Lower/OpenMP/thread-limit-dims.f90
+35-3mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+17-6mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+8-7flang/lib/Lower/OpenMP/OpenMP.cpp
+10-3flang/lib/Lower/OpenMP/Clauses.cpp
+13-0mlir/test/Dialect/OpenMP/ops.mlir
+144-195 files not shown
+169-2911 files

LLVM/project 56666d9llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, polly/lib/External/isl/include/isl typed_cpp.h cpp.h

Merge branch 'main' into users/meinersbur/runtimes_resource-dir
DeltaFile
+47,161-55,379llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+52,760-0polly/lib/External/isl/include/isl/typed_cpp.h
+17,188-14,558llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+12,842-18,547llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+30,864-0polly/lib/External/isl/include/isl/cpp.h
+11,654-16,786llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+172,469-105,2703,291 files not shown
+517,937-324,8963,297 files

LLVM/project 443e4e8cmake/Modules GetToolchainDirs.cmake

Address review comments by @petrhosek
DeltaFile
+4-7cmake/Modules/GetToolchainDirs.cmake
+4-71 files

LLVM/project fcb96d8llvm/test/CodeGen/AArch64 aarch64-mixed-ptr-sizes.ll stack-probing-dynamic.ll, llvm/test/CodeGen/AArch64/GlobalISel pr57349.ll

[AArch64][GlobalISel] Remove -global-isel-abort=2 from a number of tests. NFC

This cleans up some -global-isel-abort=2 uses, either removing the unnecessary
flags or cleaning up the tests that use them.
DeltaFile
+66-26llvm/test/CodeGen/AArch64/aarch64-mixed-ptr-sizes.ll
+6-4llvm/test/CodeGen/AArch64/stack-probing-dynamic.ll
+7-1llvm/test/CodeGen/AArch64/win64-fpowi.ll
+5-2llvm/test/CodeGen/AArch64/vararg-tallcall.ll
+3-1llvm/test/CodeGen/AArch64/GlobalISel/pr57349.ll
+2-2llvm/test/CodeGen/AArch64/itofp.ll
+89-3612 files not shown
+101-4818 files

LLVM/project 4946906llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp, llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2 exec_mode3.ll

[SPIRV] Emit intrinsics for globals only in function that references them

In the SPIRV backend, the SPIRVEmitIntrinscs::processGlobalValue
function adds intrinsic calls for every global variable of the module,
on every function.

These intrinsics are used to keep track of global variables, their types and
initializers.

In SPIRV everything is an instruction (even globals/constants). We currently
represent these global entities as individual instructions on every function.
Later, the `SPIRVModuleAnalysis` collects these entities and maps function _local_ registers
to _global_ registers. The `SPIRVAsmPrinter` is in charge of mapping back the _local_
registers to the appropiate _global_ register.

These instructions associated with global entities on functions that do not reference them leads
to a bloated intermediate representation and high memory consumption (as it happend
in https://github.com/llvm/llvm-project/issues/170339).


    [25 lines not shown]
DeltaFile
+48-38llvm/test/CodeGen/SPIRV/pointers/fun-with-aggregate-arg-in-const-init.ll
+46-30llvm/test/CodeGen/SPIRV/extensions/SPV_KHR_float_controls2/exec_mode3.ll
+38-2llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+15-15llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_faddfsub_vec_float16.ll
+15-15llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_fminfmax_vec_float16.ll
+162-1005 files

LLVM/project 8dfca82libcxx/include/__algorithm equal.h

[libc++][NFC] Don't use std::distance in std::equal (#177113)

We don't need to use `std::distance`, since we know for a fact that we
have random access iterators in that place. Instead, we can just
subtract the iterators, avoiding a bunch of template machinery and
imrpoving compile times a bit.
DeltaFile
+1-4libcxx/include/__algorithm/equal.h
+1-41 files

LLVM/project 7687a14llvm/docs GettingInvolved.rst

[docs] Update ics for my office hours
DeltaFile
+1-1llvm/docs/GettingInvolved.rst
+1-11 files

LLVM/project c2d510fflang/include/flang/Optimizer/Builder HLFIRTools.h, flang/include/flang/Optimizer/Dialect FIROpsSupport.h

[flang] fix DIR IVDEP for array assignments inside loops (#177940)

The access attribute set on hlfir.assign for arrays was lost in
InlineHLFIRAssign.cpp. This patch propagates it to the creates loads and
stores.
DeltaFile
+22-10flang/lib/Optimizer/Builder/HLFIRTools.cpp
+20-0flang/test/Lower/ivdep-array.f90
+7-4flang/include/flang/Optimizer/Builder/HLFIRTools.h
+6-1flang/lib/Optimizer/HLFIR/Transforms/InlineHLFIRAssign.cpp
+5-0flang/include/flang/Optimizer/Dialect/FIROpsSupport.h
+2-1flang/lib/Lower/Bridge.cpp
+62-161 files not shown
+63-177 files

LLVM/project 14cd8f0clang/lib/AST/ByteCode InterpState.h State.h

[clang][bytecode][NFC] Clean up InterpState includes (#178130)

DeltaFile
+0-7clang/lib/AST/ByteCode/InterpState.h
+1-0clang/lib/AST/ByteCode/State.h
+1-72 files

LLVM/project e3284b9llvm/lib/Transforms/Utils LowerMemIntrinsics.cpp, llvm/test/Transforms/PreISelIntrinsicLowering/AMDGPU expand-mem-intrinsics.ll

[LowerMemIntrinsics][AMDGPU] Propagate Debug Value (#178131)

Propagate debug value to expanded loops for `memcpy`, `memmove` and
`memset` intrinsics.
DeltaFile
+330-0llvm/test/Transforms/PreISelIntrinsicLowering/AMDGPU/expand-mem-intrinsics.ll
+34-6llvm/lib/Transforms/Utils/LowerMemIntrinsics.cpp
+364-62 files

LLVM/project ba833a6mlir/include/mlir/Dialect/EmitC/Transforms Transforms.h, mlir/lib/Dialect/EmitC/IR EmitC.cpp

Revert "[mlir][emitc] Fix recurring operands in expression (#175535)"

This reverts commit 4a50d99a50ef10da020cc7de6d9f10a07398b25a.

Fails the buildbot.
DeltaFile
+0-52mlir/lib/Dialect/EmitC/Transforms/Transforms.cpp
+10-34mlir/lib/Dialect/EmitC/IR/EmitC.cpp
+1-23mlir/test/Dialect/EmitC/ops.mlir
+0-19mlir/test/Dialect/EmitC/form-expressions.mlir
+0-13mlir/test/Dialect/EmitC/invalid_ops.mlir
+0-4mlir/include/mlir/Dialect/EmitC/Transforms/Transforms.h
+11-1451 files not shown
+11-1467 files

LLVM/project b232970libcxx/include/__ranges subrange.h join_with_view.h, libcxx/test/libcxx/ranges/range.adaptors/range.join.with nodiscard.verify.cpp

[libc++][ranges] Updated `[[nodiscard]]` implementation for `subrange` and `join_with_view` (#176936)

Added or removed `[[nodiscard]]` according to the guidelines and updated
the tests.

 - https://libcxx.llvm.org/CodingGuidelines.html
 - https://wg21.link/range.subrange
 -  https://wg21.link/range.join.with.view

Towards #172124
DeltaFile
+24-65libcxx/test/libcxx/ranges/range.adaptors/range.join.with/nodiscard.verify.cpp
+75-0libcxx/test/libcxx/ranges/range.utility/range.subrange/nodiscard.verify.cpp
+5-5libcxx/include/__ranges/subrange.h
+2-3libcxx/include/__ranges/join_with_view.h
+106-734 files

LLVM/project 8488263compiler-rt/lib/builtins CMakeLists.txt, compiler-rt/lib/builtins/aarch64 sme-libc-opt-memcpy-memmove-sve.S sme-libc-opt-memcpy-memmove.S

[compiler-rt][aarch64][sme] Add SVE/FP variant of `__arm_sc_memcpy` (#127093)

When SVE is available use the `-sve` variant of memcpy from AOR for
`__arm_sc_memcpy`. From:
https://github.com/ARM-software/optimized-routines/blob/71e36403858ab3ff743fcde336fb31890e57af7e/string/aarch64/memcpy-sve.S

This implementation uses FPR/ZPR load/store instructions to do the copy,
so should not cause memory hazards if called in streaming mode (with the
memory later being accessed in the streaming mode with SVE/SME
instructions).

The implementation has been slightly modified from AOR to use local
labels (matching other compiler-rt functions) but still passes the
memcpy and memmove tests from AOR.
DeltaFile
+180-0compiler-rt/lib/builtins/aarch64/sme-libc-opt-memcpy-memmove-sve.S
+3-0compiler-rt/lib/builtins/aarch64/sme-libc-opt-memcpy-memmove.S
+1-1compiler-rt/lib/builtins/CMakeLists.txt
+184-13 files

LLVM/project c8010daclang/docs/analyzer checkers.rst

[analyzer][docs] Add basic description of checker 'core.CallAndMessage' (#177179)

The checker had very little documentation. Now a more detailed (but
still not much) description of the features and options is added.
DeltaFile
+43-1clang/docs/analyzer/checkers.rst
+43-11 files