LLVM/project ecf4927lldb/source/Host/windows ProcessLauncherWindows.cpp

[lldb] Fix Windows build after 6b51e26d39fa (#170917)

DeltaFile
+1-1lldb/source/Host/windows/ProcessLauncherWindows.cpp
+1-11 files

LLVM/project 93d64a5llvm/lib/Target/SPIRV SPIRVModuleAnalysis.cpp SPIRVLegalizerInfo.cpp, llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector atomicrmw_faddfsub_vec_float16.ll atomicrmw_fminfmax_vec_float16.ll

[SPIRV] Add `<2 x half>` and `<4 x half>` atomics via `SPV_NV_shader_atomic_fp16_vector` (#170213)

This adds support for the `SPV_NV_shader_atomic_fp16_vector` extension,
and then uses it to enable lowering of atomic add, sub, min and max on 2
and 4 component vectors of FP16, which are rather common options in ML
workloads. Even though `bfloat16` also works in practice, we do not
enable it since it's not specified in the extension (which might need
updating / promoting to KHR at least). A `TODO` is also inserted in
`SPIRVModuleAnalysis.cpp' regarding the need to upgrade its ample usage
of `report_fatal_error`; I have a WiP patch for that, but it still needs
a bit of baking. Finally, a paired patch will be necessary in the
Translator, as it's not aware of the extension either - I'll update this
review to reference the PR once I create it.
DeltaFile
+47-0llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_faddfsub_vec_float16.ll
+45-0llvm/test/CodeGen/SPIRV/extensions/SPV_NV_shader_atomic_fp16_vector/atomicrmw_fminfmax_vec_float16.ll
+41-0llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.cpp
+3-2llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
+3-1llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+3-0llvm/lib/Target/SPIRV/SPIRVSymbolicOperands.td
+142-32 files not shown
+146-38 files

LLVM/project 4c50d83clang/test/Driver darwin-link-libcxx.cpp

Require darwin
DeltaFile
+3-0clang/test/Driver/darwin-link-libcxx.cpp
+3-01 files

LLVM/project 3ac8417clang/lib/Driver/ToolChains Darwin.cpp, clang/test/Driver darwin-link-libcxx.cpp experimental-library-flag.cpp

Reapply "[clang][Darwin] Prefer the toolchain-provided libc++.dylib if there i…"

This reverts commit 12a532cc430c3b89483ce9cc89bbfc7bea8541e5.
DeltaFile
+81-0clang/test/Driver/darwin-link-libcxx.cpp
+41-3clang/lib/Driver/ToolChains/Darwin.cpp
+15-0compiler-rt/cmake/config-ix.cmake
+5-3clang/test/Driver/experimental-library-flag.cpp
+2-2clang/test/Driver/darwin-header-search-libcxx.cpp
+0-0clang/test/Driver/Inputs/basic_darwin_toolchain_static/usr/lib/libc++experimental.a
+144-85 files not shown
+144-811 files

LLVM/project 9dc9c14.github CODEOWNERS

[NFC] Become CODEOWNER of AMDGPULowerBufferFatPointers (#167953)

DeltaFile
+3-0.github/CODEOWNERS
+3-01 files

LLVM/project 7470d72llvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp AArch64Subtarget.h, llvm/test/Transforms/LoopUnroll/AArch64 apple-unrolling.ll

[AArch64] Add isAppleMLike helper to check for M cores and aligned CPUs. (#170553)

Add a new isAppleMLike helper, that returns true if the core is part of
the Apple M core family or Apple A14 or later. Used to apply cost
decisions consistently to those groups of cores.

The function is now a single place to update when new cores are added.
It also makes sure we apply unrolling decisions for newer Apple cores to
Apple A17.

PR: https://github.com/llvm/llvm-project/pull/170553
DeltaFile
+2-201llvm/test/Transforms/LoopUnroll/AArch64/apple-unrolling.ll
+4-13llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+15-0llvm/lib/Target/AArch64/AArch64Subtarget.h
+1-10llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+22-2244 files

LLVM/project 12a532cclang/lib/Driver/ToolChains Darwin.cpp, clang/test/Driver darwin-link-libcxx.cpp experimental-library-flag.cpp

Revert "[clang][Darwin] Prefer the toolchain-provided libc++.dylib if there i…"

This reverts commit 190b8d0b4f19e1c3d68c5d153ec7be71a9969192.
DeltaFile
+0-81clang/test/Driver/darwin-link-libcxx.cpp
+3-41clang/lib/Driver/ToolChains/Darwin.cpp
+0-15compiler-rt/cmake/config-ix.cmake
+3-5clang/test/Driver/experimental-library-flag.cpp
+2-2clang/test/Driver/darwin-header-search-libcxx.cpp
+0-0clang/test/Driver/Inputs/basic_darwin_toolchain_static/usr/lib/libc++experimental.a
+8-1445 files not shown
+8-14411 files

LLVM/project 03c3716flang/include/flang/Parser parse-tree.h, flang/lib/Semantics canonicalize-do.cpp check-omp-loop.cpp

[flang][OpenMP] Reject END DO on construct that crosses label-DO (#169714)

In a label-DO construct where two or more loops share the same
teminator, an OpenMP construct must enclose all the loops if an
end-directive is present. E.g.

```
  do 100 i = 1,10
!$omp do
    do 100 j = 1,10
    100 continue
!$omp end do    ! Error, but ok if this line is removed
```

Fixes https://github.com/llvm/llvm-project/issues/169536.
DeltaFile
+28-5flang/lib/Semantics/canonicalize-do.cpp
+19-0flang/lib/Semantics/check-omp-loop.cpp
+1-1flang/include/flang/Parser/parse-tree.h
+1-1flang/test/Parser/OpenMP/atomic-label-do.f90
+1-1flang/test/Parser/OpenMP/cross-label-do.f90
+1-0flang/test/Semantics/OpenMP/loop-association.f90
+51-86 files

LLVM/project d07af13llvm/lib/MC GOFFObjectWriter.cpp

Fix formatting
DeltaFile
+1-2llvm/lib/MC/GOFFObjectWriter.cpp
+1-21 files

LLVM/project acb9742lldb/source/Commands CommandObjectBreakpoint.cpp

[lldb] Fix a warning

This patch fixes:

  lldb/source/Commands/CommandObjectBreakpoint.cpp:1266:21: error:
  unused variable 'expr' [-Werror,-Wunused-variable]
DeltaFile
+0-1lldb/source/Commands/CommandObjectBreakpoint.cpp
+0-11 files

LLVM/project 4febd61clang/test/CodeGenObjC expose-direct-method-consumed.m

fix mac test
DeltaFile
+0-11clang/test/CodeGenObjC/expose-direct-method-consumed.m
+0-111 files

LLVM/project 5d714bdclang/lib/CodeGen CGCall.cpp, clang/test/CodeGen lifetime-invoke-c.c

Add cleanups for the error path and add more tests
DeltaFile
+12-13clang/test/CodeGenCXX/aggregate-lifetime-invoke.cpp
+5-7clang/test/CodeGen/lifetime-invoke-c.c
+3-0clang/lib/CodeGen/CGCall.cpp
+20-203 files

LLVM/project aa32d94clang/lib/CodeGen CGCall.cpp

Avoid checking NoLifetimeMarkersForTemporaries
DeltaFile
+5-4clang/lib/CodeGen/CGCall.cpp
+5-41 files

LLVM/project 2560fa6clang/test/CodeGen lifetime-invoke-c.c

Fix test checks
DeltaFile
+2-2clang/test/CodeGen/lifetime-invoke-c.c
+2-21 files

LLVM/project 8be2cadclang/lib/CodeGen CGCall.cpp

Update comment to be more accurate
DeltaFile
+3-5clang/lib/CodeGen/CGCall.cpp
+3-51 files

LLVM/project 78523ecclang/test/CodeGenCXX aggregate-lifetime-invoke.cpp

Update invoke test for tighter lifetimes
DeltaFile
+12-14clang/test/CodeGenCXX/aggregate-lifetime-invoke.cpp
+12-141 files

LLVM/project 2513476clang/lib/CodeGen CGCall.cpp CGCall.h, clang/test/CodeGen stack-usage-lifetimes.c

[clang] Use tighter lifetime bounds for C temporary arguments

In C, consecutive statements in the same scope are under
CompoundStmt/CallExpr, while in C++ they typically fall under
CompoundStmt/ExprWithCleanup. This leads to different behavior with
respect to where pushFullExprCleanUp inserts the lifetime end markers
(e.g., at the end of scope).

For these cases, we can track and insert the lifetime end markers right
after the call completes. Allowing the stack space to be reused
immediately. This partially addresses #109204 and #43598 for improving
stack usage.
DeltaFile
+89-0clang/test/CodeGen/stack-usage-lifetimes.c
+15-5clang/lib/CodeGen/CGCall.cpp
+19-0clang/lib/CodeGen/CGCall.h
+1-1clang/test/CodeGenCXX/stack-reuse-miscompile.cpp
+124-64 files

LLVM/project 5daad5bclang/docs ReleaseNotes.rst, clang/lib/CodeGen CGCall.cpp

[clang] Limit lifetimes of temporaries to the full expression (#170517)

We have several issues describing suboptimal stack usage related to the
lifetimes of temporary objects, such as #68747, #43598, and #109204.

Previously, https://reviews.llvm.org/D74094 tried to address this. In
that review, a few issues were brought up, particularly a concern about
the lifetimes of the temporaries needing to be extended to end of the
full expression. While there are arguably more optimal lifetime bounds
we could enforce, for now we can conservatively make them extend to the
end of the full expression, and later refine the optimization to use
tighter bounds (or perhaps a better mechanism in the middle end?).

Fixes #68747

Co-authored-by: Nick Desaulniers <nick.desaulniers at gmail.com>
Co-authored-by: Erik Pilkington <erik.pilkington at gmail.com>

---------

    [2 lines not shown]
DeltaFile
+98-0clang/test/CodeGen/lifetime-call-temp.c
+50-0clang/test/CodeGen/lifetime-invoke-c.c
+47-0clang/test/CodeGenCXX/aggregate-lifetime-invoke.cpp
+21-1clang/lib/CodeGen/CGCall.cpp
+19-0clang/test/CodeGenCXX/amdgcn-call-with-aggarg.cc
+9-0clang/docs/ReleaseNotes.rst
+244-14 files not shown
+261-110 files

LLVM/project 8c08de3flang/lib/Semantics check-omp-loop.cpp, flang/test/Semantics/OpenMP loop-association.f90

Fix typo in error message
DeltaFile
+1-1flang/test/Semantics/OpenMP/loop-association.f90
+1-1flang/lib/Semantics/check-omp-loop.cpp
+2-22 files

LLVM/project bb17dfamlir/include/mlir/Transforms RegionUtils.h, mlir/lib/Dialect/Bufferization/Transforms EmptyTensorElimination.cpp

[mlir][bufferization] Enable moving dependent values in eliminate-empty-tensors (#169718)

Currently empty tensor elimination by constructing a SubsetExtractionOp
to match a SubsetInsertionOp at the end of a DPS chain will fail if any
operands required by the insertion op don't dominate the insertion point
for the extraction op.

This change improves the transformation by attempting to move all pure
producers of required operands to the insertion point of the extraction
op. In the process this improves a number of tests for empty tensor
elimination.
DeltaFile
+89-87mlir/test/Transforms/move-operation-deps.mlir
+80-19mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-empty-tensor-elimination.mlir
+12-5mlir/lib/Transforms/Utils/RegionUtils.cpp
+8-2mlir/lib/Dialect/Bufferization/Transforms/EmptyTensorElimination.cpp
+3-1mlir/test/lib/Transforms/TestTransformsOps.td
+2-1mlir/include/mlir/Transforms/RegionUtils.h
+194-1156 files

LLVM/project 5506a4dflang/lib/Semantics check-omp-loop.cpp

Remove unused variable
DeltaFile
+0-1flang/lib/Semantics/check-omp-loop.cpp
+0-11 files

LLVM/project ee781e8flang/lib/Semantics check-omp-loop.cpp canonicalize-do.cpp, flang/test/Parser/OpenMP cross-label-do.f90

The END DO crossing loop boundary is always illegal
DeltaFile
+16-13flang/lib/Semantics/check-omp-loop.cpp
+6-5flang/lib/Semantics/canonicalize-do.cpp
+2-3flang/test/Parser/OpenMP/cross-label-do.f90
+0-1flang/test/Semantics/OpenMP/loop-association.f90
+24-224 files

LLVM/project 699e634llvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp, llvm/lib/MC MCAsmInfoGOFF.cpp

Remove loop with type information
DeltaFile
+0-12llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+3-3llvm/test/CodeGen/SystemZ/zos-symbol-1.ll
+0-4llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp
+2-2llvm/test/CodeGen/SystemZ/zos-section-1.ll
+1-1llvm/test/CodeGen/SystemZ/zos-section-2.ll
+0-1llvm/lib/MC/MCAsmInfoGOFF.cpp
+6-236 files

LLVM/project 29fa151mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir] Fix a warning

This patch fixes:

  mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp:2666:10: error:
  unused variable 'v4i32' [-Werror,-Wunused-variable]
DeltaFile
+2-1mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+2-11 files

LLVM/project 5dfd9c4mlir/include/mlir/Dialect/AMDGPU/IR AMDGPU.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir][amdgpu] Add lowering for make_dma_descriptor (#169955)

* Adds initial lowering for make_dma_descriptor supporting tensors of
rank 2.
* Adds folders for make_dma_descriptor allowing statically known
operands to be folded into attributes.
* Add AllElementTypesMatch<["lds", "global"]> to make_dma_base.
* Rename pad to pad_amount
* Rename pad_every to pad_interval
DeltaFile
+343-1mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+150-0mlir/test/Conversion/AMDGPUToROCDL/gfx1250.mlir
+66-21mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
+24-24mlir/test/Dialect/AMDGPU/ops.mlir
+32-5mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
+19-0mlir/test/Dialect/AMDGPU/amdgpu-make-dma-descriptor-fold.mlir
+634-511 files not shown
+640-567 files

LLVM/project 6b51e26lldb/include/lldb/Host FileAction.h, lldb/source/Host/common FileAction.cpp

[lldb][NFCI] Remove FileAction::GetPath (#170764)

This method puts strings into the ConstString pool and vends them as
llvm::StringRefs. Most of the uses only require a `std::string` or a
`const char *`. This can be achieved without wasting memory.
DeltaFile
+24-15lldb/source/Host/macosx/objcxx/Host.mm
+3-3lldb/source/Target/Target.cpp
+0-4lldb/source/Host/common/FileAction.cpp
+2-2lldb/source/Host/posix/ProcessLauncherPosixFork.cpp
+0-2lldb/include/lldb/Host/FileAction.h
+29-265 files

LLVM/project ad94fdellvm/include/llvm/MC MCSymbolGOFF.h, llvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp

Remove MCSA_WeakReference/MCSA_Global from the loop
DeltaFile
+4-3llvm/include/llvm/MC/MCSymbolGOFF.h
+1-3llvm/lib/MC/MCSymbolGOFF.cpp
+0-3llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+1-0llvm/lib/MC/MCAsmInfoGOFF.cpp
+6-94 files

LLVM/project c5bdc21llvm/include/llvm/IR DebugLoc.h, llvm/lib/IR DebugLoc.cpp

[NFC][LLVM] Minor code cleanup in DebugLoc (#170757)

Remove indentation of code in llvm namespace in header file. Remove {}
around single statement if in .cpp file.
DeltaFile
+236-237llvm/include/llvm/IR/DebugLoc.h
+3-5llvm/lib/IR/DebugLoc.cpp
+239-2422 files

LLVM/project f57f338flang/include/flang/Optimizer/Dialect/CUF CUFOps.td, flang/lib/Lower CUDA.cpp Allocatable.cpp

[flang][cuda] Add double descriptor information in allocate/deallocate operations (#170901)

After https://github.com/llvm/llvm-project/pull/169740, the allocate and
deallocate cuf operation can be converted later. Update the way to
recognize double descriptor case by adding this information directly on
the operation itself.
DeltaFile
+2-24flang/lib/Optimizer/Transforms/CUDA/CUFAllocationConversion.cpp
+18-0flang/test/Lower/CUDA/cuda-allocatable.cuf
+14-0flang/lib/Lower/CUDA.cpp
+6-6flang/test/Fir/CUDA/cuda-allocate.fir
+7-3flang/lib/Lower/Allocatable.cpp
+5-4flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td
+52-371 files not shown
+55-377 files

LLVM/project ad1edc9mlir/lib/Analysis/DataFlow IntegerRangeAnalysis.cpp, mlir/test/Interfaces/InferIntRangeInterface infer-int-range-test-ops.mlir

[mlir][IntegerRangeAnalysis] Handle multi-dimensional loops (#170765)

Since LoopLikeInterface has (for some time) been extended to handle
multiple induction variables (and thus lower and upper bounds), handle
those bounds one at a time.
DeltaFile
+45-43mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp
+16-0mlir/test/Interfaces/InferIntRangeInterface/infer-int-range-test-ops.mlir
+61-432 files