LLVM/project 53cd4abllvm/lib/Target/AArch64 AArch64InstrInfo.cpp AArch64RedundantCondBranchPass.cpp, llvm/test/CodeGen/AArch64 arm64-shrink-wrapping.ll pr164181.ll

Revert "[AArch64] Run optimizeTerminators earlier too." (#171505)

Reverts llvm/llvm-project#170907

Causes crashes, see
https://github.com/llvm/llvm-project/pull/170907#issuecomment-3634271414
DeltaFile
+52-22llvm/test/CodeGen/AArch64/arm64-shrink-wrapping.ll
+37-29llvm/test/CodeGen/AArch64/pr164181.ll
+0-47llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+45-1llvm/lib/Target/AArch64/AArch64RedundantCondBranchPass.cpp
+24-12llvm/test/CodeGen/AArch64/block-placement-optimize-branches.ll
+16-9llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/aarch64_generated_funcs.ll.generated.expected
+174-1206 files not shown
+203-14812 files

LLVM/project 687986ellvm/docs CMake.rst

[Docs] Add documentation for LLVM_ENABLE_CURL (#170928)

DeltaFile
+6-0llvm/docs/CMake.rst
+6-01 files

LLVM/project 21147e7mlir/include/mlir/Dialect/OpenACC OpenACCUtilsTiling.h, mlir/lib/Dialect/OpenACC/Utils OpenACCUtilsTiling.cpp CMakeLists.txt

[mlir][acc] Add loop tiling utilities for OpenACC (#171490)

Add utilities in OpenACCUtilsTiling.h/.cpp to support tiling
transformations on acc.loop operations:

- uncollapseLoops: Expand collapsed loops with multiple IVs into nested
loop structures when tile count exceeds collapse count
- tileACCLoops: Transform loop nests into tile and element loops based
on provided tile sizes, with automatic resolution of unknown tile sizes
(tile(*) represented as -1)

These utilities prepare for the ACCLoopTiling pass which handles the
OpenACC loop tile directive.

---------

Co-authored-by: Vijay Kandiah <vkandiah at nvidia.com>
DeltaFile
+348-0mlir/unittests/Dialect/OpenACC/OpenACCUtilsTilingTest.cpp
+311-0mlir/lib/Dialect/OpenACC/Utils/OpenACCUtilsTiling.cpp
+83-0mlir/include/mlir/Dialect/OpenACC/OpenACCUtilsTiling.h
+3-0mlir/lib/Dialect/OpenACC/Utils/CMakeLists.txt
+1-0mlir/unittests/Dialect/OpenACC/CMakeLists.txt
+746-05 files

LLVM/project 29760celldb/source/Interpreter CommandInterpreter.cpp, lldb/test/API/functionalities/abbreviation TestAbbreviations.py

[lldb] Fix capitalization in ambiguous command error (#171519)

We follow LLVM's style guide for diagnostics, which instructs to start
the first sentence with a lowercase letter, and finish the last sentence
without a period, if it would end in one otherwise.
DeltaFile
+2-4lldb/source/Interpreter/CommandInterpreter.cpp
+1-1lldb/test/API/functionalities/abbreviation/TestAbbreviations.py
+1-1lldb/test/API/functionalities/ambigous_commands/TestAmbiguousCommands.py
+1-1lldb/test/API/functionalities/wrong_commands/TestWrongCommands.py
+5-74 files

LLVM/project 2da1699flang-rt/lib/runtime cudadevice.f90 __ppc_intrinsics.f90, flang/module cudadevice.f90 __ppc_intrinsics.f90

Merge branch 'main' into revert-170907-gh-a64-cbzwzrearly
DeltaFile
+0-2,242flang-rt/lib/runtime/cudadevice.f90
+2,242-0flang/module/cudadevice.f90
+1,911-0flang/module/__ppc_intrinsics.f90
+0-1,911flang-rt/lib/runtime/__ppc_intrinsics.f90
+0-1,122flang-rt/lib/runtime/mma.f90
+1,122-0flang/module/mma.f90
+5,275-5,275153 files not shown
+9,009-9,087159 files

LLVM/project dda715dbolt/lib/Core BinaryContext.cpp, bolt/test dwarf5-missing-dwo.c

[BOLT][DWARF] Improve reporting on missing DWOs (#171506)

List all required missing DWO files and report a summary with
recommendations on how to proceed.
DeltaFile
+20-5bolt/lib/Core/BinaryContext.cpp
+18-0bolt/test/dwarf5-missing-dwo.c
+38-52 files

LLVM/project 782f507lldb/source/Plugins/ObjectFile/wasm ObjectFileWasm.cpp ObjectFileWasm.h, lldb/test/Shell/Symtab symtab-wasm.test

[lldb][Wasm] Handle imports when parsing Wasm name sections (#170960)

LLDB can use the wasm name section to populate its symbol table and get
names for functions. However the index space used in the name section is
the "function index space" which includes imported as well as locally
defined functions.
DeltaFile
+74-10lldb/source/Plugins/ObjectFile/wasm/ObjectFileWasm.cpp
+47-27lldb/test/Shell/Symtab/Inputs/simple.wasm.yaml
+6-4lldb/test/Shell/Symtab/symtab-wasm.test
+1-0lldb/source/Plugins/ObjectFile/wasm/ObjectFileWasm.h
+128-414 files

LLVM/project 93404e0mlir/lib/Conversion/XeGPUToXeVM XeGPUToXeVM.cpp, mlir/test/Conversion/XeGPUToXeVM loadstore_matrix.mlir loadstoreprefetch.mlir

add more tests
DeltaFile
+159-90mlir/test/Conversion/XeGPUToXeVM/loadstore_matrix.mlir
+25-0mlir/test/Conversion/XeGPUToXeVM/loadstoreprefetch.mlir
+1-1mlir/lib/Conversion/XeGPUToXeVM/XeGPUToXeVM.cpp
+185-913 files

LLVM/project a9bcedfllvm/lib/Target/SPIRV SPIRVLegalizerInfo.cpp SPIRVPreLegalizer.cpp

[NFC][SPIRV] Fix breakage introduced by #170798 (#171513)

Adding support for i128 missed a few quirks of legalisation, which were
masked previously by early erroring out on bitwidth > 64. i128 uses
should be legal, we decide whether or not the resulting module is viable
(i.e. if the required extensions are present) in the ModuleAnalysis
pass.
DeltaFile
+9-8llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
+2-2llvm/lib/Target/SPIRV/SPIRVPreLegalizer.cpp
+11-102 files

LLVM/project 7062cd6clang/lib/Driver/ToolChains Clang.cpp, clang/lib/Frontend CompilerInvocation.cpp

[Matrix] Add a row\col major toggle in the clang driver (#167628)

fixes #167621

- define the new options in `Options.td` limit the naming to row-major
or column-major.
- In `ToolChains/Clang.cpp` limit the opt usage to only when
`-fenable-matrix` is used.

---------

Co-authored-by: Florian Hahn <flo at fhahn.com>
DeltaFile
+49-0clang/test/Driver/fmatrix-memory-layout.c
+36-0clang/unittests/Frontend/CompilerInvocationTest.cpp
+30-0clang/lib/Frontend/CompilerInvocation.cpp
+19-0clang/lib/Sema/SemaChecking.cpp
+13-0clang/test/Sema/matrix-col-major-builtin-disable.c
+12-0clang/lib/Driver/ToolChains/Clang.cpp
+159-08 files not shown
+190-014 files

LLVM/project 2765113llvm/lib/Target/AMDGPU BUFInstructions.td, llvm/test/Analysis/UniformityAnalysis/AMDGPU atomics.ll

AMDGPU: Drop and upgrade llvm.amdgcn.atomic.csub/cond.sub to atomicrmw (#105553)

These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
DeltaFile
+169-125llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
+0-270llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.global.atomic.csub.ll
+0-219llvm/test/CodeGen/AMDGPU/llvm.amdgcn.atomic.cond.sub.ll
+147-0llvm/test/Bitcode/amdgcn-atomic.ll
+0-57llvm/test/Analysis/UniformityAnalysis/AMDGPU/atomics.ll
+18-36llvm/lib/Target/AMDGPU/BUFInstructions.td
+334-70710 files not shown
+346-76216 files

LLVM/project 4af81d6llvm/lib/Transforms/IPO GlobalOpt.cpp, llvm/test/Transforms/GlobalOpt disable-globals-aa.ll

[GlobalOpt][profcheck] Mark as `unknown` the branch weights of global shrunk to boolean
DeltaFile
+24-4llvm/test/Transforms/GlobalOpt/disable-globals-aa.ll
+0-6llvm/utils/profcheck-xfail.txt
+4-1llvm/lib/Transforms/IPO/GlobalOpt.cpp
+28-113 files

LLVM/project 0a2e56dclang/include/clang/CIR/Dialect/Builder CIRBaseBuilder.h, clang/include/clang/CIR/Dialect/IR CIROps.td

[CIR] Add support for thread-local storage (TLS) (#168662)

This commit adds full support for thread-local storage variables in
ClangIR, including code generation, lowering to LLVM IR, and
comprehensive testing.

Changes include:
- Added CIR_TLSModel enum with 4 TLS models (GeneralDynamic,
LocalDynamic, InitialExec, LocalExec) to CIROps.td
- Extended GlobalOp with optional tls_model attribute
- Extended GetGlobalOp with thread_local unit attribute
- Added verification to ensure thread_local GetGlobalOp references
globals with tls_model set
- Implemented GetDefaultCIRTLSModel() and setTLSMode() in CIRGenModule
- Updated getAddrOfGlobalVar() to handle TLS access
- Removed MissingFeatures assertions for TLS operations
- Added lowering of GetGlobalOp with TLS to llvm.threadlocal.address
intrinsic
- Added lowering of GlobalOp with tls_model to LLVM thread_local globals

    [6 lines not shown]
DeltaFile
+34-5clang/lib/CIR/CodeGen/CIRGenModule.cpp
+29-0clang/test/CIR/CodeGen/tls.c
+16-1clang/include/clang/CIR/Dialect/IR/CIROps.td
+13-0clang/test/CIR/IR/invalid-tls.cir
+7-5clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+7-5clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+106-164 files not shown
+121-2110 files

LLVM/project 84b9e44clang/lib/CIR/CodeGen CIRGenFunction.cpp CIRGenFunction.h, clang/test/CIR/CodeGen kr-func-promote.c

[CIR] Add Function Argument Demotion support (#170915)

This PR migrates the Function Argument Demotion feature from the
incubator repository to upstream. The feature handles K&R-style function
parameters that are promoted (e.g., short->int, float->double) and
demotes them back to their declared types.

## Changes
- Add emitArgumentDemotion helper function for type demotion
- Create emitFunctionProlog function to handle function prologue setup
(addresses existing TODO to move parameter handling logic)
- Move parameter handling logic into emitFunctionProlog
- Add test case kr-func-promote.c to verify the feature

Tested: All CIR tests pass (320/321, 99.69%). The one unsupported test
is an expected failure.
DeltaFile
+72-20clang/lib/CIR/CodeGen/CIRGenFunction.cpp
+42-0clang/test/CIR/CodeGen/kr-func-promote.c
+5-0clang/lib/CIR/CodeGen/CIRGenFunction.h
+119-203 files

LLVM/project fb29a6ellvm/lib/Target/RISCV RISCVInstrInfoXSf.td

[RISCV] Fix formatting in RISCVInstrInfoXSf.td. NFC (#171500)

DeltaFile
+1-1llvm/lib/Target/RISCV/RISCVInstrInfoXSf.td
+1-11 files

LLVM/project 794218bclang/lib/ExtractAPI DeclarationFragments.cpp, clang/test/ExtractAPI typedef.c

[ExtractAPI] Format typedef params correctly (#171516)

Typically, pointer types are formatted in a way where the identifier
comes right after the type definition without a space separating them,
e.g. `int *foo`, where the type is `int *` and the identifier is `foo`.
However, if a type alias to a pointer type is used, the emitted
declaration fragments are incorrect due to the missing space between the
type and identifier, like in the below example:

```
typedef int *T;
// The declaration fragment contains `Tbar` instead of `T bar`
void foo(T bar);
```

This patch checks if pointer types are aliased, and inserts the space
correctly if so.

rdar://132022003
DeltaFile
+80-1clang/test/ExtractAPI/typedef.c
+4-1clang/lib/ExtractAPI/DeclarationFragments.cpp
+84-22 files

LLVM/project 9e2b8b0lldb/include/lldb/Interpreter CommandReturnObject.h, lldb/source/Commands CommandObjectMultiword.cpp

[lldb] Remove CommandReturnObject::AppendRawError (#171517)

Remove `CommandReturnObject::AppendRawError` and replace its two uses
with `AppendError`, which correctly prefixes the message with `error:`.
The comment for the method is outdated and the prefixing is clearly
desired in both situations.
DeltaFile
+0-9lldb/source/Interpreter/CommandReturnObject.cpp
+2-2lldb/test/Shell/Commands/command-wrong-subcommand-error-msg.test
+3-1lldb/test/API/functionalities/wrong_commands/TestWrongCommands.py
+1-1lldb/source/Interpreter/CommandInterpreter.cpp
+0-2lldb/include/lldb/Interpreter/CommandReturnObject.h
+1-1lldb/source/Commands/CommandObjectMultiword.cpp
+7-162 files not shown
+9-188 files

LLVM/project 2e4e026mlir/lib/Conversion/ArithAndMathToAPFloat ArithToAPFloat.cpp MathToAPFloat.cpp, mlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp

[mlir][math] Add FP software implementation lowering pass: math-to-apfloat
DeltaFile
+0-665mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+623-0mlir/lib/Conversion/ArithAndMathToAPFloat/ArithToAPFloat.cpp
+96-0mlir/lib/Conversion/ArithAndMathToAPFloat/MathToAPFloat.cpp
+49-0mlir/lib/Conversion/ArithAndMathToAPFloat/CMakeLists.txt
+39-0mlir/lib/Dialect/Func/Utils/Utils.cpp
+32-0mlir/test/Integration/Dialect/Math/CPU/test-apfloat-emulation.mlir
+839-66510 files not shown
+952-69116 files

LLVM/project 8eab1fdmlir/include/mlir/Conversion/MathToAPFloat MathToAPFloat.h, mlir/lib/Conversion/ArithAndMathToAPFloat ArithToAPFloat.cpp MathToAPFloat.cpp

[mlir][math] Add FP software implementation lowering pass: math-to-apfloat
DeltaFile
+0-665mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+623-0mlir/lib/Conversion/ArithAndMathToAPFloat/ArithToAPFloat.cpp
+96-0mlir/lib/Conversion/ArithAndMathToAPFloat/MathToAPFloat.cpp
+49-0mlir/lib/Conversion/ArithAndMathToAPFloat/CMakeLists.txt
+39-0mlir/lib/Dialect/Func/Utils/Utils.cpp
+21-0mlir/include/mlir/Conversion/MathToAPFloat/MathToAPFloat.h
+828-6657 files not shown
+877-69113 files

LLVM/project 488ebbcclang/lib/Driver/ToolChains Clang.cpp, clang/test/Driver clang_f_opts.c

update driver behavior and a test
DeltaFile
+8-0clang/test/Driver/clang_f_opts.c
+2-1clang/lib/Driver/ToolChains/Clang.cpp
+10-12 files

LLVM/project b0028camlir/include/mlir/Conversion/MathToAPFloat MathToAPFloat.h, mlir/lib/Conversion/ArithAndMathToAPFloat ArithToAPFloat.cpp MathToAPFloat.cpp

[mlir][math] Add FP software implementation lowering pass: math-to-apfloat
DeltaFile
+0-665mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+623-0mlir/lib/Conversion/ArithAndMathToAPFloat/ArithToAPFloat.cpp
+96-0mlir/lib/Conversion/ArithAndMathToAPFloat/MathToAPFloat.cpp
+49-0mlir/lib/Conversion/ArithAndMathToAPFloat/CMakeLists.txt
+39-0mlir/lib/Dialect/Func/Utils/Utils.cpp
+21-0mlir/include/mlir/Conversion/MathToAPFloat/MathToAPFloat.h
+828-6657 files not shown
+877-69113 files

LLVM/project 53fadc2flang-rt/lib/runtime cudadevice.f90 __ppc_intrinsics.f90, flang/module cudadevice.f90 __ppc_intrinsics.f90

Reapply "[Flang] Move builtin .mod generation into runtimes (Reapply #137828)"

This reverts commit d233e787f0adfa2acd2e6c67aa2c362f9cf47ab4.
DeltaFile
+0-2,242flang/module/cudadevice.f90
+2,242-0flang-rt/lib/runtime/cudadevice.f90
+1,911-0flang-rt/lib/runtime/__ppc_intrinsics.f90
+0-1,911flang/module/__ppc_intrinsics.f90
+1,122-0flang-rt/lib/runtime/mma.f90
+0-1,122flang/module/mma.f90
+5,275-5,27582 files not shown
+8,051-7,76988 files

LLVM/project c8b6bb7flang-rt/lib/runtime cudadevice.f90 __ppc_intrinsics.f90, flang/module cudadevice.f90 __ppc_intrinsics.f90

Revert "[Flang] Move builtin .mod generation into runtimes (Reapply #137828) …"

This reverts commit 7675fc79c802cf7f6a95660f6ee59bf6cb62102f.
DeltaFile
+2,242-0flang/module/cudadevice.f90
+0-2,242flang-rt/lib/runtime/cudadevice.f90
+1,911-0flang/module/__ppc_intrinsics.f90
+0-1,911flang-rt/lib/runtime/__ppc_intrinsics.f90
+1,122-0flang/module/mma.f90
+0-1,122flang-rt/lib/runtime/mma.f90
+5,275-5,27582 files not shown
+7,769-8,05188 files

LLVM/project d233e78flang-rt/lib/runtime cudadevice.f90 __ppc_intrinsics.f90, flang/module cudadevice.f90 __ppc_intrinsics.f90

Revert "[Flang] Move builtin .mod generation into runtimes (Reapply #137828) (#169638)"

This reverts commit 7675fc79c802cf7f6a95660f6ee59bf6cb62102f.

Requested in PR:
https://github.com/llvm/llvm-project/pull/169638#issuecomment-3634227707
DeltaFile
+2,242-0flang/module/cudadevice.f90
+0-2,242flang-rt/lib/runtime/cudadevice.f90
+1,911-0flang/module/__ppc_intrinsics.f90
+0-1,911flang-rt/lib/runtime/__ppc_intrinsics.f90
+1,122-0flang/module/mma.f90
+0-1,122flang-rt/lib/runtime/mma.f90
+5,275-5,27582 files not shown
+7,769-8,05188 files

LLVM/project 3310c0bllvm/lib/Transforms/Vectorize VPlan.h VPlanRecipes.cpp

[VPlan] Strip TODO to consolidate (ActiveLaneMask|Widen)PHI (#171392)

They cannot be consolidated, as WidenPHI is not a header PHI, while
ActtiveLaneMaskPHI is.
DeltaFile
+0-2llvm/lib/Transforms/Vectorize/VPlan.h
+0-2llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+0-42 files

LLVM/project d86bc19clang-tools-extra/clang-doc JSONGenerator.cpp, clang-tools-extra/test/clang-doc basic-project.mustache.test

[clang-doc] Do not serialize empty text comments (#169087)

DeltaFile
+0-97clang-tools-extra/test/clang-doc/basic-project.mustache.test
+40-3clang-tools-extra/clang-doc/JSONGenerator.cpp
+0-3clang-tools-extra/test/clang-doc/json/class.cpp
+40-1033 files

LLVM/project f29d060llvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize runtime-check-threshold-with-force-metadata.ll

Revert "[LV] Mark checks as never succeeding for high cost cutoff."

This reverts commit 8a115b6934a90441d77ea54af73e7aaaa1394b38.

This broke premerge. https://lab.llvm.org/staging/#/builders/192/builds/13326

/home/gha/llvm-project/clang/test/Frontend/optimization-remark-options.c:10:11: remark: loop not vectorized: cannot prove it is safe to reorder floating-point operations; allow reordering by specifying '#pragma clang loop vectorize(enable)' before the loop or by providing the compiler option '-ffast-math'
DeltaFile
+25-14llvm/test/Transforms/LoopVectorize/runtime-check-threshold-with-force-metadata.ll
+1-5llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+26-192 files

LLVM/project 64ee4bfcompiler-rt/lib/scudo/standalone tsd_shared.h tsd_exclusive.h, compiler-rt/lib/scudo/standalone/tests combined_test.cpp

[scudo] Refactor initialization of TSDs. (#169738)

Instead of getting a lock and then checking/modifying the Initialization
variable, make it an atomic. Doing this, we can remove one of the
mutexes in shared TSDs and avoid any potential lock contention in both
shared TSDs and exclusive TSDs if multiple threads do allocation
operations at the same time.

Add two new tests that make sure no crashes occur if multiple threads
try and do allocations at the same time.
DeltaFile
+88-0compiler-rt/lib/scudo/standalone/tests/combined_test.cpp
+28-22compiler-rt/lib/scudo/standalone/tsd_shared.h
+11-8compiler-rt/lib/scudo/standalone/tsd_exclusive.h
+127-303 files

LLVM/project 1c199c0llvm/test/CodeGen/AMDGPU maximumnum.bf16.ll minimumnum.bf16.ll, llvm/test/CodeGen/X86 wide-scalar-shift-by-byte-multiple-legalization.ll bitcnt-big-integer.ll

merge with main
DeltaFile
+17,522-20,773llvm/test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll
+8,857-10,952llvm/test/CodeGen/AMDGPU/maximumnum.bf16.ll
+8,840-10,957llvm/test/CodeGen/AMDGPU/minimumnum.bf16.ll
+4,091-0llvm/test/CodeGen/AMDGPU/atomicrmw_usub_sat.ll
+3,175-413llvm/test/CodeGen/X86/bitcnt-big-integer.ll
+1,541-1,541llvm/test/tools/llvm-mca/RISCV/SpacemitX60/vlseg-vsseg.s
+44,026-44,6362,732 files not shown
+160,939-117,3432,738 files

LLVM/project e3ff827compiler-rt/lib/sanitizer_common sanitizer_symbolizer_posix_libcdep.cpp, mlir/lib/Bytecode/Reader BytecodeReader.cpp

Merge branch 'main' into revert-170907-gh-a64-cbzwzrearly
DeltaFile
+4-3compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp
+3-2mlir/lib/Bytecode/Reader/BytecodeReader.cpp
+7-52 files