LLVM/project 4911812llvm/lib/Target/AMDGPU AMDGPUMCInstLower.cpp, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp AMDGPUBaseInfo.h

[AMDGPU] Add asm comments if setreg changes MSBs
DeltaFile
+45-0llvm/test/CodeGen/AMDGPU/vgpr-setreg-mode-swar.mir
+19-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+12-5llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+7-0llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+2-1llvm/test/CodeGen/AMDGPU/code-size-estimate.ll
+85-65 files

LLVM/project b02ef5aclang/include/clang/Analysis/Scalable/Serialization SerializationFormat.h, clang/lib/Analysis/Scalable/Serialization/JSONFormat JSONFormatImpl.cpp

[clang][ssaf] Add ssaf-format to validate and convert summaries

This PR introduces the `ssaf-format` command-line tool, which validates
and converts translation-unit (TU) and link-unit (LU) summaries between
registered serialization formats in the SSAF framework. After the
serialization format registry and the JSON format were introduced, there
was no standalone tool to inspect, validate, or convert summary files
outside of a full compilation pipeline. `ssaf-format` fills that gap: it
serves as both a format validator (read without writing) and a format
converter (read then write to a different format or path).
DeltaFile
+483-0clang/tools/ssaf-format/SSAFFormat.cpp
+14-0clang/tools/ssaf-format/CMakeLists.txt
+9-0clang/test/Analysis/Scalable/ssaf-format/list.test
+7-0clang/include/clang/Analysis/Scalable/Serialization/SerializationFormat.h
+6-0clang/unittests/Analysis/Scalable/Registries/MockSerializationFormat.cpp
+6-0clang/lib/Analysis/Scalable/Serialization/JSONFormat/JSONFormatImpl.cpp
+525-05 files not shown
+537-011 files

LLVM/project f35042aflang/lib/Optimizer/OpenACC/Support FIROpenACCOpsInterfaces.cpp RegisterOpenACCExtensions.cpp, flang/test/Transforms/OpenACC offload-target-verifier.fir

[flang][openacc] Attach IndirectGlobalAccessModel to fir.use_stmt (#185767)

In some cases, `fir.use_stmt` operation can end up in offload region
like in acc routine for example. Make sure we can validate the symbols
associated with the `fir.use_stmt` operation.
DeltaFile
+34-0flang/test/Transforms/OpenACC/offload-target-verifier.fir
+17-0flang/lib/Optimizer/OpenACC/Support/FIROpenACCOpsInterfaces.cpp
+2-0flang/lib/Optimizer/OpenACC/Support/RegisterOpenACCExtensions.cpp
+53-03 files

LLVM/project 8d0c686clang/lib/Headers/hlsl hlsl_alias_intrinsics.h, clang/lib/Sema SemaHLSL.cpp

[HLSL][DXIL][SPIRV] Added WaveActiveBitOr HLSL intrinsic (#165156)

Adds the WaveActiveBitOr intrinsic from issue #99167. This intrinsic
required a bit more work than the last intrinsics that I have done.

There are some peculiarities, which I verified with dxcompiler:
- WaveActiveBitOr only works on uint and uint64_t, no other types are
allowed
- There is no 16 bit version of WaveActiveBitOr

Followed the checklist:
- [x] Implement WaveActiveBitOr clang builtin,
- [x] Link WaveActiveBitOr clang builtin with hlsl_intrinsics.h
- [x] Add sema checks for WaveActiveBitOr to
CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
- [x] Add codegen for WaveActiveBitOr to EmitHLSLBuiltinExpr in
CGBuiltin.cpp
- [x] Add codegen tests to
clang/test/CodeGenHLSL/builtins/WaveActiveBitOr.hlsl

    [15 lines not shown]
DeltaFile
+82-0clang/test/CodeGenHLSL/builtins/WaveActiveBitOr.hlsl
+34-0clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+32-0llvm/test/CodeGen/SPIRV/hlsl-intrinsics/WaveActiveBitOr.ll
+27-0clang/lib/Sema/SemaHLSL.cpp
+23-0clang/test/SemaHLSL/BuiltIns/WaveActiveBitOr-errors.hlsl
+22-0llvm/lib/Target/DirectX/DXIL.td
+220-010 files not shown
+284-216 files

LLVM/project 12fcde1clang/lib/CodeGen CGExpr.cpp, clang/test/CodeGenHLSL/resources CBufferMatrixSingleSubscriptSwizzle.hlsl

[Matrix] Copy Row data from padded cbuffer offsets before swizzle (#185346)

fixes https://github.com/llvm/llvm-project/issues/184849

The fix is just to copy the data before a swizzle can happen
DeltaFile
+28-0clang/test/CodeGenHLSL/resources/CBufferMatrixSingleSubscriptSwizzle.hlsl
+10-3clang/lib/CodeGen/CGExpr.cpp
+38-32 files

LLVM/project 9d65b65clang/lib/Headers/hlsl hlsl_intrinsics.h hlsl_alias_intrinsics.h, clang/test/CodeGenHLSL/builtins mul.hlsl

[HLSL][Matrix] Add `half` type overloads to `mul` and exercise them (#185506)

PR #184882 was missing `half` type-specific overloads for `mul`. 
This PR introduces `half` type-specific overloads for `mul` and
additional codegen tests for the half type.
Also added f16 tests for the lowering of llvm.matrix.multiply.

The offload test suite already has a `mul.fp16` test for exercising half
types at runtime, so no change is needed there.

Assisted-by: claude-opus-4.6
DeltaFile
+79-0llvm/test/CodeGen/DirectX/matrix-multiply.ll
+58-2clang/test/CodeGenHLSL/builtins/mul.hlsl
+41-2clang/lib/Headers/hlsl/hlsl_intrinsics.h
+15-0clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+1-1clang/test/SemaHLSL/BuiltIns/mul-errors.hlsl
+194-55 files

LLVM/project 048106bclang/lib/Headers/hlsl hlsl_alias_intrinsics.h

[HLSL] Fix intrinsics header file 16 bit attribute macro to use version 6.2 (#185757)

There have been a couple builtins declared in a header file that specify
16 bit availability for shader model 6.0.
This is incorrect, it should be 6.2.
This bug was propagated for many of the waveops, and should be
corrected.

Fixes https://github.com/llvm/llvm-project/issues/185756
DeltaFile
+32-32clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+32-321 files

LLVM/project a4244bcllvm/test/CodeGen/X86 sdiv_fix_sat.ll scmp.ll

[LegalizeTypes] Emit FSHL/FSHR from ExpandShiftByConstant when Legal. (#180888)

This avoids needing to combine the SHL/SHR/OR pattern later.
    
This improves code quality on RISC-V where our slx/srx instructions
clobber the destination register but we don't have an immediate form.
We can't recover the original direction from the SHL/SHR/OR pattern
and we can't commute it during the TwoAddressInstruction pass like X86
due to the shift amount being in a register.
DeltaFile
+198-202llvm/test/CodeGen/X86/sdiv_fix_sat.ll
+200-185llvm/test/CodeGen/X86/scmp.ll
+172-173llvm/test/CodeGen/X86/pr43820.ll
+55-53llvm/test/CodeGen/X86/fold-tied-op.ll
+54-52llvm/test/CodeGen/X86/shift-i256.ll
+44-62llvm/test/CodeGen/X86/vector-sext.ll
+723-72723 files not shown
+1,094-1,10429 files

LLVM/project ffdf216clang/docs ReleaseNotes.rst, clang/include/clang/Analysis/Analyses UnsafeBufferUsage.h

Revert "[Clang][UnsafeBufferUsage] Warn about two-arg string_view constructors. (#180471)" (#185692)

This reverts commit 75b2ea57d5f4a5ae0de1b3ca1ca7eec464811b45.
Makes clang assert, see:
https://github.com/llvm/llvm-project/pull/180471#issuecomment-4033081814
DeltaFile
+0-131clang/lib/Analysis/UnsafeBufferUsage.cpp
+0-44clang/test/SemaCXX/warn-unsafe-buffer-usage-string-view.cpp
+3-33clang/lib/Sema/AnalysisBasedWarnings.cpp
+1-5clang/include/clang/Basic/DiagnosticSemaKinds.td
+0-4clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
+0-3clang/docs/ReleaseNotes.rst
+4-2202 files not shown
+4-2238 files

LLVM/project a585f45lldb/test/API/functionalities/data-formatter/data-formatter-objc TestDataFormatterObjCNSDate.py

[lldb] Make date test handle host-target time difference (#185759)

It seems there may be a formatter bug when there's a time zone
difference between the target machine being debugged, and the host the
debugger is running on.
DeltaFile
+2-1lldb/test/API/functionalities/data-formatter/data-formatter-objc/TestDataFormatterObjCNSDate.py
+2-11 files

LLVM/project 687e66cclang/tools/libclang CMakeLists.txt, lldb/source/API CMakeLists.txt

build: adjust LLDB and clang library naming on Windows (#185084)

Ensure that use of the GNU driver does not change the library name on
Windows. We would check the build tools being MSVC rather than targeting
Windows to select the output name.
DeltaFile
+1-1clang/tools/libclang/CMakeLists.txt
+1-1lldb/source/API/CMakeLists.txt
+2-22 files

LLVM/project 67094a4llvm/lib/Target/X86 X86ISelLowering.cpp X86SelectionDAGInfo.cpp, llvm/test/CodeGen/X86 vector-half-conversions.ll

[X86] Fix assertion when lowering FP_ROUND (#185562)

443ce5569ee9854cfef1139cf6b9cf05165e0902 caused us to start hitting
assertions with non-standard vector widths (<3 x float>) in this case
now that node types are actually enforced. There was a place in
X86ISelLowering.cpp where we just passed along a 64-bit integer whereas
other places constructing a CVTPS2PH node specifically construct a new
integer.
DeltaFile
+34-0llvm/test/CodeGen/X86/vector-half-conversions.ll
+8-4llvm/lib/Target/X86/X86ISelLowering.cpp
+0-2llvm/lib/Target/X86/X86SelectionDAGInfo.cpp
+42-63 files

LLVM/project 08cef69lldb/unittests/DAP TestUtilities.h VariablesTest.cpp, lldb/unittests/TestingSupport TestUtilities.cpp TestUtilities.h

[lldb] Consolidating platform support checks in tests. (#184656)

Moving the platform support check into
`lldb/unittests/TestingSupport/TestUtilities.h` so it can be reused
across tests.

Also skipping 'VariablesTest' cases that load a core dump if the
platform is not supported.
DeltaFile
+69-0lldb/unittests/DAP/TestUtilities.h
+17-42lldb/unittests/DAP/VariablesTest.cpp
+3-56lldb/unittests/DAP/TestBase.h
+52-1lldb/unittests/TestingSupport/TestUtilities.cpp
+8-40lldb/unittests/DAP/TestBase.cpp
+15-0lldb/unittests/TestingSupport/TestUtilities.h
+164-1392 files not shown
+171-1468 files

LLVM/project 639f47cllvm/lib/Target/RISCV RISCVInstrInfo.cpp

[RISCV] Use RISCVCC::getInverseBranchOpcode in RISCVInstrInfo::reverseBranchCondition. NFC (#185752)
DeltaFile
+18-82llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+18-821 files

LLVM/project 6dbcbccllvm/cmake/modules HandleLLVMOptions.cmake

[SystemZ] Disable PCH on z/OS. (#185750)

The compiler supports PCH in principle, but there are occasionally
hickups. Therefore it is better to disable PCH by default.
DeltaFile
+7-0llvm/cmake/modules/HandleLLVMOptions.cmake
+7-01 files

LLVM/project 32c59aeclang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/CodeGen CIRGenTypes.cpp

[CIR] Remove diagnostic when handling incomplete record types (#185715)

Following the example of classic codegen, we were checking for
incomplete record types in function signatures and issuing a diagnostic
in a place where it appeared that the type conversion needed to be
delayed. However, because CIR defers ABI processing until after codegen,
we don't actually need special handling for incomplete types.

This change removes the diagnostic and adds a comment explaining the
difference in behavior.
DeltaFile
+36-0clang/test/CIR/CodeGen/convert-incomplete-type.cpp
+8-12clang/lib/CIR/CodeGen/CIRGenTypes.cpp
+0-1clang/include/clang/CIR/MissingFeatures.h
+44-133 files

LLVM/project 3878131clang/lib/Driver/ToolChains Flang.cpp Flang.h, flang/lib/Optimizer/CodeGen CodeGen.cpp CMakeLists.txt

[FLANG][MLIR][OpenMP] add MathToNVVM conversion pass to NVPTX MLIR (#180060)

This Commit adds the MLIR MathToNVVM conversion pass to flang's
NVPTX codegen lowering Math and Arith operations to libdevice library calls.
This allows support for calls to Fortran math intrinsics for OpenMP offload
for NVIDIA Targets. To support this support for -nogpulib was added for
NVIDIA targets

Fix #147023  
Fix #179347
DeltaFile
+338-0flang/test/Lower/OpenMP/math-nvptx.f90
+41-0clang/lib/Driver/ToolChains/Flang.cpp
+8-8flang/test/Driver/omp-driver-offload.f90
+7-1flang/lib/Optimizer/CodeGen/CodeGen.cpp
+3-0clang/lib/Driver/ToolChains/Flang.h
+1-0flang/lib/Optimizer/CodeGen/CMakeLists.txt
+398-96 files

LLVM/project afdfbd2flang/lib/Optimizer/Transforms/CUDA CUFPredefinedVarToGPU.cpp, flang/test/Fir/CUDA predefined-variables.mlir

[flang][cuda] Support predefined conversion in inlined function (#185723)

Only fir.declare at top level were converted. Update the pass to loop
through all fir.declare operations.
DeltaFile
+48-0flang/test/Fir/CUDA/predefined-variables.mlir
+2-2flang/lib/Optimizer/Transforms/CUDA/CUFPredefinedVarToGPU.cpp
+50-22 files

LLVM/project 7615f58llvm/lib/Target/AArch64 AArch64BranchTargets.cpp AArch64.h

[NewPM] Port AArch64BranchTargets (#185585)
DeltaFile
+34-13llvm/lib/Target/AArch64/AArch64BranchTargets.cpp
+8-1llvm/lib/Target/AArch64/AArch64.h
+1-1llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-0llvm/lib/Target/AArch64/AArch64PassRegistry.def
+44-154 files

LLVM/project 1c228a0lldb/include/lldb/Host Host.h, lldb/source/Host/common Host.cpp

[lldb] Have Host::RunShellCommand ret stderr & stdout seperately (#184548)

Host::RunShellCommand takes a std::string *command_output argument and a
bool hide_stderr=false defaulted argument. If the shell command returns
stderr and stdout text, it is intermixed in the same command_output,
unless hide_stderr=true.

In SymbolLocatorDebugSymbols::DownloadObjectAndSymbolFile we call an
external program to find a binary and dSYM by uuid, and the external
program returns a plist (xml) output. In some cases, it printed a
(harmless) warning message to stderr, and then a complete plist output
to stdout. We attempt to parse the combination of these two streams, and
the parse fails - we don't get the output.

This patch removes hide_stderr and instead adds a `std::string
*separated_error_output` argument. If `separated_error_output` is
nullptr, output and error texts are returned combined in the
`command_output` argument. If a std::string object address is provided
for `separated_error_output`, then standard error output is separated

    [4 lines not shown]
DeltaFile
+89-37lldb/include/lldb/Host/Host.h
+59-15lldb/source/Host/common/Host.cpp
+9-7lldb/source/Target/RemoteAwarePlatform.cpp
+8-2lldb/source/Target/Platform.cpp
+5-5lldb/source/Plugins/Platform/POSIX/PlatformPOSIX.cpp
+7-2lldb/source/Plugins/SymbolLocator/DebugSymbols/SymbolLocatorDebugSymbols.cpp
+177-6813 files not shown
+214-8519 files

LLVM/project 3578807llvm/lib/MC MCGOFFStreamer.cpp GOFFObjectWriter.cpp

Fix formatting
DeltaFile
+4-3llvm/lib/MC/MCGOFFStreamer.cpp
+2-1llvm/lib/MC/GOFFObjectWriter.cpp
+6-42 files

LLVM/project 3829fdbmlir/lib/Pass PassRegistry.cpp, mlir/test/Pass invalid-pass.mlir

[mlir][Pass] Report error when passing options to pipelines via shorthand syntax (#185738)

When passing options to a pass pipeline that doesn't accept options,
mlir-opt exits with error code 1, but prints no error message when using
shorthand CLI syntax:

```
# Silent failure (no error message):
$ mlir-opt --tosa-to-linalg-pipeline=foo /dev/null
$ echo $?
1

# Same pipeline via --pass-pipeline syntax reports error:
$ mlir-opt --pass-pipeline='builtin.module(tosa-to-linalg-pipeline{foo})' /dev/null
<unknown>:0: error: failed to add `tosa-to-linalg-pipeline` with options `foo`
```

This PR adds replaces the silent call to `failure` with `errorHandler`
in `PassPipelineCLParser::addToPipeline`, matching the existing pattern
in `TextualPipeline::addToPipeline`.
DeltaFile
+3-1mlir/lib/Pass/PassRegistry.cpp
+2-0mlir/test/Pass/invalid-pass.mlir
+5-12 files

LLVM/project 6bdd17dllvm/lib/Target/X86 X86SelectionDAGInfo.cpp

[X86] Add stop-gap for SDAG failure in fptrunc lowering

```
define <3 x half> @err_420(<3 x float> %0) {
entry:
  %1 = fptrunc <3 x float> %0 to <3 x half>
  ret <3 x half> %1
}
```

currently crashes.

\#185562 will fix this, but needs another round of review. Land a
stop-gap for now to unblock our integrate process.
DeltaFile
+2-0llvm/lib/Target/X86/X86SelectionDAGInfo.cpp
+2-01 files

LLVM/project 88c33f8mlir/include/mlir/Dialect/SPIRV/IR SPIRVMatrixOps.td, mlir/lib/Dialect/SPIRV/IR SPIRVOps.cpp

[mlir][spirv] Move remaining verification from C++ to ODS for Matrix ops (#185702)

This adds two new custom constraints to enforce matrix and vector
dimension with respect to one another.

Assisted-by: Codex
DeltaFile
+0-88mlir/lib/Dialect/SPIRV/IR/SPIRVOps.cpp
+43-4mlir/include/mlir/Dialect/SPIRV/IR/SPIRVMatrixOps.td
+9-9mlir/test/Dialect/SPIRV/IR/matrix-ops.mlir
+52-1013 files

LLVM/project ffa128dmlir/include/mlir/Dialect/AMDGPU/IR AMDGPUOps.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[AMDGPU] Added support for Sparse WMMA ops (#183360)

This PR adds support for Sparce WMMA ops (gfx12 and gfx1250)

---------

Co-authored-by: Jakub Kuderski <kubakuderski at gmail.com>
DeltaFile
+255-15mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+112-0mlir/test/Dialect/AMDGPU/invalid.mlir
+98-0mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPUOps.td
+85-0mlir/test/Conversion/AMDGPUToROCDL/swmmac-gfx12.mlir
+71-0mlir/lib/Dialect/AMDGPU/IR/AMDGPUOps.cpp
+51-0mlir/test/Conversion/AMDGPUToROCDL/swmmac-gfx1250.mlir
+672-156 files

LLVM/project a158b0bllvm/cmake/modules HandleLLVMOptions.cmake

[SystemZ] Disable PCH on z/OS.

The compiler supports PCH in pronciple, but there are occasionally
hickups. Therefore it is better to disable PCH by default.
DeltaFile
+7-0llvm/cmake/modules/HandleLLVMOptions.cmake
+7-01 files

LLVM/project d4c7630llvm/utils lldbDataFormatters.py

[lldb] Fix type checking in lldbDataFormatters (NFC) (#185706)
DeltaFile
+17-16llvm/utils/lldbDataFormatters.py
+17-161 files

LLVM/project 9c464eeutils/bazel/llvm-project-overlay/mlir BUILD.bazel

[Bazel] Add SCFDialect dep to OpenACCDialect

Needed to satisfy some internal layering checks.
DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+1-01 files

LLVM/project 8963edbllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer consecutive-access.ll reduced-gathered-vectorized.ll

[SLP] Loop aware cost model/tree building

Currently, SLP vectorizer do not care about loops and their trip count.
It may lead to inefficient vectorization in some cases. Patch adds loop
nest-aware tree building and cost estimation.
When it comes to tree building, it now checks that tree do not span
across different loop nests. The nodes from other loop nests are
immediate buildvector nodes.
The cost model adds the knowledge about loop trip count. If it is
unknown, the default value is used, controlled by the
-slp-cost-loop-min-trip-count=<value> option. The cost of the vector
nodes in the loop is multiplied by the number of iteration (trip count),
because each vector node will be executed the trip count number of
times. This allows better cost estimation.

Reviewers: jdenny-ornl, vporpo, hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/150450
DeltaFile
+183-13llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+69-92llvm/test/Transforms/SLPVectorizer/RISCV/buildvector-all-external-scalars.ll
+34-45llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll
+24-52llvm/test/Transforms/SLPVectorizer/consecutive-access.ll
+49-20llvm/test/Transforms/SLPVectorizer/reduced-gathered-vectorized.ll
+33-36llvm/test/Transforms/SLPVectorizer/X86/deleted-instructions-clear.ll
+392-25822 files not shown
+530-43328 files

LLVM/project d89c2b2llvm/lib/Analysis DependenceAnalysis.cpp

add another assertion
DeltaFile
+7-0llvm/lib/Analysis/DependenceAnalysis.cpp
+7-01 files