LLVM/project 23a13d0llvm/tools CMakeLists.txt

[CMake][NFC] Remove dead code add_llvm_external_project(libclc) (#196241)

It was added in 72f9881c3ffcf. libclc has now switched to runtime build.
DeltaFile
+0-3llvm/tools/CMakeLists.txt
+0-31 files

LLVM/project fb20976mlir/lib/Dialect/XeGPU/Transforms XeGPULayoutImpl.cpp XeGPUBlocking.cpp, mlir/test/Dialect/XeGPU propagate-layout-inst-data.mlir

[MLIR][XeGPU] Fix layout inference issues blocking MXFP_GEMM test  (#196243)

This branch fixes layout inference issues in XeGPU passes that were
blocking MXFP (microscaled floating point) GEMM workloads:
                                                        
- Fix bitcast layout adjustment to use result shape instead of source
shape. The setupBitCastResultLayout function were incorrectly bounding
the layout adjustment loop against the source shape. Added tests.
- Fix blocking pass to drop inst_data from anchor operations. Operations
whose shape already matches inst_data don't get unrolled, so their
layout attributes retained stale inst_data that broke downstream passes.
Now inst_data is unconditionally stripped from all op attributes after
blocking.
- Propagate layout to both results of vector.deinterleave. The layout
recovery pass was only setting the layout on result 0, leaving result 1
without a layout.
                  
  Test plan                                             
   

    [9 lines not shown]
DeltaFile
+73-0mlir/test/Integration/Dialect/XeGPU/WG/simple_mxfp_gemm.mlir
+46-0mlir/test/Dialect/XeGPU/propagate-layout-inst-data.mlir
+11-5mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+6-0mlir/lib/Dialect/XeGPU/Transforms/XeGPUBlocking.cpp
+136-54 files

LLVM/project 0ed2eb4llvm/lib/Target/AMDGPU AMDGPUAttributor.cpp

[NFC][AMDGPU] Use a worklist and remember results in AMDGPUAttributor

This was a recursive function with a Map to cache things that was never filled.
Now it's a worklist and the map is actually used.

Co-authored-by: Johannes Doerfert <johannes at jdoerfert.de>
DeltaFile
+38-19llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+38-191 files

LLVM/project 589faedllvm/lib/Target/RISCV RISCVFrameLowering.cpp, llvm/test/CodeGen/RISCV stack-probing-dynamic-nonentry.ll

[CodeGen][RISCV] Inline stack probes immediately after `allocateStack` in `eliminateCallFramePseudoInstr` (#195456)

This PR adds a call to `inlineStackProbe` immediately after
`allocateStack` in `eliminateCallFramePseudoInstr`. This allows code
generation for stack probe pseudoinstructions in non-entry BBs.

Fixes #195454.
DeltaFile
+115-0llvm/test/CodeGen/RISCV/stack-probing-dynamic-nonentry.ll
+1-0llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+116-02 files

LLVM/project 8b8fdfdllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 expanded-operand-already-scheduled.ll expanded-binop-doesnotneedschedule-user.ll

[SLP]Bail out on non-schedulable expanded binop with stale operand deps

In tryScheduleBundle's DoesNotRequireScheduling path, an expanded binop
(shl X, 1 modeled as add X, X) doubles the dependency count of the
duplicated operand. If the operand has a
single IR use yet its ScheduleData already has Dependencies populated
by an earlier calculation that did not see the expanded duplicate use,
double decrement still exceeds calculateDependencies' single increment
and UnscheduledDeps goes negative.

Fixes #196281.

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/196449
DeltaFile
+50-0llvm/test/Transforms/SLPVectorizer/X86/expanded-operand-already-scheduled.ll
+11-5llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+3-3llvm/test/Transforms/SLPVectorizer/X86/expanded-binop-doesnotneedschedule-user.ll
+64-83 files

LLVM/project 6627accmlir/lib/Dialect/Vector/Transforms LowerVectorContract.cpp

Keep contract lowering filter patch focused
DeltaFile
+28-42mlir/lib/Dialect/Vector/Transforms/LowerVectorContract.cpp
+28-421 files

LLVM/project 055d49cmlir/include/mlir/Dialect/Vector/Transforms LoweringPatterns.h, mlir/test/Dialect/Vector vector-contract-composable-lowering.mlir

add more tests and parallelarith

Signed-off-by: Eric Feng <Eric.Feng at amd.com>
DeltaFile
+37-10mlir/test/Dialect/Vector/vector-contract-composable-lowering.mlir
+6-2mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp
+5-0mlir/include/mlir/Dialect/Vector/Transforms/LoweringPatterns.h
+48-123 files

LLVM/project 3d20376clang/lib/Sema SemaHLSL.cpp

[Clang][HLSL] Fix -Wunused-variable (#196445)

LookupSucceeded is only used in an assertion. Mark it [[maybe_unused]]
so we do not get -Wunused-variable in non-assertions builds.
DeltaFile
+2-1clang/lib/Sema/SemaHLSL.cpp
+2-11 files

LLVM/project 7e5ba54llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU d16-write-vgpr32.ll function-args.ll

[AMDGPU][True16] relax d16-write-vgpr32 condition (#194477)

Patch https://github.com/llvm/llvm-project/pull/157795 work around a D16
load HW issue.

We found the condition of this workaround could be relaxed for
instructions from same order groups. Downstream testing looks ok.
DeltaFile
+139-0llvm/test/CodeGen/AMDGPU/d16-write-vgpr32.ll
+70-19llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+20-10llvm/test/CodeGen/AMDGPU/function-args.ll
+2-2llvm/test/CodeGen/AMDGPU/branch-relaxation-inst-size-gfx11.ll
+0-1llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
+231-325 files

LLVM/project 0b9f8dfclang/lib/CIR/FrontendAction CIRGenAction.cpp, clang/test/CIR/CodeGen link-bitcode-file.c

[CIR] Add Support for linking modules on cc1
DeltaFile
+84-0clang/lib/CIR/FrontendAction/CIRGenAction.cpp
+52-0clang/test/CIR/CodeGen/link-bitcode-file.c
+136-02 files

LLVM/project 4c7fd84llvm/include/llvm/IR Module.h, llvm/lib/Bitcode/Writer BitcodeWriterPass.cpp

[DebugInfo] Remove old decls when converting DI (#194964)

We were trying to remove declarations of old debug intrinsics whenever
printing modules or writing them to file. This is no longer necessary as
we use the new-style debug values exclusively now, other than when a
target pass specifically converts back to the old style. If a target
pass does that, removing the intrinsics is not right as the intrinsics'
users will still linger.

This change should be NFC except for the experimental DirectX target
where we do exactly that.

Fixes #194884
DeltaFile
+59-0llvm/test/CodeGen/DirectX/debug-info.ll
+0-4llvm/lib/IRPrinter/IRPrintingPasses.cpp
+0-4llvm/lib/Bitcode/Writer/BitcodeWriterPass.cpp
+0-4llvm/lib/IR/IRPrintingPasses.cpp
+2-0llvm/include/llvm/IR/Module.h
+0-2llvm/tools/llvm-as/llvm-as.cpp
+61-145 files not shown
+61-2011 files

LLVM/project 8f6eaaellvm/include/llvm/IR Intrinsics.h

[NFC][LLVM] Remove extra indentation in Intrinsics.h (#196360)

Remove extra indentation for Intrinsics.h in conformance with
https://llvm.org/docs/CodingStandards.html#namespace-indentation
DeltaFile
+244-245llvm/include/llvm/IR/Intrinsics.h
+244-2451 files

LLVM/project 4f98539flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP metadirective-loop.f90

Fix metadirective loop variant lowering

Preserve the associated DO evaluation when a dynamic metadirective can
select either a loop-associated directive or a standalone fallback, so
the fallback still lowers the original loop body.

Scope temporary loop-IV data-sharing attributes to the selected variant.
Use the selected variant's collapse clause to determine how many loop IVs
to mark, avoiding DSA state leaking between alternatives.
DeltaFile
+84-23flang/lib/Lower/OpenMP/OpenMP.cpp
+49-1flang/test/Lower/OpenMP/metadirective-loop.f90
+133-242 files

LLVM/project 363e904llvm/include/llvm/IR Module.h, llvm/lib/IR AsmWriter.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+30-0llvm/unittests/IR/AsmWriterTest.cpp
+12-5llvm/lib/IR/AsmWriter.cpp
+2-2llvm/include/llvm/IR/Module.h
+44-73 files

LLVM/project 75dc872llvm/include/llvm/IR Module.h, llvm/lib/IR AsmWriter.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+30-0llvm/unittests/IR/AsmWriterTest.cpp
+12-5llvm/lib/IR/AsmWriter.cpp
+3-1llvm/tools/llvm-reduce/ReducerWorkItem.cpp
+2-2llvm/include/llvm/IR/Module.h
+0-1llvm/test/tools/llvm-reduce/remove-unused-declarations.ll
+0-1llvm/test/tools/llvm-reduce/remove-attributes-from-intrinsics.ll
+47-101 files not shown
+47-117 files

LLVM/project 2c7ee41llvm/include/llvm/IR Module.h, llvm/lib/IR AsmWriter.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+30-0llvm/unittests/IR/AsmWriterTest.cpp
+12-5llvm/lib/IR/AsmWriter.cpp
+2-2llvm/include/llvm/IR/Module.h
+44-73 files

LLVM/project f5cef67llvm/lib/Target/PowerPC PPCInstrInfo.td PPCInstr64Bit.td

[PowerPC] Hardcode LDAT/LWAT_CSNE constant immediate (#196115)

The FC field in LDAT/LWAT_CSNE instructions is always 16, so hardcode it
in the TableGen definition instead of passing it as an explicit operand.
DeltaFile
+4-4llvm/lib/Target/PowerPC/PPCInstrInfo.td
+3-3llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+0-1llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
+7-83 files

LLVM/project 3f46fa2flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP metadirective-loop.f90

Fix metadirective loop variant lowering

Preserve the associated DO evaluation when a dynamic metadirective can
select either a loop-associated directive or a standalone fallback, so
the fallback still lowers the original loop body.

Scope temporary loop-IV data-sharing attributes to the selected variant.
Use the selected variant's collapse clause to determine how many loop IVs
to mark, avoiding DSA state leaking between alternatives.
DeltaFile
+84-23flang/lib/Lower/OpenMP/OpenMP.cpp
+49-1flang/test/Lower/OpenMP/metadirective-loop.f90
+133-242 files

LLVM/project 497ebfcflang/lib/Lower/OpenMP OpenMP.cpp DataSharingProcessor.cpp, flang/test/Lower/OpenMP metadirective-loop.f90

[flang][OpenMP] Support loop-associated metadirective variants (part 3)

Enable metadirective lowering for loop-associated variants such as
`do`, `simd`, `parallel do`, and `do simd`.

When a metadirective resolves to a loop-associated directive, the
sibling DO evaluation is spliced into the metadirective's evaluation
list so existing loop lowering finds it. Loop IV data-sharing
attributes are marked at lowering time since semantic analysis cannot
know which variant will be selected. The DataSharingProcessor is also
extended to handle spliced evaluations.

This patch is part of the feature work for #188820 and stacked on top
of #194424.

Assisted with copilot and GPT-5.4
DeltaFile
+203-0flang/test/Lower/OpenMP/metadirective-loop.f90
+101-1flang/lib/Lower/OpenMP/OpenMP.cpp
+83-2flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
+15-0flang/test/Lower/OpenMP/Todo/metadirective-target-loop.f90
+14-0flang/lib/Lower/OpenMP/Utils.cpp
+12-0flang/test/Lower/OpenMP/Todo/metadirective-no-loop.f90
+428-32 files not shown
+430-158 files

LLVM/project a8321f3llvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll, llvm/test/tools/llvm-mca/AArch64/Cortex C1Premium-sve-instructions.s C1Premium-writeback.s

rebase

Created using spr 1.3.4
DeltaFile
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,969-1,160llvm/test/CodeGen/X86/vector-reduce-mul.ll
+3,979-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-writeback.s
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+22,350-5,2163,419 files not shown
+140,008-44,7103,425 files

LLVM/project c19f83dllvm/test/Instrumentation/MemorySanitizer ftrunc.ll

[NFCI][msan] Add test case for llvm.fptoui.sat/llvm.fptosi.sat (#196416)

Forked from llvm/test/Instrumentation/MemorySanitizer/ftrunc.ll

PR #191365 lowered NEON fcvtz[us] intrinsics into fpto[us]i.sat,
exposing a gap in MSan's instrumentation. A follow-up patch will add
support in MSan for ftop[us]i.sat, propagating the shadow (similar to
its handling of fcvtz[us]) rather than strictly handling them.
DeltaFile
+278-0llvm/test/Instrumentation/MemorySanitizer/ftrunc.ll
+278-01 files

LLVM/project 583853cllvm/test/CodeGen/X86 vector-reduce-smin.ll vector-reduce-smax.ll, llvm/test/tools/llvm-mca/AArch64/Cortex C1Premium-sve-instructions.s C1Premium-writeback.s

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.4

[skip ci]
DeltaFile
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+2,969-1,160llvm/test/CodeGen/X86/vector-reduce-mul.ll
+3,979-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-writeback.s
+2,677-1,279llvm/test/CodeGen/X86/vector-reduce-umax.ll
+22,350-5,2163,418 files not shown
+139,996-44,6793,424 files

LLVM/project 5c72b10mlir/lib/Dialect/AMDGPU/Transforms VectorReductionToDot.cpp, mlir/test/Dialect/AMDGPU vector-reduction-to-dot.mlir

test

Signed-off-by: Eric Feng <Eric.Feng at amd.com>
DeltaFile
+7-66mlir/test/Dialect/AMDGPU/vector-reduction-to-dot.mlir
+0-16mlir/lib/Dialect/AMDGPU/Transforms/VectorReductionToDot.cpp
+7-822 files

LLVM/project c507e20llvm/include/llvm/Transforms/IPO InstrumentorStubPrinter.h Instrumentor.h, llvm/lib/Transforms/IPO InstrumentorStubPrinter.cpp

[Instrumentor] Allow printing a runtime stub (#138978)

This commit extends the Instrumentor with the option
`configuration.runtime_stubs_file` to generate a runtime stub file with
the configured instrumentation. The stub prints all parameters passed to
each enabled instrumentation function.
DeltaFile
+212-0llvm/lib/Transforms/IPO/InstrumentorStubPrinter.cpp
+105-0llvm/test/Instrumentation/Instrumentor/rt_config.json
+105-0llvm/test/Instrumentation/Instrumentor/bad_rt_config.json
+37-0llvm/test/Instrumentation/Instrumentor/default_rt
+32-0llvm/include/llvm/Transforms/IPO/InstrumentorStubPrinter.h
+16-0llvm/include/llvm/Transforms/IPO/Instrumentor.h
+507-07 files not shown
+520-213 files

LLVM/project 54e1afcllvm/lib/Target/AArch64 AArch64SLSHardening.cpp AArch64.h

[NewPM] Port for AArch64SLSHardening (#196378)

AArch64.h: Declared the AArch64SLSHardeningPass class.
AArch64PassRegistry.def: Registered the pass under the name
aarch64-sls-hardening.
AArch64SLSHardening.cpp: Implemented the run method to bridge the NewPM
with the existing pass logic, ensuring MachineModuleAnalysis is
correctly retrieved.
DeltaFile
+28-6llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
+8-2llvm/lib/Target/AArch64/AArch64.h
+2-2llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-0llvm/lib/Target/AArch64/AArch64PassRegistry.def
+39-104 files

LLVM/project 119f338clang/lib/Basic Targets.cpp, clang/lib/Basic/Targets OSTargets.h

[clang][RISCV] Remove some of the bits added with RISC-V big endian support (#192903)

- FreeBSD will not have any new 32-bit archs
- *BSD's are unlikely to touch BE RISC-V
- Keep the BE and LE targets separate
DeltaFile
+16-2clang/lib/Basic/Targets.cpp
+0-8clang/lib/Driver/ToolChains/FreeBSD.cpp
+3-0clang/test/Driver/freebsd.c
+0-2clang/lib/Basic/Targets/OSTargets.h
+19-124 files

LLVM/project a2e0ee2bolt/docs BinaryAnalysis.md, llvm/docs/AMDGPU AMDGPUAsmGFX950.rst

Merge branch 'main' into users/s-perron/constantbuffer-type-trait
DeltaFile
+5,910-880llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+3,306-504llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+581-920llvm/docs/AMDGPU/AMDGPUAsmGFX950.rst
+1,287-0llvm/test/tools/dsymutil/AArch64/typedef-different-types.test
+0-775llvm/utils/Reviewing/find_interesting_reviews.py
+672-100bolt/docs/BinaryAnalysis.md
+11,756-3,1791,890 files not shown
+42,586-17,1011,896 files

LLVM/project d279247llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp, llvm/test/MC/AMDGPU gfx1250_asm_vop3_err.s

[AMDGPU] Also disable lit64() from VOP3 and inline constant
DeltaFile
+5-3llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+5-0llvm/test/MC/AMDGPU/gfx1250_asm_vop3_err.s
+10-32 files

LLVM/project cc79831clang/include/clang/CIR/Interfaces CIROpInterfaces.td, clang/lib/CIR/CodeGen CIRGenModule.cpp

[CIR] Implement weak ref and alias attribute handling (#195972)

This adds handling for globals with the WeakRefAttr (not emitted) or
AliasAttr attributes set. CIR already had support for function aliases,
but we weren't handling the explicit alias attribute, and we didn't have
any support for global variable aliases. This change adds the global
variable alias support and adds the code to handle the explicit
attribute for variables and functions.

Assisted-by: Cursor / claude-opus-4.7-thinking-xhigh
DeltaFile
+91-3clang/lib/CIR/CodeGen/CIRGenModule.cpp
+79-0clang/test/CIR/CodeGen/attr-alias.c
+32-22clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+42-0clang/test/CIR/IR/invalid-global.cir
+26-0clang/include/clang/CIR/Interfaces/CIROpInterfaces.td
+26-0clang/test/CIR/CodeGen/attr-weakref.c
+296-254 files not shown
+324-2610 files

LLVM/project 93c6562clang/lib/Sema SemaHLSL.cpp HLSLBuiltinTypeDeclBuilder.cpp, clang/test/AST/HLSL ConstantBuffers-AST.hlsl

[HLSL] Add ConstantBuffer<T> (#195153)

The ConstantBuffer<T> is a standard resource type in HLSL. This commit
is following the design in wg-hlsl proposal
[0046](https://github.com/llvm/wg-hlsl/blob/main/proposals/0046-constantbuffer-t.md).

The type constraints will be left to a follow up pr.

Assisted-by: Gemini

<!-- branch-stack-start -->

-------------------------
- main
  - https://github.com/llvm/llvm-project/pull/195151
    - https://github.com/llvm/llvm-project/pull/195152
      - users/s-perron/constantbuffer-constantbuffer-t :point_left:
        - https://github.com/llvm/llvm-project/pull/195154


    [3 lines not shown]
DeltaFile
+120-0clang/test/AST/HLSL/ConstantBuffers-AST.hlsl
+68-0clang/test/CodeGenHLSL/builtins/ConstantBuffer-layout.hlsl
+65-0clang/test/CodeGenHLSL/builtins/ConstantBuffer.hlsl
+47-0clang/lib/Sema/SemaHLSL.cpp
+35-0clang/test/SemaHLSL/BuiltIns/ConstantBuffers.hlsl
+29-2clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp
+364-27 files not shown
+478-313 files