LLVM/project c7ca704flang/docs index.md

[flang][docs] Reorganize the table of contents (#171240)

This patch creates a section for user guidance.
DeltaFile
+16-8flang/docs/index.md
+16-81 files

LLVM/project 1335a05mlir/python/mlir/dialects affine.py, mlir/test/python/dialects affine.py

[MLIR][Python] Fix AffineIfOp insertion point (#171957)

This bug was introduced by #108323, where the loc and ip were not
properly set. It may lead to errors when the operations are not linearly
asserted to the IR.
DeltaFile
+28-0mlir/test/python/dialects/affine.py
+1-1mlir/python/mlir/dialects/affine.py
+29-12 files

LLVM/project ecaf673libcxx/include/__format format_context.h format_parse_context.h, libcxx/test/libcxx/diagnostics format.nodiscard.verify.cpp

[libc++][format] Applied `[[nodiscard]]` to more classes (#170808)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html

Some classes in `<format>` were already annotated. This patch completes
the remaining.
DeltaFile
+23-0libcxx/test/libcxx/diagnostics/format.nodiscard.verify.cpp
+3-3libcxx/include/__format/format_context.h
+2-2libcxx/include/__format/format_parse_context.h
+1-1libcxx/include/__format/format_args.h
+29-64 files

LLVM/project cea9813llvm/lib/Target/RISCV RISCVAsmPrinter.cpp RISCVInstrInfoV.td, llvm/lib/Target/RISCV/MCTargetDesc RISCVBaseInfo.h

[RISCV] Add an OperandType to VMaskOp. NFC (#171926)

Use that instead of register class to detect the mask operand in
lowerRISCVVMachineInstrToMCInst.

There are other instructions like vmerge and vadc that have a VMV0
operand that isn't optional and do not reach this code. Having a
dedicated marker for the optional mask is more precise.
DeltaFile
+2-2llvm/lib/Target/RISCV/RISCVAsmPrinter.cpp
+2-0llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
+2-0llvm/lib/Target/RISCV/RISCVInstrInfoV.td
+6-23 files

LLVM/project 8deb422llvm/lib/Target/RISCV RISCVInstrInfoVPseudos.td RISCVInstrInfoXAndes.td

[RISCV] Use VMV0 instead of VMaskOp in masked vector pseudoinstructions. NFC (#171924)

VMaskOp primarily exists for parsing/printing in the MC layer where the
mask is optional. The vector pseudos are split into mask and unmasked
versions. The mask is always rquired for the mask version.
DeltaFile
+26-26llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+1-1llvm/lib/Target/RISCV/RISCVInstrInfoXAndes.td
+27-272 files

LLVM/project 673383allvm/lib/CodeGen WinEHPrepare.cpp

Merge remote-tracking branch 'upstream/users/Enna1/WinEH-removeIncomingValueIf' into users/Enna1/Opt-Phi-removeIncomingValue
DeltaFile
+5-7llvm/lib/CodeGen/WinEHPrepare.cpp
+5-71 files

LLVM/project b8491afllvm/lib/Target/ARM MVEGatherScatterLowering.cpp

Merge remote-tracking branch 'upstream/users/Enna1/MVE-Phi-removeIncomingValue' into users/Enna1/Opt-Phi-removeIncomingValue
DeltaFile
+19-16llvm/lib/Target/ARM/MVEGatherScatterLowering.cpp
+19-161 files

LLVM/project 8df0265llvm/lib/Transforms/Utils LoopRotationUtils.cpp

Merge remote-tracking branch 'zjt/remove_incomming_value_loop_rotation' into users/Enna1/Opt-Phi-removeIncomingValue
DeltaFile
+1-1llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
+1-11 files

LLVM/project a7784c4llvm/lib/Transforms/Utils CodeExtractor.cpp

Merge remote-tracking branch 'zjt/remove_incomming_value_code_extractor' into users/Enna1/Opt-Phi-removeIncomingValue
DeltaFile
+1-1llvm/lib/Transforms/Utils/CodeExtractor.cpp
+1-11 files

LLVM/project d7f9dd0llvm/lib/Transforms/Utils CloneFunction.cpp

Merge remote-tracking branch 'zjt/remove_incomming_value_clone_function' into users/Enna1/Opt-Phi-removeIncomingValue
DeltaFile
+4-5llvm/lib/Transforms/Utils/CloneFunction.cpp
+4-51 files

LLVM/project deb8969llvm/test/Transforms/DFAJumpThreading dfa-jump-threading-transform.ll

update llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
DeltaFile
+8-10llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
+8-101 files

LLVM/project 1d5fa2bllvm/lib/IR Instructions.cpp, llvm/test/Transforms/LoopUnroll runtime-loop-multiple-exits.ll

[IR] Optimzie `PHINode::removeIncomingValue()` by swapping with the last of incoming value.

Add an optional argument `KeepIncomingOrder` defaults true, when `KeepIncomingOrder` is true,
the new implementation simply moves the last incoming value and block into the position of the element being removed.

This improve compile-time for PHI nodes with many predecessors.
DeltaFile
+18-18llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll
+12-12llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll
+13-9llvm/lib/IR/Instructions.cpp
+11-11llvm/test/Transforms/PGOProfile/chr.ll
+10-10llvm/test/Transforms/SimpleLoopUnswitch/inject-invariant-conditions.ll
+10-10llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
+74-7057 files not shown
+173-16663 files

LLVM/project 5dfa72fllvm/test/Transforms/DFAJumpThreading dfa-unfold-select.ll

update llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
DeltaFile
+77-84llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
+77-841 files

LLVM/project 88ef263llvm/lib/Transforms/Utils CloneFunction.cpp

clang format
DeltaFile
+1-1llvm/lib/Transforms/Utils/CloneFunction.cpp
+1-11 files

LLVM/project 853327allvm/lib/IR Instructions.cpp

[IR] Optimize PHINode::removeIncomingValueIf() using two-pointer
DeltaFile
+15-16llvm/lib/IR/Instructions.cpp
+15-161 files

LLVM/project 85f8b76llvm/lib/Transforms/Utils LoopRotationUtils.cpp

[LoopRotate] Simplify PHINode::removeIncomingValue usage
DeltaFile
+1-1llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
+1-11 files

LLVM/project 50d833allvm/lib/Transforms/Utils CodeExtractor.cpp

[CodeExtractor] Optimize PHI incoming value removal using reverse iteration
DeltaFile
+1-1llvm/lib/Transforms/Utils/CodeExtractor.cpp
+1-11 files

LLVM/project 4bd036amlir/include/mlir/Bindings/Python IRCore.h, mlir/lib/Bindings/Python IRAttributes.cpp MainModule.cpp

rebase
DeltaFile
+7-7mlir/include/mlir/Bindings/Python/IRCore.h
+4-5mlir/python/CMakeLists.txt
+0-8mlir/lib/Bindings/Python/IRAttributes.cpp
+8-0mlir/lib/Bindings/Python/MainModule.cpp
+19-204 files

LLVM/project 238a970llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU waitcnt-loop-ds-opt-same-iter-overwrite.mir waitcnt-loop-ds-opt-no-improvement.mir

[AMDGPU] DS loop wait relaxation -- more test cases and improvements to handle them (4/4)

Add handling for same-iteration use/overwrite of DS load results:
- Track DS load destinations and detect when results are used or
  overwritten within the same iteration
- Compute FloorWaitCount for WMMAs that only use flushed loads
Add bailout for tensor_load_to_lds and LDS DMA writes after barrier
Add negative test based on profitability criteria

Assisted-by: Cursor / claude-4.5-opus-high
DeltaFile
+111-0llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-same-iter-overwrite.mir
+109-0llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-no-improvement.mir
+107-0llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-same-iter-use.mir
+93-6llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+97-0llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-tensor-load.mir
+1-1llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
+518-76 files

LLVM/project 70826b7llvm/lib/Transforms/Utils CloneFunction.cpp

[CloneFunciton] Optimize PHI incoming value removal using reverse iteration
DeltaFile
+4-5llvm/lib/Transforms/Utils/CloneFunction.cpp
+4-51 files

LLVM/project 9d5403cllvm/lib/Target/RISCV RISCVSchedSpacemitX60.td

[RISCV] Fix incorrect chapter number in comments in RISCVSchedSpacemitX60.td. (#171765)

DeltaFile
+6-6llvm/lib/Target/RISCV/RISCVSchedSpacemitX60.td
+6-61 files

LLVM/project 8378ec4libcxx/include set, libcxx/test/libcxx/diagnostics set.nodiscard.verify.cpp

[libc++][set] Applied `[[nodiscard]]` (#169982)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.htm
- https://wg21.link/set
DeltaFile
+94-10libcxx/test/libcxx/diagnostics/set.nodiscard.verify.cpp
+49-39libcxx/include/set
+143-492 files

LLVM/project 96b6594clang/docs ReleaseNotes.rst, clang/lib/Sema SemaDeclCXX.cpp

[Clang] Remove the early-check for anonymous struct in ShouldDeleteSpecialMember (#171799)

That check doesn't seem very useful. For non-dependent context records,
ShouldDeleteSpecialMember is called when checking implicitly defined
member functions, before the anonymous flag which the check relies on is
set. (One could notice that in ParseCXXClassMemberDeclaration,
ParseDeclarationSpecifiers ends up calling
ShouldDeleteSpecialMember, while the flag is only set later in
ParsedFreeStandingDeclSpec.)

For dependent contexts, this check actually breaks correctness: since we
don't create those special members until the template is instantiated,
their deletion checks are skipped because of the anonymity.

There's only one regression in ObjC test about notes; we are more
explanative now.

Fixes https://github.com/llvm/llvm-project/issues/167217
DeltaFile
+31-0clang/test/SemaCXX/anonymous-struct.cpp
+0-7clang/lib/Sema/SemaDeclCXX.cpp
+1-1clang/test/SemaObjCXX/arc-0x.mm
+1-0clang/docs/ReleaseNotes.rst
+33-84 files

LLVM/project b0d3405llvm/lib/Transforms/Scalar SROA.cpp, llvm/test/Transforms/SROA protected-field-pointer.ll

SROA: Recognize llvm.protected.field.ptr intrinsics.

When an alloc slice's users include llvm.protected.field.ptr intrinsics
and their discriminators are consistent, drop the intrinsics in order
to avoid unnecessary pointer sign and auth operations.

Reviewers: nikic

Reviewed By: nikic

Pull Request: https://github.com/llvm/llvm-project/pull/151650
DeltaFile
+93-0llvm/test/Transforms/SROA/protected-field-pointer.ll
+78-5llvm/lib/Transforms/Scalar/SROA.cpp
+171-52 files

LLVM/project 2b1fa68clang/include/clang/Basic DiagnosticDriverKinds.td, clang/include/clang/Options Options.td

[HLSL] Add the DXC matrix orientation flags (#171550)

fixes #58676

- Make /Zpr and /Zpc turn on the -fmatrix-memory-layout= row-major and
column-major flags
- Add the new DXC driver flags to Options.td
- Error in the HLSL toolchain when  both flags are specified
- Add the new error diagnostic to DiagnosticDriverKinds.td
- propogate the flag via the Clang toolchain
DeltaFile
+18-0clang/lib/Driver/ToolChains/HLSL.cpp
+8-0clang/test/Driver/hlsl_matrix_pack_order.hlsl
+2-0clang/include/clang/Basic/DiagnosticDriverKinds.td
+2-0clang/include/clang/Options/Options.td
+1-0clang/lib/Driver/ToolChains/Clang.cpp
+31-05 files

LLVM/project e13998futils/bazel/llvm-project-overlay/llvm/include/llvm/Config llvm-config.h, utils/bazel/llvm_configs llvm-config.h.cmake

[bazel] Port 8e999e3d7857ce131d03bab4fd5c42b0e8edd980 (#171946)

Added a new preprocessor macro to llvm-config.h which needs to be
reflected on the bazel side.
DeltaFile
+3-0utils/bazel/llvm-project-overlay/llvm/include/llvm/Config/llvm-config.h
+3-0utils/bazel/llvm_configs/llvm-config.h.cmake
+6-02 files

LLVM/project a1b3586flang/docs OptionComparison.md

[flang][docs] Remove stale inline links to Intel and IBM compiler option

Remove all inline links to Intel and IBM compiler options from the
comparison tables, as these links have become stale (Intel links
redirect to generic pages, IBM links redirect to PDF-only pages).

Option names are preserved for readability. The Data sources section
still contains links to the main documentation pages.

Details:

- Removed 43 Intel compiler option links
- Removed 35 IBM compiler option links
- Removed 2 stale links in notes section
- Updated documentation text accordingly

Fixes #171464

---------

Co-authored-by: Tarun Prabhu <tarun at lanl.gov>
DeltaFile
+90-90flang/docs/OptionComparison.md
+90-901 files

LLVM/project e760d06llvm/lib/Target/AMDGPU AMDGPUPromoteAlloca.cpp, llvm/test/CodeGen/AMDGPU promote-alloca-scoring.ll promote-alloca-negative-index.ll

AMDGPU/PromoteAlloca: Refactor into analysis / commit phases (#170512)

This change is motivated by the overall goal of finding alternative ways
to promote allocas to VGPRs. The current solution is effectively limited
to allocas whose size matches a register class, and we can't keep adding
more register classes. We have some downstream work in this direction,
and I'm currently looking at cleaning that up to bring it upstream.

This refactor paves the way to adding a third way of promoting allocas,
on top of the existing alloca-to-vector and alloca-to-LDS. Much of the
analysis can be shared between the different promotion techniques.

Additionally, the idea behind splitting the pass into an analysis
phase and a commit phase is that it ought to allow us to more easily
make
better "big picture" decision about which allocas to promote how in the
future.
DeltaFile
+347-304llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+34-30llvm/test/CodeGen/AMDGPU/promote-alloca-scoring.ll
+2-4llvm/test/CodeGen/AMDGPU/promote-alloca-negative-index.ll
+383-3383 files

LLVM/project 70beea8llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU waitcnt-loop-ds-opt-eligible.mir

[AMDGPU] Add DS loop preheader flush (3/4)

Add insertDSPreheaderFlushes() to insert S_WAIT_DSCNT 0 in loop preheaders
when DS wait relaxation was applied.

Assisted-by: Cursor / claude-4.5-opus-high
DeltaFile
+67-0llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+4-2llvm/test/CodeGen/AMDGPU/waitcnt-loop-ds-opt-eligible.mir
+71-22 files

LLVM/project 0ef5d73mlir/lib/Bindings/Python MainModule.cpp, mlir/python CMakeLists.txt

rebase
DeltaFile
+8-0mlir/lib/Bindings/Python/MainModule.cpp
+2-5mlir/python/CMakeLists.txt
+10-52 files