LLVM/project ad3d9fbllvm/lib/Target/RISCV RISCVSchedTTAscalonD8.td, llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8 vlseg-vsseg.s vdiv_vsqrt.s

[RISCV] tt-ascalon-d8 vector scheduling (#167066)

Add the vector scheduling model for tt-ascalon-d8 and corresponding
llvm-mca tests.

---------

Co-authored-by: Craig Topper <craig.topper at sifive.com>
DeltaFile
+4,734-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vlseg-vsseg.s
+1,016-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vdiv_vsqrt.s
+900-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vmv.s
+714-6llvm/lib/Target/RISCV/RISCVSchedTTAscalonD8.td
+595-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vlxe-vsxe.s
+549-0llvm/test/tools/llvm-mca/RISCV/tt-ascalon-d8/vle-vse-vlm.s
+8,508-67 files not shown
+9,672-5813 files

LLVM/project 0f941f6flang/lib/Optimizer/Transforms CUFOpConversion.cpp, flang/test/Fir/CUDA cuda-alloc-free.fir

[flang][cuda] Add support to allocate scalar character types (#169550)

Add support for character declared like: 

```
subroutine sub1()
  character*4, device :: b
end subroutine
```
DeltaFile
+20-0flang/test/Fir/CUDA/cuda-alloc-free.fir
+5-0flang/lib/Optimizer/Transforms/CUFOpConversion.cpp
+25-02 files

LLVM/project 3694798lldb/packages/Python/lldbsuite/test/tools/lldb-dap dap_server.py, lldb/test/API/tools/lldb-dap/evaluate TestDAP_evaluate.py

[lldb-dap] Add format support for evaluate request (#169132)

This patch adds support for format option in the `evaluate` request
according to
[DAP](https://microsoft.github.io/debug-adapter-protocol/specification#Requests_Evaluate)
specification. Also, fixed typo in `LLDB_DAP_INVALID_VARRERF` constant.
DeltaFile
+10-1lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
+10-1lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py
+5-3lldb/tools/lldb-dap/Handler/EvaluateRequestHandler.cpp
+2-2lldb/tools/lldb-dap/Protocol/ProtocolTypes.h
+27-74 files

LLVM/project af0fcf8mlir/include/mlir/TableGen Pattern.h, mlir/lib/TableGen Pattern.cpp

[mlir][tblgen] Don't echo absolute paths into rewrite pattern source (#168984)

Currently, the declarative pattern rewrite generator will always print
the [source]:[line](s) from which a pattern came. This is a useful
debugging hint, but it causes problem when absolute paths are used as
arguments to mlir-tblgen (which LLVM's build rules automatically do).
Specifially, it causes the source to be tied to the build location,
harning reproducability and our collective ability to get ccache hits
from, say, separate worktrees.

This commit resolves the issue by replacing absolute paths in thes
"Generated from:" comments with their filenames. (The alternative would
have been to implement an entire file-prefix-map the way the C compilers
do, but since this is an isolated incident, I chose to resolve it
locally.)
DeltaFile
+17-4mlir/lib/TableGen/Pattern.cpp
+4-2mlir/include/mlir/TableGen/Pattern.h
+1-1mlir/tools/mlir-tblgen/RewriterGen.cpp
+22-73 files

LLVM/project b2619beclang/tools/clang-scan-deps ClangScanDeps.cpp

[clang][deps][NFC] Replace a vector with an array

`ResourceDirectoryCache::findResourceDir` uses a `std::vector` when a `std::array` would do.
DeltaFile
+3-5clang/tools/clang-scan-deps/ClangScanDeps.cpp
+3-51 files

LLVM/project 0917a38llvm/lib/Target/PowerPC PPCISelLowering.cpp

[PowerPC] Fix a warning

This patch fixes:

  llvm/lib/Target/PowerPC/PPCISelLowering.cpp:15676:17: error: unused
  variable 'CC' [-Werror,-Wunused-variable]
DeltaFile
+2-1llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+2-11 files

LLVM/project 0be7741clang-tools-extra/clangd/test index-tools.test

rebase

Created using spr 1.3.7
DeltaFile
+2-0clang-tools-extra/clangd/test/index-tools.test
+2-01 files

LLVM/project d4526fbclang-tools-extra/clangd/test index-tools.test

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+2-0clang-tools-extra/clangd/test/index-tools.test
+2-01 files

LLVM/project 2fd03c2clang-tools-extra/clangd/test index-tools.test

rebase

Created using spr 1.3.7
DeltaFile
+2-0clang-tools-extra/clangd/test/index-tools.test
+2-01 files

LLVM/project 6c48fbcbolt/test lit.local.cfg, bolt/test/X86 lit.local.cfg

[BOLT][Tests] Use AT&T assembler syntax only for X86 tests (#169541)

Enabling AT&T syntax for all tests is broken when X86 target is not
enabled as reported in #167225.
DeltaFile
+1-1bolt/test/X86/lit.local.cfg
+1-1bolt/test/lit.local.cfg
+2-22 files

LLVM/project ebe4006mlir/lib/Dialect/GPU/Pipelines CMakeLists.txt

[mlir] Fix build failure with BUILD_SHARED_LIBS=ON

/usr/bin/ld: tools/mlir/lib/Dialect/GPU/Pipelines/CMakeFiles/obj.MLIRGP
UPipelines.dir/GPUToXeVMPipeline.cpp.o: in function `mlir::gpu::buildLo
werToXeVMPassPipeline(mlir::OpPassManager&, mlir::gpu::GPUToXeVMPipelin
eOptions const&)':
GPUToXeVMPipeline.cpp:(.text._ZN4mlir3gpu28buildLowerToXeVMPassPipeline
ERNS_13OpPassManagerERKNS0_24GPUToXeVMPipelineOptionsE+0x1293): undefin
ed reference to `mlir::createConvertVectorToLLVMPass()'
DeltaFile
+1-0mlir/lib/Dialect/GPU/Pipelines/CMakeLists.txt
+1-01 files

LLVM/project 38948b4llvm/lib/Target/AArch64 AArch64AsmPrinter.cpp AArch64InstrInfo.h

[AArch64][PAC] Factor out printing real AUT/PAC/BLRA encodings (NFC)

Separate the low-level emission of the appropriate variants of `AUT*`,
`PAC*` and `B(L)RA*` instructions from the high-level logic of pseudo
instruction expansion.

Introduce `getBranchOpcodeForKey` helper function by analogy to
`get(AUT|PAC)OpcodeForKey`.
DeltaFile
+70-105llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+18-0llvm/lib/Target/AArch64/AArch64InstrInfo.h
+88-1052 files

LLVM/project 5961ae2llvm/lib/Target/AArch64 AArch64AsmPrinter.cpp

Update the comments, rename MayUseAddrAsScratch
DeltaFile
+15-9llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+15-91 files

LLVM/project f936493llvm/lib/Target/AArch64 AArch64AsmPrinter.cpp

[AArch64][PAC] Cleanup AArch64AsmPrinter::emitPtrauthDiscriminator (NFC)

Refactor emitPtrauthDiscriminator function: introduce `isPtrauthRegSafe`
function, update the comments and assertions for readability.
DeltaFile
+25-21llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+25-211 files

LLVM/project eab23e1llvm/lib/Target/RISCV RISCVRegisterInfo.cpp

[RISCV] Don't add Zilsd pairing hints if other part of the pair is reserved. (#169538)

DeltaFile
+3-5llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
+3-51 files

LLVM/project 3a27fc4llvm/lib/Target/RISCV RISCVInsertVSETVLI.cpp

[RISCV] Omit VTYPE in VSETVLIInfo::print() when state is uninit or unknown. (#169459)

DeltaFile
+19-17llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+19-171 files

LLVM/project 2ee12f1llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp AMDGPUISelDAGToDAG.cpp, llvm/test/CodeGen/AMDGPU gws_agpr.ll verify-ds-gws-align.mir

AMDGPU: Use RegClassByHwMode to manage GWS operand special case (#169373)

On targets that require even aligned 64-bit VGPRs, GWS operands
require even alignment of a 32-bit operand. Previously we had a hacky
post-processing which added an implicit operand to try to manage
the constraint. This would require special casing in other passes
to avoid breaking the operand constraint. This moves the handling
into the instruction definition, so other passes no longer need
to consider this edge case. MC still does need to special case this,
to print/parse as a 32-bit register. This also still ends up net
less work than introducing even aligned 32-bit register classes.

This also should be applied to the image special case.
DeltaFile
+108-234llvm/test/CodeGen/AMDGPU/gws_agpr.ll
+41-9llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+31-2llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+31-2llvm/test/CodeGen/AMDGPU/verify-ds-gws-align.mir
+32-0llvm/test/MC/AMDGPU/ds_gws_sgpr_err.s
+15-15llvm/test/MC/AMDGPU/gfx90a_ldst_acc.s
+258-2627 files not shown
+302-26513 files

LLVM/project 20ca85blld/MachO InputFiles.cpp, lld/MachO/Arch X86_64.cpp

[lld] macho: Support section branch relocations, including the 1-byte form (#169062)

I noticed that we had a hardcoded value of 4 for the pcrel section
relocations, which seems like an issue given that we recently added
support for 1-byte branch relocations in
https://github.com/llvm/llvm-project/pull/164439. The code included an
assert that the relevant relocation had the BYTE4 attribute, but that is
actually not enough to use a hardcoded value of 4: we need to assert
that the *other* `BYTE<n>` attributes are not set either.

However, since we did not support local branch relocations, that doesn't
seem to have mattered in practice. That said, local branch relocations
can be emitted by compilers, and ld64 does handle the 4-byte version of
them, so I've added support for it here.

ld64 actually seems to reject 1-byte section relocations, so the
questionable code is actually probably fine (minus the incorrect
assert). So we have two options: add an equivalent check in LLD, or just
support 1-byte local branch relocations. Supporting it actually requires
less code, so I've gone with that option here.
DeltaFile
+17-1lld/test/MachO/x86-64-relocs.s
+2-2lld/MachO/InputFiles.cpp
+2-1lld/MachO/Arch/X86_64.cpp
+21-43 files

LLVM/project 622dbb3llvm/test/CodeGen/AMDGPU constant-address-space-32bit.ll

AMDGPU: Add more tests for 32-bit constant address space (#168976)

The sub-dword cases just assert now, so comment those out.
DeltaFile
+1,560-19llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+1,560-191 files

LLVM/project 2d78b14clang/include/clang/Basic OpenMPKinds.def OpenMPKinds.h, clang/lib/Parse ParseOpenMP.cpp

[OpenMP][Clang] Parsing/Sema support for `need_device_ptr(fb_nullify/fb_preserve)`. (#168905)

This patch adds parsing, semantic handling, and diagnostics for the
`OpenMP 6.1 fb_nullify` and` fb_preserve` fallback modifiers used with
the `need_device_ptr` map modifier.
DeltaFile
+31-0clang/lib/Parse/ParseOpenMP.cpp
+26-0clang/test/OpenMP/need_device_ptr_kind_messages.cpp
+24-0clang/test/OpenMP/need_device_ptr_kind_ast_print.cpp
+8-0clang/include/clang/Basic/OpenMPKinds.def
+7-0clang/include/clang/Basic/OpenMPKinds.h
+4-0clang/include/clang/Basic/DiagnosticParseKinds.td
+100-01 files not shown
+102-07 files

LLVM/project a8e0afeclang/lib/CIR/CodeGen CIRGenExpr.cpp CIRGenFunction.h, clang/test/CIR/CodeGen vector-ext-element.cpp

[CIR] ArraySubscriptExpr on ExtVectorElementExpr (#169158)

Implement ArraySubscriptExpr support for ExtVectorElementExpr
DeltaFile
+43-10clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+24-0clang/test/CIR/CodeGen/vector-ext-element.cpp
+2-0clang/lib/CIR/CodeGen/CIRGenFunction.h
+69-103 files

LLVM/project dce95b2clang/lib/CIR/CodeGen CIRGenStmtOpenACC.cpp CIRGenOpenACCClause.cpp

[OpenACC][CIR][NFC] Remove 'NYI' diagnostics, since we're done with t… (#169543)

…hese

We've finished all of the clauses/etc that we're going to use this
visitor for, so we can remove the SourceLocation we used just for that,
and replace all NYI with unreachables.
DeltaFile
+25-46clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
+21-48clang/lib/CIR/CodeGen/CIRGenOpenACCClause.cpp
+7-9clang/lib/CIR/CodeGen/CIRGenFunction.h
+1-2clang/lib/CIR/CodeGen/CIRGenDeclOpenACC.cpp
+1-2clang/lib/CIR/CodeGen/CIRGenStmtOpenACCLoop.cpp
+55-1075 files

LLVM/project 6c8ff4fllvm/lib/Target/NVPTX NVPTXISelLowering.cpp

[NVPTX] Fix maybe unused variable in 17852ded (#169542)

DeltaFile
+1-1llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+1-11 files

LLVM/project 8f1bb92llvm/test/CodeGen/NVPTX masked-store-vectors-256.ll masked-load-vectors.ll

[NVPTX] Fix lit test issues from masked load/store implementation (#169535)

From this commit:
https://github.com/llvm/llvm-project/commit/17852deda7fb9dabb41023e2673025c630b9369d,
Build was broken here:
https://lab.llvm.org/buildbot/#/builders/155/builds/15135/steps/7/logs/stdio.
I think this should fix things.
DeltaFile
+3-3llvm/test/CodeGen/NVPTX/masked-store-vectors-256.ll
+3-3llvm/test/CodeGen/NVPTX/masked-load-vectors.ll
+1-1llvm/test/CodeGen/NVPTX/masked-store-variable-mask.ll
+7-73 files

LLVM/project 5c6608cllvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp AMDGPUISelDAGToDAG.cpp, llvm/test/CodeGen/AMDGPU gws_agpr.ll verify-ds-gws-align.mir

AMDGPU: Use RegClassByHwMode to manage GWS operand special case

On targets that require even aligned 64-bit VGPRs, GWS operands
require even alignment of a 32-bit operand. Previously we had a hacky
post-processing which added an implicit operand to try to manage
the constraint. This would require special casing in other passes
to avoid breaking the operand constraint. This moves the handling
into the instruction definition, so other passes no longer need
to consider this edge case. MC still does need to special case this,
to print/parse as a 32-bit register. This also still ends up net
less work than introducing even aligned 32-bit register classes.

This also should be applied to the image special case.
DeltaFile
+108-234llvm/test/CodeGen/AMDGPU/gws_agpr.ll
+41-9llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+31-2llvm/test/CodeGen/AMDGPU/verify-ds-gws-align.mir
+31-2llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+32-0llvm/test/MC/AMDGPU/ds_gws_sgpr_err.s
+15-15llvm/test/MC/AMDGPU/gfx90a_ldst_acc.s
+258-2627 files not shown
+302-26513 files

LLVM/project 1d30ae6llvm/lib/Target/AMDGPU AMDGPUTargetMachine.cpp

AMDGPU: Stop forcing RequiresCodeGenSCCOrder (#169522)

This hasn't been strictly necessary since c897c13dde.
Practically this makes little difference; we still enable IPRA
by default which implies this option. By removing this explicit
force, -enable-ipra=0 has the expected change in the pass pipeline
to remove the DummyCGSCC runs.
DeltaFile
+0-4llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+0-41 files

LLVM/project d7dcc10llvm/docs DeveloperPolicy.rst, llvm/utils/git github-automation.py

[GitHub] Add review instructions for commit access requests (#168971)

As discussed in
https://discourse.llvm.org/t/clarification-on-how-to-accept-commit-access-requests/88728,
clarify reviewer instructions for how to accept commit access requests.
DeltaFile
+4-1llvm/docs/DeveloperPolicy.rst
+2-0llvm/utils/git/github-automation.py
+6-12 files

LLVM/project 8cbaffcllvm/lib/Target/AMDGPU AMDGPULegalizerInfo.cpp, llvm/test/CodeGen/AMDGPU codegen-prepare-addrspacecast-non-null.ll

AMDGPU: Try to use zext to implement constant-32-bit addrspacecast

If the high bits are assumed 0 for the cast, use zext. Previously
we would emit a build_vector and a bitcast with the high element
as 0. The zext is more easily optimized. I'm less convinced this is
good for globalisel, since you still need to have the inttoptr back
to the original pointer type.

The default value is 0, though I'm not sure if this is meaningful
in the real world. The real uses might always override the high
bit value with the attribute.
DeltaFile
+24-24llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-zextload-constant-32bit.mir
+18-18llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sextload-constant-32bit.mir
+16-16llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant-32bit.mir
+18-9llvm/test/CodeGen/AMDGPU/codegen-prepare-addrspacecast-non-null.ll
+6-6llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-addrspacecast.mir
+8-2llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+90-752 files not shown
+95-778 files

LLVM/project 882ea7allvm/test/CodeGen/AMDGPU constant-address-space-32bit.ll

AMDGPU: Add more tests for 32-bit constant address space

The sub-dword cases just assert now, so comment those out.
DeltaFile
+1,560-19llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+1,560-191 files

LLVM/project 0c9c62allvm/lib/Target/PowerPC PPCISelLowering.cpp, llvm/test/CodeGen/PowerPC memCmpUsedInZeroEqualityComparison.ll

[PowerPC ]convert `(setcc (and X, 1), 0, eq)`  to  `XORI (and X, 1), 1` (#168384)

Convert `(setcc (and X, 1), 0, eq)` to `XORI (and X, 1), 1`  , it will save one instruction.
DeltaFile
+69-0llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+1-2llvm/test/CodeGen/PowerPC/memCmpUsedInZeroEqualityComparison.ll
+70-22 files