LLVM/project 281f4eallvm/utils/gn/secondary/clang-tools-extra/clangd/test BUILD.gn

[gn] port f65c19982d2a
DeltaFile
+1-0llvm/utils/gn/secondary/clang-tools-extra/clangd/test/BUILD.gn
+1-01 files

LLVM/project 4c4236eopenmp/runtime/src kmp_ftn_entry.h

fix formatting
DeltaFile
+2-2openmp/runtime/src/kmp_ftn_entry.h
+2-21 files

LLVM/project fd8bf3clldb/source/Plugins/ScriptInterpreter/Python/Interfaces ScriptedPythonInterface.h

[lldb/ScriptInterpreter] Fix typo in AbstractMethodCheckerPayload (NFC) (#170187)

This fixes a typo in ScriptedPythonInterface and changes
`AbstrackMethodCheckerPayload` to `AbstractMethodCheckerPayload`.

Signed-off-by: Med Ismail Bennani <ismail at bennani.ma>
DeltaFile
+6-6lldb/source/Plugins/ScriptInterpreter/Python/Interfaces/ScriptedPythonInterface.h
+6-61 files

LLVM/project 64c9a4fopenmp/runtime/src kmp_ftn_entry.h exports_so.txt

use KMP_EXPAND_NAME and add versioned symbol
DeltaFile
+10-4openmp/runtime/src/kmp_ftn_entry.h
+2-0openmp/runtime/src/exports_so.txt
+2-0openmp/runtime/src/exports_test_so.txt
+14-43 files

LLVM/project 6397e2fllvm/lib/Target/BPF BPFISelLowering.cpp BPFISelLowering.h, llvm/test/CodeGen/BPF builtin_calls.ll atomic-oversize.ll

Revert "[BPF] Allow libcalls behind a feature gate (#168442)" (#169733)

**Problem**

As mentioned in
https://github.com/llvm/llvm-project/pull/168442#pullrequestreview-3501502448
#168442, is not the right solution for the problem.

I'll follow @arsenm suggestions starting with
https://github.com/llvm/llvm-project/pull/169537 to properly allow safe
math operations and i128 support in BPF.

**Solution**

Revert #168442.
DeltaFile
+0-39llvm/test/CodeGen/BPF/builtin_calls.ll
+3-20llvm/lib/Target/BPF/BPFISelLowering.cpp
+0-10llvm/lib/Target/BPF/BPFISelLowering.h
+0-4llvm/lib/Target/BPF/BPF.td
+0-3llvm/lib/Target/BPF/BPFSubtarget.h
+2-0llvm/test/CodeGen/BPF/atomic-oversize.ll
+5-763 files not shown
+7-799 files

LLVM/project d200a51flang/lib/Optimizer/Transforms CUFOpConversion.cpp, flang/lib/Optimizer/Transforms/CUDA CUFAllocationConversion.cpp

Merge branch 'main' into users/rampitec/11-17-_amdgpu_allow_hazard_checks_for_wmma_co-exec
DeltaFile
+468-0flang/lib/Optimizer/Transforms/CUDA/CUFAllocationConversion.cpp
+14-375flang/lib/Optimizer/Transforms/CUFOpConversion.cpp
+94-100llvm/test/CodeGen/X86/combine-fround.ll
+66-42lldb/source/Host/windows/ProcessLauncherWindows.cpp
+52-37llvm/test/CodeGen/X86/combine-ffloor.ll
+76-0lldb/test/Shell/SymbolFile/NativePDB/find-pdb-next-to-exe.test
+770-55421 files not shown
+1,067-60027 files

LLVM/project 31d0e47flang/lib/Lower/OpenMP ClauseProcessor.cpp, flang/test/Integration/OpenMP map-types-and-sizes.f90

Update test.
DeltaFile
+38-24offload/test/offloading/fortran/target-is-device-ptr.f90
+9-2mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+4-1flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+1-1flang/test/Integration/OpenMP/map-types-and-sizes.f90
+52-284 files

LLVM/project e6ae246llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp GCNHazardRecognizer.h

[AMDGPU] Refactor hazard recognizer for VALU-pipeline hazards. NFCI. (#168801)

This is in preparation of handling these in scheduler. I do not expect
any changes to the produced code here, it is just an infrastructure.
Our current problem with the VALU pipeline hazards is that we only
insert V_NOP instructions in the hazard recognizer mode, but ignore
it during scheduling. This patch is meant to create a mechanism to
actually account for that during scheduling.
DeltaFile
+47-38llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+8-1llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+55-392 files

LLVM/project 0ff0f52llvm/include/llvm/DebugInfo/DWARF DWARFAcceleratorTable.h, llvm/lib/DebugInfo/DWARF DWARFAcceleratorTable.cpp

Fix __apple_XXX iterator that iterates over all entries. (#157538)

The previous iterator for __apple_XXX sections was assuming that all
entries in the table would be contiguous and it wasn't using the offsets
table to access each chain of entries for a given name. This patch fixes
it so the iterator does the right thing.

This issue became apparent after a modification to strip template names
from DW_AT_name entries to allow adding both the template class base
name as an entry and also include the name with template names. The
commit hash is 2e7ee4dc21430b0fe4c9ee306dc1d8c7986a6646. The problem is
if the name starts with a "<" it will try and split the name. So if the
name is `"<get-size>"` it will return an empty string as the function
name, and this empty string gets added to the __apple_names table and
causes large delays when using the iterators.
DeltaFile
+21-8llvm/include/llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h
+18-6llvm/lib/DebugInfo/DWARF/DWARFAcceleratorTable.cpp
+39-142 files

LLVM/project 9edbf83lldb/source/Host/windows ProcessLauncherWindows.cpp

[lldb][windows] fix environment handling in CreateProcessW setup (#168733)

This patch refactors and documents the setup of the `CreateProcessW`
invocation in `ProcessLauncherWindows`. It's a dependency of
https://github.com/llvm/llvm-project/pull/168729.

`CreateEnvironmentBufferW` now sorts the environment variable keys
before concatenating them into a string. From [the `CreateProcess`
documentation](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw):
> An application must manually pass the current directory information to
the new process. To do so, the application must explicitly create these
environment variable strings, sort them alphabetically (because the
system uses a sorted environment), and put them into the environment
block. Typically, they will go at the front of the environment block,
due to the environment block sort order.

`GetFlattenedWindowsCommandStringW` now returns an error which will be
surfaced, instead of failing silently.


    [2 lines not shown]
DeltaFile
+66-42lldb/source/Host/windows/ProcessLauncherWindows.cpp
+66-421 files

LLVM/project b73385dcompiler-rt/cmake config-ix.cmake

[TySan] Attempt to unbreak build after #169036

If tysan was not in COMPILER_RT_SANITIZERS_TO_BUILD, we used to
get an error after #169036, see comments there for details.
DeltaFile
+1-0compiler-rt/cmake/config-ix.cmake
+1-01 files

LLVM/project f65c199clang-tools-extra/clangd/test index-tools.test include-cleaner-batch-fix.test

Reapply "[clangd] Make lit tests work with the internal shell" (#169972)

This reverts commit bd04ef6df50e8e6e5212762fc798ea9fbdcfc897.
    
This reapply fixes the broken case where we would fail at CMake
configuration time if LLVM_INCLUDE_BENCHMARKS was explicitly turned off.
DeltaFile
+5-4clang-tools-extra/clangd/test/index-tools.test
+3-1clang-tools-extra/clangd/test/include-cleaner-batch-fix.test
+4-0clang-tools-extra/clangd/test/CMakeLists.txt
+4-0clang-tools-extra/clangd/test/lit.cfg.py
+2-1clang-tools-extra/clangd/test/system-include-extractor.test
+1-0clang-tools-extra/clangd/test/lit.site.cfg.py.in
+19-66 files

LLVM/project 4a48740clang/lib/CodeGen CGExpr.cpp, clang/test/CodeGenHLSL BoolVector.hlsl

[HLSL] Update indexed vector elements individually (#169144)

When an individual element of a vector is updated via indexing into the vector, it needs to be handled as a store operation on that one vector element.

Clang treats vectors as one unit, so a vector element needs to be updated, the whole vector is loaded, the element is modified, and then the whole vector is stored. In HLSL vector elements are handled individually. We need to avoid this load/modify/store sequence to prevent overwriting other vector elements that might be getting updated in parallel.

Fixes #167729

Contributes to #160208.
DeltaFile
+41-0clang/test/CodeGenHLSL/builtins/VectorElementStore.hlsl
+26-0clang/lib/CodeGen/CGExpr.cpp
+6-9clang/test/CodeGenHLSL/BoolVector.hlsl
+4-2clang/test/CodeGenHLSL/builtins/lit.hlsl
+77-114 files

LLVM/project 56d061clibcxx/include optional

[libc++][NFC] Add optional<T&> synopsis (#170043)

DeltaFile
+65-0libcxx/include/optional
+65-01 files

LLVM/project c103d61lldb/source/Core Statusline.cpp, lldb/test/API/functionalities/statusline TestStatusline.py

[lldb] Fix a bug when disabling the statusline. (#169127)

Currently, disabling the statusline with `settings set show-statusline
false` leaves LLDB in a broken state. The same is true when trying to
toggle the setting again.

The issue was that setting the scroll window to 0 is apparently not
identical to setting it to the correct number of rows, even though some
documentation online incorrectly claims so.

Fixes #166608
DeltaFile
+5-4lldb/source/Core/Statusline.cpp
+4-2lldb/test/API/functionalities/statusline/TestStatusline.py
+9-62 files

LLVM/project c9d9dddlldb/source/Plugins/SymbolFile/NativePDB SymbolFileNativePDB.cpp, lldb/test/Shell/SymbolFile/NativePDB find-pdb-next-to-exe.test

[LLDB][NativePDB] Look for PDBs in `target.debug-file-search-paths` (#169719)

Similar to DWARF's DWO, we should look for PDBs in
`target.debug-file-search-paths` if the PDB isn't at the original
location or next to the executable.

With this PR, the search order is as follows:

1. PDB path specified in the PE/COFF file
2. Next to the executable
3. In `target.debug-file-search-paths`

This roughly matches [the order Visual Studio
uses](https://learn.microsoft.com/en-us/visualstudio/debugger/specify-symbol-dot-pdb-and-source-files-in-the-visual-studio-debugger?view=vs-2022#where-the-debugger-looks-for-symbols),
except that we don't have a project folder and don't support symbol
servers.

Closes #125355 (though I think this is already fixed in the native
plugin).
DeltaFile
+76-0lldb/test/Shell/SymbolFile/NativePDB/find-pdb-next-to-exe.test
+43-9lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+119-92 files

LLVM/project 21e64d1flang/include/flang/Optimizer/Transforms Passes.td, flang/include/flang/Optimizer/Transforms/CUDA CUFAllocationConversion.h

[flang][cuda][NFC] Split allocation related operation conversion from other cuf operations (#169740)

Split AllocOp, FreeOp, AllocateOp and DeallocateOp from other
conversion. Patterns are currently added to the base CUFOpConversion
when the option is enabled.
This split is a pre-requisite to be more flexible where we do the
allocation related operations conversion in the pipeline.
DeltaFile
+468-0flang/lib/Optimizer/Transforms/CUDA/CUFAllocationConversion.cpp
+14-375flang/lib/Optimizer/Transforms/CUFOpConversion.cpp
+33-0flang/include/flang/Optimizer/Transforms/CUDA/CUFAllocationConversion.h
+8-0flang/include/flang/Optimizer/Transforms/Passes.td
+1-0flang/lib/Optimizer/Transforms/CMakeLists.txt
+524-3755 files

LLVM/project d1899acllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 combine-fround.ll combine-ffloor.ll

[X86] combineConcatVectorOps - add handling to concat ISD::FROUND/FFLOOR intrinsics together (#170176)

These were missed in #170160
DeltaFile
+94-100llvm/test/CodeGen/X86/combine-fround.ll
+52-37llvm/test/CodeGen/X86/combine-ffloor.ll
+2-0llvm/lib/Target/X86/X86ISelLowering.cpp
+148-1373 files

LLVM/project 6d5beb9llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp AMDGPURegBankLegalizeRules.cpp

AMDGPU/GlobalISel: Report RegBankLegalize errors using reportGISelFailure

Use standard GlobalISel error reporting with reportGISelFailure
and pass returning false instead of llvm_unreachable.
Also enables -global-isel-abort=0 or 2 for -global-isel -new-reg-bank-select.
Note: new-reg-bank-select with abort 0 or 2 runs LCSSA,
while "intended use" without abort or with abort 1 does not run LCSSA.
DeltaFile
+47-23llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+11-16llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+6-3llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.h
+4-2llvm/lib/Target/AMDGPU/AMDGPURegBankLegalize.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h
+70-465 files

LLVM/project 7ab14fcllvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp, llvm/test/CodeGen/AMDGPU misched-into-wmma-hazard-shadow.mir

[AMDGPU] Allow hazard checks for WMMA co-exec

Now we are just inserting V_NOP instrtuctions, try to schedule
something into the shadow.

It is still somewhat imprecise, for example AdvanceCycle() will
use TII.getNumWaitStates() anyway, but in a scheduling mode
we are not required to be precise. We must be finally precise
in the hazard recognizer mode. Then EmittedInstrs buffer is also
limited to MaxLookAhead even though VALU only hazards may actually
never expire and require an endless buffer. But that's OK, we can
at least mitigate what the buffer can hold. The buffer is also
currently much bigger than any of VALU hazards may need.

That said the rest of the 'fix*' functions here can be changed
the same way, these which are using V_NOPs. This one is just the
worst because it may require up to 9 nops.
DeltaFile
+56-0llvm/test/CodeGen/AMDGPU/misched-into-wmma-hazard-shadow.mir
+6-0llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+62-02 files

LLVM/project 88388ballvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp GCNHazardRecognizer.h

[AMDGPU] Refactor hazard recognizer for VALU-pipeline hazards. NFCI.

This is in preparation of handling these in scheduler. I do not expect
any changes to the produce code here, it is just an infrastructure.
Our current problem with the VALU pipeline hazards is that we only
insert V_NOP instructions in the hazard recognizer mode, but ignore
it during scheduling. This patch is meant to create a mechanism to
actually account for that during scheduling.
DeltaFile
+47-38llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+8-1llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+55-392 files

LLVM/project 9c641ballvm/include/llvm/CodeGen/GlobalISel Utils.h RegBankSelect.h, llvm/lib/CodeGen/GlobalISel Utils.cpp IRTranslator.cpp

GlobalISel: Stop using TPC to check if GlobalISelAbort is enabled

New pass manager does not use TargetPassConfig.
GlobalISel requires TargetPassConfig to reportGISelFailure,
and it only actual use is to check if GlobalISelAbort is enabled.
TargetPassConfig uses TargetMachine to check if GlobalISelAbort is
enabled, but TargetMachine is also available from MachineFunction.
DeltaFile
+16-12llvm/lib/CodeGen/GlobalISel/Utils.cpp
+10-10llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+6-9llvm/lib/CodeGen/GlobalISel/InstructionSelect.cpp
+5-4llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp
+3-3llvm/include/llvm/CodeGen/GlobalISel/Utils.h
+2-2llvm/include/llvm/CodeGen/GlobalISel/RegBankSelect.h
+42-404 files not shown
+45-4810 files

LLVM/project dec77e4llvm/lib/Transforms/Vectorize VPlanRecipes.cpp

[VPlan] Improve code in VPInstruction::generate (NFC) (#169470)

Make miscellaneous improvements including inlining some expressions and
re-using the existing State.Builder reference.
DeltaFile
+17-22llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+17-221 files

LLVM/project 61881c3clang/lib/CodeGen CGCUDARuntime.cpp, clang/lib/Sema SemaCUDA.cpp SemaDecl.cpp

[CUDA] Add device-side kernel launch support (#165519)

- CUDA's dynamic parallelism extension allows device-side kernel
launches, which share the identical syntax to host-side launches, e.g.,

    kernel<<<Dg, Db, Ns, S>>>(arguments);

but differ from the code generation. That device-side kernel launches is
eventually translated into the following sequence

    config = cudaGetParameterBuffer(alignment, size);
    // setup arguments by copying them into `config`.
    cudaLaunchDevice(func, config, Dg, Db, Ns, S);

- To support the device-side kernel launch, 'CUDAKernelCallExpr' is
reused but its config expr is set to a call to 'cudaLaunchDevice'.
During the code generation, 'CUDAKernelCallExpr' is expanded into the
sequence aforementioned.


    [2 lines not shown]
DeltaFile
+106-0clang/lib/CodeGen/CGCUDARuntime.cpp
+93-6clang/lib/Sema/SemaCUDA.cpp
+23-14clang/lib/Serialization/ASTWriter.cpp
+35-0clang/test/CodeGenCUDA/device-kernel-call.cu
+24-8clang/lib/Sema/SemaDecl.cpp
+8-18clang/test/SemaCUDA/function-overload.cu
+289-4616 files not shown
+409-6022 files

LLVM/project a7c1f46mlir/lib/Target/SPIRV/Deserialization Deserializer.cpp Deserializer.h, mlir/test/Target/SPIRV selection_switch.spvasm

[mlir][spirv] Enable block splitting for `spirv.Switch` (#170147)

This is not strictly necessary as now selection regions can yield
values, however splitting the block simplifies the code as it avoids
unnecessary values being sunk just to be later yielded.
DeltaFile
+69-0mlir/test/Target/SPIRV/selection_switch.spvasm
+7-7mlir/lib/Target/SPIRV/Deserialization/Deserializer.cpp
+5-5mlir/lib/Target/SPIRV/Deserialization/Deserializer.h
+81-123 files

LLVM/project 25ab47bllvm/lib/Transforms/Vectorize VPlanRecipes.cpp VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize/AArch64 pr60831-sve-inv-store-crash.ll

[VPlan] Use wide IV if scalar lanes > 0 are used with scalable vectors. (#169796)

For scalable vectors, VPScsalarIVStepsRecipe cannot create all scalar
step values. At the moment, it creates a vector, in addition to to the
first lane. The only supported case for this is when only the last lane
is used. A recipe should not set both scalar and vector values.

Instead, we can simply use a vector induction. It would also be possible
to preserve the current vector code-gen, by creating VPInstructions
based on the first lane of VPScalarIVStepsRecipe, but using a vector
induction seems simpler.

PR: https://github.com/llvm/llvm-project/pull/169796
DeltaFile
+58-99llvm/test/Transforms/LoopVectorize/AArch64/pr60831-sve-inv-store-crash.ll
+0-22llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+12-5llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+70-1263 files

LLVM/project 3d862cfllvm/include/llvm/CodeGen/GlobalISel LegalizerInfo.h, llvm/lib/CodeGen/GlobalISel LegalityPredicates.cpp

[SPIRV] Add legalization for long vectors (#169665)

This patch introduces the necessary infrastructure to legalize vector
operations on vectors that are longer than what the SPIR-V target
supports. For instance, shaders only support vectors up to 4 elements.

The legalization is done by splitting the long vectors into smaller
vectors of a legal size.

Specifically, this patch does the following:
- Introduces `vectorElementCountIsGreaterThan` and
  `vectorElementCountIsLessThanOrEqualTo` legality predicates.
- Adds legalization rules for `G_SHUFFLE_VECTOR`,
`G_EXTRACT_VECTOR_ELT`,
  `G_BUILD_VECTOR`, `G_CONCAT_VECTORS`, `G_SPLAT_VECTOR`, and
  `G_UNMERGE_VALUES`.
- Handles `G_BITCAST` of long vectors by converting them to
  `@llvm.spv.bitcast` intrinsics which are then legalized.
- Updates `selectUnmergeValues` to handle extraction of both scalars

    [3 lines not shown]
DeltaFile
+181-11llvm/lib/Target/SPIRV/SPIRVLegalizerInfo.cpp
+133-0llvm/test/CodeGen/SPIRV/legalization/vector-legalization-shader.ll
+69-0llvm/test/CodeGen/SPIRV/legalization/vector-legalization-kernel.ll
+37-13llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+20-0llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp
+10-0llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
+450-241 files not shown
+454-247 files

LLVM/project 8a3891cllvm/lib/Target/AMDGPU AMDGPURewriteAGPRCopyMFMA.cpp

[AMDGPU][NPM] Preserve analyses in AMDGPURewriteAGPRCopyMFMA for NPM (#170130)

The pass preserved LiveStacksAnalysis but failed to preserve
LiveIntervalsAnalysis, LiveRegMatrixAnalysis, VirtRegMapAnalysis, and
SlotIndexesAnalysis under NPM. This caused these analyses to be
invalidated and recomputed, leading to incorrect behavior in subsequent
passes like VirtRegRewriter.

Fix by explicitly preserving all required analyses in the NPM version,
matching the legacy pass manager behavior.

---------

Co-authored-by: vikhegde <vikram.hegde at amd.com>
DeltaFile
+7-2llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+7-21 files

LLVM/project ad187e1clang/docs ReleaseNotes.rst, clang/include/clang/Basic DiagnosticCommonKinds.td

[Clang] [C++26] Expansion Statements (Part 11)
DeltaFile
+104-0clang/test/AST/ast-print-expansion-stmts.cpp
+49-0clang/test/AST/ast-dump-expansion-stmt.cpp
+0-4clang/include/clang/Basic/DiagnosticCommonKinds.td
+1-1clang/www/cxx_status.html
+2-0clang/docs/ReleaseNotes.rst
+156-55 files

LLVM/project b874722clang/include/clang/Basic DiagnosticSemaKinds.td, clang/include/clang/Sema Sema.h

[Clang] [C++26] Expansion Statements (Part 6)
DeltaFile
+96-5clang/lib/Sema/SemaExpand.cpp
+52-11clang/lib/Sema/TreeTransform.h
+3-0clang/include/clang/Sema/Sema.h
+2-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+153-164 files