LLVM/project 03f488allvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp, llvm/test/MC/AArch64 global-tagging.ll

[AsmPrinter][MTE] Support memtag-globals for all AArch64 targets (#187065)

This change ensures that all AArch64 targets can use memtag globals, not
only Android.
DeltaFile
+7-3llvm/test/MC/AArch64/global-tagging.ll
+2-2llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+9-52 files

LLVM/project da47edellvm/include/llvm/CodeGen RegisterScavenging.h, llvm/lib/Target/AArch64 AArch64FrameLowering.cpp

[AArch64] Fix register scavenger crash when merging MTE stack tags (#186934)

When `-sanitize=memtag-stack` is enabled, `TagStoreEdit::emitLoop`
optimizes contiguous ST2Gi instructions into an STGloop. Because this
runs during PEI (post-register allocation), it spawns two new virtual
registers: BaseReg and SizeReg.

Under high register pressure (e.g., Swift async continuation thunks
where almost all registers are kept live), the Register Scavenger must
rely on emergency spill slots to assign physical registers to BaseReg
and SizeReg.

Previously, the compiler assumed at most one emergency spill slot was
needed. If PEI found an unused Callee-Saved Register (`ExtraCSSpill`),
it bypassed allocating an emergency slot entirely. If no CSRs were free,
it allocated exactly one slot. Because STGloop requires TWO scratch
locations, the scavenger would crash trying to fulfill the second
allocation.


    [11 lines not shown]
DeltaFile
+551-0llvm/test/CodeGen/AArch64/memtag-stg-loop-reg-pressure.mir
+28-0llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+2-0llvm/include/llvm/CodeGen/RegisterScavenging.h
+581-03 files

LLVM/project 7f1be73llvm/include/llvm/Remarks RemarkStreamer.h, llvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp

[spr] initial version

Created using spr 1.3.8-wip
DeltaFile
+5-6llvm/lib/Remarks/RemarkStreamer.cpp
+5-5llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+3-3llvm/test/CodeGen/X86/remarks-section.ll
+3-1llvm/include/llvm/Remarks/RemarkStreamer.h
+16-154 files

LLVM/project 2408c2blibclc/clc/lib/generic/math clc_nextdown.inc clc_nextup.inc

libclc: Fix nextafter with -cl-denorms-are-zero

Follow the suggested behavior of returning +/-FLT_MIN for logical
zeros.
DeltaFile
+5-2libclc/clc/lib/generic/math/clc_nextdown.inc
+4-0libclc/clc/lib/generic/math/clc_nextup.inc
+2-0libclc/clc/lib/generic/math/clc_nextafter.cl
+1-1libclc/clc/lib/generic/math/clc_nextafter.inc
+1-0libclc/clc/lib/generic/math/clc_nextdown.cl
+1-0libclc/clc/lib/generic/math/clc_nextup.cl
+14-36 files

LLVM/project ea2ac04libclc/clc/include/clc/math clc_flush_if_daz.h clc_canonicalize.h, libclc/clc/lib/generic/math clc_flush_if_daz.inc clc_flush_if_daz.cl

libclc: Add canonicalize utility functions

This is mostly to work around spirv's canonicalize still
being broken.
DeltaFile
+43-0libclc/clc/lib/generic/math/clc_flush_if_daz.inc
+19-0libclc/clc/include/clc/math/clc_flush_if_daz.h
+19-0libclc/clc/include/clc/math/clc_canonicalize.h
+16-0libclc/clc/lib/generic/math/clc_flush_if_daz.cl
+15-0libclc/clc/lib/generic/math/clc_canonicalize.cl
+0-10libclc/clc/include/clc/math/math.h
+112-104 files not shown
+121-1510 files

LLVM/project 08848cdlibclc/clc/include/clc/math gentype.inc clc_subnormal_config.h, libclc/clc/lib/generic subnormal_config.cl

libclc: Really implement denormal config checks

These should be implementable by checking the behavior of
the canonicalize intrinsic. Hack around spirv still failing
on canonicalize by overriding and assuming DAZ for float.
DeltaFile
+24-3libclc/clc/lib/generic/subnormal_config.cl
+19-0libclc/clc/lib/spirv/subnormal_config.cl
+7-1libclc/clc/include/clc/math/gentype.inc
+3-0libclc/clc/include/clc/math/clc_subnormal_config.h
+1-0libclc/clc/lib/spirv/CMakeLists.txt
+54-45 files

LLVM/project eabcfceclang/lib/Headers/hlsl hlsl_alias_intrinsics.h, clang/test/CodeGenHLSL/builtins QuadReadAcrossX.hlsl

[HLSL][DXIL][SPIRV] QuadReadAcrossX intrinsic support (#184360)

This PR adds QuadReadAcrossX intrinsic support in HLSL with codegen for
both DirectX and SPIRV backends. Resolves
https://github.com/llvm/llvm-project/issues/99175.

- [x] Implement QuadReadAcrossX clang builtin
- [x]  Link QuadReadAcrossX clang builtin with hlsl_intrinsics.h
- [x] Add sema checks for QuadReadAcrossX to
CheckHLSLBuiltinFunctionCall in SemaChecking.cpp
- [x] Add codegen for QuadReadAcrossX to EmitHLSLBuiltinExpr in
CGBuiltin.cpp
- [x] Add codegen tests to
clang/test/CodeGenHLSL/builtins/QuadReadAcrossX.hlsl
- [x] Add sema tests to
clang/test/SemaHLSL/BuiltIns/QuadReadAcrossX-errors.hlsl
- [x] Create the int_dx_QuadReadAcrossX intrinsic in
IntrinsicsDirectX.td
- [x] Create the DXILOpMapping of int_dx_QuadReadAcrossX to 123 in

    [8 lines not shown]
DeltaFile
+99-0clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+87-0llvm/test/CodeGen/DirectX/QuadReadAcrossX.ll
+46-0clang/test/CodeGenHLSL/builtins/QuadReadAcrossX.hlsl
+44-0llvm/test/CodeGen/SPIRV/hlsl-intrinsics/QuadReadAcrossX.ll
+30-0llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+28-0clang/test/SemaHLSL/BuiltIns/QuadReadAcrossX-errors.hlsl
+334-012 files not shown
+401-218 files

LLVM/project 17158b2llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp

[InstCombine] Fix comment in SimplifyDemandedUseBits (NFC) (#187126)

Fix the values in the truth table comment for the combine

  add iN (sext i1 X), (sext i1 Y) --> sext (X | Y) to iN
DeltaFile
+2-2llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+2-21 files

LLVM/project e9799e5lldb/test/API/tools/lldb-dap/variables TestDAP_variables.py main.cpp, lldb/tools/lldb-dap Variables.cpp ProtocolUtils.cpp

[lldb-dap] Improve support for variables with anonymous fields and types (#186482)

While looking at the '[raw]' value of a std::vector I noticed we didn't
handle the anonymous inner struct very well. The 'evaluateName' was
incorrect (e.g. the evaluateName would return `<var>.` for the anonymous
struct).

This improves support for variables with anonymous fields and anonymous
types.

* Changed the name of anonymous fields from `<null>` to `(anonymous)`,
which matches other tooling like clangd's representation and how types
are presented if the field is not defined.
* Adjusts variables to not return an 'evaluateName' for anonymous
fields.
* Adjusted '[raw]' values to be marked as 'internal' which deemphasizes
them in the UI.

While working in this area, I also consolidated some helpers that are

    [10 lines not shown]
DeltaFile
+194-109lldb/tools/lldb-dap/Variables.cpp
+153-24lldb/test/API/tools/lldb-dap/variables/TestDAP_variables.py
+0-73lldb/tools/lldb-dap/ProtocolUtils.cpp
+18-25lldb/tools/lldb-dap/Variables.h
+42-1lldb/test/API/tools/lldb-dap/variables/main.cpp
+0-42lldb/tools/lldb-dap/ProtocolUtils.h
+407-2746 files not shown
+440-29812 files

LLVM/project 88a6589clang/include/clang/AST TemplateBase.h, clang/lib/AST TemplateBase.cpp TypeLoc.cpp

[clang] fix crash related to missing source locations for converted template arguments

This adds a way to attach source locations to trivially created template
arguments such as packs, or converted expressions when there is no
expression anymore.

This also avoids crashes due to missing source locations.

In a few places where this matters, we already create expressions
from the converted arguments, but this requires access to Sema,
where currently creating trivial typelocs only requires access to
to the ASTContext.

So this creates a new storage kind for TemplateArgumentLocs, where
a single SourceLocation is stored, embedded in the pointer where
possible.

As a drive-by, strenghten asserts by enforcing the TemplateArgumentLocs
are created with the right kinds of locations.

    [2 lines not shown]
DeltaFile
+54-3clang/include/clang/AST/TemplateBase.h
+19-0clang/lib/AST/TemplateBase.cpp
+4-4clang/lib/Sema/SemaExpr.cpp
+7-0clang/test/SemaCXX/type_pack_element.cpp
+2-5clang/lib/AST/TypeLoc.cpp
+3-3clang/lib/Sema/SemaTemplate.cpp
+89-152 files not shown
+91-168 files

LLVM/project 0ea2e58llvm/lib/Transforms/Vectorize VPlan.cpp, llvm/test/Transforms/LoopVectorize early_exit_with_outer_loop.ll single_early_exit_with_outer_loop.ll

[VPlan] Account for early-exit dispatch blocks when updating LI. (#185618)

Now that we can vectorize loops with multiple early exits, we emit
dispatch blocks after the middle block to go to a specific exit or
continue in the dispatch chain.

With that, we need to be a bit more careful when it comes to picking the
loop the dispatch block belongs to. The dispatch block will belong to
the innermost loop of all exit blocks reachable from the current block.

Fixes https://github.com/llvm/llvm-project/issues/185362

PR: https://github.com/llvm/llvm-project/pull/185618
DeltaFile
+417-0llvm/test/Transforms/LoopVectorize/early_exit_with_outer_loop.ll
+0-121llvm/test/Transforms/LoopVectorize/single_early_exit_with_outer_loop.ll
+26-0llvm/lib/Transforms/Vectorize/VPlan.cpp
+443-1213 files

LLVM/project b2c2422clang/include/clang/CIR/Dialect/IR CIRAttrs.td CIROps.td, clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp

[CIR] Upstream ThreeWayCmpOp (#169963)

This PR upstreams the three way compare op from the incubator repo

---------

Co-authored-by: Hendrik Hübner <hhuebner at Hendriks-MacBook-Pro.local>
DeltaFile
+307-0clang/test/CIR/CodeGen/Inputs/std-compare.h
+95-0clang/test/CIR/CodeGen/three-way-cmp.cpp
+83-0clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+68-0clang/include/clang/CIR/Dialect/IR/CIROps.td
+62-1clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+60-0clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
+675-14 files not shown
+777-310 files

LLVM/project 81950f6mlir/lib/Dialect/GPU/IR InferIntRangeInterfaceImpls.cpp, mlir/test/Dialect/GPU int-range-interface-cluster.mlir

[mlir][GPU] Bump static bound on cluster IDs (#187106)

Hardware (like AMD's gfx1250) allows 16 workgroups per cluster, but the
static bound of 8 from many years ago hasn't been updated. This commit
adds such an update and adds a test for that bound.
DeltaFile
+25-1mlir/test/Dialect/GPU/int-range-interface-cluster.mlir
+1-1mlir/lib/Dialect/GPU/IR/InferIntRangeInterfaceImpls.cpp
+26-22 files

LLVM/project 480eba3lldb/source/Core Module.cpp, lldb/test/API/functionalities/compilation-prefix-map TestCompilationPrefixMap.py

[lldb][PrefixMap] follow up fixes to #187145 (#187337)

Fix and improve #187145 for following issues:
* Fix unhandled error.
* Align the log type with the file where it contains.
* The added test doesn't work on windows host for remote debugging, add
  decorator to skip when host and target do not match.
DeltaFile
+14-12lldb/source/Core/Module.cpp
+3-2lldb/test/API/functionalities/compilation-prefix-map/TestCompilationPrefixMap.py
+17-142 files

LLVM/project 872247cllvm/lib/Target/NVPTX NVPTXISelDAGToDAG.cpp NVPTXISelLowering.cpp, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[NVPTX] Split Param address space into EntryParam and DeviceParam (NFC) (#186636)

This change begins clarifying and cleaning up some oddities around the
param address-space in NVPTX. PTX supports ".param" loads and stores
referring to both entry (kernel) and device parameters, however these
spaces are actually quite different. Entry param space supports
pointers, and addrspace-casting to generic while device parameter space
can only be refrenced by a parameter plus an immediate offset. This
change accounts for this fact with the following refactors:

- Rename `ADDRESS_SPACE_PARAM` -> `ADDRESS_SPACE_ENTRY_PARAM`. This
reflects the fact that only entry parameter space can be meaningfully
modeled in LLVM IR and that pointers with this AS in llvm IR are always
referring to entry parameters.

- Add `NVPTX::AddressSpace::DeviceParam` for NVPTX MIR instructions.
This is used in NVPTX MIR instructions to signify that they load/store
device parameters. It has a distinct value from
`NVPTX::AddressSpace::EntryParam` so that in the future we can print
these differently on supported PTX versions.
DeltaFile
+18-26llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
+19-16llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+14-9llvm/lib/Target/NVPTX/NVPTX.h
+9-8llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
+1-3mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+2-2llvm/lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
+63-647 files not shown
+73-7113 files

LLVM/project befaa35libcxx/include string __tree, libcxx/test/libcxx/type_traits is_transparently_comparable.compile.pass.cpp

[libc++] Fix passing through object to comparisons in __tree (#186341)

Fixes #180659
DeltaFile
+22-11libcxx/include/string
+32-0libcxx/test/libcxx/type_traits/is_transparently_comparable.compile.pass.cpp
+2-2libcxx/include/__tree
+56-133 files

LLVM/project 3e09538libcxx/test/std/containers/associative lookup_with_converting_comparator.pass.cpp, libcxx/test/std/containers/associative/map/map.ops find.pass.cpp

[libc++] Expand test coverage for converting comparators in associative containers (#187133)

This is in preparation for fixing #187105.
DeltaFile
+87-0libcxx/test/std/containers/associative/lookup_with_converting_comparator.pass.cpp
+0-13libcxx/test/std/containers/associative/map/map.ops/find.pass.cpp
+87-132 files

LLVM/project a33e9e5lldb/source/Target StackFrameList.cpp

Move the call frame edges log messages to the verbose channel. (#187324)

The messages about searching for call edges can be really verbose and
they are only useful if you are explicitly debugging the call edges
feature. Most of the time they are irrelevant and just make the step log
output hard to read.
DeltaFile
+10-9lldb/source/Target/StackFrameList.cpp
+10-91 files

LLVM/project ee75b37mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp, mlir/test/Dialect/OpenMP ops.mlir

Add placeholder if linear modifier is not specified
DeltaFile
+11-0mlir/test/Dialect/OpenMP/ops.mlir
+3-2mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+14-22 files

LLVM/project 61af888llvm/test/tools/llvm-remarkutil/filter filter-sort-dedupe.test, llvm/test/tools/llvm-remarkutil/filter/Inputs filter-unsorted.yaml

[spr] initial version

Created using spr 1.3.8-wip
DeltaFile
+48-1llvm/tools/llvm-remarkutil/RemarkFilter.cpp
+46-0llvm/test/tools/llvm-remarkutil/filter/filter-sort-dedupe.test
+28-0llvm/test/tools/llvm-remarkutil/filter/Inputs/filter-unsorted.yaml
+122-13 files

LLVM/project 45857cflldb/source/Core Module.cpp, lldb/test/API/functionalities/compilation-prefix-map TestCompilationPrefixMap.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+14-12lldb/source/Core/Module.cpp
+3-2lldb/test/API/functionalities/compilation-prefix-map/TestCompilationPrefixMap.py
+17-142 files

LLVM/project a2891ffllvm/include/llvm/Transforms/Utils UnrollLoop.h, llvm/lib/Target/AMDGPU AMDGPUTargetTransformInfo.cpp

Reapply "[LoopUnroll] Remove computeUnrollCount()'s return value" (#187104)

Address
https://github.com/llvm/llvm-project/pull/184529#issuecomment-4074393657
by checking the loop's metadata prior to unrolling.
DeltaFile
+75-54llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
+54-0llvm/test/Transforms/LoopUnroll/disable-after-count.ll
+0-31llvm/lib/Transforms/Utils/LoopUnroll.cpp
+10-14llvm/include/llvm/Transforms/Utils/UnrollLoop.h
+6-16llvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp
+2-3llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+147-1186 files

LLVM/project f8db5dbflang/include/flang/Optimizer/Dialect FIROps.td, flang/lib/Optimizer/Dialect FIROps.cpp

[flang] Fix fir.call setCalleeFromCallable (#187124)

The CallOpInterface setCalleeFromCallable allows either value or
SymbolRef to be passed in. However, the implementation showed an issue
because while it was able to set attribute, it would fall-through and
also try to set value.

This PR improves the implementation to handle updating the callee even
when switching modes (direct vs indirect) and adds testing for these
APIs.
DeltaFile
+414-0flang/unittests/Optimizer/FIRCallInterfaceTest.cpp
+50-0flang/lib/Optimizer/Dialect/FIROps.cpp
+1-7flang/include/flang/Optimizer/Dialect/FIROps.td
+1-0flang/unittests/Optimizer/CMakeLists.txt
+466-74 files

LLVM/project ba231aaclang-tools-extra/clang-doc/assets clang-doc-mustache.css

[clang-doc] Enclose documented entities in a card (#185121)

This patch adds a card that encompasses the whole documented entity
instead of just the description. This helps to visually separate the
documentation which was previously more difficult to distinguish. The
description card is also changed to only show a left border to create
less visual noise within the card.

The light theme colors are also changed slightly to not be completely
white.
DeltaFile
+11-8clang-tools-extra/clang-doc/assets/clang-doc-mustache.css
+11-81 files

LLVM/project 20a6cf9libsycl/src/detail/offload offload_utils.cpp

default -> specific switch case

Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova at intel.com>
DeltaFile
+1-1libsycl/src/detail/offload/offload_utils.cpp
+1-11 files

LLVM/project d54da68llvm/test/tools/llvm-remarkutil/filter filter-exclude.test, llvm/tools/llvm-remarkutil RemarkFilter.cpp

[llvm-remarkutil] filter: Add --exclude flag (#187163)

Add --exclude to invert filter behavior, keeping all remarks excluding
those matching the filter.

Pull Request: https://github.com/llvm/llvm-project/pull/187163
DeltaFile
+9-0llvm/test/tools/llvm-remarkutil/filter/filter-exclude.test
+8-1llvm/tools/llvm-remarkutil/RemarkFilter.cpp
+17-12 files

LLVM/project cfb1fa0libclc/clc/include/clc/math clc_subnormal_config.h math.h, libclc/clc/lib/generic subnormal_config.cl

libclc: Invert subnormal checks

The base case is correct denormal handling, not flushing. This
also matches the spec controls, which starts at IEEE and
flushing is enabled with -cl-denorms-are-zero.

Also fix wrong defaults for half and double. Denormal support is
not optional for these.
DeltaFile
+3-11libclc/clc/lib/generic/subnormal_config.cl
+3-4libclc/clc/include/clc/math/clc_subnormal_config.h
+1-1libclc/clc/include/clc/math/math.h
+1-1libclc/clc/lib/generic/math/clc_ldexp.cl
+8-174 files

LLVM/project 7166468flang/lib/Optimizer/OpenACC/Transforms ACCUseDeviceCanonicalizer.cpp, flang/test/Fir/OpenACC use-device-canonicalizer.mlir

[flang][acc] Handle deduplicated use_device (part 2) (#187305)

After https://github.com/llvm/llvm-project/pull/186855 there was still
one additional part of the pass that assumed it was able to erase
acc.use_device. Thus extend the same solution and add test.
DeltaFile
+39-0flang/test/Fir/OpenACC/use-device-canonicalizer.mlir
+8-3flang/lib/Optimizer/OpenACC/Transforms/ACCUseDeviceCanonicalizer.cpp
+47-32 files

LLVM/project 731969clibsycl/include/sycl/__impl usm_functions.hpp

add more articles

Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova at intel.com>
DeltaFile
+8-8libsycl/include/sycl/__impl/usm_functions.hpp
+8-81 files

LLVM/project bfaf9e5libclc/clc/lib/generic subnormal_config.cl CMakeLists.txt, libclc/opencl/lib/clspv CMakeLists.txt

libclc: Move subnormal config file to clc

Fix layering violation.
DeltaFile
+0-22libclc/opencl/lib/generic/subnormal_config.cl
+21-0libclc/clc/lib/generic/subnormal_config.cl
+1-0libclc/clc/lib/generic/CMakeLists.txt
+0-1libclc/opencl/lib/clspv/CMakeLists.txt
+0-1libclc/opencl/lib/generic/CMakeLists.txt
+22-245 files