LLVM/project 26493fcllvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVISelLowering.h

[RISCV][NFC] Turn lowerVECTOR_SHUFFLE into a member function of RISCVTargetLowering (#194299)

Convert lowerVECTOR_SHUFFLE into a member function of
RISCVTargetLowering, aligning it with other lowerXXX member functions in
RISCVTargetLowering and matching other targets like AArch64.
DeltaFile
+4-4llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+1-0llvm/lib/Target/RISCV/RISCVISelLowering.h
+5-42 files

LLVM/project 955bb5cclang-tools-extra/docs/clang-tidy/checks/cert flp37-c.rst exp42-c.rst

[clang-tidy][Docs] Remove all auto-redirects in documentation. NFC. (#193408)

RFC:
https://discourse.llvm.org/t/rfc-remove-automatic-redirects-from-clang-tidy-documentation/90633
DeltaFile
+1-2clang-tools-extra/docs/clang-tidy/checks/cert/flp37-c.rst
+1-2clang-tools-extra/docs/clang-tidy/checks/cert/exp42-c.rst
+1-2clang-tools-extra/docs/clang-tidy/checks/cert/exp45-c.rst
+0-2clang-tools-extra/docs/clang-tidy/checks/cert/ctr56-cpp.rst
+0-2clang-tools-extra/docs/clang-tidy/checks/cert/dcl03-c.rst
+0-2clang-tools-extra/docs/clang-tidy/checks/cert/dcl16-c.rst
+3-1289 files not shown
+3-19095 files

LLVM/project 61adeccmlir/include/mlir/Dialect/XeGPU/uArch IntelGpuXe2.h, mlir/lib/Dialect/XeGPU/Transforms XeGPULayoutImpl.cpp XeGPUPropagateLayout.cpp

[MLIR][XeGPU] XeGPU DpasMx Op Definition adds Layout Support (#194117)

This PR extends the DpasMx operation to support MXFP (microscaling
floating point) matrix multiply with separate scale factor layouts.

1. Op Definition
     Added layout_a_scale and layout_b_scale attributes to DpasMx op
Removed AllElementTypesMatch<["a", "b"]> trait to allow different types
for A/B with scales
2. Layout Infrastructure
setupDpasMxLayout(): Creates anchor layouts for all 5 operands (A, B,
C/D, scale_a, scale_b)
Derives scale layouts from parent matrix layouts by dividing innermost
dimension
    Supports all layout kinds: Subgroup, InstData, Lane
Fix a bug in getupDpasSubgroupLayouts(): sg_data of A/B matrix should
keep the full K dimension.
3. Layout Propagation


    [7 lines not shown]
DeltaFile
+303-112mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+196-3mlir/include/mlir/Dialect/XeGPU/uArch/IntelGpuXe2.h
+135-0mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+97-15mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
+86-1mlir/test/Dialect/XeGPU/propagate-layout-inst-data.mlir
+85-1mlir/test/Dialect/XeGPU/propagate-layout.mlir
+902-1324 files not shown
+930-14310 files

LLVM/project 3c66b32libc/shared/math llogbbf16.h, libc/src/__support/math llogbbf16.h CMakeLists.txt

[libc][math] Refactor llogbbf16 to header-only (#194509)

Refactor llogbbf16 to be header-only.

part of: #147386
DeltaFile
+26-0libc/src/__support/math/llogbbf16.h
+23-0libc/shared/math/llogbbf16.h
+17-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+11-0libc/src/__support/math/CMakeLists.txt
+2-7libc/src/math/generic/llogbbf16.cpp
+1-5libc/src/math/generic/CMakeLists.txt
+80-124 files not shown
+85-1210 files

LLVM/project fe46959libc/shared/math ilogbbf16.h, libc/src/__support/math ilogbbf16.h CMakeLists.txt

[libc][math] Refactor ilogbbf16 to header-only (#194503)

Refactors ilogbbf16 to be header-only.
DeltaFile
+26-0libc/src/__support/math/ilogbbf16.h
+23-0libc/shared/math/ilogbbf16.h
+17-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+2-7libc/src/math/generic/ilogbbf16.cpp
+9-0libc/src/__support/math/CMakeLists.txt
+1-5libc/src/math/generic/CMakeLists.txt
+78-124 files not shown
+84-1210 files

LLVM/project 1228142clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp, clang/test/CIR/CodeGen consteval-aggregate.cpp

[CIR] Implement PredefinedExpr in aggregate emitter and add consteval… (#194484)

… aggregate test

Handle PredefinedExpr by delegating to emitAggLoadOfLValue, removing the
NYI fallback. Also add a test for ConstantExpr aggregate emission
(consteval functions returning structs), which was already implemented
but lacked test coverage.

This unblocks ~206 libcxx test failures that involve aggregate
ConstantExpr and PredefinedExpr.

Note on LLVM IR divergence (will be addressed in follow-up PRs): For
consteval functions returning aggregates, CIR currently emits a global
constant + cir.copy that lowers to llvm.memcpy from the global, while
OGCG decomposes the constant into per-field stores. The added CIR / LLVM
/ OGCG CHECK lines in consteval-aggregate.cpp document this difference.
Convergence will come from a follow-up that decomposes the consteval
aggregate stores into per-field stores in LoweringPrepare (and related
GEP-index handling for padded structs).
DeltaFile
+44-0clang/test/CIR/CodeGen/consteval-aggregate.cpp
+1-4clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+45-42 files

LLVM/project eec2249llvm/lib/Target/RISCV RISCVTargetTransformInfo.cpp, llvm/test/Transforms/LoopVectorize/RISCV tail-folding-interleave.ll

[RISCV] Improve getInterleavedMemoryOpCost for interleave groups with tail gaps. (#192074)

For interleaved access groups where gaps are only at the tail (i.e.
members are contiguous starting from index 0 but do not fill the entire
factor), the interleaved memory access pass can lower them to
vlsseg/vssseg intrinsics with NF equal to the number of group members
rather than the factor after #151612 and #154647.

Previously these groups fell through to the generic fixed-vector shuffle
cost model. This patch adds a dedicated cost path that checks legality
and estimates appropriate cost for them.

TODO: Support scalable vector type.
Fix #151497
DeltaFile
+11-18llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-interleave.ll
+28-0llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+39-182 files

LLVM/project 085f240clang/lib/CIR/CodeGen CIRGenExprConstant.cpp CIRGenModule.cpp, clang/test/CIR/CodeGen temp-param-obj-decl.cpp

[CIR] Lower constant NTTP objects (#194496)

Like my previous patch, this just stores an NTTP object as a global
(using the same code, with 1 level of indrection slipped off), and
initializes it as a const. This patch also fleshes out the
CIRGenExprConstant.cpp area, leaving just 2 'NYI's in the area, 1 of
which is the MSGuidAttr again.
DeltaFile
+15-4clang/test/CIR/CodeGen/temp-param-obj-decl.cpp
+11-6clang/lib/CIR/CodeGen/CIRGenExprConstant.cpp
+3-3clang/lib/CIR/CodeGen/CIRGenModule.cpp
+2-1clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+1-1clang/lib/CIR/CodeGen/CIRGenModule.h
+32-155 files

LLVM/project 5281d4clibc/shared/math bf16subl.h, libc/src/__support/math bf16subl.h CMakeLists.txt

[libc][math] Refactor bf16subl to header-only (#194498)

Refactors the bf16subl math family to be header-only.
DeltaFile
+26-0libc/src/__support/math/bf16subl.h
+23-0libc/shared/math/bf16subl.h
+15-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+9-0libc/src/__support/math/CMakeLists.txt
+2-5libc/src/math/generic/bf16subl.cpp
+1-5libc/src/math/generic/CMakeLists.txt
+76-104 files not shown
+81-1010 files

LLVM/project eac6d03clang/test/Driver modules-driver-import-std.cpp

[clang][modules-driver] Fix failing import-std regression test (#194502)

See
https://github.com/llvm/llvm-project/pull/194475#issuecomment-4331347690.
This constrains the test to not run on aarch64, where it fails on
`clang-aarch64-quick` and `llvm-clang-aarch64-darwin` builders.
The failing builders don't show any output, and the test will be
re-enabled for aarch64 in a later follow-up.

Co-authored-by: Naveen Seth Hanig <naveen.hanig at oulook.com>
DeltaFile
+3-0clang/test/Driver/modules-driver-import-std.cpp
+3-01 files

LLVM/project 3ad8184llvm/docs LangRef.rst ReleaseNotes.md, llvm/include/llvm/IR DataLayout.h

[DataLayout] Add null pointer value infrastructure

Add support for specifying the null pointer bit representation per address space
in DataLayout via new pointer spec flags:
- 'z': null pointer is all-zeros
- 'o': null pointer is all-ones

When neither flag is present, the address space inherits the default set by the
new 'N<null-value>' top-level specifier ('Nz' or 'No'). If that is also absent,
the null pointer value is zero.

No target DataLayout strings are updated in this change. This is pure
infrastructure for a future ConstantPointerNull semantic change to support
targets with non-zero null pointers (e.g. AMDGPU).
DeltaFile
+136-1llvm/unittests/IR/DataLayoutTest.cpp
+61-6llvm/lib/IR/DataLayout.cpp
+23-1llvm/include/llvm/IR/DataLayout.h
+17-1llvm/docs/LangRef.rst
+8-0llvm/docs/ReleaseNotes.md
+245-95 files

LLVM/project c1e78edclang/include/clang/Driver ModulesDriver.h, clang/lib/Driver ModulesDriver.cpp

Reland "[clang][modules-driver] Add support for C++ named modules and import std" (2nd attempt) (#194475)

This reverts #193857 and relands #193312.

This adds basic support for explicit C++ named module builds, managed
natively by the Clang driver, including support for use of the Standard
library modules. This follows #187606, which adds the same for Clang
modules.

Current limitations:
- Standard library modules are still compiled to object files instead of
using the provided shared library. (This will be addressed in a
follow-up soon.)
- Caching is not supported yet (but likely to be added during the
upcoming GSoC cycle).
- Importing C++ standard library modules into Clang modules is not
supported (and not expected in the near term).

RFC:

https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system
DeltaFile
+111-0clang/test/Driver/modules-driver-both-modules-types.cpp
+89-11clang/lib/Driver/ModulesDriver.cpp
+88-0clang/test/Driver/modules-driver-cxx-modules-only.cpp
+63-0clang/test/Driver/modules-driver-import-std.cpp
+10-0clang/test/Driver/modules-driver-incompatible-options.cpp
+4-0clang/include/clang/Driver/ModulesDriver.h
+365-112 files not shown
+369-118 files

LLVM/project 94a6a6bclang/include/clang/CIR/Dialect/Builder CIRBaseBuilder.h, clang/lib/CIR/CodeGen CIRGenModule.cpp CIRGenExpr.cpp

[CIR] Handle DeclRefExpr's to NTTP Objects (#194482)

NTTP objects are represented as globals so that you can refer to
them/address of them/etc, but most access to them should result in
constant expressions. This patch implements the creation of these
globals, and allows compelation to continue.

This should fix up the last DeclRefExpr LValue that appears other than
MSGuids and named global registers, both of which are specific to
individual attributes.
DeltaFile
+52-0clang/test/CIR/CodeGen/temp-param-obj-decl.cpp
+43-0clang/lib/CIR/CodeGen/CIRGenModule.cpp
+15-2clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+4-3clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+4-0clang/lib/CIR/CodeGen/CIRGenModule.h
+118-55 files

LLVM/project 2774ab1llvm/lib/CodeGen InlineSpiller.cpp, llvm/test/CodeGen/AMDGPU regalloc-hoist-spill-live-range-upd.ll

[InlineSpiller] Fix live-range update in hoisting within bb (#193880)

The InlineSpiller tries to shorten the live-ranges used when storing a
value that is defined by a sibling register by performing the following
transformation:
```
a = copy b
store a
```
=>
```
store b
```
That is, it eliminates the copy and store the original value at the copy
location.

As far as `b`'s live-range is concerned, this transformation is neutral
as long as the store is inserted in place of the copy being removed.


    [37 lines not shown]
DeltaFile
+2,870-0llvm/test/CodeGen/AMDGPU/regalloc-hoist-spill-live-range-upd.ll
+8-0llvm/lib/CodeGen/InlineSpiller.cpp
+2,878-02 files

LLVM/project 5d44230llvm/docs AMDGPUUsage.rst, llvm/docs/AMDGPU DeveloperGuideline.rst

[NFC][AMDGPU][Doc] Add developer guideline

This guideline covers topics on top of existing LLVM guideline.
DeltaFile
+297-0llvm/docs/AMDGPU/DeveloperGuideline.rst
+1-0llvm/docs/AMDGPUUsage.rst
+298-02 files

LLVM/project 0b82418flang/lib/Optimizer/Transforms/CUDA CUFAddConstructor.cpp, flang/test/Fir/CUDA cuda-constructor-2.f90

[flang][cuda] Restore constructor for global only module (#194466)
DeltaFile
+36-11flang/lib/Optimizer/Transforms/CUDA/CUFAddConstructor.cpp
+29-0flang/test/Fir/CUDA/cuda-constructor-2.f90
+65-112 files

LLVM/project a9f550cllvm/lib/Target/PowerPC PPCISelLowering.cpp PPCInstr64Bit.td

[PowerPC] Simplify lowering for ldat intrinsics

This change defines 2 new output patterns, `PAIR8` and `EVEN8`,
and uses them to implement the lowering of the intrinsics
`int_ppc_amo_ldat` and `int_ppc_amo_ldat_cond` in TableGen.
As result, the generated instructions are much clearer, and the
C++ code is also simplified.
DeltaFile
+11-27llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+15-12llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+26-392 files

LLVM/project 6722bb7compiler-rt/lib/asan_abi asan_abi_tbd.txt

[asan_abi] Skip new __asan_get_report_* from ABI (#194463)

PR #181446 ("[asan] API for getting multiple pointer ranges") added five
new __asan_get_report_{dealloc,dest,first,second,src}_address entries to
compiler-rt/lib/asan/asan_interface.inc without updating
asan_abi_tbd.txt
or implementing them in compiler-rt/lib/asan_abi/asan_abi.cpp. This
broke
the AddressSanitizerABI-arm64-darwin ::
Darwin/llvm_interface_symbols.cpp
test, which diffs asan_interface.inc (minus asan_abi_tbd.txt) against
the
symbols actually exported by libclang_rt.asan_abi_osx.a.

List the new symbols alongside the existing __asan_get_report_* entries
so the stable-ABI test passes. The symbols remain unimplemented in the
stable ABI library; this change only reflects that they are
intentionally
not part of the stable ABI surface.

    [5 lines not shown]
DeltaFile
+5-0compiler-rt/lib/asan_abi/asan_abi_tbd.txt
+5-01 files

LLVM/project e44ea65llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv fixed-vectors-sext-vp.ll fixed-vectors-zext-vp.ll

[RISCV] Remove codegen for vp_zext, vp_sext (#194295)

Part of the work to remove trivial VP intrinsics from the RISC-V
backend, see
https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999

This splits off 2 intrinsics from #179622. The vp_truncate combine needs
to be updated to keep the vaaddu patterns for now, but it will be
removed in an upcoming PR.
DeltaFile
+37-61llvm/test/CodeGen/RISCV/rvv/fixed-vectors-sext-vp.ll
+37-60llvm/test/CodeGen/RISCV/rvv/fixed-vectors-zext-vp.ll
+26-53llvm/test/CodeGen/RISCV/rvv/vzext-vp.ll
+26-53llvm/test/CodeGen/RISCV/rvv/vsext-vp.ll
+6-50llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+12-20llvm/test/CodeGen/RISCV/rvv/vpscatter-sdnode.ll
+144-29712 files not shown
+188-35718 files

LLVM/project 822392dllvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine/X86 extract-extract-oob.ll

[VectorCombine] reject out-of-bounds extract indexes in foldExtractExtract (#194381)

Fixes #194355

`VectorCombine::foldExtractExtract()` matches any constant-index
`extractelement` operands, but it never verifies that they are in range
for a fixed-width vector.
DeltaFile
+15-0llvm/test/Transforms/VectorCombine/X86/extract-extract-oob.ll
+7-0llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+22-02 files

LLVM/project 043e778lldb/source/Target Process.cpp Target.cpp

lldb] Fix two issues causing TestEvents.py flakiness (#194438)

This PR fixes two issues that contribute to `TestEvents.py` being flaky
in CI:

1. `ProcessEventData::DoOnRemoval` runs the full stop-handling logic
(like `ShouldStop` and `RunStopHooks`) every time an event is consumed
from any listener. When the primary listener consumes an event and then
the shadow listener consumes the same event, the logic runs twice. The
second execution can race with subsequent event processing. Fix this by
incrementing `m_update_state` after the first successful run so
secondary listeners skip the full logic.

2. Target::RunStopHooks updates `m_latest_stop_hook_id` (marking a stop
as "handled") before checking whether any threads have stop reasons. If
the check fails and hooks don't run, the stop ID is already consumed,
preventing hooks from ever running for that stop. Fix this by deferring
the update until we're certain we'll actually run hooks.
DeltaFile
+10-5lldb/source/Target/Process.cpp
+2-2lldb/source/Target/Target.cpp
+12-72 files

LLVM/project 3c7128bllvm/utils/gn/secondary/llvm/unittests/ADT BUILD.gn

[gn build] Port fe587c5ad92b (#194480)
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/unittests/ADT/BUILD.gn
+1-01 files

LLVM/project 1106bd9llvm/utils/gn/secondary/clang/unittests/Tooling BUILD.gn

[gn build] Port ef18c253321f (#194479)
DeltaFile
+1-0llvm/utils/gn/secondary/clang/unittests/Tooling/BUILD.gn
+1-01 files

LLVM/project ccc0d54llvm/utils/gn/secondary/llvm/lib/ProfileData BUILD.gn

[gn build] Port 0eaa1f5884bf (#194478)
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/ProfileData/BUILD.gn
+1-01 files

LLVM/project 75bef3eclang/tools/clang-sycl-linker CMakeLists.txt

[clang-sycl-linker] Add FrontendOffloading dependency to clang-sycl-linker (#194471)

This PR fixes the build after
https://github.com/llvm/llvm-project/commit/b862554a70eeb9a8315f45406a7e13c771af1eb7

Fixes missing dependency in `clang-sycl-linker/CMakeLists.txt`
DeltaFile
+1-0clang/tools/clang-sycl-linker/CMakeLists.txt
+1-01 files

LLVM/project 8427cb7clang/test/CIR/CodeGen if.cpp cmp.cpp

[CIR] Fix failing tests after bool load change (#194468)

This fixes CIR tests that were failing as a result of the change in how
bool load values are truncated in
https://github.com/llvm/llvm-project/pull/193783
DeltaFile
+5-5clang/test/CIR/CodeGen/if.cpp
+4-4clang/test/CIR/CodeGen/cmp.cpp
+4-4clang/test/CIR/CodeGen/binop.cpp
+4-4clang/test/CIR/CodeGen/ternary-throw.cpp
+3-3clang/test/CIR/CodeGen/ternary.cpp
+2-2clang/test/CIR/CodeGen/complex.cpp
+22-221 files not shown
+23-237 files

LLVM/project 0eaa1f5llvm/include/llvm/ProfileData ETMTraceDecoder.h, llvm/lib/ProfileData ETMTraceDecoder.cpp

Reland "[llvm-profgen] Add support for ETM trace decoding" (#194465)

This relands commit e3bd61890e68303a33fdd33fbdd9abeda (#191584), which
was reverted in commit
ec9d7d18bdfe21c30c94c02f14f3613f7b69a17b (#194087) due to bot failures
on ppc64le and Windows.

This reland incorporates the following fixes:

1) Rename member variable InputFile to InputFilePath inside struct
InputFile to resolve MSVC shadow
conflicts.

2) Add arm-registered-target to ETM tests REQUIRES directive to prevent
failures on builders that do
not have the ARM target enabled.
DeltaFile
+251-0llvm/lib/ProfileData/ETMTraceDecoder.cpp
+71-36llvm/tools/llvm-profgen/llvm-profgen.cpp
+72-17llvm/tools/llvm-profgen/PerfReader.cpp
+81-0llvm/test/tools/llvm-profgen/etm-arch.test
+48-0llvm/test/tools/llvm-profgen/Inputs/etm-opencsd.yaml
+46-0llvm/include/llvm/ProfileData/ETMTraceDecoder.h
+569-538 files not shown
+672-7214 files

LLVM/project 6f2e1a1libc/test/src/sys/socket/linux sendmsg_recvmsg_test.cpp

[libc][test] Remove non-proxy header in sendmsg_recvmsg_test.cpp (#194467)

The header `include/llvm-libc-macros/linux/sys-socket-macros.h` should
be included via `hdr/sys_socket_macros.h` which proxies based on whether
LIBC_FULL_BUILD is enabled. Else we mix LLVM-libc internal headers and
system headers.
DeltaFile
+8-3libc/test/src/sys/socket/linux/sendmsg_recvmsg_test.cpp
+8-31 files

LLVM/project 5aa71adllvm/lib/Transforms/Scalar SROA.cpp, llvm/test/Transforms/SROA protected-field-pointer.ll

Revert "SROA: Recognize llvm.protected.field.ptr intrinsics."

This commit turns out to also not be needed after #186548; it now makes
a very small difference (~40 bytes) to Fleetbench binary size.

This reverts commit b0d340557841e0c3f72d19be89aebef2a8ce02e5.

Reviewers: fmayer, nikic

Pull Request: https://github.com/llvm/llvm-project/pull/194109
DeltaFile
+0-93llvm/test/Transforms/SROA/protected-field-pointer.ll
+5-78llvm/lib/Transforms/Scalar/SROA.cpp
+5-1712 files

LLVM/project 11f1a35lldb/examples/python lldbtk.py armv7_cortex_m_target_defintion.py

[lldb] Unify python shebangs (#187257)

As per PEP-0394[1], there is no real concensus over what binary names
Python has, specifically 'python' could be Python 3, Python 2, or not
exist.

However, everyone has a python3 interpreter and the scripts are all
written for Python 3. Unify the shebangs so that the ~50% of shebangs
that use python now use python3.

[1] https://peps.python.org/pep-0394/
DeltaFile
+1-1lldb/examples/python/lldbtk.py
+1-1lldb/examples/python/armv7_cortex_m_target_defintion.py
+1-1lldb/examples/python/bsd.py
+1-1lldb/examples/python/cmdtemplate.py
+1-1lldb/examples/python/delta.py
+1-1lldb/examples/python/disasm-stress-test.py
+6-635 files not shown
+41-4141 files