LLVM/project a70f82fclang/lib/Analysis/FlowSensitive TypeErasedDataflowAnalysis.cpp, clang/unittests/Analysis/FlowSensitive TypeErasedDataflowAnalysisTest.cpp

[clang][FlowSensitive] Do a quick check and bail early for massive CFGs (#186808)

Bail out early if the visiting each reachable basic block once would
have exceeded the MaxBlockVisits limit. If that is the case, then
actually visiting and doing the dataflow analysis would hit the limit,
but we would have wasted a lot of time.

Another possibility is that we run out of memory (OOM) and the process
crashes. We've seen example of CFGs with # of blocks that are 2-8x the
visit limit. Those examples also have lots of `Locs`, which we track in
MapVectors for each BB. Since the maps do not share memory across BBs,
this leads to non-linear memory usage and OOMing before hitting the max
visit limit. With this, we can avoid OOMing, and at least get some
results for the other CFGs in the TU, instead of losing all results from
the process crashing.
DeltaFile
+36-0clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
+31-2clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
+67-22 files

LLVM/project 4daf632libclc/clc/lib/generic/math clc_tan.inc

libclc: Fix vector float tan

This needs to be converted to a boolean for the vector
select to work correctly.
DeltaFile
+1-1libclc/clc/lib/generic/math/clc_tan.inc
+1-11 files

Linux/linux 8a30aebfs/nfsd export.c nfsctl.c, net/sunrpc cache.c

Merge tag 'nfsd-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix cache_request leak in cache_release()

 - Fix heap overflow in the NFSv4.0 LOCK replay cache

 - Hold net reference for the lifetime of /proc/fs/nfs/exports fd

 - Defer sub-object cleanup in export "put" callbacks

* tag 'nfsd-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: fix heap overflow in NFSv4.0 LOCK replay cache
  sunrpc: fix cache_request leak in cache_release
  NFSD: Hold net reference for the lifetime of /proc/fs/nfs/exports fd
  NFSD: Defer sub-object cleanup in export put callbacks
DeltaFile
+54-9fs/nfsd/export.c
+21-5net/sunrpc/cache.c
+19-3fs/nfsd/nfsctl.c
+12-5fs/nfsd/state.h
+7-2fs/nfsd/nfs4xdr.c
+5-2fs/nfsd/export.h
+118-266 files

LLVM/project 2a89e24flang-rt/lib/runtime namelist.cpp

[flang] [flang-rt] Subscript overrun could occur in namelists during a READ command. (#176959)

NOTE: This is a new pull request, as the prior didn't have labels
properly applied.

If a bad subscript is provided in a namelisted record, the
HandleSubscripts() routine can read off into infinity. This patch
ensures that a read will not go beyond the rank of the expected
variable.

The failure will then be captured in the return status (IOSTAT) of the
READ.

The small test demonstrates the failure before and after the fix.

---------

Co-authored-by: Kevin Wyatt <kwyatt at hpe.com>
DeltaFile
+3-1flang-rt/lib/runtime/namelist.cpp
+3-11 files

LLVM/project 80f11cbclang-tools-extra/clang-tidy/modernize TypeTraitsCheck.cpp, clang-tools-extra/clang-tidy/readability RedundantTypenameCheck.cpp

release/22.x: Backport workarounds for certain addMatcher overloads ignoring traversal kind
DeltaFile
+14-0clang-tools-extra/test/clang-tidy/checkers/modernize/type-traits.cpp
+3-1clang-tools-extra/clang-tidy/modernize/TypeTraitsCheck.cpp
+1-1clang-tools-extra/clang-tidy/readability/RedundantTypenameCheck.cpp
+18-23 files

LLVM/project 8ac0a12llvm/lib/Target/PowerPC PPCISelLowering.cpp, llvm/test/CodeGen/PowerPC load-i128-eq-chain.ll

[PowerPC] Preserve load output chain in vcmpequb combine (#187010)

Replace uses of the old load output chain with the new load output
chain. A plain replacement here is fine because the transform verifies
the load is one-use.

Fixes https://github.com/llvm/llvm-project/issues/186549.

(cherry picked from commit 7404a5dbe0ca971e0f312a366019361fc9d576e0)
DeltaFile
+47-0llvm/test/CodeGen/PowerPC/load-i128-eq-chain.ll
+5-2llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+52-22 files

LLVM/project 20795d7clang/lib/Sema SemaConcept.cpp SemaTemplateDeduction.cpp, clang/test/CXX/expr/expr.prim/expr.prim.req nested-requirement.cpp

[clang] Backport: use canonical arguments for checking function template constraints

Backport from #186889

This is a partial revert of #161671, restoring the original behaviour
where the canonical template arguments are used for function template
constraint checking in diagnostics.

This reverts the fix from #183010, which attempted to fix #182344
but it causes regressions. These regressions now have test cases
included.

The attempt at #183010 is flawed because in the general case we can't
check satisfaction for constraints which have unsubstituted template
arguments, even if they don't affect the canonical type (ie they are
purely
syntactical), because these types can still turn out to be invalid after
substitution.


    [20 lines not shown]
DeltaFile
+43-0clang/test/SemaTemplate/concepts.cpp
+4-4clang/test/SemaCXX/cxx2b-deducing-this.cpp
+6-1clang/lib/Sema/SemaConcept.cpp
+3-4clang/lib/Sema/SemaTemplateDeduction.cpp
+3-3clang/test/SemaTemplate/concepts-recursive-inst.cpp
+1-1clang/test/CXX/expr/expr.prim/expr.prim.req/nested-requirement.cpp
+60-136 files

LLVM/project 160377dlld/ELF/Arch RISCV.cpp LoongArch.cpp, lld/test/ELF riscv-relax-synthetic-in-text.s loongarch-relax-synthetic-in-text.s

[lld][ELF] Fix crash when relaxation pass encounters synthetic sections

In LoongArch and RISC-V, the relaxation pass iterates over input sections
within executable output sections. When a linker script places a synthetic
section (e.g., .got) into such an output section, the linker would crash
because synthetic sections do not have the relaxAux field initialized.

The relaxAux data structure is only allocated for non-synthetic sections
in initSymbolAnchors. This patch adds the necessary null checks in the
relaxation loops (relaxOnce and finalizeRelax) to skip sections that
do not require relaxation.

A null check is also added to elf::initSymbolAnchors to ensure the
subsequent sorting of anchors is safe.

Fixes: #184757

Reviewers: MaskRay


    [3 lines not shown]
DeltaFile
+33-0lld/test/ELF/riscv-relax-synthetic-in-text.s
+31-0lld/test/ELF/loongarch-relax-synthetic-in-text.s
+8-1lld/ELF/Arch/RISCV.cpp
+4-1lld/ELF/Arch/LoongArch.cpp
+76-24 files

LLVM/project dc8750dlibcxx/lib/abi i686-linux-android21.libcxxabi.v1.stable.exceptions.nonew.abilist x86_64-linux-android21.libcxxabi.v1.stable.exceptions.nonew.abilist, libcxx/src iostream.cpp

[libc++] Fix iostream size ABI break (#185839)

In #124103 we changed the size of various iostream objects, which turns
out to be ABI breaking when compiling non-PIE code.

This ABI break is safe to fix, since for any programs allocating more
memory for the iostream objects, the remaining bytes are simply unused
now.

Fixes #185724

(cherry picked from commit c1d26c3c25106be2bc5b2b5a440faa5b93488de5)
DeltaFile
+55-36libcxx/src/iostream.cpp
+8-8libcxx/lib/abi/i686-linux-android21.libcxxabi.v1.stable.exceptions.nonew.abilist
+8-8libcxx/lib/abi/x86_64-linux-android21.libcxxabi.v1.stable.exceptions.nonew.abilist
+8-8libcxx/lib/abi/x86_64-unknown-freebsd.libcxxabi.v1.stable.exceptions.nonew.abilist
+8-8libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew.abilist
+8-8libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.noexceptions.nonew.abilist
+95-766 files

LLVM/project 3611dc1llvm/test/CodeGen/AArch64 framelayout-fpr128-csr.ll

Fix comment in FPR128 test (NFC)

(cherry picked from commit 4c31b6f93c7d8499b93cd6d29b8874a62f2cfed0)
DeltaFile
+3-3llvm/test/CodeGen/AArch64/framelayout-fpr128-csr.ll
+3-31 files

LLVM/project 34bf276llvm/lib/Target/AArch64 AArch64FrameLowering.cpp, llvm/test/CodeGen/AArch64 framelayout-fpr128-spill.mir framelayout-fpr128-csr.ll

[AArch64] Ensure FPR128 callee-save stack offsets are aligned (#184314)

This was benign for Linux targets (as when dividing by the scale the
offset would be correctly truncated), so only resulted in failures with
`-DLLVM_ENABLE_ASSERTIONS=On`. On Windows, this was a miscompile as the
lack of alignment would result in the FPR128 callee-save getting
assigned to the same offset as the previous GPR.

Fixes: #183708
(cherry picked from commit 327f1adef8df6afc07f6c88cfa380c97399af3dc)
DeltaFile
+38-0llvm/test/CodeGen/AArch64/framelayout-fpr128-spill.mir
+33-0llvm/test/CodeGen/AArch64/framelayout-fpr128-csr.ll
+15-5llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+86-53 files

LLVM/project d70ebc8lldb/examples/python formatter_bytecode.py, lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode RigidArrayLLDBFormatterC.txt RigidArrayLLDBFormatterSwift.txt

[lldb][bytecode] Compile pick ops using unsigned literal (#187376)

The `pick` op requires an unsigned integer index. Use the `u` suffix
when generating `pick` operations in the Python->formatter-bytecode
compiler.
DeltaFile
+3-3lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode/RigidArrayLLDBFormatterC.txt
+3-3lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode/RigidArrayLLDBFormatterSwift.txt
+2-2lldb/examples/python/formatter_bytecode.py
+8-83 files

LLVM/project c1c8337clang/lib/Headers __clang_cuda_runtime_wrapper.h

[clang][CUDA] Define _NV_RSQRT_SPECIFIER for glibc-2.42/cuda-13.2 compatibility (#185701)

CUDA-13.2 defines _NV_RSQRT_SPECIFIER to make its headers compileable
with glibc 2.42+. However, clang does not include the header that
defines the macro, and has to define it by itself.

(cherry picked from commit ab048ac6c0339c631ea4a1064b675318867a3853)
DeltaFile
+12-0clang/lib/Headers/__clang_cuda_runtime_wrapper.h
+12-01 files

LLVM/project b73f72ellvm/docs ReleaseNotes.md

Update ptrtoaddr provenance reference in ReleaseNotes
DeltaFile
+1-1llvm/docs/ReleaseNotes.md
+1-11 files

LLVM/project 227ced1flang/lib/Optimizer/Builder IntrinsicCall.cpp, flang/lib/Optimizer/HLFIR/Transforms SimplifyHLFIRIntrinsics.cpp

[flang] Use integer arith.max/min operations for max/min lowering. (#186466)

arith.maxsi/maxui/minsi/minui are more concise than cmp+select
and probably allow more folding, so we should use it in Flang lowering.
DeltaFile
+20-40flang/test/Lower/HLFIR/custom-intrinsic.f90
+43-9flang/test/Lower/fp-maxmin-behavior.f90
+9-39flang/lib/Optimizer/HLFIR/Transforms/SimplifyHLFIRIntrinsics.cpp
+24-15flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+10-20flang/test/Lower/Intrinsics/max.f90
+4-8flang/test/Lower/HLFIR/array-ctor-as-inlined-temp.f90
+110-13117 files not shown
+136-18223 files

FreeBSD/src 85cf26csys/netinet in_var.h

in_var.h: fixup comments that mention use by ifmcstat(8)
DeltaFile
+1-4sys/netinet/in_var.h
+1-41 files

FreeBSD/src ad0e698usr.sbin/ifmcstat ifmcstat.c ifmcstat.8

ifmcstat: remove libkvm(3) code

It has been broken and disabled for over 10 years.  Remove mentions of the
kvm(3) from the manual page.
DeltaFile
+2-500usr.sbin/ifmcstat/ifmcstat.c
+1-44usr.sbin/ifmcstat/ifmcstat.8
+3-5442 files

LLVM/project 752ccf7flang/lib/Lower OpenACC.cpp, flang/lib/Semantics resolve-names.cpp

[flang][openacc][cuda] Add implicit device attribute for use_device unconditionally (#186844)

For interoperability between CUDA Fortran and OpenACC.
DeltaFile
+33-34flang/lib/Lower/OpenACC.cpp
+5-13flang/lib/Semantics/resolve-names.cpp
+1-1flang/test/Semantics/OpenACC/bug1583.f90
+39-483 files

LLVM/project e044c4aclang/include/clang/Basic BuiltinsAMDGPU.td, clang/test/CodeGenOpenCL builtins-amdgcn-swmmac-gfx1250-err.cl builtins-amdgcn-swmmac-w32-gfx10-err.cl

[AMDGPU] Add target features for SWMMAC instructions (#185785)

Introduce `swmmac-gfx1200-insts` and `swmmac-gfx1250-insts`
DeltaFile
+36-36clang/include/clang/Basic/BuiltinsAMDGPU.td
+36-0clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-gfx1250-err.cl
+31-0clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w32-gfx10-err.cl
+30-0clang/test/CodeGenOpenCL/builtins-amdgcn-swmmac-w64-gfx10-err.cl
+13-1llvm/lib/Target/AMDGPU/AMDGPU.td
+6-1llvm/lib/TargetParser/TargetParser.cpp
+152-381 files not shown
+156-387 files

LLVM/project f42df8cclang/include/clang/AST TemplateBase.h, clang/lib/AST TemplateBase.cpp TypeLoc.cpp

[clang] fix crash related to missing source locations for converted template arguments

This adds a way to attach source locations to trivially created template
arguments such as packs, or converted expressions when there is no
expression anymore.

This also avoids crashes due to missing source locations.

In a few places where this matters, we already create expressions
from the converted arguments, but this requires access to Sema,
where currently creating trivial typelocs only requires access to
to the ASTContext.

So this creates a new storage kind for TemplateArgumentLocs, where
a single SourceLocation is stored, embedded in the pointer where
possible.

As a drive-by, strenghten asserts by enforcing the TemplateArgumentLocs
are created with the right kinds of locations.

    [2 lines not shown]
DeltaFile
+54-3clang/include/clang/AST/TemplateBase.h
+19-0clang/lib/AST/TemplateBase.cpp
+4-4clang/lib/Sema/SemaExpr.cpp
+2-5clang/lib/AST/TypeLoc.cpp
+7-0clang/test/SemaCXX/type_pack_element.cpp
+3-3clang/lib/Sema/SemaTemplate.cpp
+89-152 files not shown
+91-168 files

LLVM/project 3de7814mlir/lib/Conversion/XeVMToLLVM XeVMToLLVM.cpp, mlir/test/Conversion/XeVMToLLVM legalize_large_vector.mlir

[MLIR][XeVM] Update HandleVectorExtractPattern (#186247)

isExtractContiguousSlice:
- Check if mask size is not greater than the vector size of the operand.
- Check if mask values do not exceed vector size. 

HandleVectorExtractPattern:
- Narrow the scope of matching to, 
  - Source shuffle doing contiguous extract
  - Source shuffle with at least the same mask size.
DeltaFile
+24-0mlir/test/Conversion/XeVMToLLVM/legalize_large_vector.mlir
+19-2mlir/lib/Conversion/XeVMToLLVM/XeVMToLLVM.cpp
+43-22 files

LLVM/project a9181e8clang/lib/CIR/Dialect/Transforms FlattenCFG.cpp, clang/test/CIR/CodeGen loop-cond-cleanup.cpp

[CIR] Fix CFG flattening for loops with cleanup in special regions (#187369)

If a loop required a cleanup scope in the condition or step region of
the loop, we crashed during CFG flattening because the flattening of the
cleanup scope created multiple blocks in the region, but we were
assuming there would only be one block.

This change updates the CFG flattening code to look for the
cir.condition or cir.yield operation in the last block of the region.
DeltaFile
+243-0clang/test/CIR/CodeGen/loop-cond-cleanup.cpp
+12-5clang/lib/CIR/Dialect/Transforms/FlattenCFG.cpp
+255-52 files

LLVM/project c95af40mlir/include/mlir/Dialect/LLVMIR XeVMOps.td, mlir/lib/Dialect/LLVMIR/IR XeVMDialect.cpp

[MLIR][XeVM] Add truncf and mma_mx op. (#180055)

truncf op converts 16 bit floats to 8 bit or 4 bit floats.
mma_mx op does cooperative matrix multiply accumulate on
8 or 4 bit float type with 8bit scale value.
DeltaFile
+100-4mlir/include/mlir/Dialect/LLVMIR/XeVMOps.td
+32-0mlir/test/Dialect/LLVMIR/xevm.mlir
+31-0mlir/lib/Dialect/LLVMIR/IR/XeVMDialect.cpp
+25-0mlir/test/Dialect/LLVMIR/invalid.mlir
+188-44 files

LLVM/project 0b49adcllvm/lib/Target/AMDGPU AMDGPUMachineFunctionInfo.cpp AMDGPUMachineFunction.cpp

[AMDGPU] Rename AMDGPUMachineFunction to AMDGPUMachineFunctionInfo. NFC. (#187276)

This is derived from MachineFunctionInfo not MachineFunction.
DeltaFile
+237-0llvm/lib/Target/AMDGPU/AMDGPUMachineFunctionInfo.cpp
+0-235llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
+0-137llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.h
+125-0llvm/lib/Target/AMDGPU/AMDGPUMachineFunctionInfo.h
+6-6llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
+5-4llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+373-38216 files not shown
+399-40722 files

LLVM/project fce100ellvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize predicated-multiple-exits.ll predicated-early-exits-interleave.ll

[VPlan] Fix masked_cond expansion.

masked_cond is used to combine early-exit conditions with masks from
predicate. The early-exit condition should only be evaluated if the mask
is true. Emit the mask first, to avoid incorrect poison propagation.

Fixes https://github.com/llvm/llvm-project/issues/187061.
DeltaFile
+24-24llvm/test/Transforms/LoopVectorize/predicated-multiple-exits.ll
+8-8llvm/test/Transforms/LoopVectorize/predicated-early-exits-interleave.ll
+5-5llvm/test/Transforms/LoopVectorize/predicated-single-exit.ll
+1-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+38-384 files

LLVM/project d1e625cclang-tools-extra/docs ReleaseNotes.rst, clang-tools-extra/docs/clang-tidy/checks/bugprone unchecked-optional-access.rst

[clang-tidy] `bugprone-unchecked-optional-access`: Add support for GTest asserts like `ASSERT_TRUE` and `ASSERT_FALSE` (#186363)

Resolves  https://github.com/llvm/llvm-project/issues/181737

Addresses false positives reported in
https://github.com/llvm/llvm-project/issues/181737 .

This PR is heavily inspired by
https://github.com/llvm/llvm-project/pull/170947 .

Many thanks to @fmayer for the prior work.

---------

Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
DeltaFile
+90-0clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp
+67-0clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp
+19-0clang-tools-extra/docs/clang-tidy/checks/bugprone/unchecked-optional-access.rst
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+181-04 files

LLVM/project aeff312llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir

[AMDGPU] Remastered 1 test now that TargetOccupancy is clamped.
DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+2-21 files

LLVM/project 5a5c317mlir/include/mlir-c/Target ExportSMTLIB.h, mlir/include/mlir/Target/SMTLIB ExportSMTLIB.h

[MLIR][Python] Add optional emit reset to exportSMTLIB (#187366)

Previously, the MLIR's python binding `smt.export_smtlib(...)` always
emit `(reset)` to the end of smtlib string as a solver terminator.
This PR added an option to suppress this trailing, as downstream users
like python z3 module don't need it.
DeltaFile
+12-14mlir/lib/CAPI/Target/ExportSMTLIB.cpp
+9-8mlir/lib/Bindings/Python/DialectSMT.cpp
+8-3mlir/test/CAPI/smt.c
+4-4mlir/include/mlir-c/Target/ExportSMTLIB.h
+2-1mlir/lib/Target/SMTLIB/ExportSMTLIB.cpp
+2-0mlir/include/mlir/Target/SMTLIB/ExportSMTLIB.h
+37-306 files

OpenBSD/ports fJfY9kAwww/webkitgtk4 distinfo Makefile, www/webkitgtk4/patches patch-Source_WebCore_platform_graphics_gbm_MemoryMappedGPUBuffer_cpp

   Update to webkitgtk{41,60}-2.52.0.
VersionDeltaFile
1.10+1-8www/webkitgtk4/pkg/PFRAG.webkitgtk60
1.4+2-2www/webkitgtk4/patches/patch-Source_WebCore_platform_graphics_gbm_MemoryMappedGPUBuffer_cpp
1.143+2-2www/webkitgtk4/distinfo
1.254+1-1www/webkitgtk4/Makefile
1.37+0-1www/webkitgtk4/pkg/PLIST
1.9+1-0www/webkitgtk4/pkg/PFRAG.no-webkitgtk60
+7-146 files

LLVM/project 360fab6llvm/lib/Target/RISCV RISCVSchedSiFive7.td RISCVInstrPredicates.td, llvm/test/tools/llvm-mca/RISCV/Inputs mul-div-rv32.s

[RISCV] Fix IDiv/IRem scheduling data for RV32 cores that use the SiFive7 model (#187331)

The integer division and remainder instructions on a 32-bit core that
uses SiFive7 scheduling model should have the same latency and
throughput as its word counterparts on a 64-bit SiFive7 core.

This patch fixes those scheduling entries by adding a new SchedPred that
predicates on `Feature64Bit` to toggle the SchedVariant that is attached
on the affected integer division / remainder instructions.
DeltaFile
+59-0llvm/test/tools/llvm-mca/RISCV/SiFive7/mul-div-rv32.test
+15-6llvm/lib/Target/RISCV/RISCVSchedSiFive7.td
+10-0llvm/test/tools/llvm-mca/RISCV/Inputs/mul-div-rv32.s
+4-0llvm/lib/Target/RISCV/RISCVInstrPredicates.td
+88-64 files