LLVM/project 1e10f9aclang/lib/Lex ModuleMap.cpp, clang/unittests/Lex ModuleMapTest.cpp CMakeLists.txt

[clang][Lex] Collapse relative extern module paths when recursing to prevent unbounded path length growth. (#193691)

Ref #147220.

### Problem Description
Bazel's use of clang modules for its `layering_check` emits `extern
module` declarations relative to some base path meaning those paths
usually include long sequences of `../` followed by the path to the
module itself.

When parsing `extern module` in the module file, we (I believe
intentionally) silently ignore missing module files. Currently in the
problem case if the file existence check failed for any _other_ reason
it also silently ignores it. This means that `-fmodules-strict-decluse`
that bazel uses for the layering_check can throw a spurious
`err_undeclared_use_of_module` error which is the problem reported in
#147220.

Clang's `extern module` parsing chooses to concatenate these relative

    [17 lines not shown]
DeltaFile
+132-0clang/unittests/Lex/ModuleMapTest.cpp
+7-0clang/lib/Lex/ModuleMap.cpp
+1-0clang/unittests/Lex/CMakeLists.txt
+140-03 files

LLVM/project 9b4f445flang/test/Lower assignment.f90 allocatable-callee.f90, flang/test/Lower/Intrinsics count.f90

[flang][NFC] Converted five tests from old lowering to new lowering (part 53) (#194772)

Convert five tests to use new HLFIR lowering instead of legacy FIR
lowering:
Lower/allocatable-callee.f90, Lower/allocatable-caller.f90,
Lower/assignment.f90, Lower/assumed-shape-caller.f90,
Lower/Intrinsics/count.f90
DeltaFile
+157-115flang/test/Lower/assignment.f90
+51-45flang/test/Lower/allocatable-callee.f90
+35-40flang/test/Lower/Intrinsics/count.f90
+33-35flang/test/Lower/assumed-shape-caller.f90
+25-17flang/test/Lower/allocatable-caller.f90
+301-2525 files

LLVM/project 7d0308cllvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize epilog-vectorization-fixed-order-recurrences.ll

[VPlan] Check FOR/FMinMaxNum epilogue restrictions in VPlan. (#191815)

Move checking of FOR/FMinMaxNum restriction checks for epilogue
vectorization to hasUnsupportedHeaderPhiRecipe and perform checks
directly on VPlan.

This unifies the checking code and enables epilogue vectorization of
VPlans with dead FORs, although the latter should be cleaned up by
scalar optimizations earlier in practice.

PR: https://github.com/llvm/llvm-project/pull/191815
DeltaFile
+72-23llvm/test/Transforms/LoopVectorize/epilog-vectorization-fixed-order-recurrences.ll
+32-34llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+104-572 files

LLVM/project 6446435llvm/lib/Transforms/InstCombine InstCombineSelect.cpp, llvm/test/Transforms/InstCombine fcmp-select.ll

[InstCombine] Fold select of ordered fcmps of fabs over NaN-scrubber selects to a single select (#192182)

Fold `select (fcmp <ordered> (fabs (select isKnownNeverNaN X, X, Y)),
K), ...` into a single compare/select directly on `X`. The outer fcmp is
limited to ordered predicates, since only they preserve the original
non-NaN behavior.


fixes #143649 
alive2: https://alive2.llvm.org/ce/z/G8UmjY


Generalized proof (needs local alive2 build):
```alive2
declare double @llvm.fabs.f64(double)

define double @src(double %x, double %y, double %k) {
entry:
  %ord = fcmp ord double %x, 0.000000e+00

    [14 lines not shown]
DeltaFile
+248-0llvm/test/Transforms/InstCombine/fcmp-select.ll
+77-0llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+325-02 files

LLVM/project aa190f5mlir/include/mlir/Dialect/X86 X86.td, mlir/lib/Dialect/X86/IR X86Dialect.cpp

[mlir][x86] Support for `f8` AMX tiled dot-product. (#194786)

This patch enable AMX tiled dot-product support for `f8E4M3FN` and
`f8E5M2` types in MLIR by lowering to below llvm instrincs:

- `llvm.x86.tdpbf8ps`
- `llvm.x86.tdpbhf8ps`
- `llvm.x86.tdphbf8ps`
- `llvm.x86.tdphf8ps`
DeltaFile
+64-0mlir/test/Dialect/X86/AMX/legalize-for-llvm.mlir
+28-9mlir/include/mlir/Dialect/X86/X86.td
+34-0mlir/test/Target/LLVMIR/amx.mlir
+9-2mlir/lib/Dialect/X86/IR/X86Dialect.cpp
+135-114 files

LLVM/project 669df4dmlir/lib/Analysis/Presburger Utils.cpp, mlir/unittests/Analysis/Presburger Utils.h

[mlir][Presburger] Fix inlining failure for dynamicAPIntFromInt64 in debug builds (#194820)

When `dynamicAPIntFromInt64` is passed as a function pointer to
`llvm::transform`, it becomes an indirect call. This causes the compiler
to fail to inline the function despite the
`LLVM_ATTRIBUTE_ALWAYS_INLINE` annotation, resulting in a compilation
error in debug builds:
```
error: inlining failed in call to 'always_inline' 'llvm::DynamicAPInt llvm::dynamicAPIntFromInt64(int64_t)': indirect function call with a yet undetermined callee
  250 | LLVM_ATTRIBUTE_ALWAYS_INLINE DynamicAPInt dynamicAPIntFromInt64(int64_t X) {
```

Fix this by wrapping `dynamicAPIntFromInt64` in a lambda, turning the
indirect call into a direct call that the compiler can inline at the
call site.
DeltaFile
+5-3mlir/unittests/Analysis/Presburger/Utils.h
+4-1mlir/lib/Analysis/Presburger/Utils.cpp
+9-42 files

LLVM/project 6617aaclldb/include/lldb/Host/windows ConPTYUtils.h ConnectionConPTYWindows.h, lldb/source/Host CMakeLists.txt

[lldb][windows] add unit tests for the StripConPTYSequences method (#194654)

Co-authored-by: Nerixyz <nero.9 at hotmail.de>
DeltaFile
+132-0lldb/unittests/Host/StripConPTYSequencesTest.cpp
+0-81lldb/source/Host/windows/ConnectionConPTYWindows.cpp
+78-0lldb/source/Host/common/ConPTYUtils.cpp
+32-0lldb/include/lldb/Host/windows/ConPTYUtils.h
+1-0lldb/include/lldb/Host/windows/ConnectionConPTYWindows.h
+1-0lldb/source/Host/CMakeLists.txt
+244-811 files not shown
+245-817 files

LLVM/project 7b58716llvm/lib/Transforms/Scalar LoopFuse.cpp, llvm/test/Transforms/LoopFusion pr193641.ll

[LoopFusion] Remove DT edge from Extiblock to ExitBlockSuc (#193641)

To remove the exit block, it cannot have successors, if this edge is not
removed, when applying the updates to the DT the following assertion
will appear:
"Assertion `Node->isLeaf() && "Node is not a leaf node."' failed"

This assertion does not always fail because before applying the updates
on the "GenericDomTreeContruction", "ApplyUpdates" function it runs
CalculateFromScratch on some situations:

    // Make unittests of the incremental algorithm work
    if (DT.DomTreeNodes.size() <= 100) {
      if (BUI.NumLegalized > DT.DomTreeNodes.size())
        CalculateFromScratch(DT, &BUI);
    } else if (BUI.NumLegalized > DT.DomTreeNodes.size() / 40)
      CalculateFromScratch(DT, &BUI);
DeltaFile
+45-0llvm/test/Transforms/LoopFusion/pr193641.ll
+2-0llvm/lib/Transforms/Scalar/LoopFuse.cpp
+47-02 files

LLVM/project 61b0677lld/COFF Driver.cpp Driver.h, lld/test/COFF arm64ec-thin-lib.s

[LLD][COFF] Use lazy object mechanism instead of relying on the archive map for thin archives on ARM64EC (#194349)

On ARM64EC/ARM64X, an archive may contain both native and EC symbols in
the symbol table, which can potentially conflict. Regular archives
handle this using the extended archive format, which stores the EC
symbol table in a separate section, but this is not available for thin
archives.
    
Work around this limitation by lazily parsing all thin archive members
instead of relying on the archive symbol table. This uses the same
mechanism as when thin archive members are passed with
-start-lib/-end-lib, where symbols are added to the symbol table without
pulling in the object file unless it is referenced.
    
Fixing this at the archive format level would require changes to the
format. Currently, the ECSYMBOLS section is supported only by the COFF
archive format, while thin archives require the GNU format. We would
either need to extend the COFF format to support thin archives or
introduce ECSYMBOLS support in the GNU format.
DeltaFile
+105-0lld/test/COFF/arm64ec-thin-lib.s
+19-12lld/COFF/Driver.cpp
+4-2lld/COFF/Driver.h
+128-143 files

LLVM/project dbdbf1elld/ELF LinkerScript.cpp

[LLD][ELF] Fix performance regression when using linker scripts (#194668)

The addition of the support for `--enable-non-contiguous-regions` from
PR #90007 moved an "early out" condition in
`LinkerScript::computeInputSections()`. This could result in other
relatively expensive checks, i.e. `pat.sectionPat.match`,
`cmd->matchesFile`, `pat.excludesFile` and `flagsMatch`, to be performed
unnecessarily in the default situation where
`--enable-non-contiguous-regions` is disabled.

This fix restores the "early out" condition and shows an ~14%
improvement for the Linux kernel benchmark link and has been seen to
improve performance by up to ~30% for a large UE5 link.
DeltaFile
+7-6lld/ELF/LinkerScript.cpp
+7-61 files

LLVM/project 6cbe31c.github CODEOWNERS

[mlir] Update CODEOWNERS after x86 dialects refactoring (#194388)

The two separate x86 dialects ('amx' and 'x86vector') have been merged
into a single 'x86' dialect.
Relevent paths are updated accordingly.

Also, adding myself to 'x86' dialect to enable notifications.
DeltaFile
+1-2.github/CODEOWNERS
+1-21 files

LLVM/project 239189clibcxx/include optional, libcxx/test/std/utilities/optional/optional.object optional_helper_types.h

[libc++] Disable mistakenly enabled `optional<T&>` constructors for `optional<T>` (#194446)

Resolves #194415 

- A constructor specifically meant for `optional<T&>` was left enabled
for `optional<T>`
- Fix it, and add a test to check for regression.
- This patch also corrects the constraints for `optional(optional<U>&)`
and `optional(const optional<U>&)` , as they were incorrectly
disallowing [valid conversions](https://godbolt.org/z/1r5Ea7z5M)
- Also, correct the `noexcept` specification.
- Add tests for both corrections.
DeltaFile
+38-1libcxx/test/std/utilities/optional/optional.object/optional_helper_types.h
+35-0libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/optional_U.pass.cpp
+31-0libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/const_optional_U.pass.cpp
+15-11libcxx/include/optional
+3-5libcxx/test/std/utilities/optional/optional.object/optional.object.ctor/ref_constructs_from_temporary.verify.cpp
+122-175 files

LLVM/project d8ef5bcclang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGen optsize-func-attr.cpp function-target-features.c

[CIR] Emit target-cpu, target-features, and tune-cpu attrs on cir.func (#193458)

Add `getCPUAndFeaturesAttributes` to `CIRGenModule`, mirroring OGCG's
`GetCPUAndFeaturesAttributes`.
This sets `cir.target-cpu`, `cir.target-features` and `cir.tune-cpu`
string attributes on `cir.func`.
For AMDGPU, only features that differ from the target CPU's defaults are
emitted matching OGCG.
DeltaFile
+91-2clang/lib/CIR/CodeGen/CIRGenModule.cpp
+65-0clang/test/CIR/CodeGenHIP/target-features.hip
+12-18clang/test/CIR/CodeGen/optsize-func-attr.cpp
+27-0clang/test/CIR/CodeGen/function-target-features.c
+8-8clang/test/CIR/CodeGen/misc-attrs.cpp
+13-0clang/test/CIR/CodeGen/tune-cpu.c
+216-2816 files not shown
+246-5522 files

LLVM/project 0a4798aoffload/libomptarget device.cpp, offload/libomptarget/OpenMP/OMPT Callback.cpp

[OMPT][OpenMP] Use omp_initial_device for host in callbacks (#192924)

The OpenMP specification offers different ways for identifying the host
device. While users of the OpenMP API can use `omp_get_initial_device()`
or the constant `omp_initial_device` (available since OpenMP v5.2), a
tool needs to rely on the `initial_device_num` passed by the OpenMP
runtime during the `initialize` callback.

In #134451, it was discovered that the `initial_device_num` passed is
always `0`, regardless of any device are available for offload
execution. For host-only OpenMP code, this matches the result of
`omp_get_num_devices()`, and is a valid result. In the case of devices
being available though, this passed identifier is incorrect. While
`libomp` calls `omp_get_num_devices()`, `libomptarget` has not fully
initialized its PluginManager at that point, hence returning no
available devices. Tools relying on `initial_device_num` might therefore
incorrectly assume host-side execution when some code runs on a device.
Since the `ompt_get_num_devices()` entry point is also not fully
implemented, tools currently need to do on-the-fly handling for the host

    [10 lines not shown]
DeltaFile
+18-14offload/libomptarget/OpenMP/OMPT/Callback.cpp
+4-3openmp/runtime/src/ompt-general.cpp
+2-2offload/libomptarget/device.cpp
+1-1offload/test/ompt/target_memcpy.c
+1-1offload/test/ompt/target_memcpy_emi.c
+26-215 files

LLVM/project eb4568clldb/source/Target Process.cpp

fixup! cosmetic changes
DeltaFile
+7-6lldb/source/Target/Process.cpp
+7-61 files

LLVM/project 39c49a8lldb/include/lldb/Target Process.h, lldb/source/Target Process.cpp ThreadPlanStepOverBreakpoint.cpp

fixup! use Error instead of Status
DeltaFile
+14-12lldb/source/Target/Process.cpp
+4-4lldb/source/Target/ThreadPlanStepOverBreakpoint.cpp
+2-2lldb/include/lldb/Target/Process.h
+20-183 files

LLVM/project 74b6eb5lldb/include/lldb/Target Process.h, lldb/source/Target Process.cpp

fixup! fix typos
DeltaFile
+1-1lldb/include/lldb/Target/Process.h
+1-1lldb/source/Target/Process.cpp
+2-22 files

LLVM/project 0e8eb8blldb/include/lldb/Target Process.h, lldb/source/Target Process.cpp

fixup! Remove delayed breakpoints
DeltaFile
+7-11lldb/source/Target/Process.cpp
+2-4lldb/include/lldb/Target/Process.h
+9-152 files

LLVM/project 0949dc6lldb/source/Target Process.cpp

fixup! don't enqueue actions that won't change the site status
DeltaFile
+7-1lldb/source/Target/Process.cpp
+7-11 files

LLVM/project 7107bb3clang/docs OpenMPSupport.rst

[OpenMP][NFC] Update OpenMP Support doc for Tools Interface (#193173)

All enum values for OpenMP v5.1 are implemented.
Add entries for added and deprecated OpenMP Tools Interface features in
OpenMP v6.0.

Also fix link to PR for `transparent clause (hull tasks)`.

Signed-off-by: Jan André Reuter <j.reuter at fz-juelich.de>
DeltaFile
+44-7clang/docs/OpenMPSupport.rst
+44-71 files

LLVM/project 89133afclang/test/CXX/drs cwg28xx.cpp, clang/www cxx_dr_status.html

[clang][NFC] Mark CWG2807 as implemented and add a test (#194755)

CWG2807 (https://wg21.link/cwg2807): One part of the standard correctly
said destructors can't be `consteval`, but another incorrectly said they
can be.

Clang diagnosed this in 9.0, for some reason started accepting it in
10.0, then went back to diagnosing in 11.0:
https://godbolt.org/z/6sWTYT38M. I've marked it as implemented since
11.0.

The issue that prompted the DR: #65665
DeltaFile
+10-0clang/test/CXX/drs/cwg28xx.cpp
+1-1clang/www/cxx_dr_status.html
+11-12 files

LLVM/project c09bc10llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/RISCV/rvv fixed-vectors-fp-setcc.ll fixed-vectors-setcc-fp-vp.ll

Merge remote-tracking branch 'upstream/main' into users/ssahasra/av-intrinsics
DeltaFile
+4,811-4,818llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+5,061-4,162llvm/test/CodeGen/Thumb2/mve-clmul.ll
+326-4,626llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-setcc.ll
+1,872-1,883llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+3,230-456llvm/test/CodeGen/WebAssembly/strided-int-mac.ll
+565-2,727llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-fp-vp.ll
+15,865-18,6725,333 files not shown
+237,953-123,9035,339 files

LLVM/project a2c59a5lldb/include/lldb/Target Process.h

fixup! make iteration order independent of pointers
DeltaFile
+10-1lldb/include/lldb/Target/Process.h
+10-11 files

LLVM/project 6e13041lldb/source/Target TargetProperties.td Process.cpp

fixup! review comments
DeltaFile
+2-2lldb/source/Target/TargetProperties.td
+1-0lldb/source/Target/Process.cpp
+3-22 files

LLVM/project 2b427e9lldb/include/lldb/Target Process.h, lldb/source/Target Process.cpp

fixup! fix order of class declaration
DeltaFile
+14-12lldb/include/lldb/Target/Process.h
+2-3lldb/source/Target/Process.cpp
+16-152 files

LLVM/project 1a81f3blldb/include/lldb/Target Process.h, lldb/source/Plugins/Process/Utility StopInfoMachException.cpp

[lldb] Implement delayed breakpoints

This patch changes the Process class so that it delays *physically*
enabling/disabling breakpoints until the process is about to
resume/detach/be destroyed, potentially reducing the packets transmitted
by batching all breakpoints together.

Most classes only need to know whether a breakpoint is "logically"
enabled, as opposed to "physically" enabled (i.e. the remote server has
actually enabled the breakpoint). However, lower level classes like
derived Process classes, or StopInfo may actually need to know whether
the breakpoint was physically enabled. As such, this commit also adds a
"IsPhysicallyEnabled" API.

https://github.com/llvm/llvm-project/pull/192910
DeltaFile
+96-6lldb/source/Target/Process.cpp
+30-1lldb/include/lldb/Target/Process.h
+6-6lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp
+5-3lldb/source/Plugins/Process/Utility/StopInfoMachException.cpp
+4-2lldb/source/Target/ThreadPlanStepOverBreakpoint.cpp
+5-0lldb/source/Target/TargetProperties.td
+146-183 files not shown
+150-229 files

LLVM/project d4b7b73llvm/test/tools/llvm-ir2vec/bindings lit.local.cfg ir2vec-getFuncEmbMap.py, llvm/tools/llvm-ir2vec/Bindings CMakeLists.txt

[llvm-ir2vec] Place IR2Vec Python bindings in the tools/llvm-ir2vec/Bindings build directory (#194301)

## Place IR2Vec Python bindings `.so` in the Bindings build directory

Without an explicit output directory, CMake places the nanobind
extension module
in `<build>/lib/`, alongside unrelated LLVM libraries.

- This change adds `set_target_properties` to redirect the output to
`<build>/tools/llvm-ir2vec/Bindings/`, keeping the Python bindings
isolated within its own tool's build tree. This mirrors MLIR's
convention,
where Python extension modules are placed under
`<build>/tools/mlir/python_packages/` rather than the global `lib/`
directory. 


- %llvm_lib_dir was pointing to build-llvm/lib but the .so actually
lives at build-llvm/tools/llvm-ir2vec/Bindings/. The tests were silently

    [7 lines not shown]
DeltaFile
+5-0llvm/tools/llvm-ir2vec/Bindings/CMakeLists.txt
+1-2llvm/test/tools/llvm-ir2vec/bindings/lit.local.cfg
+1-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncEmbMap.py
+1-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getBBEmbMap.py
+1-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncEmb.py
+1-1llvm/test/tools/llvm-ir2vec/bindings/ir2vec-getFuncNames.py
+10-64 files not shown
+14-810 files

LLVM/project 9d332aallvm/test/TableGen/GlobalISelEmitter metadata-operand.td, llvm/utils/TableGen/Common/GlobalISel GlobalISelMatchTable.cpp

[GlobalISel] skip type check when matching metadata operand (#191389)

Assisted-by: Claude Opus 4.6

---------

Co-authored-by: macurtis-amd <macurtis at amd.com>
DeltaFile
+32-0llvm/test/TableGen/GlobalISelEmitter/metadata-operand.td
+6-0llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp
+38-02 files

LLVM/project c758623llvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize/AArch64 partial-reduce-chained.ll

[VPlan] Don't create sub(ext(mul(...))) partial reductions (#194660)

Currently if we have a loop that does a sub(ext(mul(...))) reduction
then createPartialReductions will try to transform it to a partial
reduction but then crash due to hitting an llvm_unreachable in
createPartialReductionExpression.

It looks like handling this in createPartialReductionExpression would
require adding a new expression recipe kind, so for now just don't try
to use a partial reduction so we avoid the crash.

Fixes #194000
DeltaFile
+178-0llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-chained.ll
+3-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+181-02 files

LLVM/project 30fa415llvm/lib/Target/X86 X86FastISel.cpp, llvm/test/CodeGen/X86 fast-isel-struct-ret.ll bf16-fast-isel.ll

[X86][FastISel] Restore support for struct returns (#194586)

After #180322, X86 FastISel forces SDAG fallback for any call with a
struct return. This caused major compile-time regressions for debug
builds in Rust, where struct returns are very common.

The type legality check should work on the de-aggregated types, not on
the return type directly.
DeltaFile
+58-0llvm/test/CodeGen/X86/fast-isel-struct-ret.ll
+30-0llvm/test/CodeGen/X86/bf16-fast-isel.ll
+16-11llvm/lib/Target/X86/X86FastISel.cpp
+104-113 files