LLVM/project cddc09bclang/lib/CIR/CodeGen CIRGenExpr.cpp

WIR [CIR][CodeGen] Remove dead srcAS code in emitCastLValue address spacecast (#197016)

The srcAS variable was computed but never used since upstream's
performAddrSpaceCast only takes (value, destType). Remove the dead code
and its errorNYI for non-target address spaces.

Fixes part of #192314
DeltaFile
+0-10clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+0-101 files

LLVM/project 3ea7398clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp, clang/test/CIR/CodeGen implicit-value-init-expr.cpp

[CIR] Implement implicit value init for aggregates (#197029)

This implements the AggExprEmitter::VisitImplicitValueInitExpr function
for CIR. The code to emit a zero-initializer was already present. We
just needed to hook it up to the visitor.
DeltaFile
+49-0clang/test/CIR/CodeGen/implicit-value-init-expr.cpp
+5-2clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+54-22 files

LLVM/project 0562d17clang/lib/CIR/CodeGen CIRGenException.cpp, clang/test/CIR/CodeGen try-catch-non-trivial-copy.cpp

[CIR] Implement copy construction of EH catch values (#196419)

This change implements handling of exception variables that require copy
construction (on Itanium targets) before they can be used in a catch
handler, using the cir.contruct_catch_param operation.

Some targets, such as MSABI, do not need to perform an explicit copy.
The construct_catch_param operation is effectively a noop for those
cases and will be lowered as such when the EHABI lowering is implemented
for those targets.

Assisted-by: Cursor / claude-opus-4.7-thinking-xhigh
DeltaFile
+541-0clang/test/CIR/CodeGen/try-catch-non-trivial-copy.cpp
+102-3clang/lib/CIR/CodeGen/CIRGenException.cpp
+643-32 files

LLVM/project 1209df8clang/test/Instrumentor StackUsageRT.cpp StackUsageRT.json, llvm/include/llvm/Transforms/IPO Instrumentor.h

[Instrumentor] Add Alloca and Function support; stack usage example

This adds support for alloca instrumentation and function pre/post
instrumentation. Alloca support follows load/store support directly.
Functions require special care to determine the insertion points.

Together, we can showcase how the stack high watermark can be profiled,
see InstrumentorStackUsage.cpp.
DeltaFile
+296-7llvm/lib/Transforms/IPO/Instrumentor.cpp
+118-8llvm/include/llvm/Transforms/IPO/Instrumentor.h
+59-0clang/test/Instrumentor/StackUsageRT.cpp
+59-0llvm/test/Instrumentation/Instrumentor/default_config.json
+56-0llvm/test/Instrumentation/Instrumentor/alloca_and_function.ll
+54-0clang/test/Instrumentor/StackUsageRT.json
+642-152 files not shown
+681-158 files

LLVM/project a430576llvm/include/llvm/Transforms/IPO Instrumentor.h InstrumentorConfigFile.h, llvm/lib/Passes PassBuilderPipelines.cpp

[Instrumentor] Use the pass builder's FileSystem for reading files

In the IO sandbox, the old read calls caused the CI to fail. This
changes uses the PassBuilder's FileSystem the same way other passes
read files from disk (during CI).
DeltaFile
+16-5llvm/lib/Transforms/IPO/InstrumentorConfigFile.cpp
+12-1llvm/lib/Transforms/IPO/Instrumentor.cpp
+7-3llvm/include/llvm/Transforms/IPO/Instrumentor.h
+2-2llvm/lib/Passes/PassBuilderPipelines.cpp
+1-1llvm/include/llvm/Transforms/IPO/InstrumentorConfigFile.h
+38-125 files

LLVM/project 7be448eflang/lib/Semantics check-declarations.cpp, flang/test/Semantics/OpenACC acc-host-data-common.f90

[flang][cuda][openacc] Don't apply CUDA Fortran COMMON/EQUIVALENCE rule to internal UseDevice marker (#197036)

`CUDADataAttr::UseDevice` is not user-spellable; the symbol that
actually lives in COMMON/EQUIVALENCE carries no CUDA attribute. The CUDA
Fortran restriction (CUDA Fortran Programming Guide §3.2) does not apply
to it.

Exclude `UseDevice` from the COMMON/EQUIVALENCE check alongside the
existing `Pinned` exclusion, and add a Semantics regression test.
DeltaFile
+36-0flang/test/Semantics/OpenACC/acc-host-data-common.f90
+8-1flang/lib/Semantics/check-declarations.cpp
+44-12 files

LLVM/project ebfb808clang/test/Instrumentor StackUsageRT.cpp StackUsageRT.json, llvm/include/llvm/Transforms/IPO Instrumentor.h

[Instrumentor] Add Alloca and Function support; stack usage example

This adds support for alloca instrumentation and function pre/post
instrumentation. Alloca support follows load/store support directly.
Functions require special care to determine the insertion points.

Together, we can showcase how the stack high watermark can be profiled,
see InstrumentorStackUsage.cpp.
DeltaFile
+296-7llvm/lib/Transforms/IPO/Instrumentor.cpp
+118-8llvm/include/llvm/Transforms/IPO/Instrumentor.h
+59-0clang/test/Instrumentor/StackUsageRT.cpp
+59-0llvm/test/Instrumentation/Instrumentor/default_config.json
+56-0llvm/test/Instrumentation/Instrumentor/alloca_and_function.ll
+54-0clang/test/Instrumentor/StackUsageRT.json
+642-152 files not shown
+681-158 files

LLVM/project 2030e46llvm/include/llvm/Transforms/IPO Instrumentor.h InstrumentorConfigFile.h, llvm/lib/Passes PassBuilderPipelines.cpp

[Instrumentor] Use the pass builder's FileSystem for reading files

In the IO sandbox, the old read calls caused the CI to fail. This
changes uses the PassBuilder's FileSystem the same way other passes
read files from disk (during CI).
DeltaFile
+16-5llvm/lib/Transforms/IPO/InstrumentorConfigFile.cpp
+12-1llvm/lib/Transforms/IPO/Instrumentor.cpp
+6-3llvm/include/llvm/Transforms/IPO/Instrumentor.h
+2-2llvm/lib/Passes/PassBuilderPipelines.cpp
+1-1llvm/include/llvm/Transforms/IPO/InstrumentorConfigFile.h
+37-125 files

LLVM/project b32f982llvm/include/llvm/DebugInfo/DWARF DWARFTypePrinter.h, llvm/unittests/DebugInfo/GSYM CMakeLists.txt

[gsymutil] Fix build error in 196448 and remove warning message (#197028)

**Problem:**
#196448 broke the linux build of a test `DebugInfoGSYMTests`. See this
[buildbot](https://lab.llvm.org/buildbot/#/builders/10/builds/28337).

**Root cause:**
The `BinaryFormat` is a dependency that is required when the build is
done with `-DBUILD_SHARED_LIBS=ON`. This explains why some of the linux
builds passes, while the above buildbot fails.

**Fix:**
This patch fixes this by adding that dependency.

This patch also removes the warning message that was added by the same
patch, which should be added in a different way, as pointed out by this
[comment](https://github.com/llvm/llvm-project/pull/196448#discussion_r3221162626).



    [10 lines not shown]
DeltaFile
+1-6llvm/include/llvm/DebugInfo/DWARF/DWARFTypePrinter.h
+1-0llvm/unittests/DebugInfo/GSYM/CMakeLists.txt
+2-62 files

LLVM/project 8f854d8lldb/source/Commands CommandObjectProcess.cpp, lldb/test/Shell/Commands command-process-plugin-no-plugin.test

[lldb] Add specific error message for "process plugin" with no plugin loaded (#196933)

Fixes #196535.

The error was:
> command is not implemented

Which is incorrect. It is now:
> no process plugin commands are currently registered

Which is not very helpful either but it's not wrong at least. We could
expand it but I'm not sure what would help anyone here, given how rare
it is that anyone encounters this anyway.
DeltaFile
+4-0lldb/source/Commands/CommandObjectProcess.cpp
+3-0lldb/test/Shell/Commands/command-process-plugin-no-plugin.test
+7-02 files

LLVM/project 437803ecompiler-rt/lib/asan asan_thread.cpp, compiler-rt/lib/sanitizer_common sanitizer_posix_libcdep.cpp sanitizer_win.cpp

[compiler-rt][common] Only unmap stacks the runtime has actually mapped (#179000)

When the sanitizer hasn't mapped the alternate signal stack, but the
host program has (like LLVM), the runtime still tries to unilaterally
unmap the alternate stack. Instead, the runtime should just check if
it's actually mmaped the alternate stack, and only unmap it if it has.

---------

Co-authored-by: Vitaly Buka <vitalybuka at google.com>
DeltaFile
+24-0compiler-rt/test/asan/TestCases/Posix/multiple_sigaltstack.cpp
+8-4compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp
+3-2compiler-rt/lib/sanitizer_common/sanitizer_win.cpp
+2-2compiler-rt/lib/sanitizer_common/sanitizer_common.h
+2-2compiler-rt/lib/sanitizer_common/sanitizer_fuchsia.cpp
+2-2compiler-rt/lib/asan/asan_thread.cpp
+41-121 files not shown
+42-127 files

LLVM/project bbe046bllvm/lib/Target/WebAssembly/GISel WebAssemblyCallLowering.cpp, llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator call-basics.ll

[WebAssembly][GlobalISel] Fix ordering of operands for calls and other issues (#196898)

Fixes a few of issues with `WebAssemblyCallLowering::lowerCall`.

- Fixes the ordering of operands on the call instruction. Defs (so call
returns) must come before uses (call target and args).
- Prevents the tail-call bail out from null derefing when the call base
is empty (e.g. for libcalls).
- Ensures that the reg class is always set for the return registers of
the call instruction (before, if the regs didn't need splitting, it
wouldn't assign a reg-class to the existing reg, causing failures later
down the pipeline).
DeltaFile
+33-26llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/call-basics.ll
+18-11llvm/lib/Target/WebAssembly/GISel/WebAssemblyCallLowering.cpp
+51-372 files

LLVM/project 14cc641llvm/lib/Transforms/Instrumentation MemorySanitizer.cpp, llvm/test/Instrumentation/MemorySanitizer ftrunc.ll

[msan] Strengthen LLVM/NEON floating-point<->int propagation (#196875)

This generalizes handleNEONVectorConvertIntrinsic() to apply it to LLVM
cross-platform floating point<->int conversion intrinsics. The handler
uses an all-or-nothing approach: if any bit of an input element is
uninitialized, the corresponding output element is fully uninitialized.
This approximates how a single bit flip in an integer can affect
multiple bits of the equivalent floating-point (likewise for FP to int).

This implements the future work suggested in
https://github.com/llvm/llvm-project/pull/196429.
DeltaFile
+323-314llvm/test/Instrumentation/MemorySanitizer/X86/avx512-intrinsics-upgrade.ll
+60-44llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+74-18llvm/test/Instrumentation/MemorySanitizer/ftrunc.ll
+24-8llvm/test/Instrumentation/MemorySanitizer/AArch64/arm64-vcvt.ll
+14-10llvm/test/Instrumentation/MemorySanitizer/X86/avx512-intrinsics.ll
+12-4llvm/test/Instrumentation/MemorySanitizer/AArch64/arm64-vcvt_f32_su32.ll
+507-3981 files not shown
+511-4007 files

LLVM/project 185305dllvm/lib/Target/AMDGPU FLATInstructions.td SMInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.flat.prefetch.ll llvm.amdgcn.global.prefetch.ll

[AMDGPU] Prevent prefetch and load reordering (#197025)

Mark prefetches as having side effects, otherwise scheduler
reorders these with loads.
DeltaFile
+19-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.flat.prefetch.ll
+19-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.global.prefetch.ll
+14-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.prefetch.data.ll
+7-7llvm/test/CodeGen/AMDGPU/loop-prefetch-data.ll
+1-0llvm/lib/Target/AMDGPU/FLATInstructions.td
+1-0llvm/lib/Target/AMDGPU/SMInstructions.td
+61-76 files

LLVM/project 5645d72flang/lib/Lower/Support ReductionProcessor.cpp, flang/test/Lower loops3.f90

[flang] Correct MIN/MAX bug with DO CONCURRENT and REDUCE (#196708)
DeltaFile
+36-1flang/test/Lower/loops3.f90
+16-2flang/lib/Lower/Support/ReductionProcessor.cpp
+52-32 files

LLVM/project 839647dllvm/lib/DWARFLinker/Parallel SyntheticTypeNameBuilder.cpp, llvm/test/tools/dsymutil/X86 dwarflinker-template-parameter-pack.test

[DWARFLinker] Preserve children of DW_TAG_GNU_template_parameter_pack (#196439)

Pack children were not getting ordered synthetic keys, so TypePool
deduplicated them by name and TypesComparator sorted the survivors
alphabetically. Register the two missing tags with
SyntheticTypeNameBuilder.
DeltaFile
+181-0llvm/test/tools/dsymutil/X86/dwarflinker-template-parameter-pack.test
+7-0llvm/test/tools/dsymutil/X86/Inputs/dwarflinker-template-parameter-pack.map
+3-1llvm/lib/DWARFLinker/Parallel/SyntheticTypeNameBuilder.cpp
+191-13 files

LLVM/project 541a8c5lldb/source/Symbol Variable.cpp, lldb/test/API/functionalities/completion TestCompletion.py

[lldb] Add completion support for direct ivars (#195187)

Fixes the current shortcoming where `v m_na<TAB>` will not complete the
member `m_name` on `this`. This implements tab completion to complement
direct ivar access support in `frame variable`.

Assisted-by: claude

---------

Co-authored-by: Jonas Devlieghere <jonas at devlieghere.com>
DeltaFile
+37-0lldb/source/Symbol/Variable.cpp
+8-0lldb/test/API/functionalities/completion/TestCompletion.py
+45-02 files

LLVM/project 23e647ebolt/lib/Rewrite RewriteInstance.cpp

[BOLT] Fix EH data encoding checks in relocateEHFrameSection (#196777)

Previously committed in 7ab26d7c3a16 (#195691) and later reverted
in bc654b438ffe (#196672) due to failures extended bolt-tests.

The problem was that the mask should be `0x70` instead of `0xf0`,
so to allow `DW_EH_PE_indirect` to pass through.

The `DW_EH_PE_*rel` constants are not defined as values that each
have only one distinctive bit set, so we rewrote the conditions to
check encoding scheme explicitly.
DeltaFile
+5-7bolt/lib/Rewrite/RewriteInstance.cpp
+5-71 files

LLVM/project 916445bmlir/include/mlir/Dialect/MemRef/Transforms Transforms.h, mlir/include/mlir/Dialect/MemRef/Utils MemRefUtils.h

[MLIR][MemRef] Extend narrow-type emulation for dynamic offsets (#196945)

This patch adds three related extensions to the MemRef narrow-type
emulation patterns.

* `ConvertMemRefSubview` now accepts a dynamic innermost offset.
* `ConvertMemRefReinterpretCast` is generalized from the previous
static-rank-1, static-offset shape to accept any rank and dynamic
offsets, with the same alignment contract as the subview pattern.
* A new `ConvertMemRefCast` pattern handles `memref.cast` between
equivalent narrow-typed memref types so that emulation does not get
blocked by trivial casts.
DeltaFile
+150-52mlir/lib/Dialect/MemRef/Transforms/EmulateNarrowType.cpp
+102-2mlir/test/Dialect/MemRef/emulate-narrow-type.mlir
+22-5mlir/include/mlir/Dialect/MemRef/Utils/MemRefUtils.h
+23-0mlir/test/Dialect/MemRef/emulate-narrow-type-no-assume-aligned.mlir
+10-4mlir/lib/Dialect/MemRef/Utils/MemRefUtils.cpp
+7-1mlir/include/mlir/Dialect/MemRef/Transforms/Transforms.h
+314-641 files not shown
+316-667 files

LLVM/project 8ac8754libc/src/stdio/printf_core float_inf_nan_converter.h

[libc] Remove global printf_core StorageType declarations in float_inf_nan_converter.h (#196859)

fixed_converter.h and float_hex_converter.h have local declarations with
the same name shadowing it, causing -Wshadow warnings. The using
declaration is used in only one function, so just make it local.
DeltaFile
+6-6libc/src/stdio/printf_core/float_inf_nan_converter.h
+6-61 files

LLVM/project 0a1472aflang/lib/Semantics check-omp-structure.cpp check-omp-loop.cpp

Use ultimate symbol everywhere
DeltaFile
+10-8flang/lib/Semantics/check-omp-structure.cpp
+9-1flang/lib/Semantics/check-omp-loop.cpp
+2-1flang/lib/Semantics/openmp-utils.cpp
+21-103 files

LLVM/project fbb6e2eclang/include/clang/Basic DiagnosticLexKinds.td, clang/test/Lexer __counter__-system-include.c

[clang] Don't warn on __COUNTER__ in system macros

The introduction of extension and compatibility warnings means
that __COUNTER__ has started causing warnings (and -Werror= build
failures) due to use of system APIs.

This PR simply ensures that these diagnostics don't get reported
for system macro expansions as well.
DeltaFile
+18-0clang/test/Lexer/__counter__-system-include.c
+7-0clang/test/Lexer/Inputs/__counter__-system-header.h
+2-2clang/include/clang/Basic/DiagnosticLexKinds.td
+27-23 files

LLVM/project 7ddee0bllvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp GCNSubtarget.h, llvm/lib/Target/AMDGPU/MCTargetDesc AMDGPUMCExpr.cpp

[AMDGPU] Account for inline asm size in inst_pref_size calculation (#192306)

`SIProgramInfo::getFunctionCodeSize()` with `IsLowerBound=true` was
completely skipping inline assembly instructions, treating them as zero
bytes. This caused `amdhsa_inst_pref_size` to be severely underestimated
for kernels containing inline asm, defeating instruction prefetch on
gfx11+.

Use MCExpr label subtraction (`.Lfunc_end - func_sym`) to compute exact
function code size, resolved at assembly time. This avoids inline asm
string parsing which cannot reliably estimate code size and risks
overestimation (which causes prefetch of unmapped memory and a fatal
segfault).

Add a new `AMDGPUMCExpr` variant (`AGVK_InstPrefSize`) to compute
`min(divideCeil(codeSize, cacheLineSize), maxFieldVal)` as a custom
MCExpr, following the same pattern as `AGVK_Occupancy` and
`AGVK_AlignTo`. The cache line size and field width are derived from the
subtarget via `IsaInfo::getInstCacheLineSize` and feature-bit checks

    [24 lines not shown]
DeltaFile
+154-0llvm/test/CodeGen/AMDGPU/inst-prefetch-inline-asm.ll
+42-41llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+41-9llvm/test/CodeGen/AMDGPU/inst-prefetch-hint.ll
+45-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
+18-0llvm/lib/Target/AMDGPU/GCNSubtarget.h
+3-14llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
+303-644 files not shown
+324-6810 files

LLVM/project a761e2elibc/src/__support/math sqrtf128.h

[libc] Fix -Wshadow warning in sqrtf128.h (#196851)

sqrtf128() contained both `using namespace sqrtf128_internal;` and
`using FPBits = fputil::FPBits<float128>;`, but sqrtf128_internal also
had a `using FPBits = fputil::FPBits<float128>;`. The outer `using`
wasn't actually used, so remove that one.
DeltaFile
+0-2libc/src/__support/math/sqrtf128.h
+0-21 files

LLVM/project c4e932flibc/test/IntegrationTest test.cpp

[libc] Fix -Wshadow warning in IntegrationTest/test.cpp (#196858)
DeltaFile
+2-2libc/test/IntegrationTest/test.cpp
+2-21 files

LLVM/project 64a8f36lldb/source/Commands CommandObjectBreakpointCommand.cpp

[lldb] Fix missing status in CommandObjectBreakpointCommand (#197024)

This should fix Breakpoint/breakpoint-command.test after #196589.
DeltaFile
+11-2lldb/source/Commands/CommandObjectBreakpointCommand.cpp
+11-21 files

LLVM/project 1ac5083libc/src/__support/HashTable table.h

[libc] Fix -Wshadow warning in HashTable/table.h (#196857)
DeltaFile
+3-3libc/src/__support/HashTable/table.h
+3-31 files

LLVM/project b5ce180llvm/lib/Target/AMDGPU FLATInstructions.td SMInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.flat.prefetch.ll llvm.amdgcn.global.prefetch.ll

[AMDGPU] Prevent prefetch and load reordering

Mark prefetches as having side effects, otherwise scheduler
reorders these with loads.
DeltaFile
+19-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.flat.prefetch.ll
+19-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.global.prefetch.ll
+14-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.s.prefetch.data.ll
+7-7llvm/test/CodeGen/AMDGPU/loop-prefetch-data.ll
+1-0llvm/lib/Target/AMDGPU/FLATInstructions.td
+1-0llvm/lib/Target/AMDGPU/SMInstructions.td
+61-76 files

LLVM/project 95f5a88llvm/test/CodeGen/AArch64 vec-combine-trunc-dup-ext.ll minmax.ll

[AArch64][GlobalISel] More test updates for vector select. NFC (#197023)
DeltaFile
+164-42llvm/test/CodeGen/AArch64/vec-combine-trunc-dup-ext.ll
+118-52llvm/test/CodeGen/AArch64/minmax.ll
+91-11llvm/test/CodeGen/AArch64/neon-anyof-splat.ll
+86-1llvm/test/CodeGen/AArch64/icmp.ll
+15-0llvm/test/CodeGen/AArch64/arm64-mul.ll
+474-1065 files

LLVM/project b1a4e08llvm/docs AMDGPUUsage.rst, llvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp

[AMDGPU] Add `.amdgpu.info` section for per-function metadata (#192384)

AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the
compiler must emit per-function metadata and call graph edges in the
relocatable object so the linker can compute whole-program resource
requirements.

This PR introduces a `.amdgpu.info` ELF section using a tagged,
length-prefixed binary format: each entry is encoded as:

```
[kind: u8] [len: u8] [payload: <len> bytes]
```

A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags,
private segment size) and relational edges (direct calls, LDS uses,
indirect call signatures). String data such as function type signatures

    [4 lines not shown]
DeltaFile
+257-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-typeid.ll
+179-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+158-2llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+126-0llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+126-0llvm/test/MC/AMDGPU/amdgpu-info-roundtrip.s
+106-0llvm/docs/AMDGPUUsage.rst
+952-29 files not shown
+1,261-1415 files