LLVM/project fc4aad7clang/lib/CodeGen CGDecl.cpp, clang/test/CodeGenCoroutines coro-param-fake-use.cpp

[Clang][Coroutines] Don't emit fake uses for coroutine parameters (#194690)

Fixes issue: https://github.com/llvm/llvm-project/issues/192351

The combination of coroutines with -fextend-variable-liveness has
resulted in use-after-free, caused by the fact that we insert fake uses
of coroutine parameters at the end of the coroutine. While this is fine
for normal functions, in coroutines these variables are stored in the
coroutine frame, which is freed before the end of the function; this
results in us loading from the deleted frame.

This patch fixes this by no longer emitting fake uses for most coroutine
parameters. Since coroutine parameters will be saved back to the frame
when we suspend, and currently may not be optimized out, fake uses are
not needed in this case, and so by not emitting them we avoid dealing
with the complexity of updating fake uses in the CoroSplit pass. The
exception to this is 'this', which is not saved to the frame.

(cherry picked from commit efb01c1bf558eaaf8ec64e1a54110584e827f21b)
DeltaFile
+42-0clang/test/CodeGenCoroutines/coro-param-fake-use.cpp
+7-2clang/lib/CodeGen/CGDecl.cpp
+49-22 files

LLVM/project 7625a2flldb/source/Plugins/LanguageRuntime/ObjC ObjCLanguageRuntime.h

fixup! change small size to 2
DeltaFile
+1-1lldb/source/Plugins/LanguageRuntime/ObjC/ObjCLanguageRuntime.h
+1-11 files

LLVM/project 5fb52fcclang/lib/CodeGen CoverageMappingGen.cpp, clang/test/CoverageMapping system_macro_switch.cpp

[Coverage] Fix assertion failure when a -isystem header invokes a user macro (#195427)

```
  // a.cc
  static void foo(int x) {
    switch (x) {
  #define GENERIC(n) case n:
  #include "types.def"   // -isystem header invokes a user macro
      break;
    }
  }

  // sys/types.def
  #define MID(name) GENERIC(name)
  MID(0)
  MID(1)
  MID(2)
```


    [20 lines not shown]
DeltaFile
+42-0clang/test/CoverageMapping/system_macro_switch.cpp
+16-11clang/lib/CodeGen/CoverageMappingGen.cpp
+58-112 files

LLVM/project 13a6287llvm/unittests/Target/AArch64 InstSizes.cpp

[AArch64][test] Fix use-after-scope in createInstrInfo (#197622)

https://github.com/llvm/llvm-project/pull/183506 revealed a pre-existing
use-after-scope in createInstrInfo (MSan bot:
https://lab.llvm.org/buildbot/#/builders/164/builds/21562 [*]).

This patch fixes the issue by changing the stack-allocated
AArch64Subtarget (which goes out of scope once createInstrInfo()
returns) into heap-allocated, allowing it to be safely stored in the
returned AArch64InstrInfo.

-----

[*] WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x55555666fabd in
llvm::AArch64InstrInfo::getInstSizeInBytes(llvm::MachineInstr const&)
const
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:247:5
...

    [19 lines not shown]
DeltaFile
+27-21llvm/unittests/Target/AArch64/InstSizes.cpp
+27-211 files

LLVM/project a8ff36allvm/test/CodeGen/AMDGPU srem.ll load-global-i8.ll, llvm/test/CodeGen/AMDGPU/GlobalISel sdivrem.ll udivrem.ll

Merge upstream/main into users/mariusz-sikora-at-amd/gfx13/add-more-types-to-permlane
DeltaFile
+6,862-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Nano-sve-instructions.s
+3,436-2,769llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll
+4,686-918llvm/test/CodeGen/X86/vector-reduce-ctpop.ll
+2,801-2,109llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll
+2,144-2,147llvm/test/CodeGen/AMDGPU/srem.ll
+1,647-1,991llvm/test/CodeGen/AMDGPU/load-global-i8.ll
+21,576-9,9342,028 files not shown
+107,220-47,0012,034 files

LLVM/project 7ae25fbllvm/test/CodeGen/AArch64 neon-dotreduce.ll fsh.ll

[AArch64] Keep MMO when converting gather lane to LDRSui. (#197522)

We were losing the MMO when converting the load. Make sure we copy them
over, which apparently alters codegen more than I expected and helps
keep postinc generation after #196305.
DeltaFile
+183-183llvm/test/CodeGen/AArch64/neon-dotreduce.ll
+70-70llvm/test/CodeGen/AArch64/fsh.ll
+58-58llvm/test/CodeGen/AArch64/complex-deinterleaving-uniform-cases.ll
+32-32llvm/test/CodeGen/AArch64/fp-maximumnum-minimumnum.ll
+26-26llvm/test/CodeGen/AArch64/nontemporal-store.ll
+17-8llvm/test/CodeGen/AArch64/concat-vector.ll
+386-3772 files not shown
+401-3908 files

LLVM/project e5b06d8llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp

Use StandardB
DeltaFile
+4-7llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+4-71 files

LLVM/project 3272c56llvm/test/CodeGen/AMDGPU amdgpu-codegenprepare-idiv.ll udiv.ll

[AMDGPU] Remove RCP_IFLAG combine (#197426)

The combine was added in D48569 8 years ago with the aim of preserving
flags, but the current LangRef says the status flags are not observable
in the default FP environment.

The main motivation for this change is to enable scalar float reciprocal
generation v_s_rcp_f32 on newer hardware. There is no v_s_rcp_iflag_f32,
so the combine effectively blocks the selection.
See: pseudo-scalar-transcendental.ll.
DeltaFile
+160-160llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
+52-52llvm/test/CodeGen/AMDGPU/udiv.ll
+40-46llvm/test/CodeGen/AMDGPU/udivrem24.ll
+77-8llvm/test/CodeGen/AMDGPU/rcp_iflag.ll
+33-33llvm/test/CodeGen/AMDGPU/sdiv.ll
+32-32llvm/test/CodeGen/AMDGPU/permute_i8.ll
+394-33119 files not shown
+577-53825 files

LLVM/project 1d93fc4libc CMakeLists.txt, libc/config/linux/aarch64 entrypoints.txt

[libc] Add LLVM_LIBC_ENABLE_EXPERIMENTAL_ENTRYPOINTS CMake flag (#197537)

Adds a new CMake option, OFF by default, to gate entrypoints with
known-incomplete implementations. This lets developers build and test
partially-implemented functions without exposing them to production
users.

The motivating case is `sysconf`, which only handles three of the
required `_SC_*` constants (`_SC_PAGESIZE`, `_SC_NPROCESSORS_CONF`,
`_SC_NPROCESSORS_ONLN`) and returns `EINVAL` for everything else.
Functions like this are useful to have in a build for testing progress,
but shouldn't be part of a default full build until the implementation
is complete.

Changes:
- `libc/CMakeLists.txt`: adds
`option(LLVM_LIBC_ENABLE_EXPERIMENTAL_ENTRYPOINTS ... OFF)`
- `libc/cmake/modules/LLVMLibCCompileOptionRules.cmake`: propagates
`-DLIBC_EXPERIMENTAL_ENTRYPOINTS` when ON

    [6 lines not shown]
DeltaFile
+6-1libc/config/linux/aarch64/entrypoints.txt
+6-1libc/config/linux/riscv/entrypoints.txt
+6-1libc/config/linux/x86_64/entrypoints.txt
+2-0libc/CMakeLists.txt
+20-34 files

LLVM/project 29206d7clang/include/clang/Sema Sema.h, clang/lib/Parse ParseOpenMP.cpp

[OpenMP] Fix launch_bounds for OpenMP ompx_attribute (#195665)

This commit fixes the handling of `launch_bounds` within OpenMP's
`ompx_attribute`. The third attribute value, the maximum blocks, was not
parsed correctly.
DeltaFile
+16-9clang/lib/Sema/SemaDeclAttr.cpp
+10-4clang/test/OpenMP/thread_limit_gpu.c
+5-3clang/include/clang/Sema/Sema.h
+3-2clang/lib/Parse/ParseOpenMP.cpp
+34-184 files

LLVM/project 35f5d7ellvm/lib/Target/AArch64/GISel AArch64RegisterBankInfo.cpp

[AArch64][GlobalISel] Fast-path common G_CONSTANT/G_BRCOND/G_FRAME_INDEX regbank mappings (#197383)

Returning the default register-bank mapping directly for these opcodes
is a -0.17% compile-time improvement on aarch64-O0-g.

https://llvm-compile-time-tracker.com/compare.php?from=b4aa4d4dcb6f1c8a00d1d1e53d2b353c97ec98b7&to=0779891fc6bf6a01e4f14d3f359e212c6ec52c0d&stat=instructions%3Au

Assisted-by: codex
DeltaFile
+20-0llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+20-01 files

LLVM/project 131d66cbolt/lib/Core DIEBuilder.cpp, bolt/test/X86 dwarf5-locexpr-regval-type.s dwarf5-form-ref-udata.s

[BOLT][DWARF] Support DW_FORM_ref_udata and DW_OP_regval_type (#197565)

Add support for DWARF opcodes seen in GCC-generated binaries:

- DW_FORM_ref_udata: ULEB128-encoded CU-relative DIE reference.

- DW_OP_regval_type (0xa5): DWARF5 expression opcode with operands
(SizeLEB, BaseTypeRef). The BaseTypeRef was not being updated when DIEs
were relocated because cloneExpression only handled (Size1, BaseTypeRef)
patterns. Generalized the first-operand copying to use raw bytes from
the data stream instead of assuming a single byte.

Fixes #188250

Assisted-by: Claude Opus 4.6/4.7
DeltaFile
+83-0bolt/test/X86/dwarf5-locexpr-regval-type.s
+70-0bolt/test/X86/dwarf5-form-ref-udata.s
+8-5bolt/lib/Core/DIEBuilder.cpp
+161-53 files

LLVM/project ac8361dllvm/lib/Target/X86 X86InstrCompiler.td, llvm/test/CodeGen/X86 atomic-load-store.ll

[X86] Remove extra MOV after widening atomic store

This change adds patterns to optimize out an extra MOV present after
widening the atomic store. Covers <2 x i8> (SSE4.1+), <2 x i16>,
<4 x i8>, <2 x i32>, <2 x float>, <4 x i16>, <2 x ptr addrspace(270)>.
DeltaFile
+47-64llvm/test/CodeGen/X86/atomic-load-store.ll
+99-0llvm/lib/Target/X86/X86InstrCompiler.td
+146-642 files

LLVM/project 6cdd328llvm/lib/AsmParser LLParser.cpp, llvm/test/Assembler thinlto-vtable-skip.ll thinlto-bad-summary1.ll

Handle typeidCompatibleVTable in skipModuleSummaryEntry (#196849)

This method needs to match the set of cases handled in parseSummaryEntry.
DeltaFile
+11-0llvm/test/Assembler/thinlto-vtable-skip.ll
+6-5llvm/lib/AsmParser/LLParser.cpp
+1-1llvm/test/Assembler/thinlto-bad-summary1.ll
+18-63 files

LLVM/project 12e06d7mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

Remove unrelated empty line
DeltaFile
+0-1mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+0-11 files

LLVM/project 83ae5ccflang/include/flang/Semantics semantics.h, flang/lib/Semantics resolve-directives.cpp rewrite-parse-tree.cpp

[flang][openacc] allow duplicate data sharing clauses (#197019)

This PR allows duplicate OpenACC `private` and `firstprivate` clauses.
While maintaining the restriction on `reduction` clauses.
DeltaFile
+122-0flang/test/Semantics/OpenACC/acc-dataclause-dedup.f90
+63-0flang/test/Lower/OpenACC/acc-dedup-private.f90
+27-16flang/lib/Semantics/resolve-directives.cpp
+28-0flang/test/Parser/acc-dedup-unparse.f90
+11-0flang/include/flang/Semantics/semantics.h
+10-0flang/lib/Semantics/rewrite-parse-tree.cpp
+261-161 files not shown
+262-177 files

LLVM/project 4f60fb9flang/docs Directives.md, flang/lib/Semantics expression.cpp

[flang][cuda] Honor !dir$ ignore_tkr(m) under -gpu=mem:{unified,managed} (#197518)

A device-typed dummy with `!dir$ ignore_tkr(m)` is meant to be an
overload discriminator (only selected for actuals with an explicit
`device/managed/unified` attribute). Skip the host->device relaxation in
AreCompatibleCUDADataAttrs when `IgnoreTKR::Managed` is set so
unattributed host actuals no longer bind to such a dummy.

Also document the §3.2.3 matching distance table next to
GetMatchingDistance and add LIT tests for the full Table 2 grid
and the ignore_tkr(m) carve-out.
DeltaFile
+90-0flang/test/Semantics/cuf-matching-distance.cuf
+56-0flang/test/Semantics/cuf-ignore-tkr-m-generic.cuf
+36-0flang/docs/Directives.md
+32-0flang/test/Semantics/cuf-ignore-tkr-m-error.cuf
+23-2flang/lib/Semantics/expression.cpp
+13-5flang/lib/Support/Fortran.cpp
+250-76 files

LLVM/project e2b5048llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp, llvm/test/MC/AMDGPU literals.s

[AMDGPU] Validate forced lit() immediate (#196623)

Right now it takes validation path of an inline constant if fits
even though it is forced to literal encoding.
DeltaFile
+7-8llvm/test/MC/AMDGPU/literals.s
+7-1llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+14-92 files

LLVM/project 1ee6e9cllvm/lib/ProfileData InstrProf.cpp, llvm/lib/Transforms/Instrumentation PGOMemOPSizeOpt.cpp

fix

Created using spr 1.3.7
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+33-16llvm/lib/ProfileData/InstrProf.cpp
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+4-38llvm/test/Transforms/JumpTableToSwitch/profile-no-guid-metadata.ll
+0-7llvm/lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp
+37-1665 files

LLVM/project e780eb0llvm/lib/ProfileData InstrProf.cpp, llvm/lib/Transforms/Instrumentation PGOMemOPSizeOpt.cpp

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+33-16llvm/lib/ProfileData/InstrProf.cpp
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+4-38llvm/test/Transforms/JumpTableToSwitch/profile-no-guid-metadata.ll
+0-7llvm/lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp
+37-1665 files

LLVM/project 5ad848ellvm/lib/ProfileData InstrProf.cpp, llvm/test/Transforms/JumpTableToSwitch profile-no-guid-metadata.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+4-38llvm/test/Transforms/JumpTableToSwitch/profile-no-guid-metadata.ll
+33-4llvm/lib/ProfileData/InstrProf.cpp
+25-0llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros-metadata.proftext
+21-0llvm/test/Transforms/PGOProfile/consecutive-zeros-metadata.ll
+83-1472 files not shown
+86-1558 files

LLVM/project 0f43f70llvm/lib/ProfileData InstrProf.cpp, llvm/lib/Transforms/Instrumentation PGOMemOPSizeOpt.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+33-4llvm/lib/ProfileData/InstrProf.cpp
+25-0llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros-metadata.proftext
+21-0llvm/test/Transforms/PGOProfile/consecutive-zeros-metadata.ll
+0-7llvm/lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp
+79-1161 files not shown
+82-1177 files

LLVM/project d1eeb00llvm/lib/ProfileData InstrProf.cpp, llvm/lib/Transforms/Instrumentation PGOMemOPSizeOpt.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+33-4llvm/lib/ProfileData/InstrProf.cpp
+25-0llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros-metadata.proftext
+21-0llvm/test/Transforms/PGOProfile/consecutive-zeros-metadata.ll
+0-7llvm/lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp
+79-1161 files not shown
+82-1177 files

LLVM/project 3f6536ellvm/docs LangRef.rst, llvm/lib/ProfileData InstrProf.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+33-4llvm/lib/ProfileData/InstrProf.cpp
+25-0llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros-metadata.proftext
+21-0llvm/test/Transforms/PGOProfile/consecutive-zeros-metadata.ll
+3-1llvm/docs/LangRef.rst
+82-1106 files

LLVM/project 44c2207llvm/lib/ProfileData InstrProf.cpp, llvm/test/Transforms/PGOProfile consecutive-zeros.ll

fix

Created using spr 1.3.7
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+33-16llvm/lib/ProfileData/InstrProf.cpp
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+33-1213 files

LLVM/project a2214e3llvm/test/Transforms/PGOProfile consecutive-zeros.ll, llvm/test/Transforms/PGOProfile/Inputs consecutive-zeros.proftext

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+0-1052 files

LLVM/project e7852b5llvm/docs LangRef.rst, llvm/test/Transforms/PGOProfile consecutive-zeros.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+0-58llvm/test/Transforms/PGOProfile/consecutive-zeros.ll
+0-47llvm/test/Transforms/PGOProfile/Inputs/consecutive-zeros.proftext
+3-1llvm/docs/LangRef.rst
+3-1063 files

LLVM/project 70ac2f7llvm/docs LangRef.rst

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.7

[skip ci]
DeltaFile
+3-1llvm/docs/LangRef.rst
+3-11 files

LLVM/project 409d4a6llvm/test/CodeGen/AMDGPU splitkit-copy-live-lanes.mir ra-inserted-scalar-instructions.mir, llvm/test/CodeGen/X86 statepoint-invoke-ra-inline-spiller.mir

[MIR] Serialize/Deserialize MachineInstr::LRSplit attribute

The LRSplit MachineInstr flag is set by SplitKit on copies inserted for
live-range splitting.
Until now the flag had no MIR-text representation.

This patch fixes that so that it gets easier to reproduce/capture issues
that involves SplitKit.

Round-trip coverage in
llvm/test/CodeGen/MIR/AMDGPU/lr-split-flag.mir.
DeltaFile
+168-168llvm/test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir
+36-36llvm/test/CodeGen/AMDGPU/ra-inserted-scalar-instructions.mir
+32-32llvm/test/CodeGen/AMDGPU/splitkit-copy-bundle.mir
+27-27llvm/test/CodeGen/AMDGPU/ran-out-of-sgprs-allocation-failure.mir
+22-22llvm/test/CodeGen/AMDGPU/inflated-reg-class-snippet-copy-use-after-free.mir
+22-22llvm/test/CodeGen/X86/statepoint-invoke-ra-inline-spiller.mir
+307-30731 files not shown
+436-40237 files

LLVM/project 8baf11allvm/test/CodeGen/AMDGPU regalloc-hoist-spill-live-range-upd.ll regalloc-hoist-spill-live-range-upd.mir

[AMDGPU][test] Use mir test for regalloc issue

Use the newly introduced split-from flag to produce a more robust test case
for the hoistSpillInsideBB live-range update issue.

NFC

DeltaFile
+0-2,870llvm/test/CodeGen/AMDGPU/regalloc-hoist-spill-live-range-upd.ll
+71-0llvm/test/CodeGen/AMDGPU/regalloc-hoist-spill-live-range-upd.mir
+71-2,8702 files