LLVM/project ad2bf49clang/lib/Serialization ASTWriter.cpp, clang/unittests/Serialization NoCommentsTest.cpp CommentsTest.cpp

[C++20] [Modules] Write comments in C++20 modules' module file (#192398)

Previously we avoid writing the comments in C++20 modules' module file.

But this prevents LSP tools to read the comments in it. Although we
thought to add a new option for it and ask LSP to use the new option,
the cost of comments seems to be low and new option raises complexity,
so I prefer to write comments in C++20 modules' module file by default
now.
DeltaFile
+0-130clang/unittests/Serialization/NoCommentsTest.cpp
+130-0clang/unittests/Serialization/CommentsTest.cpp
+0-6clang/lib/Serialization/ASTWriter.cpp
+1-1clang/unittests/Serialization/CMakeLists.txt
+131-1374 files

LLVM/project 84d5a9cllvm/lib/Target/LoongArch LoongArchISelDAGToDAG.cpp

Address wanglei's comments
DeltaFile
+3-4llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
+3-41 files

LLVM/project 0dee745lldb/include/lldb/Interpreter OptionGroupVariable.h, lldb/source/Interpreter OptionGroupVariable.cpp

[lldb] Reformat OptionGroupVariable.{h,cpp}, NFC.

This patch runs clang-format on OptionGroupVariable.{h,cpp}.

stack-info: PR: https://github.com/llvm/llvm-project/pull/192395, branch: users/bzcheeseman/stack/10
DeltaFile
+84-21lldb/source/Interpreter/OptionGroupVariable.cpp
+3-3lldb/include/lldb/Interpreter/OptionGroupVariable.h
+87-242 files

LLVM/project 72a3cd9llvm/lib/CodeGen MacroFusion.cpp, llvm/test/CodeGen/AArch64 macro-fusion-cluster-conflict.mir

[MacroFusion] Early return when insts already clustered (#191710)

This patch adds an early return to `fuseInstructionPair()` when macro
fused instructions are already clustered, either by an earlier fusion or
another clustering like ld/st clustering, removing the assert.
    
The assert is generally wrong - there are edge cases where an earlier
ld/st clustering (before macro fusion) reached the assert because it
sets `ParentClusterIdx` and fails. For example, ADRP+LOAD/STORE on
AArch64, thought it seems to be a rare case because the addresses are
ususally unkown at compile time.
    
It doesn't effectively change how fusions are prioritized - early
fusions still win on fusion-fusion conflicts, like before. But it
changes how we resolve the edge case of ld/st-fusion conflicts:
    
Previously, fusions would effectively override ld/st clustering in this
case, given that we currently limits instruction membership to at most a
single cluster through `ParentClusterIdx`. Macro fusion runs after ld/st

    [10 lines not shown]
DeltaFile
+162-0llvm/test/CodeGen/AArch64/macro-fusion-cluster-conflict.mir
+17-16llvm/lib/CodeGen/MacroFusion.cpp
+179-162 files

LLVM/project eea8fb3mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-target-launch-device.mlir openmp-target-launch-host.mlir

add else block for maxTeamsVals
DeltaFile
+5-3mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+1-1mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir
+1-1mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir
+7-53 files

LLVM/project cb448f7bolt/unittests/Profile DataAggregator.cpp

format
DeltaFile
+1-2bolt/unittests/Profile/DataAggregator.cpp
+1-21 files

LLVM/project 178ef93mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-target-launch-host.mlir openmp-target-launch-device.mlir

update
DeltaFile
+30-40mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+24-4mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir
+19-0mlir/test/Target/LLVMIR/openmp-target-launch-device.mlir
+73-443 files

LLVM/project 3a85815mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-todo.mlir openmp-target-launch-host.mlir

[OpenMP][MLIR] Add num_teams mlir to llvm lowering
DeltaFile
+83-34mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+18-2mlir/test/Target/LLVMIR/openmp-todo.mlir
+3-3mlir/test/Target/LLVMIR/openmp-target-launch-host.mlir
+104-393 files

LLVM/project 13cae27flang/include/flang/Evaluate tools.h, flang/lib/Evaluate tools.cpp

Revert "[flang][cuda] Avoid false positive on multi device symbol with components" (#192393)

Reverts llvm/llvm-project#192177

this breaks some downstream testing
DeltaFile
+0-34flang/lib/Evaluate/tools.cpp
+0-26flang/test/Lower/CUDA/cuda-data-transfer.cuf
+0-11flang/include/flang/Evaluate/tools.h
+1-2flang/lib/Semantics/check-cuda.cpp
+1-734 files

LLVM/project 80776e6flang/include/flang/Evaluate tools.h, flang/lib/Evaluate tools.cpp

Revert "[flang][cuda] Avoid false positive on multi device symbol with compon…"

This reverts commit a3af640a1b5c61e7f31c7212338cd348d9b4a132.
DeltaFile
+0-34flang/lib/Evaluate/tools.cpp
+0-26flang/test/Lower/CUDA/cuda-data-transfer.cuf
+0-11flang/include/flang/Evaluate/tools.h
+1-2flang/lib/Semantics/check-cuda.cpp
+1-734 files

LLVM/project 6b36d39bolt/include/bolt/Profile DataAggregator.h, bolt/unittests/Profile DataAggregator.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.4

[skip ci]
DeltaFile
+208-2bolt/unittests/Profile/DataAggregator.cpp
+1-0bolt/include/bolt/Profile/DataAggregator.h
+209-22 files

LLVM/project 29cecd0bolt/include/bolt/Profile DataAggregator.h, bolt/lib/Profile DataReader.cpp DataAggregator.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.4
DeltaFile
+248-2bolt/unittests/Profile/DataAggregator.cpp
+8-3bolt/lib/Profile/DataReader.cpp
+4-2bolt/lib/Profile/DataAggregator.cpp
+1-0bolt/include/bolt/Profile/DataAggregator.h
+261-74 files

LLVM/project e6ef412bolt/include/bolt/Profile DataAggregator.h, bolt/unittests/Profile DataAggregator.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.4
DeltaFile
+208-2bolt/unittests/Profile/DataAggregator.cpp
+1-0bolt/include/bolt/Profile/DataAggregator.h
+209-22 files

LLVM/project 98449cbcompiler-rt/lib/hwasan hwasan_allocator.cpp hwasan_flags.inc, compiler-rt/test/hwasan/TestCases tag_mask_smoke.c

[HWASan] [compiler-rt] Add tag_bits option to HWASan alloc (#192386)

This can be used to make sure the allocator does not use the top bit of
the pointer. This is useful when HWASan is used in combination with
signed-integer-overflow detection. Some code uses arithmetic on intptr_t
that overflows for sufficiently large pointers.
DeltaFile
+21-0compiler-rt/test/hwasan/TestCases/tag_mask_smoke.c
+14-2compiler-rt/lib/hwasan/hwasan_allocator.cpp
+2-0compiler-rt/lib/hwasan/hwasan_flags.inc
+37-23 files

LLVM/project 57e973bllvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp, llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp

[AMDGPU] Add `.amdgpu.info` section for per-function metadata

AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.

This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:

```
[kind: u8] [len: u8] [payload: <len> bytes]
```

A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.

    [4 lines not shown]
DeltaFile
+197-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+172-2llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+121-0llvm/test/MC/AMDGPU/amdgpu-info-roundtrip.s
+117-0llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+83-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-prototype.ll
+63-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-indirect.ll
+753-28 files not shown
+1,033-1214 files

LLVM/project 8521c90llvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU lds-link-time-codegen.ll lds-link-time-codegen-named-barrier.ll

[AMDGPU] Emit the relocation symbol for LDS and named barrier when object linking is enabled
DeltaFile
+50-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen.ll
+35-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-named-barrier.ll
+12-3llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+10-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+107-34 files

LLVM/project db3c8cdcompiler-rt/lib/hwasan hwasan_allocator.cpp hwasan_flags.inc, compiler-rt/test/hwasan/TestCases tag_mask_smoke.c

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+21-0compiler-rt/test/hwasan/TestCases/tag_mask_smoke.c
+14-2compiler-rt/lib/hwasan/hwasan_allocator.cpp
+2-0compiler-rt/lib/hwasan/hwasan_flags.inc
+37-23 files

LLVM/project 9816a91compiler-rt/lib/hwasan hwasan_allocator.cpp hwasan_flags.inc, compiler-rt/test/hwasan/TestCases tag_mask_smoke.c

[HWASan] [compiler-rt] Add tag_bits option to HWASan alloc (#191089)

This can be used to make sure the allocator does not use the top bit of
the pointer. This is useful when HWASan is used in combination with
signed-integer-overflow detection. Some code uses arithmetic on intptr_t
that overflows for sufficiently large pointers.
DeltaFile
+21-0compiler-rt/test/hwasan/TestCases/tag_mask_smoke.c
+14-2compiler-rt/lib/hwasan/hwasan_allocator.cpp
+2-0compiler-rt/lib/hwasan/hwasan_flags.inc
+37-23 files

LLVM/project b848e99clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/non-overloaded vpaire.c vpairo.c, clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/overloaded vpairo.c

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+6,877-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-sve-instructions.s
+5,336-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-writeback.s
+3,167-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-neon-instructions.s
+2,723-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/overloaded/vpairo.c
+2,723-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/non-overloaded/vpaire.c
+2,723-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/non-overloaded/vpairo.c
+23,549-0646 files not shown
+86,047-9,587652 files

LLVM/project b06f25fclang/include/clang/AST ASTContext.h, clang/lib/AST ASTContext.cpp ItaniumMangle.cpp

[clang] implement CWG2064: ignore value dependence for decltype

The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.

This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.

This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.

Fixes #8740
Fixes #61818
Fixes #190388
DeltaFile
+889-175clang/lib/AST/ASTContext.cpp
+312-12clang/test/SemaTemplate/instantiation-dependence.cpp
+151-93clang/lib/AST/ItaniumMangle.cpp
+76-68clang/lib/AST/Type.cpp
+77-48clang/lib/Sema/SemaTemplate.cpp
+93-16clang/include/clang/AST/ASTContext.h
+1,598-41281 files not shown
+2,323-75787 files

LLVM/project 78cefa3llvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp, llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp

[AMDGPU] Add `.amdgpu.info` section for per-function metadata

AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.

This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:

```
[kind: u8] [len: u8] [payload: <len> bytes]
```

A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.

    [4 lines not shown]
DeltaFile
+197-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+172-2llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+121-0llvm/test/MC/AMDGPU/amdgpu-info-roundtrip.s
+117-0llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+83-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-prototype.ll
+63-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-indirect.ll
+753-28 files not shown
+1,038-1314 files

LLVM/project 9ae35fcllvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp, llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp

[AMDGPU] Add `.amdgpu.info` section for per-function metadata

AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.

This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:

```
[kind: u8] [len: u8] [payload: <len> bytes]
```

A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.

    [4 lines not shown]
DeltaFile
+192-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+172-2llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+121-0llvm/test/MC/AMDGPU/amdgpu-info-roundtrip.s
+117-0llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+83-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-prototype.ll
+63-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-indirect.ll
+748-28 files not shown
+1,033-1314 files

LLVM/project 0e5e975llvm/lib/CodeGen InlineSpiller.cpp LiveRangeEdit.cpp, llvm/lib/Target/X86 X86InstrInfo.cpp

[X86][APX] Reset SubReg for dst and check isVirtual before getInterval/getPhys (#191765)

We have made sure dst operand never has a SubReg. We need to make sure
register is virtual when calling getInterval/getPhys.
DeltaFile
+287-0llvm/test/CodeGen/X86/apx/add.ll
+219-0llvm/test/CodeGen/X86/apx/sub.ll
+9-5llvm/lib/Target/X86/X86InstrInfo.cpp
+5-2llvm/lib/CodeGen/InlineSpiller.cpp
+5-2llvm/lib/CodeGen/LiveRangeEdit.cpp
+525-95 files

LLVM/project 3f5a247clang/include/clang/AST ASTContext.h, clang/lib/AST ASTContext.cpp ItaniumMangle.cpp

[clang] implement CWG2064: ignore value dependence for decltype

The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.

This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.

This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.

Fixes #8740
Fixes #61818
Fixes #190388
DeltaFile
+889-175clang/lib/AST/ASTContext.cpp
+313-12clang/test/SemaTemplate/instantiation-dependence.cpp
+151-93clang/lib/AST/ItaniumMangle.cpp
+76-68clang/lib/AST/Type.cpp
+77-48clang/lib/Sema/SemaTemplate.cpp
+93-16clang/include/clang/AST/ASTContext.h
+1,599-41281 files not shown
+2,324-75787 files

LLVM/project d85709cllvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU lds-link-time-codegen.ll lds-link-time-codegen-named-barrier.ll

[AMDGPU] Emit the relocation symbol for LDS and named barrier when object linking is enabled
DeltaFile
+51-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen.ll
+35-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-named-barrier.ll
+12-3llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+10-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+108-34 files

LLVM/project c340f60offload/libomptarget omptarget.cpp, offload/plugins-nextgen/common/src RecordReplay.cpp

[offload] Fix asserts in kernel record replay (#192379)

This commit fixes issues introduced in PR #190588
DeltaFile
+2-2offload/libomptarget/omptarget.cpp
+2-2offload/plugins-nextgen/common/src/RecordReplay.cpp
+4-42 files

LLVM/project 90fc0ccllvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU lds-link-time-codegen.ll lds-link-time-codegen-named-barrier.ll

[AMDGPU] Emit the relocation symbol for LDS and named barrier when object linking is enabled
DeltaFile
+43-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen.ll
+36-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-named-barrier.ll
+12-3llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+10-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+101-34 files

LLVM/project c87a60fllvm/lib/Target/RISCV/MCTargetDesc RISCVMatInt.cpp

[RISCV] Replace Unsigned flag in generateInstSeqImpl with ShiftOpc. NFC (#192363)

Changed ShiftAmount from int to unsigned.
DeltaFile
+5-6llvm/lib/Target/RISCV/MCTargetDesc/RISCVMatInt.cpp
+5-61 files

LLVM/project de5a7f1llvm/lib/Target/RISCV/MCTargetDesc RISCVMatInt.cpp, llvm/test/MC/RISCV rv32p-aliases-valid.s

[RISCV] Prefer LUI over PLUI.H on RV32. (#192340)

I don't think any of the cases PLUI.H can handle would be eligible for
C.LUI, but still figured it was best to use base ISA instructions when
possible.
DeltaFile
+7-3llvm/test/MC/RISCV/rv32p-aliases-valid.s
+3-1llvm/lib/Target/RISCV/MCTargetDesc/RISCVMatInt.cpp
+10-42 files

LLVM/project 53cf0d5clang/test/Driver serenity.cpp

[clang] Make serenity.cpp tests pass on clang-with-thin-lto-ubuntu (#192231)

LTO_FULL-NOT was definitely too generic and prone to matching unrelated
content. It would, as an example, match against the build path on
clang-with-thin-lto-ubuntu builder [1].

Making the match more restrictive should avoid this kind of issues.

[1] https://lab.llvm.org/buildbot/#/builders/127/builds/6956
DeltaFile
+1-1clang/test/Driver/serenity.cpp
+1-11 files