LLVM/project 7a37eacflang/test/Fir/FirToSCF do-extra.fir iterate-while-extra.fir

[flang] [test] add tests for FIRToSCF (#176026)

Increase the coverage of FIRToSCF testing.
DeltaFile
+152-0flang/test/Fir/FirToSCF/do-extra.fir
+124-0flang/test/Fir/FirToSCF/iterate-while-extra.fir
+122-0flang/test/Fir/FirToSCF/if-extra.fir
+69-0flang/test/Fir/FirToSCF/normalize.fir
+50-0flang/test/Fir/FirToSCF/sum.fir
+42-0flang/test/Fir/FirToSCF/any.fir
+559-01 files not shown
+585-07 files

LLVM/project abba7eeclang/include/clang/Analysis/Analyses/LifetimeSafety Facts.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp

Detect dangling fields
DeltaFile
+313-315clang/lib/Sema/AnalysisBasedWarnings.cpp
+151-0clang/test/Sema/warn-lifetime-safety-dangling-field.cpp
+48-4clang/include/clang/Analysis/Analyses/LifetimeSafety/Facts.h
+45-1clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+0-28clang/test/Analysis/lifetime-cfg-output.cpp
+16-9clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+573-35713 files not shown
+661-38019 files

LLVM/project c63a744clang/lib/CodeGen CodeGenFunction.cpp CGExpr.cpp, clang/test/CodeGen lifetime-sanitizer.c

[CodeGen][InstCombine][Sanitizers] Emit lifetimes when compiling with memtag-stack (#177130)

Currently we do not emit lifetimes by default when compiling with
memtag-stack - which means we don't catch use-after-scope (when
compiling without optimization).

This patch fixes that by mirroring ASan, HWASan and MSan, and always
emitting lifetime markers. The patch is based on the changes made in
aeca569.

rdar://163713381
DeltaFile
+15-0llvm/test/Transforms/InstCombine/lifetime-sanitizer.ll
+3-0clang/test/CodeGen/lifetime-sanitizer.c
+3-0clang/test/CodeGenCXX/lifetime-sanitizer.cpp
+2-1llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+2-1clang/lib/CodeGen/CodeGenFunction.cpp
+1-0clang/lib/CodeGen/CGExpr.cpp
+26-26 files

LLVM/project 87751d3.github CODEOWNERS

[mlir] Add myself as CODEOWNER for `ControlFlowInterfaces`
DeltaFile
+1-0.github/CODEOWNERS
+1-01 files

LLVM/project 584e772llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.960bit.ll

AMDGPU: Change ABI of 16-bit element vectors on gfx6/7

Fix ABI on old subtargets so match new subtargets, packing
16-bit element subvectors into 32-bit registers. Previously
this would be scalarized and promoted to i32/float.

Note this only changes the vector cases. Scalar i16/half are
still promoted to i32/float for now. I've unsuccessfully tried
to make that switch in the past, so leave that for later.

This will help with removal of softPromoteHalfType.
DeltaFile
+47,697-51,378llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+14,474-16,242llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+16,328-12,881llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+13,036-14,705llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+11,668-13,311llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+10,558-11,908llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+113,761-120,425151 files not shown
+200,132-204,069157 files

LLVM/project 8d2c40cllvm/test/CodeGen/AMDGPU/GlobalISel orn2.ll andn2.ll

cleanup regression
DeltaFile
+8-3llvm/test/CodeGen/AMDGPU/GlobalISel/orn2.ll
+8-3llvm/test/CodeGen/AMDGPU/GlobalISel/andn2.ll
+16-62 files

LLVM/project 81ec09cllvm/test/CodeGen/AMDGPU fneg-combines.f16.ll bf16.ll

AMDGPU: Change ABI of 16-bit scalar values for gfx6/gfx7

Keep bf16/f16 values encoded as the low half of a 32-bit register,
instead of promoting to float. This avoids unwanted FP effects
from the fpext/fptrunc which should not be implied by just
passing an argument. This also fixes ABI divergence between
SelectionDAG and GlobalISel.

I've wanted to make this change for ages, and failed the last
few times. The main complication was the hack to return
shader integer types in SGPRs, which now needs to inspect
the underlying IR type.
DeltaFile
+372-419llvm/test/CodeGen/AMDGPU/fneg-combines.f16.ll
+247-430llvm/test/CodeGen/AMDGPU/bf16.ll
+116-174llvm/test/CodeGen/AMDGPU/fcopysign.bf16.ll
+139-139llvm/test/CodeGen/AMDGPU/constant-address-space-32bit.ll
+112-153llvm/test/CodeGen/AMDGPU/select-fabs-fneg-extract.f16.ll
+140-114llvm/test/CodeGen/AMDGPU/fcopysign.f16.ll
+1,126-1,42981 files not shown
+3,579-4,36087 files

LLVM/project 85d881bllvm/lib/CodeGen/GlobalISel CallLowering.cpp

suggestion cleanup
DeltaFile
+12-29llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+12-291 files

LLVM/project a6418f0llvm/lib/CodeGen/GlobalISel CallLowering.cpp

suggestion
DeltaFile
+10-0llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+10-01 files

LLVM/project 0fec035llvm/lib/CodeGen/GlobalISel CallLowering.cpp

Suggestion with TypeSize
DeltaFile
+9-4llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+9-41 files

LLVM/project 9232a0bllvm/lib/CodeGen/GlobalISel CallLowering.cpp

GlobalISel: Fix mishandling vector-as-scalar in return values

This fixes 2 cases when the AMDGPU ABI is fixed to pass <2 x i16>
values as packed on gfx6/gfx7. The ABI does not pack values
currently; this is a pre-fix for that change.

Insert a bitcast if there is a single part with a different size.
Previously this would miscompile by going through the scalarization
and extend path, dropping the high element.

Also fix assertions in odd cases, like <3 x i16> -> i32. This needs
to unmerge with excess elements from the widened source vector.

All of this code is in need of a cleanup; this should look more
like the DAG version using getVectorTypeBreakdown.
DeltaFile
+24-2llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+24-21 files

LLVM/project af4f5eellvm/include/llvm/CodeGen SDPatternMatch.h, llvm/unittests/CodeGen SelectionDAGPatternMatchTest.cpp

[DAG] SDPatternMatch - add m_Negative/m_StrictlyPositive/m_NonNegative/m_NonPositive/m_NonZero matchers (#175191)

Resolves #174327
DeltaFile
+120-0llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp
+63-0llvm/include/llvm/CodeGen/SDPatternMatch.h
+183-02 files

LLVM/project 735f7b3llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 combine-bzhi.ll

[X86] computeKnownBitsForTargetNode - add basic X86ISD::BZHI handling (#177347)

Currently limited to constant masks, if the mask (truncated to i8) if
less than the bitwidth then it will zero the upper bits.

So far it mainly just handles BZHI(X,0) -> 0 and BZHI(C1,C2) constant
folding.

All the BMI node combines seem to just call SimplifyDemandedBits - so
I've merged them into a single combineBMI.
DeltaFile
+30-22llvm/lib/Target/X86/X86ISelLowering.cpp
+0-6llvm/test/CodeGen/X86/combine-bzhi.ll
+30-282 files

LLVM/project 9f536c7llvm/lib/Target/AMDGPU AMDGPU.td GCNSubtarget.h

[NFC][AMDGPU] Remove unused `FeatureDisable` (#177288)

DeltaFile
+0-6llvm/lib/Target/AMDGPU/AMDGPU.td
+0-3llvm/lib/Target/AMDGPU/GCNSubtarget.h
+0-92 files

LLVM/project 02d34a7llvm/lib/Target/AMDGPU GCNSubtarget.h AMDGPU.td

[NFCI][AMDGPU] Remove more redundant code from `GCNSubtarget.h` (#177297)

We are getting pretty close to use `GET_SUBTARGETINFO_MACRO` in the
header with this cleanup.
DeltaFile
+22-58llvm/lib/Target/AMDGPU/GCNSubtarget.h
+30-46llvm/lib/Target/AMDGPU/AMDGPU.td
+11-11llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-3llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+3-3llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+2-4llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
+71-1255 files not shown
+78-13311 files

LLVM/project a81d2bfllvm/lib/Transforms/InstCombine InstCombineLoadStoreAlloca.cpp, llvm/test/Transforms/InstCombine load-addrspacecast-select.ll

[InstCombine] Propagate profiles when folding addrscast through loads (#177214)

#176352 introduced a new fold and a new test for this functionality.
Given the select condition is the same before and after, we can
propagate any profile information that may be attached to the select
instruction. We should not need to explicitly drop any metadata off the
select.
DeltaFile
+11-4llvm/test/Transforms/InstCombine/load-addrspacecast-select.ll
+6-1llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
+0-1llvm/utils/profcheck-xfail.txt
+17-63 files

LLVM/project d5899ccclang/docs LanguageExtensions.rst, clang/include/clang/Basic Builtins.td

[Clang] Rename `uinc_wrap` and add normal atomic builtin (#177253)

Summary:
The `__scoped_atomic` builtins are supposed to match the standard
GNU-flavored `__atomic` builtins. We added a scoped builtin without a
corresponding standard one before the fork so this should be added in
the release candidate. These were originally added in
https://github.com/llvm/llvm-project/pull/168666

Also, the name `uinc_wrap` does not follow the naming convention. The
GNU atomics use `fetch_xyz` to indicate that the builtin returns the
previous location's value as part of the RMW operation, which these do.
This PR renames it and its uses.
DeltaFile
+14-12clang/docs/LanguageExtensions.rst
+16-4clang/include/clang/Basic/Builtins.td
+10-10clang/test/Sema/scoped-atomic-ops.c
+12-6clang/lib/CodeGen/CGAtomic.cpp
+17-0clang/test/Sema/atomic-ops.c
+12-0clang/test/CodeGen/atomic-ops.c
+81-327 files not shown
+98-4313 files

LLVM/project f508e9eclang/include/clang/Analysis/Analyses/LifetimeSafety Facts.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp

Detect dangling fields
DeltaFile
+313-315clang/lib/Sema/AnalysisBasedWarnings.cpp
+110-0clang/test/Sema/warn-lifetime-safety-dangling-field.cpp
+43-4clang/include/clang/Analysis/Analyses/LifetimeSafety/Facts.h
+39-1clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+0-28clang/test/Analysis/lifetime-cfg-output.cpp
+16-9clang/test/Sema/warn-lifetime-analysis-nocfg.cpp
+521-35712 files not shown
+600-37818 files

LLVM/project 925b033llvm/lib/Transforms/Coroutines CoroFrame.cpp

[CoroFrame][NFC] Create more helper functions for insertSpills (#177149)

This allows us to delete some variables and simplify the core loop of of
insertSpills.
DeltaFile
+37-28llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+37-281 files

LLVM/project 9568772llvm/test/CodeGen/AMDGPU llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll mfma-loop.ll

AMDGPU: Select VGPR MFMAs by default (#159493)

AGPRs are undesirable since they are only usable by a
handful instructions like loads, stores and mfmas and everything
else requires copies to/from VGPRs. Using the AGPR form should be
a measure of last resort if we must use more than 256 VGPRs.
DeltaFile
+2,436-4,283llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll
+548-1,962llvm/test/CodeGen/AMDGPU/mfma-loop.ll
+2,297-2llvm/test/CodeGen/AMDGPU/llvm.amdgcn.smfmac.gfx950.ll
+1,018-1,120llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.ll
+540-740llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.gfx942.ll
+168-1,050llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.scale.f32.16x16x128.f8f6f4.ll
+7,007-9,15722 files not shown
+10,151-11,73328 files

LLVM/project 9357c59llvm/lib/Target/SPIRV SPIRVPrepareGlobals.cpp SPIRVGlobalRegistry.cpp, llvm/test/CodeGen/SPIRV fembed-bitcode-marker.ll hip_dyn_lds.ll

[SPIRV] Unify unsized array handling for AMGCN flavoured SPIR-V (#175848)

Currently we handle 0-sized arrays in multiple places, non-uniformly,
either via `SPIRVLegalizeZeroSizeArrays` or via `SPIRVPrepareGlobals`.
For AMDGCN flavoured SPIR-V we have a singular, simpler solution: set
all 0-sized arrays to be `UINT64_MAX` sized. This is an unambiguous
token that we can use during reverse translation to restore the intended
0 size.
DeltaFile
+0-55llvm/lib/Target/SPIRV/SPIRVPrepareGlobals.cpp
+12-9llvm/test/CodeGen/SPIRV/fembed-bitcode-marker.ll
+12-0llvm/lib/Target/SPIRV/SPIRVGlobalRegistry.cpp
+3-4llvm/lib/Target/SPIRV/SPIRVLegalizeZeroSizeArrays.cpp
+3-2llvm/test/CodeGen/SPIRV/hip_dyn_lds.ll
+30-705 files

LLVM/project de51bceclang/lib/CIR/Dialect/Transforms LoweringPrepareItaniumCXXABI.cpp LoweringPrepareCXXABI.h, clang/lib/CIR/Dialect/Transforms/TargetLowering LowerItaniumCXXABI.cpp

[CIR][NFC] Move ABI lowering of dynamic_cast to CXXABILowering (#176931)

This patch moves the ABI lowering for `dynamic_cast` from
LoweringPrepare to the new CXXABILowering pass. This effectively removes
ABI lowering code away from LoweringPrepare, thus the patch also removes
the LoweringPrepareCXXABI classes and files.

Related to #175968 .
DeltaFile
+0-170clang/lib/CIR/Dialect/Transforms/LoweringPrepareItaniumCXXABI.cpp
+167-0clang/lib/CIR/Dialect/Transforms/TargetLowering/LowerItaniumCXXABI.cpp
+0-38clang/lib/CIR/Dialect/Transforms/LoweringPrepareCXXABI.h
+1-36clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+10-0clang/lib/CIR/Dialect/Transforms/CXXABILowering.cpp
+3-3clang/test/CIR/CodeGen/dynamic-cast.cpp
+181-2473 files not shown
+185-2489 files

LLVM/project ec28be3lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/optional TestDataFormatterLibcxxOptionalSimulator.py

[lldb][test] TestDataFormatterLibcxxOptionalSimulator.py: skip on older Clang versions
DeltaFile
+1-8lldb/test/API/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/optional/TestDataFormatterLibcxxOptionalSimulator.py
+1-81 files

LLVM/project 2be02dclldb/test/API/lang/cpp/template-alias TestTemplateAlias.py

[lldb][test] TestTemplateAlias.py: skip on older Clang version

The `-g[no-]template-alias` flag is not available on older versions.
DeltaFile
+1-0lldb/test/API/lang/cpp/template-alias/TestTemplateAlias.py
+1-01 files

LLVM/project 69fbba1lldb/test/API/lang/cpp/template-alias TestTemplateAlias.py

[lldb][test] TestTemplateAlias.py: skip on older Clang versions

The `-g[no-]template-alias` flag is not available on older versions.
DeltaFile
+3-0lldb/test/API/lang/cpp/template-alias/TestTemplateAlias.py
+3-01 files

LLVM/project 088f88dlldb/source/ValueObject DILEval.cpp

[lldb] Fix crash when there is no compile unit. (#177278)

The crash occurred in lldb-dap when we are in a shared library with no
debug information and we are trying to get the expression path for an
address.
DeltaFile
+3-0lldb/source/ValueObject/DILEval.cpp
+3-01 files

LLVM/project dcbe8f1clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp CIRGenExprScalar.cpp, clang/test/CIR/CodeGen builtin-bit-cast.cpp

Address comments from Andy
DeltaFile
+7-8clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+2-8clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+4-0clang/test/CIR/CodeGen/builtin-bit-cast.cpp
+13-163 files

LLVM/project 1a9b901clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp CIRGenExprScalar.cpp, clang/test/CIR/CodeGen builtin-bit-cast.cpp

[Clang][CIR] Implement CIRGen logic for __builtin_bit_cast

NOTE: This patch merely upstreams code from
  * https://github.com/llvm/clangir.

This Op was originally implemented by Sirui Mu in #762 Further
modification were made by other ClangIR contributors.

co-authored-by: Sirui Mu <msrlancern at gmail.com>
DeltaFile
+135-0clang/test/CIR/CodeGen/builtin-bit-cast.cpp
+24-0clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+17-0clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+176-03 files

LLVM/project 699792bclang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenBuilder.h

[CIR] Add cir.libc.memcpy Op (#176781)

The operation is a 1:1 mapping to libc's memcpy.

NOTE: This patch upstreams code from
  * https://github.com/llvm/clangir.

This Op was originally implemented by Vinicius Couto Espindola
in https://github.com/llvm/clangir/pull/237. Further
modifications were made by other ClangIR contributors.

Co-authored-by: Vinicius Couto Espindola <vini.couto.e at gmail.com>
DeltaFile
+52-0clang/include/clang/CIR/Dialect/IR/CIROps.td
+37-0clang/test/CIR/IR/invalid-memcpy.cir
+12-0clang/test/CIR/Lowering/libc.cir
+10-0clang/lib/CIR/CodeGen/CIRGenBuilder.h
+10-0clang/test/CIR/IR/libc-memcpy.cir
+9-0clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+130-06 files

LLVM/project 6024031llvm/test/CodeGen/AMDGPU llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll mfma-loop.ll

AMDGPU: Select VGPR MFMAs by default

AGPRs are undesirable since they are only usable by a
handful instructions like loads, stores and mfmas and everything
else requires copies to/from VGPRs. Using the AGPR form should be
a measure of last resort if we must use more than 256 VGPRs.
DeltaFile
+2,436-4,283llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll
+548-1,962llvm/test/CodeGen/AMDGPU/mfma-loop.ll
+2,297-2llvm/test/CodeGen/AMDGPU/llvm.amdgcn.smfmac.gfx950.ll
+1,018-1,120llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.ll
+540-740llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.gfx942.ll
+168-1,050llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.scale.f32.16x16x128.f8f6f4.ll
+7,007-9,15722 files not shown
+10,151-11,73328 files