LLVM/project b1c4b55llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.960bit.ll

RenameIndependentSubregs: try to only implicit def used subregs (#167486)

Attempt to only define used subregisters when creating IMPLICIT_DEF fix
ups for live interval subranges. This avoids the appearance at the MIR
level of entire (wide) registers becoming live rather than relying only
on transient LiveIntervals dead definitions for unused subregisters.
DeltaFile
+5,420-8,636llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+768-2,280llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+768-2,256llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+752-2,232llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+736-2,168llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+712-2,104llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.704bit.ll
+9,156-19,6766 files not shown
+10,077-21,94712 files

LLVM/project 94e4ee3llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats.mir

[AMDGPU] Fixed crash in getLastMIForRegion when the region is empty. (#168653)

PreRARematStage builds region live-outs if GCN trackers are enabled. If
rematerialization leads to empty regions, this can cause a crash because
of dereference of an invalid iterator in getLastMIForRegion. The fix is
to skip calling getLastMIForRegion for empty regions.

This patch fixes another bug in the same code region. getLastMIForRegion
calls skipDebugInstructionsBackward which may immediately return the
RegionEnd if it is not the begin instruction and it is a non-debug
instruction. That would imply considering an instruction that is outside
the relevant region. The fix is to always pass the previous of RegionEnd
to skipDebugInstructionsBackward.

This bug was found while using GCN trackers on the existing LIT test
machine-scheduler-sink-trivial-remats.mir. Here's the assertion failure.

llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:168:
llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::reference

    [4 lines not shown]
DeltaFile
+4,325-0llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats.mir
+12-9llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+4,337-92 files

LLVM/project af73aeamlir/include/mlir/Dialect/Vector/IR VectorOps.td, mlir/lib/Dialect/Vector/IR VectorOps.cpp

[MLIR][Vector] Add unroll pattern for vector.shape_cast (#167738)

This PR adds pattern for unrolling shape_cast given a targetShape. This
PR is a follow up of #164010 which was very general and was using
inserts and extracts on each element (which is also
LowerVectorShapeCast.cpp is doing).
After doing some more research on use cases, we (me and @Jianhui-Li )
realized that the previous version in #164010 is unnecessarily generic
and doesn't fit our performance needs.

Our use case requires that targetShape is contiguous in both source and
result vector.

This pattern only applies when contiguous slices can be extracted from
the source vector and inserted into the result vector such that each
slice remains in vector form with targetShape (and not decompose to
scalars). In these cases, the unrolling proceeds as:

vector.extract_strided_slice -> vector.shape_cast (on the slice
unrolled) -> vector.insert_strided_slice
DeltaFile
+191-2mlir/lib/Dialect/Vector/Transforms/VectorUnroll.cpp
+79-0mlir/test/Dialect/Vector/vector-unroll-options.mlir
+22-0mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp
+4-0mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+1-0mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
+297-25 files

LLVM/project 7de59f0mlir/lib/Conversion/XeGPUToXeVM XeGPUToXeVM.cpp, mlir/test/Conversion/XeGPUToXeVM loadstore_nd.mlir loadstore_1d.mlir

[MLIR][Conversion] XeGPU to XeVM: Use adaptor for getting base address from memref. (#168610)

adaptor already lowers memref to base address.
Conversion patterns should use it instead of generating code to get base
address from memref.
DeltaFile
+9-3mlir/test/Conversion/XeGPUToXeVM/loadstore_nd.mlir
+5-5mlir/test/Conversion/XeGPUToXeVM/loadstore_1d.mlir
+3-4mlir/lib/Conversion/XeGPUToXeVM/XeGPUToXeVM.cpp
+3-3mlir/test/Conversion/XeGPUToXeVM/create_nd_tdesc.mlir
+20-154 files

LLVM/project ef0cd1dclang/lib/CIR/CodeGen CIRGenItaniumCXXABI.cpp CIRGenCoroutine.cpp, clang/lib/CIR/Dialect/Transforms LoweringPrepare.cpp

[CIR][NFC] Fix warnings in release builds (#168791)

This fixes several warnings that occur in CIR release builds.
DeltaFile
+2-1clang/lib/CIR/CodeGen/CIRGenItaniumCXXABI.cpp
+1-1clang/lib/CIR/CodeGen/CIRGenCoroutine.cpp
+1-1clang/lib/CIR/CodeGen/CIRGenException.cpp
+1-1clang/lib/CIR/CodeGen/CIRGenExprCXX.cpp
+1-1clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp
+6-55 files

LLVM/project ff39d59compiler-rt/test/asan/TestCases stack_container_dynamic_lib.cpp

Disable test under GCC (#168792)

New test stack_container_dynamic_lib.cpp has errors under gcc.

Require clang while better fix is investigated
DeltaFile
+3-0compiler-rt/test/asan/TestCases/stack_container_dynamic_lib.cpp
+3-01 files

LLVM/project 8359513clang/lib/Tooling/DependencyScanning DependencyScanningFilesystem.cpp, clang/unittests/Tooling/DependencyScanning DependencyScanningFilesystemTest.cpp

[clang][deps] Enable calling `DepScanFile::getBuffer()` repeatedly (#168789)

This PR makes it possible to call `getBuffer()` on `DepScanFile` (a
`llvm::vfs::File`) repeatedly. Previously, this function would return a
moved-from `unique_ptr`. This doesn't fix any existing bugs, I
discovered this while experimenting with the VFSs in the scanner. Note
that the returned instances of `llvm::MemoryBuffer` are non-owning and
share the underlying buffer storage.
DeltaFile
+33-0clang/unittests/Tooling/DependencyScanning/DependencyScanningFilesystemTest.cpp
+2-1clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+35-12 files

LLVM/project 80f862bclang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/test/CIR/CodeGen/X86 lzcnt-builtins.c bmi-builtins.c

[CIR] Upstream CIR codegen for `lzcnt` and `tzcnt` x86 builtins (#168479)

Support CIR codegen for x86 builtins `__builtin_ia32_lzcnt` and
`__builtin_ia32_tzcnt`.
DeltaFile
+67-0clang/test/CIR/CodeGen/X86/lzcnt-builtins.c
+49-0clang/test/CIR/CodeGen/X86/bmi-builtins.c
+16-3clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+132-33 files

LLVM/project 82380f3llvm/lib/Target/AMDGPU SIRegisterInfo.h SIRegisterInfo.td, llvm/test/CodeGen/AMDGPU regalloc-spill-wmma-scale.ll

[AMDGPU] Prioritize allocation of low 256 VGPR classes (#167978)

If we have 1024 VGPRs available we need to give priority to the
allocation of these registers where operands can only use low 256.
That is noteably scale operands of V_WMMA_SCALE instructions.
Otherwise large tuples will be allocated first and take all low
registers, so we would have to spill to get a room for these
scale registers.

Allocation priority itself does not eliminate spilling completely
in large kernels, although helps to some degree. Increasing spill
weight of a restricted class on top of it helps.
DeltaFile
+11-0llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+2-3llvm/test/CodeGen/AMDGPU/regalloc-spill-wmma-scale.ll
+1-1llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+14-43 files

LLVM/project 03f4d4dclang/include/clang/CIR/Dialect/IR CIRAttrs.td CIROps.td, clang/lib/CIR/CodeGen CIRGenModule.cpp

[CIR] Add CxxCTorAttr, CxxDTorAttr, CxxAssignAttr, CxxSpecialMemberAttr to cir::FuncOp (#167975)

This PR adds a special member attribute to `cir::FuncOp`. This attribute
is also present in the incubator repo. Additionally, I added a
"is_trivial" flag, to mark trivial members. I think that might be useful
when trying to replace calls to the copy constructor with memcpy for
example, but please let me know your thoughts on this. [Here in the
incubator
repo](https://github.com/llvm/clangir/blob/823e943d1b9aaba0fc46f880c5a6ac8c29fc761d/clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp#L1537-L1550)
this function is called `LowerTrivialConstructorCall`, but I don't see a
check that ensures the constructor is actually trivial.
DeltaFile
+113-0clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+76-0clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+59-0clang/test/CIR/CodeGen/cxx-special-member-attr.cpp
+55-0clang/lib/CIR/CodeGen/CIRGenModule.cpp
+34-0clang/test/CIR/IR/func.cir
+30-3clang/include/clang/CIR/Dialect/IR/CIROps.td
+367-34 files not shown
+383-410 files

LLVM/project 1278d47clang/include/clang/CIR MissingFeatures.h, clang/include/clang/CIR/Dialect/IR CIROps.td

[CIR] Upstream isfpclass op (#166037)

Ref commit in incubator: ee17ff67f3e567585db991cdad1159520c516bb4
 
There is a minor change in the assumption for emitting a direct callee.
In incubator, `bool hasAttributeNoBuiltin = false`
(`llvm-project/clang/lib/CIR/CodeGen/CIRGenExpr.cpp:1671`), while in
upstream, it's true, therefore, the call to finite(...) is not converted
to a builtin anymore.

Fixes #163892
DeltaFile
+174-0clang/test/CIR/CodeGen/builtin-isfpclass.c
+92-0clang/lib/CIR/CodeGen/CIRGenBuiltin.cpp
+66-0clang/include/clang/CIR/Dialect/IR/CIROps.td
+12-0clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+5-0clang/lib/CIR/CodeGen/CIRGenBuilder.h
+1-0clang/include/clang/CIR/MissingFeatures.h
+350-06 files

LLVM/project 2aa2290clang-tools-extra/clang-tidy/readability DuplicateIncludeCheck.cpp DuplicateIncludeCheck.h, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Add `IgnoredFilesList` option to `readability-duplicate-include` (#168196)

Closes [#166938](https://github.com/llvm/llvm-project/issues/166938)
DeltaFile
+37-7clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp
+24-0clang-tools-extra/test/clang-tidy/checkers/readability/duplicate-include-ignored-files.cpp
+7-2clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.h
+9-0clang-tools-extra/docs/clang-tidy/checks/readability/duplicate-include.rst
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+1-0clang-tools-extra/test/clang-tidy/checkers/readability/Inputs/duplicate-include/pack_end.h
+83-91 files not shown
+84-97 files

LLVM/project 8830525llvm/lib/Analysis ConstantFolding.cpp, llvm/test/Transforms/InstSimplify/ConstProp vector-calls.ll

[ConstantFolding] Add constant folding for scalable vector interleave intrinsics. (#168668)

We can constant fold interleave of identical splat vectors to a larger
splat vector.
DeltaFile
+122-0llvm/test/Transforms/InstSimplify/ConstProp/vector-calls.ll
+16-0llvm/lib/Analysis/ConstantFolding.cpp
+138-02 files

LLVM/project db1e73eclang/lib/CodeGen CGDebugInfo.cpp, clang/test/DebugInfo/CXX simple-template-names.cpp

[clang][DebugInfo] Mark _BitInt's as reconstitutable when emitting -gsimple-template-names (#168383)

Depends on:
* https://github.com/llvm/llvm-project/pull/168382

As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as
part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark
`_BitInt`s as "reconstitutable" when compiling with
`-gsimple-template-names`. We still only omit template parameters that
are `<= 64` bit wide. So support `_BitInt`s larger than 64 bits is not
part of this patch.
DeltaFile
+19-3clang/test/DebugInfo/CXX/simple-template-names.cpp
+0-9clang/lib/CodeGen/CGDebugInfo.cpp
+19-122 files

LLVM/project be955e5clang/lib/Sema SemaOpenACCClause.cpp, clang/test/SemaOpenACC declare-construct-ast.cpp

[OpenACC] Make sure 'link' gets the right node in the AST with ASE

Another miss when working through 'link', we didn't properly handle
giving the whole array-section expression or array index expression,
instead allowed it to only get the decl-ref-expr.  This patch makes
sure we don't add the wrong thing.
DeltaFile
+6-0clang/test/SemaOpenACC/declare-construct-ast.cpp
+3-1clang/lib/Sema/SemaOpenACCClause.cpp
+9-12 files

LLVM/project 7e85b79llvm/lib/Target/SystemZ SystemZISelLowering.cpp

[SystemZ] Fix linux s390x main can't bootstrap itself on SanitizerSpecialCaseList.cpp #168088 (#168779)

This test has long call chain in recursion. Search tree can be pruned
early by swapping CC test and recursive simplifyAssumingCCVal.

Fixes: https://github.com/llvm/llvm-project/issues/168088
Co-authored-by: anoopkg6 <anoopkg6 at github.com>
DeltaFile
+6-4llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+6-41 files

LLVM/project 3f6cbdellvm/utils/lit/lit TestRunner.py, llvm/utils/lit/tests shtest-env-positive.py

[lit] Add LIT_CURRENT_TESTCASE environment variable when running tests (#168762)

I'm not aware of any way for `%run` wrapper scripts like
`iosssim_run.py`
([ref](https://github.com/llvm/llvm-project/blob/d2c7c6064259320def7a74e111079725958697d4/compiler-rt/test/sanitizer_common/ios_commands/iossim_run.py#L4))
to know what testcase they are currently running. This can be useful if
these wrappers need to create a (potentially remote) temporary directory
for each test case.

This adds the `LIT_CURRENT_TESTCASE` environment variable to both the
internal shell and the external shell, containing the full name of the
current test being run.
DeltaFile
+8-3llvm/utils/lit/tests/shtest-env-positive.py
+6-0llvm/utils/lit/tests/Inputs/shtest-env-positive/env-current-testcase.txt
+5-1llvm/utils/lit/lit/TestRunner.py
+1-0llvm/utils/lit/tests/Inputs/shtest-env-positive/env-no-subcommand.txt
+20-44 files

LLVM/project e99c83fcross-project-tests/debuginfo-tests/clang_llvm_roundtrip/Inputs simplified_template_names.cpp, llvm/include/llvm/DebugInfo/DWARF DWARFTypePrinter.h

[llvm][DebugInfo] Add support for _BitInt in DWARFTypePrinter (#168382)

As of recent, LLVM includes the bit-size as a `DW_AT_bit_size` (and as
part of `DW_AT_name`) of `_BitInt`s in DWARF. This allows us to mark
`_BitInt`s as "reconstitutable" when compiling with
`-gsimple-template-names`. However, before doing so we need to make sure
the `DWARFTypePrinter` can reconstruct template parameter values that
have `_BitInt` type. This patch adds support for printing
`DW_TAG_template_value_parameter`s that have `_BitInt` type. Since
`-gsimple-template-names` only omits template parameters that are `<=
64` bit wide, we don't support `_BitInt`s larger than 64 bits.
DeltaFile
+7,387-7,087llvm/test/tools/llvm-dwarfdump/X86/simplified-template-names.s
+39-4llvm/include/llvm/DebugInfo/DWARF/DWARFTypePrinter.h
+6-2cross-project-tests/debuginfo-tests/clang_llvm_roundtrip/Inputs/simplified_template_names.cpp
+7,432-7,0933 files

LLVM/project 19fe9b4clang/include/clang/Basic Builtins.td, clang/lib/Sema SemaHLSL.cpp SemaOverload.cpp

[HLSL][TableGen] Add `__hlsl_resource_t` to known built-in function types (#163465)

This change adds resource handle type `__hlsl_resource_t` to the list of types recognized in the Clang's built-in functions prototype string.

HLSL has built-in resource classes and some of them have many methods, such as
[Texture2D](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/sm5-object-texture2d).
Most of these methods will be implemented by built-in functions that will take resource handle as an argument. This change enables us to move from generic `void(...)` prototype string for these methods and explicit argument checking in `SemaHLSL.cpp` to a prototype string with explicit argument types. Argument checking in `SemaHLSL.cpp` can be reduced to handle just the rules that cannot be expressed in the prototype string (for example verifying that the offset value in `__builtin_hlsl_buffer_update_counter` is `1` or `-1`).

In order to make this work, we now allow conversions from attributed resource handle type such as `__hlsl_resource_t [[hlsl::resource_class(UAV)]] [[hlsl::contained_type(float)]]` to a plain non-attributed `__hlsl_resource_t` type.
DeltaFile
+11-54clang/lib/Sema/SemaHLSL.cpp
+26-26clang/test/AST/HLSL/StructuredBuffers-AST.hlsl
+28-14clang/lib/Sema/SemaOverload.cpp
+8-8clang/test/AST/HLSL/ByteAddressBuffers-AST.hlsl
+8-8clang/include/clang/Basic/Builtins.td
+8-8clang/test/AST/HLSL/TypedBuffers-AST.hlsl
+89-1183 files not shown
+96-1209 files

LLVM/project 9b7fd00clang/lib/Sema SemaOpenACCClause.cpp, clang/test/SemaOpenACC declare-construct.cpp

[OpenACC] Fix crash when checking an section in a 'link' clause (#168783)

I saw this while doing lowering, we were not properly looking into the
array sections for the variable. Presumably we didn't do a good job of
making sure we did this right when making this extension, and missed
this spot.
DeltaFile
+8-7clang/lib/Sema/SemaOpenACCClause.cpp
+7-0clang/test/SemaOpenACC/declare-construct.cpp
+15-72 files

LLVM/project 7d7cabdllvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp, llvm/test/CodeGen/AMDGPU invariant-load-no-alias-store.ll

AMDGPU: Handle invariant loads when considering if a load can be scalar

Doesn't touch the globalisel version because the handling
there looks a bit broken.
DeltaFile
+14-1llvm/test/CodeGen/AMDGPU/invariant-load-no-alias-store.ll
+2-1llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+16-22 files

LLVM/project 7bad49bllvm/test/CodeGen/AMDGPU load-select-ptr.ll select-vectors.ll, llvm/test/CodeGen/NVPTX i1-select.ll fast-math.ll

Reapply "DAG: Allow select ptr combine for non-0 address spaces" (#168292)

This reverts commit 6d5f87fc4284c4c22512778afaf7f2ba9326ba7b.

Previously this failed due to treating the unknown MachineMemOperand
value as known uniform.
DeltaFile
+71-76llvm/test/CodeGen/AMDGPU/load-select-ptr.ll
+43-34llvm/test/CodeGen/NVPTX/i1-select.ll
+36-28llvm/test/CodeGen/NVPTX/fast-math.ll
+34-29llvm/test/CodeGen/AMDGPU/select-vectors.ll
+19-38llvm/test/CodeGen/NVPTX/lower-byval-args.ll
+15-12llvm/test/CodeGen/AMDGPU/select-load-to-load-select-ptr-combine.ll
+218-2176 files not shown
+260-24912 files

LLVM/project 3bf6501llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp AMDGPUInstrInfo.cpp, llvm/test/CodeGen/AMDGPU load-select-ptr.ll

AMDGPU: Fix treating divergent loads as uniform

Avoids regression which caused the revert 6d5f87fc42.

This is a hack on a hack. We currently have isUniformMMO,
which improperly treats unknown source value as known uniform.
This is hack from before we had divergence information in the
DAG, and should be removed. This is the minimum change to avoid
the regression; removing the aggressive handling of the unknown
case (or dropping isUniformMMO entirely) are more involved fixes.
DeltaFile
+84-0llvm/test/CodeGen/AMDGPU/load-select-ptr.ll
+11-3llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+1-0llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp
+96-33 files

LLVM/project 308185ellvm/utils/TableGen CallingConvEmitter.cpp

[NFC][TableGen] Use `IfGuardEmitter` in CallingConvEmitter (#168763)

Use `IfGuardEmitter` in CallingConvEmitter. Additionally refactor the
code a bit to extract duplicated code to emit the CC function prototype
into a helper function.
DeltaFile
+21-31llvm/utils/TableGen/CallingConvEmitter.cpp
+21-311 files

LLVM/project 3f55f8bclang/lib/CIR/CodeGen CIRGenExprConstant.cpp CIRGenExprCXX.cpp, clang/test/CIR/CodeGen ctor-null-init.cpp

[CIR] Handle non-empty null base class initialization (#168646)

This implements null base class initialization for non-empty bases.
DeltaFile
+112-8clang/lib/CIR/CodeGen/CIRGenExprConstant.cpp
+60-0clang/test/CIR/CodeGen/ctor-null-init.cpp
+40-2clang/lib/CIR/CodeGen/CIRGenExprCXX.cpp
+10-0clang/lib/CIR/CodeGen/CIRGenBuilder.h
+6-0clang/lib/CIR/CodeGen/CIRGenModule.h
+5-0clang/lib/CIR/CodeGen/CIRGenRecordLayout.h
+233-101 files not shown
+236-107 files

LLVM/project 90ea49allvm/lib/Analysis ConstantFolding.cpp, llvm/test/Transforms/InstSimplify/ConstProp vector-calls.ll

[ConstantFolding] Generalize constant folding for vector_deinterleave2 to deinterleave3-8. (#168640)

DeltaFile
+192-0llvm/test/Transforms/InstSimplify/ConstProp/vector-calls.ll
+33-16llvm/lib/Analysis/ConstantFolding.cpp
+225-162 files

LLVM/project f2c9c7dclang/include/clang/AST StmtOpenACC.h, clang/lib/AST StmtOpenACC.cpp

[OpenACC][CIR] Fix atomic-capture single-line-postfix (#168717)

In my last patch, it became clear during code review that the postfix
operation was actually a read THEN update, not update/read like other
single line versions. It wasn't clear at the time how much additional
work this would be to make postfix work correctly (and they are a bit of
a 'special' thing in codegen anyway), so this patch adds some
functionality to sense this and special-cases it when generating the
statement info for capture.
DeltaFile
+18-17clang/lib/AST/StmtOpenACC.cpp
+9-3clang/include/clang/AST/StmtOpenACC.h
+2-2clang/test/CIR/CodeGenOpenACC/atomic-capture.cpp
+29-223 files

LLVM/project 3b49c92clang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp

Fix build breakage from: #167948 (#168781)

It appears that this broke the build by not using the 'correct' name for
the expression. This is probably something that crossed in review.
DeltaFile
+8-8clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+8-81 files

LLVM/project 040d9c9llvm/lib/Transforms/Vectorize LoopVectorize.cpp

[VPlan] Collect FMFs for in-loop reduction chain in VPlan. (NFC)

Replace retrieving FMFs for in-loop reduction via underlying instruction
+ legal by collecting the flags during reduction chain traversal in
VPlan.
DeltaFile
+14-7llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+14-71 files

LLVM/project 5c43385clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenCoroutine.cpp

[CIR] Upstream CIR await op (#168133)

This PR upstreams `cir.await` and adds initial codegen for emitting a
skeleton of the ready, suspend, and resume branches. Codegen for these
branches is left for a future PR. It also adds a test for the invalid
case where a `cir.func` is marked as a coroutine but does not contain a
`cir.await` op in its body.
DeltaFile
+111-0clang/lib/CIR/CodeGen/CIRGenCoroutine.cpp
+96-2clang/include/clang/CIR/Dialect/IR/CIROps.td
+73-3clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+39-0clang/test/CIR/CodeGen/coro-task.cpp
+21-0clang/test/CIR/IR/await.cir
+19-0clang/test/CIR/IR/invalid-await.cir
+359-56 files not shown
+383-512 files