LLVM/project 832285aclang/include/clang/Lex Preprocessor.h MacroInfo.h, clang/lib/Lex PPMacroExpansion.cpp

[clang][NFC] Unify `MacroState` `isAmbiguous` and `getModuleInfo` (#197867)

Every call to `MacroState::getModuleInfo`, and `MacroState::isAmbiguous`
are paired in the same function. Rather than doing the same work twice,
just unify them into a single function, `getModuleInfo`, that returns
both pieces of information in a new type `ModuleMacroInfo`.

Unfortunately, `getModuleInfo` and`ModuleMacroInfo` already exist, so
rename them to `getFullModuleInfo` and `FullModuleMacroInfo`,
respectively, since the new type is a subset of the old type. The new
type contains just the pieces consumers care about.

While we're there, use the range constructor of `llvm::DenseSet` instead
of default constructing and calling `insert` in a loop.
DeltaFile
+23-30clang/include/clang/Lex/Preprocessor.h
+6-6clang/lib/Lex/PPMacroExpansion.cpp
+8-3clang/include/clang/Lex/MacroInfo.h
+37-393 files

LLVM/project 18332f1llvm/test/CodeGen/AMDGPU extract_vector_dynelt.ll vgpr-large-tuple-alloc-error.ll, llvm/test/CodeGen/X86 pcsections-atomics.ll

[RegAllocFast] Eliminate dead copies (#196056)

github issue: https://github.com/llvm/llvm-project/issues/168201

This patch extends copy elimination in **RegAllocFast** to catch an
additional class of redundant copies. Previously, only identity copies
(where source and destination registers are the same) were marked for
removal.

Now, we check whether the current instruction is dead and remove it if
it is.
The change:

- Updates the copy-elimination condition to include dead destination
operands.

- Improves debug output to be more generic.

This reduces unnecessary instructions and can lead to slightly better
codegen by eliminating dead copies earlier in the fast register
allocation pass.
DeltaFile
+628-728llvm/test/CodeGen/X86/pcsections-atomics.ll
+389-427llvm/test/CodeGen/AMDGPU/extract_vector_dynelt.ll
+296-312llvm/test/CodeGen/AMDGPU/vgpr-large-tuple-alloc-error.ll
+136-139llvm/test/CodeGen/AMDGPU/mubuf-legalize-operands-non-ptr-intrinsics.ll
+114-121llvm/test/CodeGen/AMDGPU/indirect-addressing-si.ll
+109-125llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
+1,672-1,85238 files not shown
+1,750-2,75344 files

LLVM/project 3c8d104llvm/lib/CodeGen/SelectionDAG TargetLowering.cpp

[DAG] SimplifyMultipleUseDemandedBits - use isIdentityElement to detect identity / fall through operands (#197952)

Now that isIdentityElement uses computeKnownBits we don't have to handle
this locally, and can handle all binops (inc smax/smin/umax/umin etc.)
at the same time
DeltaFile
+10-15llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+10-151 files

LLVM/project c998fcbllvm/include/llvm/CodeGen TargetLowering.h, llvm/lib/CodeGen InterleavedAccessPass.cpp

[IA][RISCV] Support gap mask for loads that are de-interleaved through intrinsics (#197062)

In the context of (de)interleaved loads and stores, a gap mask is a mask
that effectively skips the entire component / field. Starting from
#151612 , the InterleavedAccessPass gained support to recognize masks of
this kind and pass it to the TLI hook. RISC-V originally only supported
gap mask on fixed vectors, this patch adds support for recognizing gap
masks on loads that are de-interleaved through the
`llvm.vector.deinterleaveN` intrinsics, with both scalable vectors and
fixed vectors.
DeltaFile
+77-38llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll
+79-1llvm/test/CodeGen/RISCV/rvv/vp-vector-interleaved-access.ll
+62-11llvm/lib/Target/RISCV/RISCVInterleavedAccess.cpp
+4-7llvm/lib/CodeGen/InterleavedAccessPass.cpp
+7-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+4-1llvm/include/llvm/CodeGen/TargetLowering.h
+233-592 files not shown
+237-618 files

LLVM/project 131446allvm/lib/Transforms/AggressiveInstCombine AggressiveInstCombine.cpp, llvm/test/Transforms/AggressiveInstCombine popcount.ll

[AggressiveInstCombine] Loosen some conditions in the popcount pattern (#197536)

This PR refines and loosens some condition regarding the last AND mask
in the popcount pattern introduced by #180917 .
More specifically, this AND mask only needs to fulfill two conditions:
1. All ones for the lower NumLenBits-bits, where NumLenBits is the
number of bits needed to represent the size of the integer
  2. Zeros from bit 8 and onward

Rest of the bits can have arbitrary values.

The compute known bits infrastructure is supposed to reason about this
but it was depth-limited in this particular case.
DeltaFile
+155-6llvm/test/Transforms/AggressiveInstCombine/popcount.ll
+12-3llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+167-92 files

LLVM/project af8454dllvm/lib/Target/SystemZ SystemZAsmPrinter.cpp

SystemZ: Remove always-true conditional (#197729)

About 20 lines above, the DSA64Bit is unconditionally set, so switching
on it later is not needed. Instead, we can print that bit
unconditionally.

Resolves #170125
DeltaFile
+1-4llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp
+1-41 files

LLVM/project e11afedclang/include/clang/DependencyScanning ModuleDepCollector.h DependencyScannerImpl.h, clang/lib/DependencyScanning ModuleDepCollector.cpp DependencyScannerImpl.cpp

[clang][deps] Allow using collector with different consumers (#197287)
DeltaFile
+13-14clang/lib/DependencyScanning/ModuleDepCollector.cpp
+7-7clang/lib/DependencyScanning/DependencyScannerImpl.cpp
+3-5clang/include/clang/DependencyScanning/ModuleDepCollector.h
+2-2clang/include/clang/DependencyScanning/DependencyScannerImpl.h
+2-2clang/lib/Tooling/DependencyScanningTool.cpp
+27-305 files

LLVM/project 0c02e10llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 spillcost-noreturn-block.ll

[SLP] Skip no-return blocks in spill cost analysis

Treat basic blocks containing a non-terminator no-return call as
dead-end paths in getSpillCost. Execution doesn't exit such blocks,
so their non-vec calls don't keep loop-invariant vector values live.
Terminator no-return calls (invoke/callbr) are excluded because
their unwind/indirect successors remain reachable.

Reviewers: bababuck, hiraditya, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/197279
DeltaFile
+31-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+1-1llvm/test/Transforms/SLPVectorizer/AArch64/spillcost-noreturn-block.ll
+32-12 files

LLVM/project 8af0fc6compiler-rt/lib/xray xray_trampoline_hexagon.S xray_hexagon.cpp, llvm/lib/Target/Hexagon HexagonAsmPrinter.cpp HexagonISelLowering.cpp

[Hexagon] Add XRay custom and typed event support (#191749)

Add support for XRay custom events (llvm.xray.customevent) and typed
events (llvm.xray.typedevent) for Hexagon.

LLVM:
* Add Hexagon to the architecture gate in SelectionDAGBuilder for
xray_customevent and xray_typedevent intrinsic lowering
* Implement EmitInstrWithCustomInserter for PATCHABLE_EVENT_CALL and
PATCHABLE_TYPED_EVENT_CALL pseudo instructions
* Implement LowerPATCHABLE_EVENT_CALL in HexagonAsmPrinter that emits
inline sleds with jump-over, allocframe/deallocframe for LR:FP save,
argument register save/restore, and call to the event handler
* Add event pseudo dispatch in HexagonMCInstLower
* Prevent event pseudos from being packetized (solo instructions)

compiler-rt:
* Implement patchCustomEvent and patchTypedEvent in xray_hexagon.cpp to
patch the sled jump to nop (enable) or back (disable)
* Add __xray_CustomEvent and __xray_TypedEvent trampolines in the
Hexagon XRay trampoline assembly
DeltaFile
+175-0llvm/lib/Target/Hexagon/HexagonAsmPrinter.cpp
+83-0llvm/test/CodeGen/Hexagon/xray-custom-event.ll
+50-12compiler-rt/lib/xray/xray_trampoline_hexagon.S
+32-3compiler-rt/lib/xray/xray_hexagon.cpp
+12-0llvm/lib/Target/Hexagon/HexagonISelLowering.cpp
+8-0llvm/lib/Target/Hexagon/HexagonMCInstLower.cpp
+360-154 files not shown
+372-1810 files

LLVM/project ef5d264clang/test/Tooling clang-sycl-linker.ll, clang/tools/clang-sycl-linker ClangSYCLLinker.cpp

[clang-sycl-linker] Remove dead and unnecessary check for no symbols image (#197596)

The `if (SI.Symbols.empty()) continue;` guard in `runSYCLLink` was
unreachable: `collectEntryPoints` always calls
`llvm::offloading::sycl::writeSymbolTable`, which emits at least a
4-byte `SymbolTableHeader` (`Count=0`) even when the input has no entry
points.
The check could never fire and was misleading suggesting that modules
without kernels would be dropped.
I checked other similar tools and did not find logic to filter out
images, if there are no entry points. This doesn't look like
responsibility of clang-sycl-linker.
If this scenario is required, we can consider adding it later with a
proper check.

---------

Co-authored-by: Alexey Bader <alexey.bader at intel.com>
DeltaFile
+15-0clang/test/Tooling/clang-sycl-linker.ll
+0-2clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
+15-22 files

LLVM/project 85e01ffllvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 bf16-instructions.ll f16-instructions.ll

[AArch64] Lower i16/f16 bitcast via vector operations. (#196341)

This removes one of the uses of getTargetExtractSubreg, that creates a
Machine Node during DAG lowering. We instead use a scalar_to_vector to
convert to a vector and let the extract element legalize to a legal
type.
DeltaFile
+23-5llvm/test/CodeGen/AArch64/bf16-instructions.ll
+23-5llvm/test/CodeGen/AArch64/f16-instructions.ll
+15-3llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+8-8llvm/test/CodeGen/AArch64/atomicrmw-fmax.ll
+8-8llvm/test/CodeGen/AArch64/atomicrmw-fmin.ll
+8-8llvm/test/CodeGen/AArch64/atomicrmw-fadd.ll
+85-378 files not shown
+108-6714 files

LLVM/project 5a48b29llvm/lib/Transforms/Vectorize VPlan.h VPlanTransforms.cpp

[VPlan] Make VPWidenLoad(EVL)Recipe VPSingleDefs (NFC). (#197496)

This updates VPWidenMemoryRecipe to no longer inherit directly from
VPRecipeBase. Instead VPWidenMemoryRecipe is turned into a mixin, like
VPIRMetadata and VPPhiAccessors. This in turn allows updating
VPWidenLoad(EVL)Recipe to inherit directly from VPSingleDefRecipe. This
brings them in line with all other recipes defining a single VPValue.

Depends on https://github.com/llvm/llvm-project/pull/197494

PR: https://github.com/llvm/llvm-project/pull/197496
DeltaFile
+78-54llvm/lib/Transforms/Vectorize/VPlan.h
+17-15llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+12-11llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+2-2llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+1-1llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+110-835 files

LLVM/project d9a2928lldb/test/API/functionalities/postmortem/elf-core TestLinuxCore.py

[lldb] Skip new tests in TestLinuxCore on Windows and non-x86 targets (#197887)

This patch must fix broken tests after #197341 on buildbots
lldb-remote-linux-win, lldb-x86_64-win and lldb-aarch64-windows.
DeltaFile
+6-0lldb/test/API/functionalities/postmortem/elf-core/TestLinuxCore.py
+6-01 files

LLVM/project d83fdd0lldb/bindings interfaces.swig, lldb/bindings/interface SBProcessInfoExtensions.i

[lldb] Expose the process arguments in SBProcessInfo (#197719)

Add GetNumArguments() and GetArgumentAtIndex() to SBProcessInfo. Add a
convienence property `arguments` in the python api.

lldb-dap uses this to show more information to the user when picking a
process.

Fixes: #197509
DeltaFile
+20-0lldb/source/API/SBProcessInfo.cpp
+18-0lldb/bindings/interface/SBProcessInfoExtensions.i
+10-0lldb/test/API/python_api/process/TestProcessAPI.py
+7-0lldb/include/lldb/API/SBProcessInfo.h
+2-0lldb/test/API/python_api/default-constructor/sb_process_info.py
+1-0lldb/bindings/interfaces.swig
+58-06 files

LLVM/project f7b575bclang/include/clang/DependencyScanning ModuleDepCollector.h, clang/lib/DependencyScanning ModuleDepCollector.cpp DependencyScannerImpl.cpp

[clang][deps] Move logic out of the PP callback (#197270)

This PR moves the main scanner logic from
`ModuleDepCollectorPP::EndOfMainFile()` to the new
`ModuleDepCollector::run()`. In a follow-up PR, this will allow invoking
the main logic with different `DependencyConsumer` objects and reusing
the `ModuleDepCollector` state, skipping some work on subsequent calls.
DeltaFile
+13-11clang/include/clang/DependencyScanning/ModuleDepCollector.h
+15-7clang/lib/DependencyScanning/ModuleDepCollector.cpp
+1-4clang/lib/Tooling/DependencyScanningTool.cpp
+3-1clang/lib/DependencyScanning/DependencyScannerImpl.cpp
+32-234 files

LLVM/project 204da32llvm/test/CodeGen/AMDGPU llvm.amdgcn.permlane.ll, llvm/test/CodeGen/AMDGPU/GlobalISel sdivrem.ll udivrem.ll

Rebase

Created using spr 1.3.7
DeltaFile
+7,377-7,311llvm/test/CodeGen/Thumb2/mve-clmul.ll
+3,436-2,769llvm/test/CodeGen/AMDGPU/GlobalISel/sdivrem.ll
+2,801-2,109llvm/test/CodeGen/AMDGPU/GlobalISel/udivrem.ll
+3,706-328llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
+3,435-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane.ll
+2,429-995llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+23,184-13,5122,088 files not shown
+114,362-52,2612,094 files

LLVM/project 7011d2allvm/lib/Target/AMDGPU SIFrameLowering.cpp SIMachineFunctionInfo.h, llvm/test/CodeGen/AMDGPU amdgpu-spill-cfi-saved-regs.ll

[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs

These spills need special CFI anyway, so implementing them directly
where CFI is emitted avoids the need to invent a mechanism to track them
from ISel.

Change-Id: If4f34abb3a8e0e46b859a7c74ade21eff58c4047
Co-authored-by: Scott Linder scott.linder at amd.com
Co-authored-by: Venkata Ramanaiah Nalamothu VenkataRamanaiah.Nalamothu at amd.com
DeltaFile
+2,926-0llvm/test/CodeGen/AMDGPU/amdgpu-spill-cfi-saved-regs.ll
+12-0llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+10-0llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
+9-0llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+2-0llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+2,959-05 files

LLVM/project 297d4dbllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll gfx-callable-argument-types.ll

[AMDGPU] Implement CFI for CSR spills

Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.

Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.

Change-Id: I9b09646abd2ac4e56eddf5e9aeca1a5bebbd43dd
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+3,568-2,598llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,912-1,913llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
+2,700-12llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+631-631llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+505-510llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+394-399llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+9,710-6,063108 files not shown
+14,825-9,527114 files

LLVM/project 93e6e56llvm/include/llvm/CodeGen MachineFunction.h, llvm/lib/CodeGen MachineFunction.cpp

[AMDGPU][MC] Replace shifted registers in CFI instructions

Change-Id: I0d99e9fe43ec3b6fecac20531119956dca2e4e5c
DeltaFile
+67-67llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir
+33-0llvm/lib/MC/MCDwarf.cpp
+15-15llvm/test/CodeGen/AMDGPU/dwarf-multi-register-use-crash.ll
+10-0llvm/lib/CodeGen/MachineFunction.cpp
+4-4llvm/test/CodeGen/AMDGPU/debug-frame.ll
+4-0llvm/include/llvm/CodeGen/MachineFunction.h
+133-865 files not shown
+143-9011 files

LLVM/project 0e425dellvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.960bit.ll

[AMDGPU] Use register pair for PC spill

Change-Id: Ibedeef926f7ff235a06de65a83087c151f66a416
DeltaFile
+4,331-4,331llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,742-1,740llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+1,562-1,560llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+1,462-1,460llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+1,238-1,236llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+1,030-1,028llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+11,365-11,35589 files not shown
+18,153-18,04495 files

LLVM/project 2d3461ellvm/test/CodeGen/AMDGPU accvgpr-spill-scc-clobber.mir pei-build-av-spill.mir

[AMDGPU] Implement CFI for non-kernel functions

This does not implement CSR spills other than those AMDGPU handles
during PEI. The remaining spills are handled in a subsequent patch.

Change-Id: I5e3a9a62cf9189245011a82a129790d813d49373
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+5,568-0llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+3,000-96llvm/test/CodeGen/AMDGPU/pei-build-av-spill.mir
+2,208-72llvm/test/CodeGen/AMDGPU/pei-build-spill.mir
+2,196-0llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-mov-b32.mir
+2,136-0llvm/test/CodeGen/AMDGPU/vgpr-spill-scc-clobber.mir
+1,671-1llvm/test/CodeGen/AMDGPU/debug-frame.ll
+16,779-16993 files not shown
+22,893-1,13299 files

LLVM/project 8efdab5llvm/lib/Target/AMDGPU SIFrameLowering.cpp, llvm/test/CodeGen/AMDGPU debug-frame.ll eliminate-frame-index-v-add-u32.mir

[AMDGPU] Emit entry function Dwarf CFI

Entry functions represent the end of unwinding, as they are the
outer-most frame. This implies they can only have a meaningful
definition for the CFA, which AMDGPU defines using a memory location
description with a literal private address space address. The return
address is set to undefined as a sentinel value to signal the end of
unwinding.

Change-Id: I21580f6a24f4869ba32939c9c6332506032cc654
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
DeltaFile
+1,405-0llvm/test/CodeGen/AMDGPU/debug-frame.ll
+204-12llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-u32.mir
+134-6llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-co-u32.mir
+114-10llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-add-i32.mir
+42-5llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+34-0llvm/test/CodeGen/AMDGPU/entry-function-cfi.mir
+1,933-3322 files not shown
+2,044-5028 files

LLVM/project 22dc26eclang/lib/Driver/ToolChains Gnu.cpp, clang/test/Driver amdgpu-unwind.cl

[Clang] Default to async unwind tables for amdgcn

To avoid codegen changes when enabling debug-info (see
https://bugs.llvm.org/show_bug.cgi?id=37240) we want to
enable unwind tables by default.

There is some pessimization in post-prologepilog scheduling, and a
general solution to the problem of CFI_INSTRUCTION-as-scheduling-barrier
should be explored.

Change-Id: I83625875966928c7c4411cd7b95174dc58bda25a
DeltaFile
+26-0clang/test/Driver/amdgpu-unwind.cl
+1-0clang/lib/Driver/ToolChains/Gnu.cpp
+27-02 files

LLVM/project f277e94

[MIR] Error on signed integer in getUnsigned

Previously we effectively took the absolute value of the APSInt, instead
diagnose the unexpected negative value.

Change-Id: I4efe961e7b29fdf1d5f97df12f8139aac12c9219
DeltaFile
+0-00 files

LLVM/project 1f672c7llvm/include/llvm/MC MCDwarf.h, llvm/lib/CodeGen MachineOperand.cpp

[MC][Dwarf] Add custom CFI pseudo-ops for use in AMDGPU

While these can be represented with .cfi_escape, using these pseudo-cfi
instructions makes .s/.mir files more readable, and it is necessary to
support updating registers in CFI instructions (something that the
AMDGPU backend requires).

Change-Id: I763d0cabe5990394670281d4afb5a170981e55d0
DeltaFile
+186-0llvm/lib/MC/MCDwarf.cpp
+106-0llvm/lib/MC/MCParser/AsmParser.cpp
+91-1llvm/include/llvm/MC/MCDwarf.h
+75-0llvm/lib/MC/MCAsmStreamer.cpp
+75-0llvm/lib/CodeGen/MIRParser/MIParser.cpp
+58-0llvm/lib/CodeGen/MachineOperand.cpp
+591-115 files not shown
+995-721 files

LLVM/project cce03d2mlir/lib/Conversion/XeGPUToXeVM XeGPUToXeVM.cpp, mlir/lib/Conversion/XeVMToLLVM XeVMToLLVM.cpp

[MLIR][XeGPU][XeVM] XeGPU to XeVM: Add lowering for xegpu.dpas_mx (#196981)

And add XeGPU dpas_mx lane level integration tests.
DeltaFile
+147-0mlir/test/Integration/Dialect/XeGPU/LANE/xegpu_dpas_mx_prepacked_e2m1.mlir
+110-33mlir/lib/Conversion/XeGPUToXeVM/XeGPUToXeVM.cpp
+129-0mlir/test/Integration/Dialect/XeGPU/LANE/xegpu_dpas_mx_prepacked_bf8.mlir
+85-0mlir/test/Conversion/XeGPUToXeVM/dpas_mx.mlir
+13-2mlir/lib/Conversion/XeVMToLLVM/XeVMToLLVM.cpp
+14-0mlir/test/Conversion/XeVMToLLVM/xevm-to-llvm.mlir
+498-351 files not shown
+502-387 files

LLVM/project 68e8dd6clang/lib/AST ItaniumMangle.cpp, clang/test/CodeGenCXX mangle-cxx2c.cpp

[Clang] Mangling of pack indexing type and expression for itanium (#123513)

See  https://github.com/itanium-cxx-abi/cxx-abi/pull/198
and #112003
DeltaFile
+30-16clang/lib/AST/ItaniumMangle.cpp
+43-0llvm/include/llvm/Demangle/ItaniumDemangle.h
+43-0libcxxabi/src/demangle/ItaniumDemangle.h
+34-0clang/test/CodeGenCXX/mangle-cxx2c.cpp
+4-0llvm/include/llvm/Testing/Demangle/DemangleTestCases.inc
+4-0libcxxabi/test/DemangleTestCases.inc
+158-163 files not shown
+162-189 files

LLVM/project 4f52e22utils/bazel/llvm-project-overlay/libc BUILD.bazel

[BAZEL] Remove duplicated hdr_errno_macros dep (#197945)
DeltaFile
+0-1utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+0-11 files

LLVM/project b25c916llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/RISCV/rvv vwadd-sdnode.ll

[DAG] isIdentityElement - use KnownBits instead of constant splat to match identity constants (#197455)

This works better with the DemandedElts mask to match hidden identity
constants (zero in particular.....).

I need this for the ongoing work to improve VECREDUCE simplification to
match identity elements (legalisation pads with identity elements) in an
expanded reduction shuffle chain.
DeltaFile
+3,699-3,716llvm/test/CodeGen/Thumb2/mve-clmul.ll
+32-32llvm/test/CodeGen/X86/vector-lzcnt-512.ll
+28-26llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+7-17llvm/test/CodeGen/RISCV/rvv/vwadd-sdnode.ll
+3,766-3,7914 files

LLVM/project 541b9cdllvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange inner-preheader-single-phi.ll

[LoopInterchange] Handle PHI nodes in inner loop preheader (#196691)

Fixes #196242

LoopInterchange crashes when the inner loop preheader contains a
single-incoming PHI node.

This fix folds single-incoming PHI nodes by replacing them with their
incoming value and then erasing the PHI nodes.

Added a regression test under llvm/test/Transforms/LoopInterchange/ 
using the reproducer from issue #196242.
DeltaFile
+33-0llvm/test/Transforms/LoopInterchange/inner-preheader-single-phi.ll
+8-0llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+41-02 files