LLVM/project 396e319llvm/include/llvm/IR DiagnosticInfo.h, llvm/unittests/IR DiagnosticInfoTest.cpp CMakeLists.txt

[IR] Fix invalid debug metadata diagnostic kind
DeltaFile
+36-0llvm/unittests/IR/DiagnosticInfoTest.cpp
+3-3llvm/include/llvm/IR/DiagnosticInfo.h
+1-0llvm/unittests/IR/CMakeLists.txt
+40-33 files

LLVM/project 89ff264bolt/include/bolt/Core BinaryFunction.h MCPlusBuilder.h, bolt/lib/Core BinaryFunction.cpp

[BOLT] Replace partial instructions with traps in patched entries (#205211)

Overwriting a function entry with a jump is likely to not perfectly
align with the instruction stream. If the end of the patch does not
fall onto an instruction boundary, the bytes following the jump are
orphaned and will have nonsensical interpretations. This can leave
other tools confused, especially since these orphaned bytes can
decode to instructions that do not nicely rejoin the still intact
part of the instructions stream. Overwrite these bytes with traps
in the PatchEntry pass.

Fixes #198455.
DeltaFile
+34-3bolt/lib/Passes/PatchEntries.cpp
+23-0bolt/lib/Core/BinaryFunction.cpp
+9-0bolt/test/X86/patch-entries.test
+8-0bolt/include/bolt/Core/BinaryFunction.h
+5-0bolt/lib/Target/X86/X86MCPlusBuilder.cpp
+5-0bolt/include/bolt/Core/MCPlusBuilder.h
+84-31 files not shown
+85-37 files

LLVM/project aa07daeutils/bazel/llvm-project-overlay/mlir BUILD.bazel

[mlir][bazel]: Remove GPU dialect deps from MemRefTransforms (#205624)

This change removes dead GPU dependencies (`GPUDialect` and
`NVGPUDialect`) from the `MemRefTransforms` target. These dependencies
are not needed by the transforms themselves and greatly increase the
build time (e.g., NVVMDialect.cpp alone requires two minutes to build).
This aligns the bazel build with the CMake configuration.
DeltaFile
+0-2utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+0-21 files

LLVM/project e53c48bllvm/lib/Target/AMDGPU SOPInstructions.td SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU usubsat.ll

[AMDGPU] Lower uniform usubsat to SOP (#203155)

Prefer scalar (SALU) lowering for uniform `usubsat`, since usubsat(a, b)
= max(a, b) - b.
* i32: add a GCNPat matching uniform `usubsat` to S_MAX_U32 + S_SUB_I32
* i16: route uniform `usubsat` through `promoteUniformOpToI32` instead
of a TableGen pattern that hard-codes the 0xffff masks. This exposes the
zero-extends as real DAG nodes so KnownBits can fold the masks when the
high bits are already known zero; the promoted i32 usubsat then reuses
the scalar pattern. Promote-and-truncate is safe for usubsat because the
result always fits in the narrow type (unlike uaddsat).

Register USUBSAT with `setTargetDAGCombine` and the promotion dispatch,
return ZERO_EXTEND in `getExtOpcodeForPromotedOp`, and add it to
`isNarrowingProfitable` so divergent i16/i32 keep their native VALU
clamp form.

Co-authored by: Jeffrey Byrnes
DeltaFile
+577-356llvm/test/CodeGen/AMDGPU/usubsat.ll
+17-0llvm/lib/Target/AMDGPU/SOPInstructions.td
+5-1llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+1-0llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+600-3574 files

LLVM/project 2c4c268llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 shrink_vmul.ll madd.ll

[X86] combineMulToPMADDWD - match 256/512-bit SIGN_EXTEND nodes (#205606)

Now that the X86ISD::VPMADDWD handling is improving, we can remove some
of the limits that we had to prevent regressions
DeltaFile
+36-36llvm/test/CodeGen/X86/shrink_vmul.ll
+38-28llvm/test/CodeGen/X86/madd.ll
+1-1llvm/lib/Target/X86/X86ISelLowering.cpp
+75-653 files

LLVM/project 847d396clang/lib/ScalableStaticAnalysisFramework/Analyses SSAFAnalysesCommon.h SSAFAnalysesCommon.cpp, clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow PointerFlowExtractor.cpp

[SSAF] Properly handle contributors with multiple declarations (#204482)

A contributor entity can have multiple declarations all contributing
interesting facts. For example, a function declaration (not definition)
may have default arguments, which may provide pointer flow or unsafe
buffer usage facts. This commit groups declarations by their canonical
decls. The entity summary of a contributor will be collected from all
its decls.

In addition, this commit includes the following minor changes:
- Factor the common procedure of summary extraction and insertion into a
template function in SSAFAnalysesCommon.h.
- Convert the no-duplicate contributor assertion into a debug warning.
We need the release build to not crash.

rdar://179150798
DeltaFile
+56-3clang/lib/ScalableStaticAnalysisFramework/Analyses/SSAFAnalysesCommon.h
+15-35clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowExtractor.cpp
+14-32clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageExtractor.cpp
+40-0clang/test/Analysis/Scalable/PointerFlow/multi-decl-contributor.cpp
+27-0clang/test/Analysis/Scalable/PointerFlow/entity-name-no-conflict.cpp
+6-4clang/lib/ScalableStaticAnalysisFramework/Analyses/SSAFAnalysesCommon.cpp
+158-746 files

LLVM/project 1514123clang/include/clang/AST Mangle.h, clang/lib/AST Mangle.cpp

[CIR] Skip trivially-recursive available_externally function bodies (#198363)

CIR was emitting available_externally bodies for glibc-style inline
wrappers whose sole call is back to the same asm-named symbol (via
__builtin_*).  LLVM then treats the function as non-terminating and
can fold away surrounding null checks — the same failure mode as
classic CodeGen PR9614 (basename-style if (cwd) paths).

Port isTriviallyRecursive / shouldEmitFunction from CodeGenModule,
including the isInlineBuiltinDeclaration exemption, and skip emitting
those definitions.  isTriviallyRecursive (and its
FunctionIsDirectlyRecursive visitor) lives on MangleContext, so both
classic CodeGen and CIRGen call getMangleContext().isTriviallyRecursive(FD).
DeltaFile
+56-107clang/lib/CodeGen/CodeGenModule.cpp
+55-0clang/lib/AST/Mangle.cpp
+38-0clang/test/CIR/CodeGen/trivially-recursive-skip.cpp
+20-2clang/lib/CIR/CodeGen/CIRGenModule.cpp
+10-0clang/lib/CIR/CodeGen/CIRGenModule.h
+8-0clang/include/clang/AST/Mangle.h
+187-1091 files not shown
+187-1107 files

LLVM/project de78b7ellvm/lib/Target/AArch64 AArch64TargetMachine.cpp, llvm/test/CodeGen/AArch64 aarch64-neon-vector-insert-uaddlv.ll fabs-fp128.ll

Revert "[AArch64] Run cleanup one final time after peephole (#199711)" (#205633)

This reverts commit 448c3d54df7bcd5e5be2b5d051832ad00b4cc89c as it
causes
compile time regressions for little gain, and sounds like the dead
instructions
can be removed in a better way.
DeltaFile
+22-22llvm/test/CodeGen/AArch64/aarch64-neon-vector-insert-uaddlv.ll
+3-2llvm/test/CodeGen/AArch64/fabs-fp128.ll
+1-3llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+0-1llvm/test/CodeGen/AArch64/O3-pipeline.ll
+26-284 files

LLVM/project f5aa4b6compiler-rt/lib/instrumentor-tools/pointer-tracking pointer_tracking_runtime.cpp pointer_tracking_config.json, compiler-rt/test/instrumentor-tools pointer_tracking_test.c simple_pointer_tracking.c

[Instrumentor] Add runtime examples: [3/N] Pointer tracking

The example shows how globals and stack allocations can be tracked. For
each we record if it was read/written and how long the time was between
creation and first use, and last use and deallocation. This is reported
at the end.
DeltaFile
+384-0compiler-rt/lib/instrumentor-tools/pointer-tracking/pointer_tracking_runtime.cpp
+98-0compiler-rt/test/instrumentor-tools/pointer_tracking_test.c
+95-0compiler-rt/lib/instrumentor-tools/pointer-tracking/pointer_tracking_config.json
+86-0compiler-rt/lib/instrumentor-tools/pointer-tracking/README.md
+67-0compiler-rt/lib/instrumentor-tools/pointer-tracking/CMakeLists.txt
+37-0compiler-rt/test/instrumentor-tools/simple_pointer_tracking.c
+767-04 files not shown
+776-110 files

LLVM/project f1f5d9cllvm/lib/Target/Xtensa/AsmParser XtensaAsmParser.cpp, llvm/lib/Target/Xtensa/MCTargetDesc XtensaMCTargetDesc.cpp XtensaTargetStreamer.h

[Xtensa] Implement XtensaNullTargetStreamer (#203819)

It fixes crash in Xtensa AsmParser::run() during
ModuleSummaryIndexAnalysis pass.
DeltaFile
+8-0llvm/lib/Target/Xtensa/MCTargetDesc/XtensaMCTargetDesc.cpp
+7-0llvm/test/CodeGen/Xtensa/null-streamer.ll
+3-4llvm/lib/Target/Xtensa/MCTargetDesc/XtensaTargetStreamer.h
+2-0llvm/lib/Target/Xtensa/AsmParser/XtensaAsmParser.cpp
+20-44 files

LLVM/project f932147compiler-rt/lib/instrumentor-tools/fp-precision-analysis fp_precision_analysis_runtime.cpp CMakeLists.txt, compiler-rt/test/instrumentor-tools precision_fp16_overflow.c precision_detailed.c

[Instrumentor] Add runtime examples: [2/N] A FP precision analysis

Second example:
Check all floating point operations and track if they could be done at
lower precision.

Partially developped by Claude (AI), tested and verified by me.
DeltaFile
+603-0compiler-rt/lib/instrumentor-tools/fp-precision-analysis/fp_precision_analysis_runtime.cpp
+92-0compiler-rt/test/instrumentor-tools/precision_fp16_overflow.c
+76-0compiler-rt/test/instrumentor-tools/precision_detailed.c
+67-0compiler-rt/lib/instrumentor-tools/fp-precision-analysis/CMakeLists.txt
+66-0compiler-rt/test/instrumentor-tools/precision_mixed.c
+56-0compiler-rt/test/instrumentor-tools/simple_precision.c
+960-04 files not shown
+1,012-210 files

LLVM/project 52f3126mlir/lib/Conversion/FuncToEmitC FuncToEmitC.cpp, mlir/test/Conversion/ConvertToEmitC func.mlir

[mlir][emitc]: use converted result types when func.call has one result (#205191)

The lowering for `func.call` to emitc properly uses converted result
types when there are multiple return values from the called func, but
not when there is a single one.
DeltaFile
+8-0mlir/test/Conversion/ConvertToEmitC/func.mlir
+3-3mlir/lib/Conversion/FuncToEmitC/FuncToEmitC.cpp
+11-32 files

LLVM/project f217862llvm/include/llvm/Transforms/IPO Instrumentor.h, llvm/lib/Transforms/IPO Instrumentor.cpp

[Instrumentor] Move common instruction IO functions into a class (#205460)

This commit moves several instruction-related IO functions into a class
instead of having them defined in the instrumentor namespace. We add the
BaseInstructionIO non-templated class because InstructionIO is a
templated class. Adding the common functions into InstructionIO would
force us to define them in the header.
DeltaFile
+28-18llvm/include/llvm/Transforms/IPO/Instrumentor.h
+19-19llvm/lib/Transforms/IPO/Instrumentor.cpp
+47-372 files

LLVM/project f8bfc17llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp

AMDGPU/GlobalISel: Fix get.rounding s_getreg lowering (#205601)

Use llvm.amdgcn.s.getreg instead of emitting S_GETREG_B32 directly so
instruction selection applies the required SReg_32 operand constraint.

This was done for setreg but missed for getreg.

Fixes https://github.com/llvm/llvm-project/pull/205265 when expensive
checks are enabled.
DeltaFile
+4-2llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+4-21 files

LLVM/project 15b1424compiler-rt/lib/instrumentor-tools instrumentor_runtime.h, compiler-rt/lib/instrumentor-tools/flop-counter flop_counter_runtime.cpp README.md

[Instrumentor] Add runtime examples: [1/N] A flop counter

This adds a instrumentor-tools folder into compiler RT to showcase
use cases of the instrumentor. The initial example is a program that,
via instrumentation, counts the number of flops performed. Call and
intrinsic support will follow after #198042.

Partially developped by Claude (AI), tested and verified by me.
DeltaFile
+293-0compiler-rt/lib/instrumentor-tools/instrumentor_runtime.h
+164-0compiler-rt/lib/instrumentor-tools/flop-counter/flop_counter_runtime.cpp
+77-0compiler-rt/lib/instrumentor-tools/flop-counter/README.md
+75-0compiler-rt/test/instrumentor-tools/lit.cfg.py
+67-0compiler-rt/lib/instrumentor-tools/flop-counter/CMakeLists.txt
+54-0compiler-rt/test/instrumentor-tools/CMakeLists.txt
+730-010 files not shown
+941-116 files

LLVM/project 2b4bbc1lldb/include/lldb/Breakpoint Breakpoint.h, lldb/source/Breakpoint Breakpoint.cpp

[lldb][NFC] Change type of Breakpoint's name list (#205429)

This is currently a `std::unordered_set<std::string>`. The downside of
this is that you need to have a `std::string` to perform a lookup of any
kind. This may require an allocation whenever we want to query the name
list. Even using `std::string_view` is not sufficient to perform a
lookup.

I propose that this instead be a `llvm::StringSet` which uses StringRefs
as its primary currency for insertions, lookups, and more.

---------

Co-authored-by: Jonas Devlieghere <jonas at devlieghere.com>
DeltaFile
+7-6lldb/include/lldb/Breakpoint/Breakpoint.h
+3-3lldb/source/Breakpoint/Breakpoint.cpp
+10-92 files

LLVM/project 95e1a39flang/lib/Semantics check-omp-structure.cpp, flang/test/Semantics/OpenMP cancel.f90

Change wording of cancel directives
DeltaFile
+2-2flang/test/Semantics/OpenMP/cancel.f90
+1-1flang/lib/Semantics/check-omp-structure.cpp
+3-32 files

LLVM/project aedca64flang/lib/Semantics check-omp-structure.cpp, flang/test/Semantics/OpenMP cancel.f90

Merge branch 'users/kparzysz/c01-clause-check' into users/kparzysz/c02-directive-wording
DeltaFile
+2-2flang/test/Semantics/OpenMP/cancel.f90
+1-1flang/lib/Semantics/check-omp-structure.cpp
+3-32 files

LLVM/project 821d709llvm/include/llvm/Transforms/IPO Instrumentor.h, llvm/lib/Transforms/IPO Instrumentor.cpp

[Instrumentor] Add subtype IDs to complement type IDs for vectors/arrays

If the type of an argument passed to the instrumentation is a vector or
array, we still want to filter on the underlying type, and the
instrumentation might also need to know. Thus, we can now pass a subtype
ID, which is -1 except if it's a vector or array, then it's the element
type ID. Structs need to be handled differently.
DeltaFile
+89-9llvm/lib/Transforms/IPO/Instrumentor.cpp
+22-22llvm/test/Instrumentation/Instrumentor/default_rt.c
+39-0llvm/test/Instrumentation/Instrumentor/numeric_subtypeid.ll
+20-0llvm/include/llvm/Transforms/IPO/Instrumentor.h
+20-0llvm/test/Instrumentation/Instrumentor/default_config.json
+10-10llvm/test/Instrumentation/Instrumentor/module_and_globals.ll
+200-419 files not shown
+255-5315 files

LLVM/project 8770aecclang-tools-extra/clang-tidy/modernize TypeTraitsCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Extend `modernize-type-traits` to fold remove_cv_t<remove_reference_t<...>> into remove_cv_ref_t (#204789)
DeltaFile
+39-0clang-tools-extra/test/clang-tidy/checkers/modernize/type-traits-remove-cvref.cpp
+32-0clang-tools-extra/clang-tidy/modernize/TypeTraitsCheck.cpp
+8-0clang-tools-extra/docs/clang-tidy/checks/modernize/type-traits.rst
+4-0clang-tools-extra/docs/ReleaseNotes.rst
+83-04 files

LLVM/project 0e99161flang/lib/Semantics check-omp-structure.cpp, flang/test/Semantics/OpenMP cancel.f90

Remove unnecesary wording change
DeltaFile
+2-2flang/test/Semantics/OpenMP/cancel.f90
+1-1flang/lib/Semantics/check-omp-structure.cpp
+3-32 files

LLVM/project f3a1bfdllvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange pr57148.ll pr43326-ideal-access-pattern.ll

[LoopInterchange] Prevent the transformation stage from stopping partway
DeltaFile
+19-36llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+18-14llvm/test/Transforms/LoopInterchange/pr57148.ll
+12-14llvm/test/Transforms/LoopInterchange/pr43326-ideal-access-pattern.ll
+11-13llvm/test/Transforms/LoopInterchange/interchanged-loop-nest-3.ll
+11-9llvm/test/Transforms/LoopInterchange/guarded-inner-loop.ll
+6-9llvm/test/Transforms/LoopInterchange/transform-stop-partway.ll
+77-953 files not shown
+91-1039 files

LLVM/project c7221f4llvm/lib/Target/AMDGPU SIInstrInfo.cpp GCNHazardRecognizer.cpp, llvm/test/CodeGen/AMDGPU ds-latency-mode-default-scheduler.mir ds-latency-mode-attr.mir

[AMDGPU] Add DSLatencyMode flag + attr to control LDS latency

Change-Id: Ia5fee7983adffdb837ba2e876a64d805263b60c3
DeltaFile
+119-0llvm/test/CodeGen/AMDGPU/ds-latency-mode-default-scheduler.mir
+61-0llvm/test/CodeGen/AMDGPU/ds-latency-mode-attr.mir
+59-2llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+53-0llvm/test/CodeGen/AMDGPU/ds-latency-mode-branch-cost.mir
+29-0llvm/test/CodeGen/AMDGPU/ds-latency-mode-flag.mir
+10-13llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+331-158 files not shown
+373-2714 files

LLVM/project 4a81134llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange transform-stop-partway.ll

[LoopInterchange] Add test for IR modification stops partway
DeltaFile
+88-0llvm/test/Transforms/LoopInterchange/transform-stop-partway.ll
+4-1llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+92-12 files

LLVM/project bc2c2a9llvm/lib/Transforms/Scalar LoopInterchange.cpp

[LoopInterchange] Remove some early exits in transform phase (NFCI) (#205563)

This patch removes some unnecessary early exits from the transformation
phase in LoopInterchange. Some of them are simply removed because they
are trivially unsatisfiable. Others are replaced with assertions. These
conditions should be checked in the legality check phase, so it should
be safe to add those asserts.
DeltaFile
+6-10llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+6-101 files

LLVM/project a92d346clang/test/AST ast-dump-openmp-teams-distribute-parallel-for-simd.c ast-dump-openmp-teams-distribute-parallel-for.c

[OpenMP] Remove AST dump tests for non-variant clauses (#204493)

As was suggested during discussion of #200077, and supported by Johannes
in our discussion during his office hours today, this PR removes OpenMP
AST dump tests that do not test the `variant` clause. The full
motivation can be found in the description of the aforementioned PR, but
the short version is that they are a maintenance burden that hold off
improvements to `TextNodeDumper` for other parts of Clang, because they
match too many unrelated details.
DeltaFile
+0-2,193clang/test/AST/ast-dump-openmp-teams-distribute-parallel-for-simd.c
+0-2,193clang/test/AST/ast-dump-openmp-teams-distribute-parallel-for.c
+0-1,970clang/test/AST/ast-dump-openmp-target-teams-distribute-parallel-for-simd.c
+0-1,970clang/test/AST/ast-dump-openmp-target-teams-distribute-parallel-for.c
+0-1,233clang/test/AST/ast-dump-openmp-teams-distribute-simd.c
+0-1,233clang/test/AST/ast-dump-openmp-teams-distribute.c
+0-10,79242 files not shown
+0-18,70648 files

LLVM/project 8291161flang/test/Semantics/OpenMP atomic01.f90 combined-constructs.f90

[flang][OpenMP] Unify wording of directive names in diagnostics
DeltaFile
+63-63flang/test/Semantics/OpenMP/atomic01.f90
+27-27flang/test/Semantics/OpenMP/combined-constructs.f90
+27-27flang/test/Semantics/OpenMP/clause-validity01.f90
+24-24flang/test/Semantics/OpenMP/if-clause-45.f90
+14-14flang/test/Semantics/OpenMP/device-constructs.f90
+10-10flang/test/Semantics/OpenMP/atomic-compare.f90
+165-16542 files not shown
+251-25148 files

LLVM/project b3db32fflang/lib/Semantics check-omp-structure.cpp check-omp-structure.h, flang/test/Semantics/OpenMP clause-validity01.f90 cancel.f90

[flang][OpenMP] Move clause validity checks into OpenMP-specific code

The checks for syntactic properties of clauses (e.g. uniqueness, being
required, etc.) were originally handled by infrastructure common to
OpenMP and OpenACC. That infrastructure, however, is not fully equipped
to handle OpenMP needs: being unable to express version-based properties
or clause set properties being two prominent examples.

The first step towards fulfilling the OpenMP requirements it is to
transfer the handling of clause validity checks into OpenMP-specific
code, which can then be modified without interfering with OpenACC.

In addition to that, this PR also changes the way that clauses on end-
directives are handled: first, a clause appearing on an end-directive
is checked to be allowed to appear on an end-directive, then all clauses
from the begin- and the end-directives are tested together. This unifies
checks for uniqueness of clauses that can appear in both places.
DeltaFile
+191-92flang/lib/Semantics/check-omp-structure.cpp
+14-6flang/lib/Semantics/check-omp-structure.h
+6-7flang/test/Semantics/OpenMP/clause-validity01.f90
+10-0llvm/include/llvm/Frontend/OpenMP/OMP.h
+6-0flang/lib/Semantics/check-omp-loop.cpp
+2-2flang/test/Semantics/OpenMP/cancel.f90
+229-1076 files not shown
+235-11212 files

LLVM/project 7920821llvm/include/llvm/IR IntrinsicsNVVM.td, llvm/lib/Target/NVPTX NVPTXLowerArgs.cpp

[NVPTX] Rewrite kernel signatures in param AS (#204192)

Rewrite the kernel signatures moving byval parameters directly into
entry parameter address space (similar to how ExpandVariadics handles
va_arg functions). This avoids the need for the somewhat hacky
nvvm_internal_addrspace_wrap intrinsic and enables better support for
parameter short pointers.
DeltaFile
+140-210llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp
+177-142llvm/test/CodeGen/NVPTX/lower-byval-args.ll
+43-47llvm/test/CodeGen/NVPTX/lower-args-gridconstant.ll
+45-0llvm/test/CodeGen/NVPTX/lower-byval-args-dbg.ll
+21-0llvm/test/CodeGen/NVPTX/lower-byval-args-idempotent.ll
+0-15llvm/include/llvm/IR/IntrinsicsNVVM.td
+426-4146 files not shown
+443-43112 files

LLVM/project b6649cfflang/lib/Lower/OpenMP OpenMP.cpp, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

[flang][OpenMP] Lower target in_reduction for host fallback

Enable host-fallback lowering for target in_reduction in Flang and MLIR OpenMP translation.

Model target in_reduction through the matching map entry, force address-preserving implicit mapping for Flang in_reduction list items, and emit the host-side task-reduction lookup with __kmpc_task_reduction_get_th_data. The runtime entry point takes and returns a generic, default-address-space pointer, so normalize a non-default-address-space captured pointer to the generic address space before the call and cast the returned private pointer back to the map block argument's address space, mirroring the in_reduction handling on omp.taskloop. Unsupported device/offload-entry and richer reduction forms remain diagnosed.

Add Flang lowering, MLIR verifier/translation, and LLVM IR tests for the supported host-fallback path, including a non-default-address-space case, and the remaining unsupported cases.
DeltaFile
+131-14mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+95-21mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+110-3mlir/test/Target/LLVMIR/openmp-todo.mlir
+107-0mlir/test/Target/LLVMIR/openmp-target-in-reduction.mlir
+77-0mlir/test/Target/LLVMIR/openmp-target-in-reduction-multi.mlir
+60-15flang/lib/Lower/OpenMP/OpenMP.cpp
+580-5312 files not shown
+913-8218 files