[PowerPC] Hardcode LDAT/LWAT_CSNE constant immediate (#196115)
The FC field in LDAT/LWAT_CSNE instructions is always 16, so hardcode it
in the TableGen definition instead of passing it as an explicit operand.
Fix metadirective loop variant lowering
Preserve the associated DO evaluation when a dynamic metadirective can
select either a loop-associated directive or a standalone fallback, so
the fallback still lowers the original loop body.
Scope temporary loop-IV data-sharing attributes to the selected variant.
Use the selected variant's collapse clause to determine how many loop IVs
to mark, avoiding DSA state leaking between alternatives.
[flang][OpenMP] Support loop-associated metadirective variants (part 3)
Enable metadirective lowering for loop-associated variants such as
`do`, `simd`, `parallel do`, and `do simd`.
When a metadirective resolves to a loop-associated directive, the
sibling DO evaluation is spliced into the metadirective's evaluation
list so existing loop lowering finds it. Loop IV data-sharing
attributes are marked at lowering time since semantic analysis cannot
know which variant will be selected. The DataSharingProcessor is also
extended to handle spliced evaluations.
This patch is part of the feature work for #188820 and stacked on top
of #194424.
Assisted with copilot and GPT-5.4
[NFCI][msan] Add test case for llvm.fptoui.sat/llvm.fptosi.sat (#196416)
Forked from llvm/test/Instrumentation/MemorySanitizer/ftrunc.ll
PR #191365 lowered NEON fcvtz[us] intrinsics into fpto[us]i.sat,
exposing a gap in MSan's instrumentation. A follow-up patch will add
support in MSan for ftop[us]i.sat, propagating the shadow (similar to
its handling of fcvtz[us]) rather than strictly handling them.
[Instrumentor] Allow printing a runtime stub (#138978)
This commit extends the Instrumentor with the option
`configuration.runtime_stubs_file` to generate a runtime stub file with
the configured instrumentation. The stub prints all parameters passed to
each enabled instrumentation function.
[NewPM] Port for AArch64SLSHardening (#196378)
AArch64.h: Declared the AArch64SLSHardeningPass class.
AArch64PassRegistry.def: Registered the pass under the name
aarch64-sls-hardening.
AArch64SLSHardening.cpp: Implemented the run method to bridge the NewPM
with the existing pass logic, ensuring MachineModuleAnalysis is
correctly retrieved.
[clang][RISCV] Remove some of the bits added with RISC-V big endian support (#192903)
- FreeBSD will not have any new 32-bit archs
- *BSD's are unlikely to touch BE RISC-V
- Keep the BE and LE targets separate
[CIR] Implement weak ref and alias attribute handling (#195972)
This adds handling for globals with the WeakRefAttr (not emitted) or
AliasAttr attributes set. CIR already had support for function aliases,
but we weren't handling the explicit alias attribute, and we didn't have
any support for global variable aliases. This change adds the global
variable alias support and adds the code to handle the explicit
attribute for variables and functions.
Assisted-by: Cursor / claude-opus-4.7-thinking-xhigh
Fix PowerPC test failure from [AsmWriter] Change the output syntax of floating-point literals. (#196407)
The root cause of the failure is that the output syntax only outputs the
+/-snan syntax for ppc_fp128 if the trailing double is 0. The clang test
here is triggering -LDBL_SNAN, which is actually an fneg(snan constant),
and the fneg causes the signs of both doubles in the ppc_fp128 to flip.
As a result, only the ppc_fp128 form is output in the hexadecimal format
rather than the -snan format, necessitating a change to the test output.
[VPlan] Directly use masks on recipes in dropPoisonGen (#193978)
dropPoisonGeneratingRecipes currently uses a convoluted and incorrect
logic to determine whether a recipe is masked. Use the masks that are
set on the recipes directly instead.
[AMDGPU] Fix LowerDIVREM24 lowering for the unsigned case
The code was not properly checking that the operands were
24-bit integers for the unsigned case.
[VPlan] Fold removeRedundantCanonicalIVs into replaceWideCanIVs. (#195545)
Unify logic to replace VPWidenCanonicalIVRecipes with scalar-steps or
VPWidenIntOrFpInductionRecipe in replaceWideCanonicalIVWithWideIV. This
reduces the code a bit and running later has the benefit that we do the
replacement after the wide mask compare has been replaced by
active-lane-mask/EVL. This means we do not need to drop wrap flags in
some cases, as the wide IV is not used for the mask.
PR: https://github.com/llvm/llvm-project/pull/195545
Add llvm-extract-bundle-entry to extend llvm-objcopy (#169386)
This commit creates llvm-extract-bundle-entry as a wrapper to
llvm-objcopy,
to allow extracting HIP offload fatbin bundles given a URI argument.
---------
Co-authored-by: dsalinas_amdeng <david.salinas at amd.com>
[HLSL] Add type traits for ConstantBuffers templates
This commit adds the type traits to restrict the template type in a
ConstantBuffer to structs or classes that do not contain a resource
type.
Assisted-by: Gemini
[SystemZ] Fix internal error with single-element vector types (#196127)
The special treatment of single-element 128-bit vector types in
SystemZTargetLowering::getRegisterTypeForCallingConv is not appropriate
if vector types are not supported, and can lead to internal compiler
errors.
Fixes: https://github.com/llvm/llvm-project/issues/194256
(cherry picked from commit 48346f2352eaf25373e1a6204c0c7f9fdce92a85)
[lldb-dap] Fix crash in source request handler (#195847)
Check optional argument source has a value before getting the source
reference.
(cherry picked from commit fa8724beccad53be2d39d065be5db11917f94bac)
[CIR] Lower cir.construct_catch_param on Itanium (#195904)
Implement Itanium-ABI lowering of the `cir.construct_catch_param`
operation. This operation encapsulates the target-specific work that
must happen before `__cxa_begin_catch` to bind an in-flight exception
object to a non-trivially-copyable catch parameter
In order to allow the full copy-constructor call generation handling,
including call site attribute generation, to be reused during codegen,
we will be generating a thunk function to perform the copy construction
when it is needed. This function gets inlined during EHABI lowering.
This allows us to generate a target-independent representation during
the initial CIR code generation without having to duplicate the copy
construction logic in the EHABI lowering pass.
The actual generation of the thunk function and the
construct_catch_param operation will be added in a follow-up change.
Assisted-by: Cursor / claude-opus-4.7-thinking-xhigh
[DAG] Use UndefPoisonKind enum in isGuaranteedNotToBeUndefOrPoison/canCreateUndefOrPoison/getFreeze (#196145)
Replace the PoisonOnly flag and allow discrimination between
undef/poison values - to more closely match ValueTracking / GISel
implementations.
This patch is mainly a drop in replacement for the PoisonOnly logic, and
hasn't added anything to match UndefOnly logic (e.g. for SelfMultiply
patterns) - we can improve upon this later on with proper test coverage.
Fixes #194818