[MLIR] Add missing dialects to C API (#82190)
We are trying to make a wrapper of MLIR for Julia in
https://github.com/JuliaLabs/MLIR.jl, but some dialects are missing in
`libMLIR-C`. This PR adds them.
[AMDGPU] Improve llvm.amdgcn.wave.shuffle handling for pre-GFX8 (#174845)
Before, GlobalISel would still return true for lowering the intrinsic
for GFX7 and earlier even though the required ds_bpermute_b32
instruction is not supported. After this change, GlobalISel will
properly report failure to select in this case. Testing is updated
appropriately.
Signed-off-by: Domenic Nutile <domenic.nutile at gmail.com>
[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758)
This patch adds VPValue sub-classes for the different cases we currently
have:
* VPIRValue: A live-in VPValue that wraps an underlying IR value
* VPSymbolicValue: A symbolic VPValue not tied to an underlying value,
e.g. the vector trip count or VF VPValues
* VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase.
This has multiple benefits:
* clearer constructors for each kind of VPValue
* limited scope: for example allows moving VPDef member to VPRecipeValue,
reducing size of other VPValues.
* stricter type checking for member variables (e.g. using VPLiveIn in
the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic
member VPValues)
There probably are additional opportunities for cleanups as follow-ups.
PR: https://github.com/llvm/llvm-project/pull/172758
[LV] Add tests for argmin/argmax with epilogue vectorization. (NFC)
Add additional test coverage for vectorizing argmin/argmax with epilogue
vectorization.
[LLVM][NVPTX] Mark ldmatrix/stmatrix intrinsics convergent (#174669)
NVVM ldmatrix and stmatrix intrinsics map to corresponding PTX
instructions that have a .sync.aligned behavior. Mark these intrinsics
as convergent to prevent control flow transformations that can break
these semantics. This is similar to other .sync.aligned intrinsics.
[llvm] Bypass sandbox for `getMainExecutable()` (#174816)
Getting the executable path is a fairly common operation in LLVM tools
that doesn't affect their outputs. Allow calling it under the sandbox.
[DirectX] Specify NegZero as signed (#174840)
#171456 set `ImplicitTrunc` to false by default. So `NegZero` value was
no longer being created as a signed integer.
This caused a similar crash during `DXILIntrinsicExpansion` as reported
here:
https://github.com/llvm/llvm-project/pull/171456#issuecomment-3718690088.
This change fixes the test case from crashing in the DirectX backend by
manually specifying it as a signed integer.
InstCombine: Handle fadd in SimplifyDemandedFPClass
Note some of the tests currently fail with alive, but not
due to this patch. Namely, when performing the fadd x, 0 -> x
simplification in functions with non-IEEE denormal handling.
The existing instsimplify ignores the denormals-are-zero hazard by
checking cannotBeNegativeZero instead of isKnownNeverLogicalZero.
Also note the self handling doesn't really do anything yet, other
than propagate consistent known-fpclass information until there is
multiple use support.
This also leaves behind the original ValueTracking support, without
switching to the new KnownFPClass:fadd utility. This will be easier
to clean up after the subsequent fsub support patch.