[AMDGPU] Add test for v_fmamk_f16/v_fmaak_f16 in real-true16. NFC (#173307)
This is to display a bug in real true16 mode that we do not have
an allocatable 16-bit VGPR class and these instructions do not
have VOP3 forms for allocatable VGPR_16 to be used. To use these
instructions 'VGPR_16_Lo128' must be allocable.
[AMDGPU] Add optimization for llvm.amdgcn.wave.shuffle in uniform cases (#174795)
When the llvm.amdgcn.wave.shuffle intrinsic is called with a uniform
Index operand, it is effectively the same as the llvm.amdgcn.readlane
intrinsic. This change handles this situation and replaces it with the
readlane intrinsic
---------
Signed-off-by: Domenic Nutile <domenic.nutile at gmail.com>
Revert "[OpenCL] Set KHR extensions minimum version to OpenCL 1.0" (#175993)
Reverts llvm/llvm-project#175120
llvm-project/amd/device-libs/opencl/src/image/imwrap.cl:461:1: error: no
matching function for call to 'get_image_height'
461 | GD2GEN(_2Dad)
| ^~~~~~~~~~~~~
llvm-project/amd/device-libs/opencl/src/image/imwrap.cl:460:1: error: no
matching function for call to 'get_image_width'
460 | GD2GEN(_2Dd)
| ^~~~~~~~~~~~
"Depth images are required with other image support for OpenCL 2.0."
InstCombine: Implement SimplifyDemandedFPClass for fma
This can't do much filtering on the sources, except for nans.
We can also attempt to introduce ninf/nnan.
ValueTracking: Improve nan tracking for fma square special case
In the square multiply case, we can infer if the add of opposite
sign infinities can occur.
ValueTracking: Improve handling for fma/fmuladd
The handling for fma was very basic and only handled the
repeated input case. Re-use the fmul and fadd handling for more
accurate sign bit and nan handling.
InstCombine: Improve SimplifyDemandedFPClass min/max handling
Refine handling of minimum/maximum and minimumnum/maximumnum. The
previous folds to input were based on sign bit checks. This was too
conservative with 0s. This can now consider -0 as less than or equal
to +0 as appropriate, account for nsz. It additionally can handle
cases like one half is known positive normal and the other subnormal.
InstCombine: Add more tests for min/max SimplifyDemandedFPClass (#175381)
Test some more refined cases, such as ordering with 0s and within
known positive and known negative cases.
[flang][OpenMP] Fix LINEAR clause validation to report all errors (#175938)
Fixes #175688
After #175383 was merged, test failures occurred because removing the
early return exposed additional errors that tests weren't expecting.
This PR comprehensively fixes the issue by:
1. **Removes the early return** in check-omp-loop.cpp (line 767) after
detecting a modifier error on DO/SIMD directives. Previously, when a
modifier error was found, the function would return immediately without
checking other restrictions like the scalar requirement. Now all
applicable errors are reported, improving diagnostics.
2. **Updates linear-clause01.f90** to expect both the modifier error AND
the scalar error for Case 1 and Case 2, where arrays are used
incorrectly in LINEAR clauses.
[8 lines not shown]
[flang][NFC] Converted five tests from old lowering to new lowering (part 6) (#175485)
Modified the following tests: array-elemental-calls-2.f90,
array-expression-assumed-size.f90, array-temp.f90,
array-user-def-assignments.f90, array.f90
[NFC][IRBuilder] Reuse CreateGEP for helpers (#175979)
Many helper functions for single index GEP exist, but each implement the
same logic to then create the GetElementPtrInst. Refactoring to call a
single function.
This is some groundwork to prepare the SGEP implementation.
InstCombine: Improve SimplifyDemandedFPClass min/max handling
Refine handling of minimum/maximum and minimumnum/maximumnum. The
previous folds to input were based on sign bit checks. This was too
conservative with 0s. This can now consider -0 as less than or equal
to +0 as appropriate, account for nsz. It additionally can handle
cases like one half is known positive normal and the other subnormal.
InstCombine: Add more tests for min/max SimplifyDemandedFPClass
Test some more refined cases, such as ordering with 0s and within
known positive and known negative cases.
[clang][ssaf][docs] Document the Summary Extraction pipeline (#172876)
This patch adds some documentation about the design of the Scalable
Static Analysis Framework (SSAF) Summary Extraction part.
This mainly focuses on how the custom FrontendAction would load
different analyses (their extraction part), and the different formats it
should export into.
Each FrontendAction call would process a single TU by extracting
summaries from them and serializing the results into a file in the
desired format.
The details are not polished yet, but I think it's still beneficial to
have some guidance on how the upcoming components would fit together,
hence this document.
I'll come back to this document to keep it up-to-date as we proceed with
the upstreaming.
[NFC][PowerPC] add test cases for milicode (#175559)
In this PR, we do the following:
1. Simplify the test case for the millicode function `___memmove`.
2. Add test cases for the millicode functions `___memcpy` ,
`____memset`, `____memmove` which are supported in the patch
https://reviews.llvm.org/D143997.
3. Add pre-commit test cases for the functions `___strstr`,
`___memccpy`, `___strcmp`