[SYCL] Correct emission status reporting for function templates declared with SYCL attributes. (#185522)
Commit cf6cc662eeee2b1416430f517850be9032788e39 ([OpenMP][SYCL] Improve
diagnosing of unsupported types usage) customized
`Sema::getEmissionStatus()` to return `Emitted` for a function declared
with the `sycl_kernel` attribute during device compilation. That change
is appropriate, but was inserted before a check for a dependent context
and resulted in `Emitted` being returned instead of `TemplateDiscarded`
for templated functions declared with the attribute. That appears to be
incorrect; templated functions are still discarded.
The customization was extended to include the `sycl_kernel_entry_point`
and `sycl_external` attributes in commit
23e4fe040b67e2dd419652830a87093a93ea1a97 ([SYCL] SYCL host kernel launch
support for the sycl_kernel_entry_point attribute). Those additions are
appropriate, but the effect on templated functions (as opposed to their
instantiations) resulted in the incorrect status being observed in a
downstream fork of Clang.
This change corrects `Sema::getEmissionStatus()` to once again
unconditionally return `TemplateDiscarded` for templated functions.
[lldb][test] Fix TestLongjmp on Linux (#185464)
Patch fixes llvm.org/pr20231.
The original test was expecting clock() to return 0 when stepping in
debugger which in reality can never happen.
[SPIR-V] Make SPIRVModuleAnalysis::MAI a non static member (#160956)
Otherwise multiple translation units in the same process could run into
ID reuse collisions cause invalid SPIR-Vs to be generated due to having
multiple definition for the same SPIR-V SSA value.
Closes: https://github.com/llvm/llvm-project/issues/160613
[AArch64] Remove dangling function declaration in AArch64PointerAuth (#185439)
Function `checkAuthenticatedLR` was declared but not defined anywhere.
This patch removes the dangling declaration.
[flang][NFC] Converted five tests from old lowering to new lowering (part 28) (#185549)
Tests converted from test/Lower/Intrinsics: dreal.f90, dshiftl.f90,
dshiftr.f90, eoshift.f90, erfc_scaled.f90
[LLVM][CodeGen][SVE] Implement isel for maximumnum/minimumnum. (#185074)
Patch to add custom lowering for FCANONICALIZE, FMAXNUM_IEEE, and
FMINNUM_IEEE, all of which are required when relying on default
expansion of FMAXIMUMNUM and FMINIMUMNUM.
The lowering is very simple because AArch64's FMAXNM and FMINNM
instructions are IEEE754-2008 compliant, with the implementation
effectively follow the same path take for NEON.
NOTE: Bfloat support will be provided separately.
[lldb][test] PlatformDarwinTest.cpp: add full error message to expected assertion
I'm about to reword the error message. Having test coverage for the
message will make that change easier to review/reason about.
[Clang] Restrict AMDGCN image built-ins (#180949)
Introduced validation for the `dmask` argument of the aforementioned
built-ins to match LLVM IR verifier behavior that is being changed in
llvm/llvm-project#179511.
[SystemZ] Add a SystemZ specific pre-RA scheduling strategy. (#135076)
This is a relatively simple strategy as it is omitting any heuristics for
liveness and register pressure reduction. This works well as the SystemZ ISel
scheduler is using Sched::RegPressure which gives a good input order to begin
with.
It is trying harder with biasing phys regs than GenericScheduler as it also
considers other instructions such as immediate loads directly into phys-regs
produced by the register coalescer. This can hopefully be refactored into
MachineScheduler.cpp.
It has a latency heuristic that is slightly different from the one in
GenericScheduler: It is activated for a specific type of region that have
many "data sequences" consisting of SUs connected only with a single
data-edge that are next to each other in the input order. This is only 3% of
all the scheduling regions, but when activated it is applied on all the
candidates (not just once per cycle). At the same time it is a bit more
careful by checking not only the SU Height against the scheduled latency but
[22 lines not shown]
[AMDGPU] Poison invalid globals after emitting error in LowerBufferFatPointers pass (#184662)
After the change from `report_fatal_error` to `Ctx.emitError` in #142014
there is a necessity to remove unsupported globals. Otherwise there is a
secondary crash during ISel when processing them
Fixes SWDEV-511241
[TableGen] Do not order register classes based on heap addresses (#185644)
Compare registers using their enum values instead, which I suspect was
the intention in the first place, since we already have lexicographical
ordering defined for CodeGenRegisters.
This does not cause any changes in .inc files and is likely NFC, but
it's still best to have it be deterministic.
[libc] 185136 - added iswlower entry point (#185221)
Changes include:
- Added iswlower entrypoint in wctype.yaml to expose the function
- Created iswlower.h header and iswlower.cpp implementation
- Added CMake entrypoint object for iswlower
- Created unit test in iswlower_test.cpp
- Added test entry to wctype CMakeLists.txt
this PR helps in exposing iswlower which internally calls islower on
wide character
built using : ninja -C build libc
tested using : ninja libc_wctype_unittests and all the 3 tests passed
resolves issue #185136
[lldb][PlatformDarwin][NFC] Use formatv-style format string in Locate ExecutableScriptingResourcesFromDSYM warning message (#185640)
About to make changes in this area and using `formatv` instead of
`printf` style format specifiers makes those easier to follow.
[Hexagon] Fix 64-bit funnel shift miscompilation with register shift amounts (#183669)
64-bit regpair shift amounts are treated as signed 7-bits, so a
complement
shift amount of 64 (when the primary amount is 0) is sign-extended to
-64,
reversing the shift direction and producing incorrect results. This
affected
any 64-bit rotate or funnel shift where the runtime shift amount could
be 0
(making the complement 64) or >= 64.
Fix by masking the shift amount to [0, 63] and computing the complement
as
(m - 64), which is always in [-64, -1]. Using lsl/lsr (logical shift)
instructions with this negative amount causes the hardware to reverse
the
shift direction while zero-filling vacated positions:
[12 lines not shown]
[libc][math] Implement an integer-only version of double precision sin and cos with 1 ULP errors. (#184752)
Size of `sin` for armv8m:
Before the patch:
```
$ ls -l libc/src/math/generic/CMakeFiles/libc.src.math.generic.sin.dir/
total 16
-rw-r----- 1 lntue primarygroup 13408 Mar 5 07:38 sin.cpp.obj
$ llvm-nm-19 --radix=d --print-size --size-sort --reverse-sort libc/src/math/generic/CMakeFiles/libc.src.math.generic.sin.dir/sin.cpp.obj
0000000000002048 V _ZN22__llvm_libc_23_0_0_git4math31range_reduction_double_internal24ONE_TWENTY_EIGHT_OVER_PIE
0000000000001632 W _ZN22__llvm_libc_23_0_0_git4math31range_reduction_double_internal19LargeRangeReduction4fastEdRNS_10NumberPairIdEE
0000000000001412 W _ZN22__llvm_libc_23_0_0_git4math3sinEd
0000000000001048 W _ZN22__llvm_libc_23_0_0_git4math20sincos_eval_internal11sincos_evalERKNS_10NumberPairIdEERS3_S6_
0000000000001040 V _ZN22__llvm_libc_23_0_0_git4math31range_reduction_double_internal17SIN_K_PI_OVER_128E
0000000000000528 W _ZN22__llvm_libc_23_0_0_git4math31range_reduction_double_internal21range_reduction_smallEdRNS_10NumberPairIdEE
0000000000000004 T sin
0000000000000004 V _ZZN22__llvm_libc_23_0_0_git6fputil7generic15quick_get_roundEvE1x
[26 lines not shown]
[OpenMP] Add definitions of FLATTEN and SPLIT to OMP.td (#185642)
Add the definitions of the "flatten" and the "split" constructs to the
OMP.td file. This will allow the implementation efforts in clang and
flang to proceed independently.
There is no other functionality added in this patch.
The "flatten" construct is defined in the OpenMP Technical Report 14:
https://www.openmp.org/wp-content/uploads/openmp-TR14.pdf
[CIR] Ensure strings are null-terminated, better deal with trailing null (#185513)
Our current implementation of string lowering did some work to remove
extra trailing zeros, plus do a 'zero' constant. That is unchanged by
this patch. However, this patch ALSO ensures that we do the 'remove
extra trailing zeros' to remove ALL trailing zeros, which likely has
canonicalization benefits later on.
However, the real benefit of this patch is to make string emission by
default emit a null-terminator, which fixes the virtual table 'name'
field get lowered correctly. We do this by making the builder::getString
function take an argument (true by default) that will ensure we add a
null terminator if necessary.
This reflects the llvm::ConstantDataArray::getString function, which has
the same functionality. However, doing this during lowering seems
incorrect, since the FE is the one that knows whether these null
terminators are necessary. There is not currently an 'opt out' use of
the behavior, but the functionality is left in place to better reflect
[3 lines not shown]
[CIR][AArch64] Add lowering for remaining `vabd_*` builtins (#185478)
Implement the missing CIR lowerings for the AdvSIMD (Neon) `vabd_*`
(absolute difference) intrinsic group.
Most `vabd` variants were already supported (see
https://github.com/llvm/llvm-project/pull/183595); this patch
completes the remaining cases listed in [1].
Move the corresponding tests from:
* clang/test/CodeGen/AArch64/neon_intrinsics.c
to:
* clang/test/CodeGen/AArch64/neon/intrinsics.c
The implementation mirrors the existing lowering in
CodeGen/TargetBuiltins/ARM.cpp. To support this, add the
`emitCommonNeonSISDBuiltinExpr` helper.
Reference:
[1] https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#absolute-difference
[mlir] Fix tests not to depend `llvm-strings` in standalone build (#185187)
Move the `llvm-strings` test dependency into the non-standalone test
dependency block, to fix standalone builds after #182846. While at it,
reformat the block to make it more visible.
Signed-off-by: Michał Górny <mgorny at gentoo.org>