[LoopUnrollAndJam] Fix out-of-date LoopInfo being used during unroll and jam (#191250)
Fixed issue #190671, where loop unroll and jam did not update LoopInfo
entirely correctly.
Invalid LoopInfo gets passed into `simplifyLoopAfterUnroll()` and is
further called by SCEV at the beginning of
`ScalarEvolution::createSCEVIter()`, which triggered hidden bugs. To
fix, updated LoopInfo correctly before its use.
The loop blocks that `simplifyLoopAfterUnroll()` iterates
through, will become unavailable after the LoopInfo update. Therefore we
store the loop blocks beforehand for its use in
`simplifyLoopAfterUnroll()` later.
[LifetimeSafety] Flow origins from lifetimebound args in `gsl::Pointer` construction (#189907)
This PR adds origin flow from `[[clang::lifetimebound]]` constructor
arguments during `gsl::Pointer` construction.
Fixes #175898
Stop using spir_kernel calling convention on non-SPIR targets. (#191090)
This behavior traces back to fc2629a65a05fa05bc5c5bc37cf910c8e41cdac3 ,
but neither the commit message or the reviews actually justify using
this calling convention. The actual behavior which is important for that
change is the way clang calling convention lowering works.
There isn't really any other reason to use spir_kernel: every non-SPIR
target either rejects it, or treats it as the C calling convention. So
let's stop doing it.
Fixes #157028.
[ADT][NFC] Make po iterator stack entry trivially copyable (#191290)
std::tuple is not trivially copyable, leading to the use of less
efficient SmallVector implementations. Additionally, named members are
more readable than std::get<N>.
Also make sure that successors() is called only once per traversed basic
block -- this is difficult here: when the begin iterator is stored in
the vector between the calls, the second call can't be eliminated due to
the potentially visible store. When copying the entry into the vector,
SmallVector exposes the address of the alloca via ptrtoint to ensure
that the object indeed doesn't reside in the vector. We're missing
some optimization here... so very carefully work around this problem.
Strip .llvm. suffix after removing the coroutine suffixes to avoid breaking pseudo probe (#191354)
Pseudo probe is currently broken when a coroutine function is promoted
with a global name during ThinLTO import. The top-level function GUID in
.pseudo_probe section are computed from the promoted name (with
".llvm.xxxx" suffix) instead of the original function name. Then it will
cause a dangling top-level GUID that doesn't have any reference in the
pseudo probe desc, and potentially hurt profile quality.
The root cause of the issue were:
1) ThinLTO post-link imports and promotes a local coroutine function,
creating a global function with ".llvm.xxxx" suffix.
2) https://github.com/llvm/llvm-project/pull/141889 introduces a change
in CoroSplit pass that updates the coroutine functions linkage name with
the ".cleanup", ".destroy", ".resume" suffixes, and this creates
top-level functions with ".llvm.xxxx.cleanup", ".llvm.xxxx.destroy",
".llvm.xxxx.resume" suffixes.
3) PseudoProbePrinter and PseudoProbeInserter only strips coroutine
suffix, and didn't consider the ".llvm." suffix.
This patch fixes the issue in step 3)
japanese/font-takao: Update to 003.03.01 and take maintainership
In this release, Takao and TakaoEx fonts are distributed separately.
Update MASTER_SITES and DISTNAME.
Lint with portclippy.
Refactor do-install.
Changelog: https://launchpad.net/takao-fonts/trunk/15.03
PR: 277679
Approved by: hrs (maintainer timeout > 3 months)
Approved by: fluffy (mentor)
[AMDGPU] Always update SETREG MSBs if offset is 0
We can always update immediate if Offset is zero. The bits
HW will write are always at the same position if offset is 0.
In particular it removes redundant mode changes created as seen
in the hazard-setreg-vgpr-msb-gfx1250.mir.
This still relies on thr wrong behaviour that SETREG updates
MSBs, so it will have to be changes later. Test immediates may be
off from desired for that reason in this patch.
Add SwitchableSimpleService base class
Subclasses can override select_systemd_unit_name() to switch between
systemd units at runtime, or return None when no unit is involved.
select_etc() allows mode-dependent config generation. Intended to
support services with alternative kernel/userspace implementations.
(cherry picked from commit fb396ad0d74bdd90796b7f682c359f0c666050ce)
[Clang] Permit '--target=amdgcn--' for binaries (#191451)
Summary:
We always accepted `--target=amdgcn--` to create IR object files but it
doesn't allow creating actual binaries without user intervention. This
is because it would fall-through to the GCC toolchain which does not
know how to handle AMGCN / AMDGPU targets. This PR just adds a single
line to handle it, which effectively allows this as a 'bare' target.
Perhaps the argument could be made that AMDGPU should not support
anything but strictly HSA because it has many assumptions in the
compiler itself, such as implicit arguments, but I feel like it is
relatively harmless to support this case if users decide they really do
not need it.
NAS-140642 / 27.0.0-BETA.1 / Add SwitchableSimpleService base class (#18716)
Subclasses can override select_systemd_unit_name() to switch between
systemd units at runtime, or return None when no unit is involved.
select_etc() allows mode-dependent config generation. Intended to
support services with alternative kernel/userspace implementations.
[flang][OpenMP] Rename GetRequiredCount to GetMinimumSequenceCount
The new name better describes the calculated value.
Also adjust a diagnostic message to say that *at least* N loops are
expected in the sequence.
[compiler-rt] Address dlvsym not found compilation error when targeting certain platforms (try 2) (#191458)
The previous attempt #191444 was incomplete.
[LifetimeSafety] Fix crash on ternary operator when one of the expressions is `throw` (#190345)
Ternary operator now doesn't flow origins from `[[noreturn]]` arms,
including `throw` statements.
Closes #183895
CodeGen: Fix double counting bundles in inst size verification
The AMDGPU implementation handles bundles by summing the
member instructions. This was starting with the size of the
bundle instruction, then re-adding all of the same instructions.
This loop is over the iterator, not instr_iterator, so it should
not be looking through the bundled instructions. Most of the other
uses of getInstSizeInBytes are also on the iterator, not the
instr_iterator so the convention seems to be targets need to handle
BUNDLE correctly themselves.
[SystemZ][GOFF] Reference to external variable needs PR symbol (#185742)
Variables are modelled as parts in the GOFF format. Referencing a
variable defined in a different compilation unit
requires to use a PR symbol instead of an EX symbol (created by the
EXTRN/WXTERN instruction). Those PR symbols need to refer to a
ED symbol, for which the ED symbol of the ADA is used.
Apparently we shouldn't touch the RTC immediately after restarting the
i8254 clock either when coming out of S3 suspend. So move the code
that checks whether the RTC alarm went off and clears it all the way to
the end of acpi_cpu_resume. This fixes a lockup seen on the x220.
Figured out by mlarkin@ who write the initial diff; I just tweaked it.
ok mlarkin@, deraadt@
[mlir][tosa] Disallow shape type in function argument/return types (#175754)
This commit adds an additional check to the TOSA validation pass to
disallow use of shape types in function arguments and return types. The
specification requires these types be tensor types.
[gsymutil] Fix a warning on systems with 32-bit `off_t` (#189524)
The size of `off_t` isn't specified, so it can be either 32 or 64 bits
depending on the system. In particular, on LLP64 systems like windows
it's generally only 32 bits. This means the `if (StrtabSize >
UINT32_MAX)` check added in #181458 may warn on such systems (Giving
-Wsign-compare).
Given that `FileWriter::tell` (and the underlying `raw_ostream::tell`)
explicitly return `uint64_t`, the simplest fix is to just use the return
type of the function instead of potentially truncating. Since the same
logic applies even where we don't happen to have a warning here, I've
applied this for all of these uses of `off_t`.