[SystemZ][GOFF] Implement reset() for GOFFObjectWriter (#201197)
The reset() methods is used to free memory before the object is
destructed or reused. This change adds this functionality to the GOFF
writer.
[Flang] Fix omp_lib.h location and search path (#201104)
Before this PR, omp_lib.h is emitted to `${PREFIX}/include` or
`${PREFIX}/lib/clang/<version>/include` (install prefix) and
`${PREFIX}/runtime/src/omp_lib.h` (builddir prefix). It is never found
there because the driver only adds `${PREFIX}/include/flang/OpenMP` to
the include path.
Fix the `omp_lib.h` include by using the same mechanism as the
omp_lib.mod; that is, move it to
`${PREFIX}/lib/clang/<version>/finclude/flang/<target-triple>`. The
search path is already added by the driver via
`-fintrinsics-modules-path` by the driver. Although omp_lib.h currently
does not contain anything target-specific, it could do so in the future
and I don't think it is worth the effort to add a mechanism without the
target triple. It should also me consistent with omp_lib.mod.
The changes in detail consist of:
1. Move the omp_lib.h output in the builddir to
[16 lines not shown]
[Clang] Rework LTO mode selection to be a Toolchain property (#201155)
Summary:
Currently, the LTO mode is a property of the Driver, which makes sense
because it is used to set up phases. However, we currently have `-flto`
and `-foffload-lto`, which is a split that doesn't fully work with the
full context of a heterogenous compilation as it is 'all-or-nothing'.
This PR seeks to be mostly NFC for now, just moving the queries to a
per-toolchain interface rather than the static driver mode setting we
have right now. The *single* use of this before ToolChains are created
is for the Webassembly toolchain to set an include path. This is now
just a direct check on the flag, which is consistent. In the future they
could shift to fat LTO objects as well.
The main goal for the PR is to allow the GPU / Offloading toolchains to
specify their "real" LTO behavior. Right now SPIR-V and AMDGCN both
default to LTO, but rather than re-use the LTO handling we hack through
the driver phases to override it. Allowing this split would let us
[6 lines not shown]
[AMDGPU] Drop !noundef when widening sub-DWORD constant loads (#201085)
The widened i32 load reads bytes outside the original sub-DWORD load, so
new op cannot claim !noundef
[HIP] Remove explicit compiler-rt from bot recipe (#201329)
The same change was done to the AnnotatedBuilder script recently. Let's
keep them in sync.
https://github.com/llvm/llvm-zorg/pull/861
[VPlan] Don't use the legacy cost model for loop conditions (#156864)
The current behaviour of using the legacy cost model for instructions
that compute the loop exit condition gets the wrong result when the loop
has been transformed to use a different exit condition, e.g. when have
tail-folded predicated vectorization the exit condition is based on the
predicate vector.
Fix this by adding cost computation for BranchOnCount and removing the
restriction on computing the cost for scalar ICmp/FCmp, and removing the
use of the legacy cost model for loop exit conditions.
This causes quite a lot of changes to expected output in tests. Some of
these are just changes to the -debug output, others are choosing a
different VF due to previously over or under-estimating the cost, and in
others the minimum trip count has changed as we now compute the cost for
compares in the middle block.
[LoopInterchange] Add tests for func attributes called in loops (NFC) (#201331)
LoopInterchange has special handling for call instructions. In general,
loops that contain call instructions are not legal to interchange, but
if a call satisfies certain conditions, we allow the interchange to
proceed. Currently, the legality checker only verifies whether the call
reads or writes memory. However, as pointed out in
https://github.com/llvm/llvm-project/pull/200828#issuecomment-4593914293,
we also need to ensure that the call does not diverge. Otherwise, an
illegal interchange may occur.
This patch adds test cases that demonstrate the issue, which will be
fixed in a follow-up patch.
[AArch64][GlobalISel] Add handling for cls intrinsic (#200440)
Neon intrinsic neon.cls wasn't linked to the generic node G_CTLS.
Add in this link in Legalisation (LegalizeIntrinsic), to allow the intrinsic to properly lower.
[clang][NFC] Introduce `LangOptions::isCompatibleWith(ClangABI)` (#201067)
This slightly improves readability and reduces the probability of
off-by-one errors.
net-mgmt/iprange: Backport fix for 32-bit platforms
I've added it as a local patch instead of using PATCHFILES because
upstream patch touches CMakeLists.txt, which is not present on release
tarball for some unknown reason.
Obtained from: upstream 268d7d8794f3f8a6c2d6f08dc4351e767990e683
Sponsored by: Rubicon Communications, LLC ("Netgate")
mail/mblaze: Update to 1.4
Changelog: https://inbox.vuxu.org/mblaze/874iokb3sq.fsf@vuxu.org/
This 1.4 release mainly adds bugfixes and small improvements.
* mcom: $MBLAZE_EDITOR is prefered to configure the editor
* mless: support OpenBSD less without LESSOPEN
(needs mlesskey.example-openbsd)
* magrep: support multibyte regexps.
* Bug fixes.
* Documentation improvements.
PR: 295796
Submitted by: Nico Sonack <nsonack at herrhotzenplotz.de>
[AMDGPU] Do not scale private alloca size when using flat-scratch (#201142)
When using flat-scratch, the `scratch_load/scratch_store` instructions
scale the stack offset by the wavefront size on their own.
Scaling the alloca-size by the wave-front size lead to accesses outside
of the private-memory limit.
mfusepy: Add version 3.1.1
mfusepy is a Python module that provides a simple interface to FUSE
and macFUSE. It's just one file and is implemented using ctypes to use
libfuse.
mfusepy is a fork of fusepy (named py-fuse-bindings in pkgsrc). The
main differences are support for the FUSE 3 API and efficiency
improvements.