[mlir] added a check in the walk to prevent catching a cos in a nested region (#190064)
The walk in SincosFusion may detect a cos within a nested region of the
sin block. This triggers an assertion in `isBeforeInBlock` later on.
Added a check within the walk so it filters operations in nested
regions, which are not in the same block and should not be fused anyway.
---------
Co-authored-by: Yebin Chon <ychon at nvidia.com>
rtld: add test for dlopen("#dirfd/path")
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D56152
rtld: allow dlopen("#<number>/<path>")
When a specially formatted path is passed to dlopen(), of the form
#number/path
and the number is the valid dirfd file descriptor listed in the
LD_LIBRARY_FDS, interpret it as a relative path name against dirfd
number.
This complements the result returned from dladdr() for such objects
in dli_fname.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D56152
[X86] LowerRotate - expand vXi8 non-uniform variable rotates using uniform constant rotates (#189986)
We expand vXi8 non-uniform variable rotates as a sequence of uniform
constant rotates along with a SELECT depending on whether the original
rotate amount needs it
This patch removes premature uniform constant rotate expansion to the
OR(SHL,SRL) sequences to allow GFNI targets to use single VGF2P8AFFINEQB
calls
[Support] Support nested parallel TaskGroup via work-stealing (#189293)
Nested TaskGroups run serially to prevent deadlock, as documented by
https://reviews.llvm.org/D61115 and refined by
https://reviews.llvm.org/D148984 to use threadIndex.
Enable nested parallelism by having worker threads actively execute
tasks from the work queue while waiting (work-stealing), instead of
just blocking. Root-level TaskGroups (main thread) keep the efficient
blocking Latch::sync(), so there is no overhead for the common
non-nested case.
In lld, https://reviews.llvm.org/D131247 worked around the limitation
by passing a single root TaskGroup into OutputSection::writeTo and
spawning 4MB-chunked tasks into it. However, SyntheticSection::writeTo
calls with internal parallelism (e.g. GdbIndexSection,
MergeNoTailSection) still ran serially on worker threads. With this
change, their internal parallelFor/parallelForEach calls parallelize
automatically via helpSync work-stealing.
[3 lines not shown]
kqueue: assert that kqueue knote lists own the knotes
Reviewed by: kevans, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D56212
[C++20] [Modules] Add VisiblePromoted module ownership kind (#189903)
This patch adds a new ModuleOwnershipKind::VisiblePromoted to handle
declarations that are not visible to the current TU but are promoted to
be visible to avoid re-parsing.
Originally we set the visible visiblity directly in such cases. But
https://github.com/llvm/llvm-project/issues/188853 shows such decls may
be excluded later if we import #include and then import. So we have to
introduce a new visibility to express the intention that the visibility
of the decl is intentionally promoted.
Close https://github.com/llvm/llvm-project/issues/188853
newsraft: update to 0.36
- fix year 2038 overflow in If-Modified-Since header
- set window title with an escape sequence
- update build instructions for macOS
- cancel search input with ^C key
- discard search query text on search input canceling
- extend menu-responsiveness setting to feeds and sections
- update items menu on mark-read-all regardless of menu-responsiveness
- respect RFC 3986 when detecting links in pager
[lld] Glob-based BP compression sort groups (#185661)
Add
--bp-compression-sort-section=<glob>[=<layout_priority>[=<match_priority>]]
to let users split input sections into multiple compression groups, run
balanced partitioning independently per group, and leave out sections
that are poor candidates for BP. This replaces the old coarse
--bp-compression-sort with a more explicit, user-controlled one.
In ELF, the glob matches input section names (.text.unlikely.cold1). In
Mach-O, it matches the concatenated segment+section name (__TEXT__text).
layout_priority controls group placement in the final layout.
match_priority resolves conflicts when multiple globs match the same
section: explicit priority beats positional matching, and among
positional specs the last match wins.
A CRTP hook getCompressionSubgroupKey() allows backends to further
subdivide glob groups into independent BP instances. This allows Mach-O
[3 lines not shown]