[clang][CUDA] Avoid ambiguity in host/device template specializations (#201049)
This commit changes SemaOverload to resolve an otherwise diagnosed
ambiguity between addresses of template specializations of functions
that are overloaded for both device and host. Similar to how it works
for non-templated function overloads, these changes prioritizes the
specializations that corresponds to the target of the owning function,
i.e. if compiling for host, the address of the host specialization takes
precedence over the device specialization and vice versa.
Fixes https://github.com/llvm/llvm-project/issues/199299
---------
Signed-off-by: Steffen Holst Larsen <sholstla at amd.com>
[BOLT] Fix data race in multi-threaded DWP type unit processing and DWP type unit duplication (#197359)
## Summary
This PR fixes a race condition in LLVM BOLT's
DIEBuilder::buildTypeUnits() that is triggered when DWARF5 split-DWARF
(.dwo/.dwp) inputs are processed with multi-threaded CU processing.
Concurrent invocations from different worker threads share the same DWP
type-unit state, which results in duplicated DIE extraction, assertion
failures, and intermittent crashes. The fix serializes buildTypeUnits()
for DWP inputs via a function-local static std::mutex, leaving the
non-DWO fast path unchanged.
## Problem Description
When BOLT processes DWARF debug info with --debug-thread-count=4
--cu-processing-batch-size=4 on testcase
dwarf5-df-types-dup-dwp-input.test, multiple threads concurrently call
DIEBuilder::buildTypeUnits() on shared DWP type units. Since type units
within a DWP file are shared across compilation units, multiple threads
may attempt to extract DIEs from the same type unit simultaneously,
violating the assertion.
[5 lines not shown]
[lldb] Enable MyST colon_fence and deflist extensions (NFC) (#201250)
Enable the colon_fence and deflist MyST parser extensions in the LLDB
docs configuration. This is a preparatory step for converting the
remaining reStructuredText documentation pages to Markdown, where these
two extensions are needed to translate RST admonition directives
(:::{note}) and definition lists.
Context:
https://discourse.llvm.org/t/rfc-make-myst-markdown-the-llvm-docs-format-rip-rest/
[docs][Kaleidoscope] fix function name InitializeModuleAndManagers in Kaleidoscope (#199601)
### Description
resloves #199477
The Kaleidoscope tutorial was not fully updated with the new Pass
Manager. This pr aligns the tutorial doc with the example code.
### Changes
- Use `InitializeModuleAndManagers` instead of
`InitializeModuleAndPassManager`.
- Remove `TheModule->setDataLayout(TheJIT->getDataLayout());` in line
141, as the `setDataLayout` was introduced later.
- Use `KaleidoscopeJIT` instead of `my cool jit` as the ModuleName, to
align with the final code.
[M68k] Add to LINK_COMPONENTS to fix BUILD_SHARED_LIBS build (#201248)
Fixes: 6897c5e24ce5 ("[M68k][MC] Add MC support for PCI w/ base
displacement addressing mode (#200696)")
[NVPTX] NVVMIntrRange: Handle maxntid > UINT32_MAX. (#201245)
Previously we computed the overall maxntid and downcast it to unsigned
int. This is not correct; it can be larger than UINT32_MAX.
This would cause reads of tid.xyz and ntid.xyz to have incorrect range
information. Also if maxntid was an exact multiple of 2^32, we'd get an
ICE (because we'd incorrectly think that maxntid is 0).
[CIR] Implement destruction of TLS and static global references (#200227)
This implements destruction of lifetime-extended reference temporaries
used to initialize TLS or static duration reference variables.
Assisted-by: Cursor / claude-opus-4.7
[CIR] Fix insertion point tracking for switch with cleanups (#201210)
We had some problems where we would incorrectly maintain the insertion
point for switch statements that contained cleanup scopes. This resulted
in cir.scope statements without a terminator, tripping a verification
error.
This change adds a RunCleanupsScope RAII object for the switch statement
and adds a check inside popCleanup() to avoid moving the insertion point
to the point after the now-closed cleanup scope if the insertion point
had previously been somewhere other than inside the cleanup scope.
Assisted-by: Cursor / claude-opus-4.8
[CIR] Coerce Direct args and returns in CallConvLowering (#195879)
Fourth PR in the split of #192119/#192124. Implements the
Direct-with-coercion path in CallConvLowering.
Every Direct argument or return whose ABI type differs from its source
type is now coerced through a store/reload roundtrip via an entry-block
alloca, mirroring classic codegen's CreateCoercedLoad/CreateCoercedStore.
The temporary alloca uses max(srcAlign, dstAlign) from the DataLayout and
is hoisted into the entry block so it composes with HoistAllocas
regardless of pipeline order. When the coerced type is larger than the
source -- e.g. a 12-byte aggregate returned as { i64, i64 } -- the slot is
sized to the larger type and accessed through a source-typed view for the
store and a destination-typed view for the load, so neither side
over-reads.
CallConvLowering is split into three phases (function-definition
coercion, call-site rewriting, and Ignore cleanup) because in-place
block-argument type changes from Direct-with-coerce otherwise confused the
[3 lines not shown]
[clang-sycl-linker][test] Improve dry-run mode and tighten test coverage (#200513)
- Rework `--dry-run` in `clang-sycl-linker` so it skips all real output
(writing bitcode, executing tools, etc.).
- The `link:`, `sycl-module-split:`, and a new `sycl-bundle:` summary
line are now gated on `-v` alone.
- Tighten `sycl-bundle:` checks in `basic.ll`, `split-mode.ll`, and
`triple.ll` to pin kind, triple, and arch (instead of just kind),
and add `-NOT: {{.+}}` after fully-covered dry-run check groups.
- replace the `clang-sycl-linker` + `llvm-objdump --offloading`
round-trip with a single `--dry-run -v` invocation.
- add dedicated `non-dry-run` mode test to verify code paths not exposed
in `dry-run`.
Assisted by Claude.
[X86][APX] Extend original LI to the same range as DstReg (#199182)
The #189222 folds NDD+Load to non-NDD when NDD memory variant not
preferred. However, this will changes DstReg from regular def to
early-clobber def, which causes "corrupted sub-interval" in
reMaterializeFor, because the OrigLI is not updated at the same time.
Fixes: https://godbolt.org/z/7n8ozz1EG
Assisted-by: Claude Sonnet 4.6
[libc] add shrink in-place support for reallocations (#200272)
This PR adds shrinking in-place for the freelist heap. This allows the
heap to reuse the place if the reallocation shrinks the size larger than
a minimal block unit.
Synthesized random action tests show that that increase heap utilization
rate from 87% to 97% percent, basically aligns with the expectation of
dlmalloc.
Assisted-by: AI tools, manually checked.
filesystems/py-fuse-bindings: Clean up fuse bl3
There was longstanding commented-out confusion about whether this
depended on some fuse implementation or the specific standard but
non-portable approach. Decide that mk/fuse.buildlink3.mk is the right
answer and just do that, without any commented-out alternatives.
filesystems/py-fuse-bindings: Update to 1.0.9
Upstream's new tests fail, and I don't think that's a pkgsrc bug, but
a test bug.
Works with bup!
Upstream NEWS:
bug fixes and minor improvements