[docs][Kaleidoscope] fix function name InitializeModuleAndManagers in Kaleidoscope (#199601)
### Description
resloves #199477
The Kaleidoscope tutorial was not fully updated with the new Pass
Manager. This pr aligns the tutorial doc with the example code.
### Changes
- Use `InitializeModuleAndManagers` instead of
`InitializeModuleAndPassManager`.
- Remove `TheModule->setDataLayout(TheJIT->getDataLayout());` in line
141, as the `setDataLayout` was introduced later.
- Use `KaleidoscopeJIT` instead of `my cool jit` as the ModuleName, to
align with the final code.
[M68k] Add to LINK_COMPONENTS to fix BUILD_SHARED_LIBS build (#201248)
Fixes: 6897c5e24ce5 ("[M68k][MC] Add MC support for PCI w/ base
displacement addressing mode (#200696)")
[NVPTX] NVVMIntrRange: Handle maxntid > UINT32_MAX. (#201245)
Previously we computed the overall maxntid and downcast it to unsigned
int. This is not correct; it can be larger than UINT32_MAX.
This would cause reads of tid.xyz and ntid.xyz to have incorrect range
information. Also if maxntid was an exact multiple of 2^32, we'd get an
ICE (because we'd incorrectly think that maxntid is 0).
[CIR] Implement destruction of TLS and static global references (#200227)
This implements destruction of lifetime-extended reference temporaries
used to initialize TLS or static duration reference variables.
Assisted-by: Cursor / claude-opus-4.7
[CIR] Fix insertion point tracking for switch with cleanups (#201210)
We had some problems where we would incorrectly maintain the insertion
point for switch statements that contained cleanup scopes. This resulted
in cir.scope statements without a terminator, tripping a verification
error.
This change adds a RunCleanupsScope RAII object for the switch statement
and adds a check inside popCleanup() to avoid moving the insertion point
to the point after the now-closed cleanup scope if the insertion point
had previously been somewhere other than inside the cleanup scope.
Assisted-by: Cursor / claude-opus-4.8
[CIR] Coerce Direct args and returns in CallConvLowering (#195879)
Fourth PR in the split of #192119/#192124. Implements the
Direct-with-coercion path in CallConvLowering.
Every Direct argument or return whose ABI type differs from its source
type is now coerced through a store/reload roundtrip via an entry-block
alloca, mirroring classic codegen's CreateCoercedLoad/CreateCoercedStore.
The temporary alloca uses max(srcAlign, dstAlign) from the DataLayout and
is hoisted into the entry block so it composes with HoistAllocas
regardless of pipeline order. When the coerced type is larger than the
source -- e.g. a 12-byte aggregate returned as { i64, i64 } -- the slot is
sized to the larger type and accessed through a source-typed view for the
store and a destination-typed view for the load, so neither side
over-reads.
CallConvLowering is split into three phases (function-definition
coercion, call-site rewriting, and Ignore cleanup) because in-place
block-argument type changes from Direct-with-coerce otherwise confused the
[3 lines not shown]
[clang-sycl-linker][test] Improve dry-run mode and tighten test coverage (#200513)
- Rework `--dry-run` in `clang-sycl-linker` so it skips all real output
(writing bitcode, executing tools, etc.).
- The `link:`, `sycl-module-split:`, and a new `sycl-bundle:` summary
line are now gated on `-v` alone.
- Tighten `sycl-bundle:` checks in `basic.ll`, `split-mode.ll`, and
`triple.ll` to pin kind, triple, and arch (instead of just kind),
and add `-NOT: {{.+}}` after fully-covered dry-run check groups.
- replace the `clang-sycl-linker` + `llvm-objdump --offloading`
round-trip with a single `--dry-run -v` invocation.
- add dedicated `non-dry-run` mode test to verify code paths not exposed
in `dry-run`.
Assisted by Claude.
[X86][APX] Extend original LI to the same range as DstReg (#199182)
The #189222 folds NDD+Load to non-NDD when NDD memory variant not
preferred. However, this will changes DstReg from regular def to
early-clobber def, which causes "corrupted sub-interval" in
reMaterializeFor, because the OrigLI is not updated at the same time.
Fixes: https://godbolt.org/z/7n8ozz1EG
Assisted-by: Claude Sonnet 4.6
[libc] add shrink in-place support for reallocations (#200272)
This PR adds shrinking in-place for the freelist heap. This allows the
heap to reuse the place if the reallocation shrinks the size larger than
a minimal block unit.
Synthesized random action tests show that that increase heap utilization
rate from 87% to 97% percent, basically aligns with the expectation of
dlmalloc.
Assisted-by: AI tools, manually checked.
filesystems/py-fuse-bindings: Clean up fuse bl3
There was longstanding commented-out confusion about whether this
depended on some fuse implementation or the specific standard but
non-portable approach. Decide that mk/fuse.buildlink3.mk is the right
answer and just do that, without any commented-out alternatives.
filesystems/py-fuse-bindings: Update to 1.0.9
Upstream's new tests fail, and I don't think that's a pkgsrc bug, but
a test bug.
Works with bup!
Upstream NEWS:
bug fixes and minor improvements
[CIR] Implement lowering for const-emitted global compound literals (#201152)
This came up in a test suite as a NYI, it is just emitting a
constant-backing literal for an initializer. These are specific to C, as
global compound literals have static storage duration in C. This patch,
just like classic codgen, just creates a '.compoundliteral' object as
backing for these variables, and lets us create references to them.
---------
Co-authored-by: Andy Kaylor <akaylor at nvidia.com>
[lldb] Stop hard-linking libpython into the dynamic Python plugin (#200530)
Drops ${Python3_LIBRARIES} from the SHARED build of
lldbPluginScriptInterpreterPython and lets undefined Python symbols
through at link time (`-undefined dynamic_lookup` on Darwin,
`--allow-shlib-undefined` on Linux; Windows keeps its existing
delay-load + import lib).
SystemInitializerFull::Initialize resolves the Python runtime loader
via ScriptInterpreterRuntimeLoader::Get(eScriptLanguagePython) and
calls Load() before initializing any plugin, so libpython is mapped
into the process before either entry point that references it: the
static script interpreter's Initialize() (which invokes Python via
the LLDB_PLUGIN_INITIALIZE loop) and the dynamic plugin's dlopen
(whose undefined references resolve against the in-process
libpython). This covers both LLDB_ENABLE_DYNAMIC_SCRIPTINTERPRETERS
=ON and =OFF, and keeps Windows working in static builds where the
delay-load thunks live in liblldb itself. The loader is
once_flag-cached, and errors propagate out via the existing Expected
[14 lines not shown]
filesystems/py-fuse-bindings: Adapt to python function deprecations
convert to wheel.mk
Now, importing fuse in python 3.13 succeeds, instead of failing with a
missing symbol, as one would expect from the undefined name warning
during the build.
Fix uninitialized variable warning in vdev_prop_get()
Update vdev_prop_get_objid() to set objid on error as the comment
in vdev_prop_get() describes.
"objid is set to 0 when absent and the few cases that call
zap_lookup directly guard against this below."
This resolves the following possible uninitialized variable warning.
module/zfs/vdev.c: In function ‘vdev_prop_get’:
module/zfs/vdev.c:6913:12: error: ‘objid’ may be used uninitialized
in this function [-Werror=maybe-uninitialized]
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #18616
sharenfs: Check for invalid characters
Check for invalid characters in sharenfs/sharesmb dataset props.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Tony Hutter <hutter2 at llnl.gov>
Closes #18613
[lldb] Add PythonRuntimeLoader for runtime libpython lookup (NFC) (#200524)
Generalizes the Windows-only Python lookup in PythonPathSetup into a
cross-platform abstraction. Adds an abstract ScriptInterpreterRuntimeLoader
with a per-language factory. The Python implementation dynamically loads Python
library into the current process.
The loader no-ops when Python is already in the process, then walks
LLDB_PYTHON_LIBRARY env override, the build-time Python
(LLDB_PYTHON_RUNTIME_LIBRARY_BUILD_PATH) and finally a platform candidate list:
- Darwin: DEVELOPER_DIR, the bundled Xcode.app, and Command Line Tools joined
against Python3.framework. Then python.org, /opt/homebrew, and /usr/local
joined against Python.framework. Then xcrun -f python3 and if that fails,
libpython3.dylib as a last resort.
- Linux: libpython3.so plus descending stable-ABI SONAMEs.
- Windows: the LLDB_PYTHON_RUNTIME_LIBRARY_FILENAME bare name (resolved via the
loader's default search list) and the exe-relative
LLDB_PYTHON_DLL_RELATIVE_PATH fallback (built off GetModuleFileNameW).
[5 lines not shown]
[clang-linker-wrapper] Drop SYCL dry-run stub-image special case (#201222)
Remove the `DryRun` branch in `bundleSYCL` that emitted a stub
`OffloadBinary`. SYCL goes through the same empty-buffer path as other
offload kinds, so the special case is no longer needed.
Update `linker-wrapper-image.c` to expect the resulting `[0 x i8]
zeroinitializer` constant and a size of `0` in the register/unregister
calls.
Assisted by Claude.
[CIR] Set ExternalWeakLinkage on weak/weak_import function declarations (#198422)
Classic CodeGen's `SetFunctionAttributes` calls `setLinkageForGV` to force `ExternalWeakLinkage` on `__attribute__((weak))` and Darwin `weak_import` declarations. CIR had no equivalent: weak function declarations were emitted with `ExternalLinkage` instead of `ExternalWeakLinkage`.
This adds `setLinkageForFunction` — the same weak/external-weak logic as `setLinkageForGV` — and calls it from `setFunctionAttributes`. The underlying crash on inline forward declarations (the original motivation) is already fixed by #195257; what remains is this linkage gap.
`inline-forward-decl.c` covers `__attribute__((weak))` on an inline forward declaration; `func-linkage-weak-import.c` covers Darwin `weak_import` (→ `extern_weak` in CIR and LLVM).
[NVPTX] Fix aggregate load/store lowering for (potentially) overlapping copies (#201177)
NVPTXLowerAggrCopies lowers load/store pairs of large values into a loop
of smaller copies.
However, it was incorrectly assuming that the load/store pairs it found
never alias.
This patch adds an alias check. If the pointers may alias, we emit a
memmov, which handles overlap correctly.
CUDA reproducer:
typedef char vec __attribute__((vector_size(256)));
__global__ void boom(char *p) {
*(vec *)(p + 8) = *(vec *)p;
}