[docs][Kaleidoscope] fix function name InitializeModuleAndManagers in Kaleidoscope (#199601)
### Description
resloves #199477
The Kaleidoscope tutorial was not fully updated with the new Pass
Manager. This pr aligns the tutorial doc with the example code.
### Changes
- Use `InitializeModuleAndManagers` instead of
`InitializeModuleAndPassManager`.
- Remove `TheModule->setDataLayout(TheJIT->getDataLayout());` in line
141, as the `setDataLayout` was introduced later.
- Use `KaleidoscopeJIT` instead of `my cool jit` as the ModuleName, to
align with the final code.
[M68k] Add to LINK_COMPONENTS to fix BUILD_SHARED_LIBS build (#201248)
Fixes: 6897c5e24ce5 ("[M68k][MC] Add MC support for PCI w/ base
displacement addressing mode (#200696)")
[NVPTX] NVVMIntrRange: Handle maxntid > UINT32_MAX. (#201245)
Previously we computed the overall maxntid and downcast it to unsigned
int. This is not correct; it can be larger than UINT32_MAX.
This would cause reads of tid.xyz and ntid.xyz to have incorrect range
information. Also if maxntid was an exact multiple of 2^32, we'd get an
ICE (because we'd incorrectly think that maxntid is 0).
[CIR] Implement destruction of TLS and static global references (#200227)
This implements destruction of lifetime-extended reference temporaries
used to initialize TLS or static duration reference variables.
Assisted-by: Cursor / claude-opus-4.7
[CIR] Fix insertion point tracking for switch with cleanups (#201210)
We had some problems where we would incorrectly maintain the insertion
point for switch statements that contained cleanup scopes. This resulted
in cir.scope statements without a terminator, tripping a verification
error.
This change adds a RunCleanupsScope RAII object for the switch statement
and adds a check inside popCleanup() to avoid moving the insertion point
to the point after the now-closed cleanup scope if the insertion point
had previously been somewhere other than inside the cleanup scope.
Assisted-by: Cursor / claude-opus-4.8
[CIR] Coerce Direct args and returns in CallConvLowering (#195879)
Fourth PR in the split of #192119/#192124. Implements the
Direct-with-coercion path in CallConvLowering.
Every Direct argument or return whose ABI type differs from its source
type is now coerced through a store/reload roundtrip via an entry-block
alloca, mirroring classic codegen's CreateCoercedLoad/CreateCoercedStore.
The temporary alloca uses max(srcAlign, dstAlign) from the DataLayout and
is hoisted into the entry block so it composes with HoistAllocas
regardless of pipeline order. When the coerced type is larger than the
source -- e.g. a 12-byte aggregate returned as { i64, i64 } -- the slot is
sized to the larger type and accessed through a source-typed view for the
store and a destination-typed view for the load, so neither side
over-reads.
CallConvLowering is split into three phases (function-definition
coercion, call-site rewriting, and Ignore cleanup) because in-place
block-argument type changes from Direct-with-coerce otherwise confused the
[3 lines not shown]
[clang-sycl-linker][test] Improve dry-run mode and tighten test coverage (#200513)
- Rework `--dry-run` in `clang-sycl-linker` so it skips all real output
(writing bitcode, executing tools, etc.).
- The `link:`, `sycl-module-split:`, and a new `sycl-bundle:` summary
line are now gated on `-v` alone.
- Tighten `sycl-bundle:` checks in `basic.ll`, `split-mode.ll`, and
`triple.ll` to pin kind, triple, and arch (instead of just kind),
and add `-NOT: {{.+}}` after fully-covered dry-run check groups.
- replace the `clang-sycl-linker` + `llvm-objdump --offloading`
round-trip with a single `--dry-run -v` invocation.
- add dedicated `non-dry-run` mode test to verify code paths not exposed
in `dry-run`.
Assisted by Claude.
[X86][APX] Extend original LI to the same range as DstReg (#199182)
The #189222 folds NDD+Load to non-NDD when NDD memory variant not
preferred. However, this will changes DstReg from regular def to
early-clobber def, which causes "corrupted sub-interval" in
reMaterializeFor, because the OrigLI is not updated at the same time.
Fixes: https://godbolt.org/z/7n8ozz1EG
Assisted-by: Claude Sonnet 4.6
[libc] add shrink in-place support for reallocations (#200272)
This PR adds shrinking in-place for the freelist heap. This allows the
heap to reuse the place if the reallocation shrinks the size larger than
a minimal block unit.
Synthesized random action tests show that that increase heap utilization
rate from 87% to 97% percent, basically aligns with the expectation of
dlmalloc.
Assisted-by: AI tools, manually checked.
[CIR] Implement lowering for const-emitted global compound literals (#201152)
This came up in a test suite as a NYI, it is just emitting a
constant-backing literal for an initializer. These are specific to C, as
global compound literals have static storage duration in C. This patch,
just like classic codgen, just creates a '.compoundliteral' object as
backing for these variables, and lets us create references to them.
---------
Co-authored-by: Andy Kaylor <akaylor at nvidia.com>
[lldb] Stop hard-linking libpython into the dynamic Python plugin (#200530)
Drops ${Python3_LIBRARIES} from the SHARED build of
lldbPluginScriptInterpreterPython and lets undefined Python symbols
through at link time (`-undefined dynamic_lookup` on Darwin,
`--allow-shlib-undefined` on Linux; Windows keeps its existing
delay-load + import lib).
SystemInitializerFull::Initialize resolves the Python runtime loader
via ScriptInterpreterRuntimeLoader::Get(eScriptLanguagePython) and
calls Load() before initializing any plugin, so libpython is mapped
into the process before either entry point that references it: the
static script interpreter's Initialize() (which invokes Python via
the LLDB_PLUGIN_INITIALIZE loop) and the dynamic plugin's dlopen
(whose undefined references resolve against the in-process
libpython). This covers both LLDB_ENABLE_DYNAMIC_SCRIPTINTERPRETERS
=ON and =OFF, and keeps Windows working in static builds where the
delay-load thunks live in liblldb itself. The loader is
once_flag-cached, and errors propagate out via the existing Expected
[14 lines not shown]
[lldb] Add PythonRuntimeLoader for runtime libpython lookup (NFC) (#200524)
Generalizes the Windows-only Python lookup in PythonPathSetup into a
cross-platform abstraction. Adds an abstract ScriptInterpreterRuntimeLoader
with a per-language factory. The Python implementation dynamically loads Python
library into the current process.
The loader no-ops when Python is already in the process, then walks
LLDB_PYTHON_LIBRARY env override, the build-time Python
(LLDB_PYTHON_RUNTIME_LIBRARY_BUILD_PATH) and finally a platform candidate list:
- Darwin: DEVELOPER_DIR, the bundled Xcode.app, and Command Line Tools joined
against Python3.framework. Then python.org, /opt/homebrew, and /usr/local
joined against Python.framework. Then xcrun -f python3 and if that fails,
libpython3.dylib as a last resort.
- Linux: libpython3.so plus descending stable-ABI SONAMEs.
- Windows: the LLDB_PYTHON_RUNTIME_LIBRARY_FILENAME bare name (resolved via the
loader's default search list) and the exe-relative
LLDB_PYTHON_DLL_RELATIVE_PATH fallback (built off GetModuleFileNameW).
[5 lines not shown]
[clang-linker-wrapper] Drop SYCL dry-run stub-image special case (#201222)
Remove the `DryRun` branch in `bundleSYCL` that emitted a stub
`OffloadBinary`. SYCL goes through the same empty-buffer path as other
offload kinds, so the special case is no longer needed.
Update `linker-wrapper-image.c` to expect the resulting `[0 x i8]
zeroinitializer` constant and a size of `0` in the register/unregister
calls.
Assisted by Claude.
[CIR] Set ExternalWeakLinkage on weak/weak_import function declarations (#198422)
Classic CodeGen's `SetFunctionAttributes` calls `setLinkageForGV` to force `ExternalWeakLinkage` on `__attribute__((weak))` and Darwin `weak_import` declarations. CIR had no equivalent: weak function declarations were emitted with `ExternalLinkage` instead of `ExternalWeakLinkage`.
This adds `setLinkageForFunction` — the same weak/external-weak logic as `setLinkageForGV` — and calls it from `setFunctionAttributes`. The underlying crash on inline forward declarations (the original motivation) is already fixed by #195257; what remains is this linkage gap.
`inline-forward-decl.c` covers `__attribute__((weak))` on an inline forward declaration; `func-linkage-weak-import.c` covers Darwin `weak_import` (→ `extern_weak` in CIR and LLVM).
[NVPTX] Fix aggregate load/store lowering for (potentially) overlapping copies (#201177)
NVPTXLowerAggrCopies lowers load/store pairs of large values into a loop
of smaller copies.
However, it was incorrectly assuming that the load/store pairs it found
never alias.
This patch adds an alias check. If the pointers may alias, we emit a
memmov, which handles overlap correctly.
CUDA reproducer:
typedef char vec __attribute__((vector_size(256)));
__global__ void boom(char *p) {
*(vec *)(p + 8) = *(vec *)p;
}
[lldb][debugserver] Arguments to kill(2) are reversed (#201226)
This codepath is only executed as an attempt to clean up during a failed
launch, so the reversed arguments were rarely actually used.
rdar://175507620
[docs] Migrate 22 popular LLVM docs to MyST
This was done with LLM assistance.
I opened all 22 docs in a browser and scrolled through them, catching
and fixing a few errors.
[docs] Rename 20 popular LLVM docs .rst -> .md
Update filename references, but leave the docs with reST syntax to
ensure rename detection works.
I updated filename references so that the docs build to pass premerge
checks.
[VPlan] Don't expand SCEVs without uses to VPInstructions (NFC). (#201221)
If a VPExpandSCEVRecipe does not have users, there's no benefit to
expand it to VPInstructions, which then have to get cleaned up.
This also prevents DCE from removing VPInstructions pointed to by
TripCount after expansion.
[lldb] Have TestRunLocker run both styles of launch (#200978)
While debugging flakey behavior with TestRunLocker, I noticed that is
intended to run its test once with a stop at the entry function (and
then Continues) and once where we launch to the main() loop. But we were
never exercising the stop-at-entry codepath.
This doesn't fix the flakey behavior, although that only happens with
the launch-directly-into-main() codepath; I don't get failures when I
stop at the entry point and then continue.
[ORC] Make SimpleExecutorDylibManager::resolve an instance method. (#201211)
Promote the lambda inside resolveWrapper to a public method on
SimpleExecutorDylibManager. This brings SimpleExecutorDylibManager into
better alignment with the NativeDylibManager implementation in the new
ORC runtime, and is a step towards allowing NativeDylibManager to be
used as a drop-in replacement for SimpleExecutorDylibManager.
[RISCV][GISel] Add GPRPair to GPRB register bank and use getXLen() for GPRSize
Map GPRPair register classes to the GPRB register bank during GlobalISel
instruction selection. This is required because the introduction of HwMode-dependent
base pointer register classes (e.g. via PtrRegClassByHwMode) causes TableGen to
emit register bank checks for GPRPair variants in RISCVGenGlobalISel.inc.
Without this mapping, instruction selection crashes on unsupported classes.
To avoid assertion failures when GPRB's maximum size increases to 128-bit on RV64
due to the register pairs, update RISCVRegisterBankInfo::getInstrMapping to query
Subtarget.getXLen() for the scalar register width instead of relying on the bank's
getMaximumSize(). This matches AArch64's design pattern of mapping register pairs
(XSeqPairsClass) to GPR and resolving scalar register sizes dynamically.
This was fine previously but was exposed by the HwMode changes in
https://github.com/llvm/llvm-project/pull/177073.
Pull Request: https://github.com/llvm/llvm-project/pull/200510
[mlir][bytecode] Add option to elide locations during serialization (#201183)
Adds a setElideLocations option to BytecodeWriterConfig to elide
locations during bytecode serialization. When enabled, all LocationAttrs
are mapped to UnknownLoc during numbering and writing to produce
location-invariant bytecode (e.g., for stable fingerprinting).
Another way to achieve the same thing would be to apply the
strip-debuginfo pass,
but that requires mutating the module, which in turn requires cloning
the module if one still requires the unstripped original.
Assisted-by: Antigravity / Gemini
[cmake] Fix host tool path with driver build on Windows (#199152)
On Windows, the llvm-shlib dylib build uses the llvm-nm host tool to
make all symbols visible by default. The LLVM_TOOL_LLVM_DRIVER_BUILD=ON
build would fail because $<TARGET_FILE:llvm-nm> was invalid. This change
passes the name of the symlink / executable copy as a custom property so
things work out and the llvm-nm.exe host tool can be found.