[SelectionDAG] Add ISD::ABS_MIN_POISON to preserve poision semantics of llvm.abs (#183851)
SelectionDAGBuilder previously dropped the is_int_min_poison flag on
llvm.abs, lowering both variants to ISD::ABS. This is unsound for
certain targets like NVPTX whose native abs.s is poison on INT_MIN. This
PR adds a new ISD::ABS_MIN_POISON opcode, emits it for llvm.abs(x, true)
and threads through legalization, SDAG folding. The default action for
this is Expand with a fallback to the original ISD::ABS. DAGCombiner
adds visitABS_MIN_POISON which mirrors visitABS and two new folds:
abs_min_poison(freeze(abs x)) -> freeze(abs x) and
abs_min_poison(sign_extend_inreg x) → zext(abs(trunc x)). PromoteIntRes,
ExpandIntRes, and widenAbs now emit ABS_MIN_POISON whenever the input
provably can't be the wide INT_MIN. NVPTX fix to match abs.s against
abs_min_poison and tests updated accordingly.
[mlir][vector] Add fold to transfer_{read,write} vector<1xT> (#196598)
vector.transfer_read and vector.transfer_write's permutations maps are
irrelevant with vector<1xT>. This pattern unblocks lowerings to
vector.load and vector.store.
Assisted-By: Claude Opus 4.6
[clang-sycl-linker] Add per-translation-unit device code split mode (#197571)
Adds `source` split mode to `clang-sycl-linker`, driven by the
`sycl-module-id` function attribute emitted by the CFE.
`source` is the default mode and groups kernels by the value of their
`sycl-module-id` attribute, emitting one device image per translation
unit.
If the linked module contains no entry points, no splitting happens.
The `EntryPointCategorizer` in `ClangSYCLLinker.cpp` is refactored into
a class (instead of a stateful lambda) to support both per-kernel and
per-TU modes cleanly.
Also fix a potential buffer invalidation bug in sycl::writeSymbolTable
where appending symbol names could reallocate the output buffer while
pointers into it were still live.
Co-Authored-By: Claude
[2 lines not shown]
[Clang][CUDA] Introduce support for 'f' GPU variants and feature test macros (#197584)
- Clang now accepts 'f' GPU variants as the target for sm_100+ GPUs.
- `__CUDA_ARCH_SPECIFIC__` and `__CUDA_ARCH_FAMILY_SPECIFIC__` are now
defined to allow distinguishing `a`/`f`/base GPU vaiants.
- refactored BuiltinsNVPTX.td to handle availability quirks introduced
by the 'f' variants, and to simplify additions of new GPU/PTX variants
to just adding a number to a list.
- bulk test changes to deal with the tablegen-generated strings.
[lldb] Add missing calls to SetStatus in CommandObjectPlatform (#197548)
Add missing calls to CommandReturnObject::SetStatus in
CommandObjectPlatform. I replaced some calls to AppendMessageWithFormatv
with AppendErrorWithFormatv because the latter implicitly sets the
status.
[flang] Implement -Wno-<warning> flags for driver diagnostics (#196354)
Utilize clang::ProcessWarningOptions function to process -Wno-...
options.
This has the side effect that without additional changes it would cause
driver warnings to become errors with -Werror. That would be a change
from the existing behavior, so make sure that these warnings remain
unaffected.
Modify the diagnostic emitter to add the disabling option at the end of
the emitted diagnostic.
Fixes https://github.com/llvm/llvm-project/issues/195921
---------
Co-authored-by: Tarun Prabhu <tarun at lanl.gov>
[lldb] Use current breakpoint location in TestBreakpointLocationDot.py (NFC) (#197472)
Replaces `FindLocationByID(1)` with `FindLocationByID(loc_id)`. Instead
of hard-coding an ID of 1, the location ID is determined from
`GetStopReasonDataAtIndex`.
Some of these asserts were failing on a windows CI, because the
breakpoint was resolved to **two** locations.
[lldb] Update calls to VerifyBreakpointIDs to handle dummy targets (#197088)
Follow up to #194272. In that change, `VerifyBreakpointIDs` had its
signature changed to take an `ExecutionContext` instead of a `Target`.
In updating the call-sites, `m_exe_ctx` was used. However, in some of
those places, the `target` argument reflected the use of a dummy target.
The switch broke situations where a proper target does not exist.
This update changes those calls back to using the `target` variable,
which may be the dummy target.
The `const` change to `ExecutionContext &` parameters is to support the
passing of `&target`.
[CIR] Lower bool bit-fields correctly (#197085)
`bool` bit-fields like `bool flag : 1;` trip an assertion in CIR codegen
because `cir.set_bitfield` and `cir.get_bitfield` are constrained to
take and produce `CIR_IntType` values, but CIRGen was passing
`convertType(boolType)` (= `!cir.bool`) as the op's result type. Both
load and store paths fail with:
Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"'
failed
[To = mlir::detail::TypedValue<cir::IntType>, From = mlir::OpResult]
This is the second-largest libcxx-with-CIR blocker behind #197068 — ~119
of 1,494 fails in our May 11 `std/` baseline, mostly in `<format>`
parsing state where bool bit-fields are common.
The fix mirrors classic CodeGen at the bit-field boundary: widen the
bool source to the storage integer type for the op call, and narrow the
integer result back to bool with `int_to_bool` so callers that consume
[11 lines not shown]
Restore clang-doc arena allocation (#197595)
This patch restores the commits reverted in
49f8ad172446dd54dd14c2333a7a0f638e37d05a.
It effectively reapplies:
Reapply "[clang-doc][nfc] Default initialize all StringRef members
(#191641)"
Reapply "[clang-doc] Initialize StringRef members in Info types
(#191637)"
Reapply "[clang-doc] Initialize member variable (#191570)"
Reapply "[clang-doc] Merge data into persistent memory (#190056)"
Reapply "[clang-doc] Support deep copy between arenas for merging
(#190055)"
This version has been updated to account for the new InfoNode<T>
paradigm and APIs introduced earlier. Its logic is largely unchanged: it
still performs deep copies between the transient and persistent arenas,
[3 lines not shown]
[Instrumentor] Add support for modules and globals (#197535)
We can emit callbacks when a module is loaded/unloaded and before
globals are initialized and "removed". Both happens in newly introduced constructors and destructors.
[AMDGPU] Restore lit() to be accepted as 64-bit FP operand
It is worth to note that immediate value comes already truncated
to this point of validation.
[libc][hdrgen] Extend guard attribute support for types (#191663)
Closes #187404
- Add support for an optional guard attribute on types in hdrgen YAML
input.
- Parse and validate guard from YAML in yaml_to_classes.py, ensuring
guard macros have macro_header in the same YAML file.
- Introduce emit_guard a function that extracts the common logic between
guarded types and guarded functions.
- Add integration tests for both type guarding and function guarding
---------
Co-authored-by: un-pixelated <masterhc321 at gmail.com>
[clang][deps] Consolidate types into new `DependencyConsumer.h` (#197772)
This PR pulls the `DependencyConsumer` type out of
`DependencyScanningWorker.h` into its own header. Just a cleanup, NFC.
[libc][math] Fix UBSan errors from left-shifting negative values (#197747)
Replace left-shift operations on potentially negative exponent values
with mathematically equivalent multiplication to avoid undefined
behavior. When computing exponential functions for inputs that produce
results less than 1, the exponent 'hi' can be negative (e.g., -2 for
exp(-1.0) ≈ 0.368). Left-shifting negative values is undefined behavior
in C++.
Fixed in:
- exp.h: 3 instances
- exp2.h: 3 instances
- exp10.h: 3 instances
- expm1.h: 2 instances
- exp_utils.h: 1 instance (hi + 1022 can be negative)
Exposed by 2b2a63819f9f.
Co-Authored-By: Claude Sonnet 4.5 <noreply at anthropic.com>
[3 lines not shown]
[Clang][AArch64] Add missing lvalue-to-rvalue conversions for MTE built-ins (#197621)
This patch adds missing lvalue-to-rvalue conversions for the second
argument of `__builtin_arm_irg()` and both arguments of
`__builtin_arm_gmi()`.
[clang][deps] Expose the tracing VFS directly (#197775)
This adds new `DependencyScanningWorker::getTracingVFS()` API that
allows direct access to the tracing VFS, if present. This replaces the
call to `vfs::FileSystem::visit()` in clang-scan-deps. This will allow
removing `DependencyScanningWorker::getVFS()` and simplifying VFS
overlay handling in a follow-up PR.
[CIR] Lower aligned operator new in CIRGen (and matching cleanup-during-new) (#197094)
Two related `errorNYI`s in CIRGen for C++17 aligned `operator new` /
`delete`:
- `emitCXXNewExpr` `errorNYI`'d on `"emitCXXNewExpr: pass alignment"`
when the allocator is the `align_val_t`-overloaded form (called whenever
you `new` an over-aligned type).
- The cleanup-during-new path (the `CallDeleteDuringNew` cleanup that
fires if a ctor throws after `operator new` returns) `errorNYI`'d on the
matching `"CallDeleteDuringNew: aligned allocation"`.
The non-cleanup delete path was already wired up; this just plumbs the
matching new side.
Both fixes are direct ports of `clang/lib/CodeGen/CGExprCXX.cpp`. In
`emitCXXNewExpr` we look up the alignment parameter type from the
allocator prototype (or fall back to `size_t` for the variadic corner
case classic also handles), build a constant of that type, and append it
[35 lines not shown]
[Instrumentor] Add support for modules and globals
We can emit callbacks when a module is loaded/unloaded and before
globals are initialized. Both happens in newly introduced constructors
and destructors.
[Instrumentor] Add a property filter for static properties
The user can define static filters in the json to limit instrumentation
to opportunities that match the static expression, e.g., is_volatile==1.
The matcher logic is pretty basic for now. Integer comparisons, pointer
null checks, string equalities and startswith are supported.
The commit was prepared with Claude (AI) and modified/tested by me.
[AMDGPU] Implement -amdgpu-spill-cfi-saved-regs
These spills need special CFI anyway, so implementing them directly
where CFI is emitted avoids the need to invent a mechanism to track them
from ISel.
Change-Id: If4f34abb3a8e0e46b859a7c74ade21eff58c4047
Co-authored-by: Scott Linder scott.linder at amd.com
Co-authored-by: Venkata Ramanaiah Nalamothu VenkataRamanaiah.Nalamothu at amd.com