[clang][AVR] Add basic AVR builtin functions (#203214)
Adds support for AVR specific builtin functions as defined in:
https://gcc.gnu.org/onlinedocs/gcc/AVR-Built-in-Functions.html
The simpler builtins have been implemented: nop, sei, cli, sleep, wdr,
swap. And they are lowered to their llvm.avr.* intrinsics.
---------
Signed-off-by: Dakkshesh <beakthoven at gmail.com>
Reland "[clang][ssaf] Track target triple in TU and LU summaries" (#204218)
This commit introduces the following changes:
- Add `TargetTriple` field to `TUSummary`, `LUSummary`, and their encodings.
- Frontend captures the triple from `CompilerInstance::getTarget()` when extracting a TU summary.
- JSON format reads/writes a `target_triple` field at the root of each summary; reader rejects strings not in `llvm::Triple::normalize` form.
- All TU/LU JSON test inputs/outputs and unit tests updated to include the new field.
- `TargetParser` is added to `LLVM_LINK_COMPONENTS` for `clangScalableStaticAnalysisFrameworkCore`, which provides `Triple::normalize` and `Triple(string&&)` constructor that the `JSONFormat` sources reference.
`clang-ssaf-linker` uses a hardcoded triple for the link unit; surfacing the triple through the tool will be handled in a follow-up PR.
rdar://179403011
Make sanitizer special case list slash-agnostic (#149886)
This changes the glob matcher for the sanitizer special case format so
that it treats `/` as matching both forward and back slashes.
When dealing with cross-compiles or build systems that don't normalize
slashes, it's possible to run into file paths with inconsistent
slashiness, e.g. `../..\v8/include\v8-internal.h` when [building
chromium](https://g-issues.chromium.org/issues/425364464).
We can match this using the current syntax using this ugly kludge:
`src:*{/,\\}v8{/,\\}*`. However, since the format is explicitly for
listing file paths, it makes sense to treat `/` as denoting a path
separator rather than a literal forward slash. This allows us to write
the much more natural form `src:*/v8/*` and have it work on any
platform.
This is technically a behavior change, but it seems very unlikely to
come up in practice. It will only make a difference if a user has a
[9 lines not shown]
[scudo] Use the unmap function on MemMap object. (#204001)
The current call does a unmap(MemMap), but the rest of the code is doing
MemMap.unmap(XXX), so follow that pattern.
[flang][cuda] Avoid runtime copies for scalar constant host reads (#204193)
Fix CUDA Fortran lowering for host reads from scalar module variables
with the `constant` attribute.
Host code can read and write CUDA constants, while kernels read the
device constant symbol. Flang keeps a host-visible value for scalar
constant host accesses and uses a device symbol for kernels.
After preserving the host declaration, scalar read-backs such as `x = c`
could still be lowered as device-to-host runtime copies, passing a host
pointer as the CUDA source. This change lowers those read-backs as
regular host load/store operations, while keeping the runtime update for
host-to-device assignments.
[AMDGPU] Refine i8 extractelement cost model (#203932)
Expand the cases when i8 extract elements are free. The extract elements
should be free when they are part of a sequence that extract multiple
consecutive elements the size of a register. This change enables the
SLPVectorizer to keep extract elements over more costly shufflevectors.
This PR also undoes a previous change that made insert element free, but
those require sequences of shift/or instructions so shouldn't be free.
[lit] Avoid profraw filename collisions with --per-test-coverage (#203998)
Per-test-coverage derived the `LLVM_PROFILE_FILE` name from the test's
basename with its extension removed, so siblings that share a basename
but differ by directory or extension (e.g. foo.c and foo.cpp in one
directory) wrote into the same profraw file and raced on it.
This PR builds the name from the full path in the suite and adds the
`%p` and `%m` placeholders so a test that runs several instrumented
binaries gets a distinct file per process and per binary, even across
exec chains or recycled process ids.
[flang][OpenACC] Support COLLAPSE on DO CONCURRENT (#203085)
Lower a COLLAPSE clause on a DO CONCURRENT when the collapse value
equals the number of concurrent controls, matching the equivalent
nested-DO collapse form, and route the loop body into the collapsed
acc.loop. Emit specific not-yet-implemented diagnostics for the
collapse-less-than and collapse-greater-than control-count cases, and a
-Wportability warning for this non-standard extension.
Collapse of mismatched control cases will require a little more invasive
change, so I will submit that as a follow up PR if it is okay, if
desired I could fix the lowering for those two cases now.
[mlir][docs] Add page for third-party tutorials (#188080)
Add a new page to the MLIR documentation that links to the
upstream Lighthouse project as well as additional third-party tutorials.
The goal is to make it easier for newcomers to discover MLIR learning
resources beyond the Toy tutorial.
The underlying discussion/RFC can be found
[here](https://discourse.llvm.org/t/rfc-tutorial-a-beginner-friendly-end-to-end-mlir-compiler-pipeline/89788).
[SROA] Extend tree-structured merge to handle init + RMW pattern (#194441)
## Problem
When SROA rewrites an alloca used as a read-modify-write accumulator, it
emits a linear chain of `shufflevector + select` per partial store.
`InstCombine`'s `SimplifyDemandedVectorElts` walks this chain
recursively per element, scaling quadratically with chain length — in
practice tens of seconds of compile time on some matmul kernels.
## Example
Take an `<8 x float>` alloca initialized once and then updated in 4
chunks of 2 elements each:
```llvm
%alloca = alloca <8 x float>
store <8 x float> %init, ptr %alloca ; full init
[104 lines not shown]
[RISCV] Rename VPseudoTernaryMaskPolicy->VPseudoReductionMaskPolicy. NFC (#204053)
This makes it clearer why this class doesn't set UsesMaskPolicy and can
prevent accidental misuse in the future.
[PAC][clang] Fix ptrauth module flags behavior
The `Error` merge behavior only has effect when module flags values
mismatch, while it allows the flag being present in one module and
absent in another one.
Always emit `ptrauth-elf-got` module flag for AArch64 targets and
`ptrauth-sign-personality` module flag for AArch64 Linux targets.
The value is either 0 or 1.
[OpenMP] Introduce the ompx_name clause for kernel naming
This adds support for the ompx_name clause that allows users to specify
custom kernel names for OpenMP target offloading regions. The clause
accepts a string literal and overrides the default compiler-generated
kernel names.
Example usage:
#pragma omp target ompx_name("my_kernel")
{ ... }
Kernel names need to be unique or they are diagnosed at compile or link
time as errors.
Co-Authored-By: Claude (claude-sonnet-4.5) <noreply at anthropic.com>
[OpenMP] Use ext linkage for kernels handles and globals handles keep linkage
Host handles are now emmitted with external linkage to clash if two
kernels with the same name are registered. This could have happen right
now and silently corrupt the program, but it can happen more easily once
we allow users to name their kernels.
In the same patch we make global variable handles retain the linkage of
the global variable, forcing clashes for external ones and continue to
support weak use cases.
[XRay][Hexagon] Use PC-rel addressing for runtime globals in trampoline (#203122)
The trampolines load the runtime handler globals
(__xray::XRayPatchedFunction and friends) with absolute
constant-extended immediates, which cannot be used in a PIC/PIE link, so
linking a default-PIE executable against the xray runtime fails -- and
-fPIC on user code does not help, the bad relocations are inside the
runtime archive:
ld.lld: error: relocation R_HEX_32_6_X cannot be used against symbol
'__xray::XRayPatchedFunction'; recompile with -fPIC
[XRay][Hexagon] Fix immext encoding of high bits in sled patcher (#203129)
encodeConstantExtender() places the high 12 bits of the 26-bit extension
at the wrong offset (<<16 instead of <<2), dropping them for any
constant above ~2^20. The runtime sled patcher then encodes a corrupted
trampoline address for PIE executables (load base 0x08000000+), so the
first patched function call jumps to a bogus address and crashes.
[DirectX] Generate PDB file with debug info (#202762)
This change adds DXContainerPDB pass for DirectX pipeline.
The pass creates PDB file containing sections with shader debug
information. PDB files comply with the format used by existing DirectX
debugging tools.
---------
Co-authored-by: Vladislav Dzhidzhoev <vdzhidzhoev at accesssoftek.com>
[flang][Semantics] Warn on repeated do-variable in nested I/O implied DO (#198757)
Fixes #198528
Add a warning when an io-implied-do's do-variable appears as, or is
associated with, the do-variable of a containing io-implied-do. This
diagnoses violations of Fortran 2023 12.6.3p7:
>The do-variable of an io-implied-do that is in another io-implied-do
shall not appear as, nor be associated with, the do-variable of the
containing io-implied-do.
Since this is not a constraint, a warning is emitted rather than an
error. As suggested in the associated issue, the warning is on by
default and can be suppressed with `-Wno-io-implied-do-index-conflict`.
The check detects:
- Direct name reuse (same symbol in inner and outer implied DO)
- Association via EQUIVALENCE
[14 lines not shown]
[lldb] Add unit tests for the MCP server (#202752)
Add unit-test coverage for the MCP protocol types and server under
source/Protocol/MCP and the MCP plugin under
source/Plugins/Protocol/MCP.
The Server handlers run over the in-memory TestTransport, which gains
SimulateError/SimulateClosed/SetRegisterMessageHandlerShouldFail helpers
to drive the handler lifecycle without a real socket.
Code that touches the filesystem or otherwise requires mucking with the
test environment are deliberately left uncovered until those layers can
be mocked.
Assisted-by: Claude
[lldb] Strip code pointers in lldb-server test binary on arm64e (#203988)
Otherwise an unstripped pointer will be sent to debugserver. LLDB strips
pointers before sending them to debugserver, so debugserver does not
know how to handle it.
This fixes TestGdbRemoteSingleStep.py, TestGdbRemote_qMemoryRegion.py,
and TestGdbRemote_vCont.py on arm64e.