[X86] Quote symbol names that collide with registers/keywords in Intel syntax (#186570)
When outputting Intel syntax assembly, symbol names that match register
names (e.g., `rsi`, `rax`) or keywords (`byte`, `ptr`, etc.) must be
quoted, otherwise the assembler parses them as registers/keywords
instead of symbol references.
Fix this by populating MCAsmInfo::ReservedIdentifiers with all X86
register names and Intel syntax keywords. isValidUnquotedName() checks
this set and forces quoting when a symbol name matches.
```
% cat rsi.c
void rsi(void); void foo(void) { rsi(); }
// old clang
% clang -c rsi.c -save-temps -masm=intel -fno-pic -o - | llvm-objdump -dr -
...
4: ff d6 callq *%rsi
[14 lines not shown]
[clang][NFC] Prevent scope pollution from repeat type specifiers
Fixes #187664
When parsing `type-specifier {class,union,struct,enum,etc} nested-name`
ParseClassSpecifier and ParseEnumSpecifier both operated on the current
declaration scope on the assumption that they were the only type
specifier. Of course in incorrect code that assumption is false, and
as a result when parsing the name specifier they would pollute the
the real scope.
This is not relevant to the semantic correctness: the error is detected
and reported. The problem is that the subsequent state is not correct,
though not in a way that impacts functional behavior of release builds.
In assertion builds however this is detected (via a somewhat obtuse path)
when we attempt to plant namespace location information from the invalid
declaration on the initial declaration.
[3 lines not shown]
[AMDGPU] Unmark wave reduce intrinsics for constant folding
The `add`, `sub`, and `xor` wave reduction intrinsics cannot
be constant folded, as `add` and `sub` need to be multipled
by the number of active lanes, and `xor` depends on the parity
of the number of active lanes.
[AMDGPU] Unmark wave reduce intrinsics for constant folding
The `add`, `sub`, and `xor` wave reduction intrinsics cannot
be constant folded, as `add` and `sub` need to be multipled
by the number of active lanes, and `xor` depends on the parity
of the number of active lanes.
[ELF] Parallelize input file loading (#191690)
During `createFiles`, `addFile()` records a `LoadJob` for each
non-script input (archive, relocatable, DSO, bitcode, binary) with a
state-machine snapshot (`inWholeArchive`, `inLib`, `asNeeded`,
`withLOption`, `groupId`) and expands them on worker threads in
`loadFiles()`. Linker scripts are still processed inline since their
`INPUT()` and `GROUP()` commands recursively call `addFile()`.
Outside `createFiles()`, `loadFiles()` is called with a single job and
drained immediately (`deferLoad` is false). Two cases:
- `addDependentLibrary()`: `.deplibs` sections trigger `addFile()`
during the serial `doParseFiles()` loop.
- `--just-symbols`: pushes files directly, bypassing
`addFile`/`LoadJob`.
Thread-safety:
- A mutex serializes `BitcodeFile` / fatLTO constructors that call
`ctx.saver` / `ctx.uniqueSaver`. Zero contention on pure ELF links.
[23 lines not shown]
Revert "[AMDGPU] Fixed verifier crash because of multiple live range components." (#193135)
Reverts llvm/llvm-project#190719
The Buildbot has detected a new failure on builder
sanitizer-aarch64-linux-bootstrap-hwasan while building llvm.
[MLIR][XeGPU] Recover temporary layout from Anchor Layout (#191947)
This PR refactor the recoverTemporaryLayout() method so that the
temporary layout is recovered from anchor layout, not from any user
specified temporary layout.
[NFC] [clangd] [C++20] [Modules] Introduce ProjectModules::getModuleNameState interface (#193133)
A hole in the current design is that, we assumed there is no duplicated
module name in different module interface in the same project.
This is not true techniquelly. ISO disallows duplicated module names in
a linked program. But we can have multiple program in a project. It will
be fine if they are not linked together. And in practice, it will be
fine if the symbols are masked and if these module interface units are
not showing in the same context of a single translation unit.
I am trying to improve this. This patch tries to add some NFC things to
reduce further patch size.
AI assisted.
[NVPTX] Add commutativity to SETP instructions to enable MachineCSE of inverted predicates
Inverted predicates can be used freely in PTX. If we can invert a
predicate and CSE the generating instruction we can save calculating
the inverse.
Teach the NVPTX commuteInstructionImpl that SETP instructions can be
inverted to allow CSEing with previous SETP that match the inverted
form. This also inverts the branch users of the predicate to maintain
correctness.
Currently only allow the SETP inversion if all users are branches.
Future work can extend this to sel and not instructions.
Made-with: Cursor
[clang-tidy][NFC] Fix list.rst and improve alias detection of `add_new_check.py` (#192228)
Follow up of https://github.com/llvm/llvm-project/pull/192224.
This commit does two things:
- Replace the original alias detection based on `:http-equiv` (we may
remove these completely in the future) with a method of directly
matching the documentation section.
- Update the list.rst
---------
Co-authored-by: Victor Chernyakin <chernyakin.victor.j at outlook.com>
[NFC] [clangd] [C++20] [Modules] Rename and move scanningProjectModules (#193128)
I am going to add more stuff to ProjectModules and the current structure
and the file name scanningProjectModules may be confusing.
This NFC patch changes that.
[AMDGPU] Fixed verifier crash because of multiple live range components. (#190719)
In Rewrite AGPR-Copy-MFMA pass, after replacing spill instructions, the
replacement register may have multiple live range components when the
spill slot was stored to more than once. The verifier crashes with a bad
machine code error. This patch fixes the problem by splitting a live
range but assigning the same physical register in this scenario. A new
test has been added that verifies the absence of this verifier error.
Assisted-by: Claude Opus