[llvm][OpenMP] Add option to disable default max threads adjustment (#198719)
This commit adds the option
`-openmp-ir-builder-use-default-max-thread=<boolean-value>` to
enable or disable the use of a default max threads in OpenMPIRBuilder
when no max threads constant is provided. The option is enabled by
default, thus maintaining the same behavior as it is currently.
This flag is useful to avoid limiting the number of threads that an
OpenMP target region can run with when no `thread_limit` or
`num_threads` (in a nested parallel region) are specified. This flag may
be used when recording a kernel to allow replaying it later with a
higher number of threads (e.g., reaching the maximum thread limit
supported by the device).
[OpenMP][OMPIRBuilder] Avoid querying SmallPtrSet during removal (#198690)
openmp-cli-fuse02.mlir can intermittently leave behind a dead block
after loop fusion, causing LLVM IR verification to fail.
(https://github.com/llvm/llvm-project/pull/197637#issuecomment-4497502486)
This happened because `removeUnusedBlocksFromParent` queried
`BBsToErase` while using `SmallPtrSet::remove_if` on the same set, which
made the result depend on the set's internal mutation order during
removal.
This patch tracks candidate blocks by index with `SmallBitVector`
instead of mutating and querying the same `SmallPtrSet`. This also
preserves the original `BBs` order when collecting the final dead
blocks.
---
The original PR was created with assistance from Copilot. The later
[2 lines not shown]
[PowerPC][NFC] Change arguments of PPCPostRAExpPseudo/PseudoXFormMemOp
The assembler string of the pseudo is almost always a # followed by
the name of the pseudo. A good part of the pseudos does not have a
pattern. Changing the order of arguments asmstr and pattern
in PPCPostRAExpPseudo and PseudoXFormMemOp, and assigning default
values, reduces repetitions.
[bolt][test] Use C++ frontend for building C++ test (#198856)
The C frontend executable will fail if there's any C++-specific options from the environment.
[mlir][vector] Migrate drop-lead-unit-dim to shape_cast (#196206)
Post-merge discussion on #195686 led to the conclusion that we should
change the behavior of drealeadunitdim to use shape_cast instead of
extracts and broadcasts since those are now the canonical form of such
unit-dimension striping. This commit implements that change.
The one exception is that vector contractions where the accumulator is
reduced to a scalar still use extract/broadcast. Said contract handling
also now omits any vector.transpose emissions since those are all
order-preserving and will fold into shape_casts.
The PR adds tests to ensure that scalable dimensions function correctly,
per previous PR comments. They should already have worked, but they
weren't tested.
AI: Codex 5.5 did most of the work on this one.
[NFC][lldb][windows] extract ConvertNtDevicePathToDosPath method (#198794)
This patch extracts the logic for converting an NT path to a DOS path.
This is a prelude to https://github.com/llvm/llvm-project/pull/198795.
[lldb][test] Make dependent-modules-nodupe-windows case-insensitive (#198807)
Running the test in containers returns `KERNEL32.DLL` instead of
`kernel32.dll`, which causes the test to fail. Both names point to the
same file. Windows path comparison is case-insensitive. The test is
asserting an OS-level behavior, not a casing convention, so it should
not be case-sensitive.
[docker][windows] install make in Dockerfile (#198814)
The lldb-api test suite builds each test's inferior via GNU make. The
current Windows CI container (`ghcr.io/llvm/ci-windows-2022:latest`)
does not install make, so LLDB's CMake configure step ends up with
`LLDB_DEFAULT_TEST_MAKE-NOTFOUND` and every lldb-api test reports
`UNRESOLVED`.
This patch adds a step in the Windows Dockerfile to install make with
choco.
[lldb] Prioritize the MAKE_NO_DEBUG_INFO flag (#198801)
Some tests require an executable with no debug information. When
`lldb-dotest` is run with `--dwarf-version`, that flag currently takes
precedence over the debug info type specified in the test. This is
incorrect when the test explicitly expects the executable to have no
debug information.
Ensure `MAKE_NO_DEBUG_INFO` takes priority over `--dwarf-version` so
that tests requiring stripped executables behave correctly regardless of
how the test suite is invoked.
[lldb] Fix data race on shared global ProcessProperties callback (#197980)
ProcessProperties::ProcessProperties was installing a per-process
value-changed callback on ePropertyDisableLangRuntimeUnwindPlans. The
property was declared Global in TargetProperties.td, so
OptionValueProperties::CreateLocalCopy shared the underlying OptionValue
across every ProcessProperties. Every Process constructor therefore
wrote into the same std::function slot, racing with concurrent
constructors and silently clobbering any earlier Process's callback.
Found by ThreadSanitizer as part of #197792.
update_test_checks: fix DIFile filename relaxation (#198382)
In e78b763568e47e685926614195c3075afa35668c (#135692) the matcher for
the `directory:` field requires a non-empty directory, which isn't
guaranteed. Relax it to accept any string, including the empty string.
Change-Id: Ie6d793f7abdbafd3d2faa29379919e68e846afe7
[arm64e][cfi] .cfi_b_key_frame is irrelevant for Mach-O platforms (#198660)
We always sign with the B-Key anyway, and the unwinder behaves the same whether the directive is present. Better to avoid emitting it in the first place.
[AArch64] Use fp16 FNEG and FABS for bf16. (#198653)
These operations are bitwise and can be used for bf16 fneg and fabs as
equally as they can be used for fp16.
Revert "[Flang] [OpenMP] atomic compare (#184761)" (#198848)
This reverts commit 91467766a8afb52439619163828c5f6816ddd550.
This was causing tests to be quite flaky. See #198776.
[flang][OpenMP] Fix EQUIVALENCE variable privatization in OpenMP (#197726)
Fixes #197553
EQUIVALENCE aliases are lowered with `fir.ptr` addresses
(`castAliasToPointer` in ConvertVariable.cpp) to inform alias analysis.
However, `privatizeSymbol()` in Utils.cpp treated all `fir::PointerType`
values as true Fortran POINTERs, skipping the `unwrapRefType` that
computes the correct allocation type. For arrays, this caused the
privatizer to allocate pointer-sized storage instead of the full array,
resulting in stack buffer overflows at runtime.
The fix adds a `!semantics::IsPointer()` check so that only true Fortran
POINTERs preserve the `fir.ptr` wrapping. EQUIVALENCE aliases are
correctly unwrapped to their underlying type.
**Changes:**
- flang/lib/Lower/Support/Utils.cpp: Gate the `PointerType` guard on
`semantics::IsPointer` to distinguish true POINTERs from EQUIVALENCE
[6 lines not shown]
[AMDGPU] Fix matchPERM byte tracker for SRA past operand width (#198708)
Bytes past the operand are 0 for SRL but the sign bit for SRA. The old
code treated both as 0, so v_perm_b32 picked the wrong byte for SRA
Example:
`ashr x, 24` keeps only x's byte 0 in the result. The upper bytes are
copies of x's sign bit, not bytes of x. The matcher used to map them
back to bytes of x, producing a perm mask that ignored the sign extend
[AArch64] Use ADDP tree for v16i8 to i16 bitmask extraction (#192974)
```
Before:
ext v1.16b, v0.16b, v0.16b, #8
zip1 v0.16b, v0.16b, v1.16b
addv h0, v0.8h
fmov w0, s0
After:
addp v0.16b, v0.16b, v0.16b
addp v0.16b, v0.16b, v0.16b
addp v0.16b, v0.16b, v0.16b
umov w0, v0.h[0]
```
The existing lowering in vectorToScalarBitmask for v16i8 used an
EXT+ZIP1+ADDV sequence to pack the per-lane bits into an i16. The
horizontal ADDV is expensive on some microarchitectures and forces an
[13 lines not shown]
Use SmallBitVector for deterministic dead-block tracking (authored by slinder1)
Track candidate blocks by stable BB indices instead of mutating pointer
sets, avoiding SmallPtrSet tombstone/rehash/iteration-order issues while
preserving original BB order for deletion.
[libc] move mblen to stdlib (#198642)
Move mblen from wchar to stdlib to conform with C standard. Also update
headers to match new style.
Assisted-by: Automated tooling, human reviewed.
[CI] Successful build and no tests running is now a notification (#198684)
check-libc now uses llvm-lit to run tests instead of running the
unittests directly through ninja. This means there should not be any
cases in tree where the build could succeed but we do not pick up any
tests as running. Still pass the build in this case because if
everything passes with exit code 0 it is wrong not to, but make a note
to the user that this is unexpected.
[NFC][LifetimeSafety]: Track assignment history within a single CFGBlock (#196075)
## Summary
Tracking assignment history allows us to backtrack and provide more
informative error messages, helping users better understand the root
cause.
As discussed in
https://github.com/llvm/llvm-project/pull/188467#issuecomment-4359071778,
I am splitting the original #188467 into smaller parts. This PR submits
the core logic: performing a reverse search for assignment history
within a single CFG block.
A simple unit test has been added to verify the basic functionality of
the algorithm.
## Details
[7 lines not shown]