Revert "[BOLT][AArch64] Transform cmpbr ~> cmp + br when inversion not possible (#185731)" (#202309)
This reverts commit 6b13656fd8386f979e061cc97e32b607ee3fcdf4.
We have identified various bugs hence reverting:
* relaxLocalBranches() should account for BB growth and adjust subsequent BB offsets in fragment
* multiple parallel workers are sharing the same allocator in DataflowInfoManager
* liveness is run lazily, potentially after the CFG has been modified
[analyzer] Trigger checkLifetimeEnd callback from CFGLifetimeEnds element
This patch adds handling of the `CFGLifetimeEnd` element to the CSA, and
produces a newly created callback `checkLifetimeEnd` for each occurrence
of it.
It is useful to implement detection of dangling pointers as in:
```
void su_use_after_block () { int* p=0; { int x=1; p=&x; } *p = 2; }
// ^ p dangles
```
This patch does not implement the check itself. it is motivated by the
discussion in
https://discourse.llvm.org/t/what-is-the-status-of-scopeend-and-scopebegin/90861
--
[4 lines not shown]
[flang][flang-rt] Treat REAL(2)/COMPLEX(2) as C-interoperable types (#201888)
IEEE-754 binary16 ("half") maps cleanly to the C `_Float16` type
(ISO/IEC TS 18661-3), but flang previously rejected `REAL(KIND=2)` and
`COMPLEX(KIND=2)` in C-interoperable contexts. Make `REAL(KIND=2)` and
`COMPLEX(KIND=2)` into actual interoperable types.
`ISO_C_BINDING` now exports the gfortran-compatible named constants
`c_float16` and `c_float16_complex` (both value 2), the kind parameter
for the half-precision C interoperable types.
Assisted-by: AI
[AMDGPU] Support Wave Reduction for true-16 types - 1
Supporting true-16 versions of the reduction intrinsics
Supported Ops: `min`, `umin`, `max`, `umax`.
Supports only the iterative stratergy, DPP is yet
to be supported.
[AMDGPU] Support Wave Reduction for i16 types - 2
Supported Ops: `add`, `sub`.
Supports only the iterative stratergy, DPP is yet
to be supported.
Supports only Fake-16 versions of the lowering.
True-16 support is yet to be added.
[AMDGPU] Support Wave Reduction for i16 types - 3
Supported Ops: `and`, `or`, `xor`.
Supports only the iterative stratergy, DPP is yet
to be supported.
Supports only Fake-16 versions of the lowering.
True-16 support is yet to be added.
[AMDGPU] Support Wave Reduction for true-16 types - 2
Supporting true-16 versions of the reduction intrinsics
Supported Ops: `add`, `sub`.
Supports only the iterative stratergy, DPP is yet
to be supported.
[lldb][test] Remove home dir paths from core files used in tests (#201630)
Most of our core files contain paths to the source files that were used
to generate the core file. LLDB probes the existence of these source
files when it sees them in the core file, which on its own is not
problematic as that's usually quite cheap.
Unfortunately, the paths used in most core file tests are in the form of
/home/XYZ/test.c which does not exist on macOS. On some macOS machines
with network drives, these file accesses cause the kernel to perform
some kind of network request which is extremely slow.
The result is that the tests that inspect these core files run extremely
slow on macOS. For example, the TestNetBSDCore.py and TestLinuxCore.py
tests spend 95% of their runtime just probing these network paths. In
the case of TestLinuxCore.py, this causes the test to run for about 1
minute where only 3.7s are actual test logic.
This patch removes all /home/* paths from our core files and replaces
[2 lines not shown]
[lldb][test] Speed up TestGlobalModuleCache.py (#201561)
This patch reduces the runtime of TestGlobalModuleCache from 27s to
3-4s. There are three reasons for why the old test was slow:
* We did a sleep for 2s to ensure the source code file had a newer time
stamp and Make would rebuild the project. Instead, we now just age all
times on disk by 10s into the past to do the same thing. I'm not sure
how many other tests need to do this, but introducing a utility method
for forced in-place rebuild would be a good follow up.
* We additionally slept even for the first initial build, which wasn't
needed as there is nothing to rebuild.
* I removed some of the system includes in the source files which are
also not needed.
[lldb][test] Don't print LLDB version in every test (#201307)
An empty minimal API test currently runs for 330ms on my macOS system.
Of these 330ms, we spend 70ms (20%) just to print the lldb version
number at the start of each test.
This patch disables this behavior by default and instead prints the LLDB
version number once at the start of the LIT test suite. This saves about
2 minutes of CPU time in an LLDB test suite run.
[lldb] Add size checks for frequently allocated classes (#200939)
Given how frequently we allocate these classes (or classes containing
these classes) on the heap, we should only grow them intentionally.
See also bd1b3d47462acf4f854f593bdd77b3f127adea46
[lldb][test] Don't run simulator tests in parallel to stabilize them (#201870)
TestSimulatorPlatform is a notoriously flaky test on macOS. However, it
seems the test failures only happen when its exectuion overlaps with the
execution of TestAppleSimulatorOSType.py (which also interacts with the
simulator). Somehow one simulator test seems to break the other one, but
it's not exactly clear how they are inteferring with each other.
This patch ensures these two tests never run in parallel to avoid these
kind of issues. It creates a parallelism_group in lit for
apple-simulator and then puts both tests into that group.
TestAppleSimulatorOSType.py had to be moved to its own directory for
this. It shared its directory with a a lot of other unrelated tests and
the lit config applies to the whole directory.
[NFC][OMPT] Use `unique_id` entry point for tests (#202228)
The OMPT tests currently use an incrementing ID for the host_op_id.
However, this value is not incremented for `submit_emi` callbacks, and
uses a global integer that is incremented on callback invocation. This
can lead to race conditions when e.g., `target nowait` is used.
Hence, replace the global integer by the `unique_id` entry point,
properly yielding unique and thread-safe IDs.
Signed-off-by: Jan André Reuter <j.reuter at fz-juelich.de>
[StringMap] Add remove_if and use it for erase-while-iterating (#202272)
Add a `remove_if` member to StringMap (and to HashKeyMap, the base of
SampleProfileMap) as a replacement for the erase-while-iterating idiom,
and convert the two in-tree users: SymbolStringPool::clearDeadEntries
and llvm-profdata's filterFunctions (a template over StringMap and
SampleProfileMap).
Extracted from #202237 - making StringMap's mutation invalidates
iterators
so that we can remove the tombstone state.
Aided by Claude Opus 4.8