[IR] Add `fpmath` to keep list of dropUBImplyingAttrsAndMetadata (#179019)
`fpmath` is precision metadata rather than UB-implying metadata. This
avoids `fpmath` from being dropped in InstCombine FoldOpIntoSelect.
Set rematerialized MIs' reg operands to sentinel reg
Also removes a bunch of const specified on class members that prevents
std::sort from compiling on some configs.
Re-apply "[AMDGPU][Scheduler] Scoring system for rematerializations (#175050)"
This re-applies commit f21e3593371c049380f056a539a1601a843df558 along
with the compile fix failure introduced in
8ab79377740789f6a34fc6f04ee321a39ab73724 before the initial patch was
reverted and fixes for the previously observed assert failure.
We were hitting the assert in the HIP Blender due to a combination of
two issues that could happen when rematerializations are being rolled
back.
1. Small changes in slots indices (while preserving instruction order)
compared to the pre-re-scheduling state meand that we have to
re-compute live ranges for all register operands of rolled back
rematerializations. This was not being done before.
2. Re-scheduling can move registers that were rematerialized at
arbitrary positions in their respective regions while their opcode
is set to DBG_VALUE, even before their read operands are defined.
This makes re-scheduling reverts mandatory before rolling back
[4 lines not shown]
[AMDGPU][Scheduler] Revert all regions when remat fails to increase occ. (#177205)
When the rematerialization stage fails to increase occupancy in all
regions, the current implementation only reverts the effect of
re-scheduling in regions in which the increased occupancy target could
not be achieved. However, given that re-scheduling with a higher
occupancy target puts more pressure on the scheduler to achieve lower
maximum RP at the cost of potentially lower ILP as well, region
schedules made with higher occupancy targets are generally less
desirable if the whole function is not able to meet that target.
Therefore, if at least one region cannot reach its target, it makes
sense to revert re-scheduling in all affected regions to go back to a
schedule that was made with a lower occupancy target.
This implements such logic for the rematerialization stage, and adds a
test to showcase that re-scheduling is indeed interrupted/reverted as
soon as a re-scheduled region that does not meet the increased target
occupancy is encountered.
[4 lines not shown]
[clang-tidy] Speed up `modernize-use-nullptr` (#178829)
As noted in [this
comment](https://github.com/llvm/llvm-project/pull/178149#discussion_r2732896149),
it appears that registering one `anyOf(a, b, ...)` matcher is generally
slower than registering `a, b, ...` all individually. Applying that
knowledge to this check gives us an easy 3x speedup:
```txt
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
Status quo: 0.3281 ( 6.1%) 0.0469 ( 5.2%) 0.3750 ( 6.0%) 0.3491 ( 5.5%) modernize-use-nullptr
With this change: 0.0938 ( 1.8%) 0.0156 ( 1.8%) 0.1094 ( 1.8%) 0.1260 ( 2.1%) modernize-use-nullptr
```
I'm not exactly sure *why* this works, but it seems pretty consistent.
I've seen a similar result trying this with `bugprone-infinite-loop`.
have state and source limiter state cleanup assert on the right lock.
state and source limiters and they pf state links they're wired up
with are protected by the pf lock, not the pf state lock. this is
asserted correctly when setting up source and state limiters, but
i copy and pasted the wrong assert for the cleanup code.
this should fix the spurious "splassert: pf_create_state: want 1 have 0"
messages i get on my firewalls.
17856 modern virtio drivers truncate ring PA to 32 bits
Reviewed by: Jason King <jason.brian.king+illumos at gmail.com>
Reviewed by: Hans Rosenfeld <rosenfeld at grumpf.hope-2000.org>
Approved by: Gordon Ross <gordon.w.ross at gmail.com>
[ELF,test] Improve riscv and aarch64 relocation error tests
Adopt modern test patterns for relocation overflow and alignment error
tests:
* Use `rm -rf %t && mkdir %t && cd %t` pattern for isolation. Use simple
filenames (32.o, 64.o, out.32) instead of %t-prefixed names
* Use `--defsym` instead of external input files where possible
* Omit `-o /dev/null` for negative tests (implicit when errors occur)
* Add `--implicit-check-not=error:` to catch unexpected errors
ctwm_app_menu: Rewrite largely in awk for ~20-100x speedup.
Previously took ~2.5sec on my laptop, now 0.03sec.
Previously took ~10sec on a wiiu, now ~0.5sec.
Output is meant to be byte-for-byte identical, except possibly in
cases that could have screwed up ctwm by quoting shenanigans which
are now escaped. (I hope the escape sequences work, didn't actually
check how ctwm interprets them.) Can maybe support Exec line with
`"' in them by deleting some code (marked XXX) but I didn't test that
it actually works that way.
PR bin/59958: ctwm: long delay during ctwm_app_menu
Update devel/objfw to 1.4.4
ObjFW 1.4.3 -> ObjFW 1.4.4, 2026-02-01
* Fixes Swift interoperability.
* Fixes building for Wii with new devkitPro.
* Fixes missing background color rounding in OFStdIOStream.
* Adds iso8859-* as an alias for iso-8859-*.
* Fixes objfw-compile not passing -f* and -m* to the linker
* Makes ofhttp always send an Accept header to avoid being flagged as
suspicious by some websites.
* Fixes ObjFWTLS with OpenSSL when using Apple GCC 4.0.1.
* Fixes a few missing OF_RETURNS_INNER_POINTER.
* Fixes some OFMutableStringTests not being run.