[SampleProfileMatcher] Flatten profiles loaded on demand (#184255)
Fix an issue that after loading top-level function from profile during
CG matching, it's not flattened. This means the inlined callees of the
loaded nested profile don't get their own entries in
`FlattenedProfiles`, making them undiscoverable by subsequent CG
matching steps.
[SampleProfileMatcher] Add direct basename early matching for orphan functions (#184409)
When user code changes function signatures (e.g., adding/removing
parameters), the C++ mangled name changes while the base function name
stays the same. The existing stale profile CG matching can only recover
renamed functions when they appear as callees of already-matched
callers. If the caller has no profile (e.g., fully inlined in the
profiled binary, or from a different TU not loaded), the renamed callee
is never discovered and gets zero profile data.
Add `matchFunctionsWithoutProfileByBasename()` that pairs orphan IR
functions (no profile) with unused top-level profile functions by
demangled basename, without requiring a matched caller in the call
graph.
This direct basename matching runs before CG matching and writes to
`FuncToProfileNameMap`. CG matching can later overwrite these entries
(since `SymbolMap` is not updated until `UpdateWithSalvagedProfiles`),
so a contextually better CG match is not blocked.
[5 lines not shown]
[FastISel] Lower call instruction with illegal type returned (#180322)
Fix issue https://github.com/llvm/llvm-project/issues/179100
When lowering the call instruction with illegal type returned, we should
bail out and transfer the lowering to DAG. Otherwise the return value is
not promoted to proper type, but DAG would assume it has been promoted.
---------
Co-authored-by: Yuanke Luo <ykluo at birentech.com>
Handle case when an FMV function is declared, used, then defined by:
fixing getMangledNameImpl such that it does not need to special case for FMV declarations because GetOrCreateLLVMFunction already can return the non-mangled name of declared FMV functions
unveil ssh-pkcs11-helper too; fixes breakage spotted by anton@
If SK/P11/askpass is overridden by environment, only unveil the requested
path and not both the requested one and the default.
feedback/ok deraadt@
[llvm][RISCV] Use zilsd for callee-saved register spill/restore on RV32 (#184794)
When the Zilsd extension is enabled on RV32, use SD_RV32/LD_RV32
instructions to spill and restore pairs of callee-saved GPRs instead of
saving 2 separate 32 bit data.
Note that we need to ensure stack slot to be aligned.
[AMDGPU] Add structural stall heuristic to scheduling strategies
Implements a structural stall heuristic that considers both resource
hazards and latency constraints when selecting instructions. In coexec,
this changes the pending queue from a binary “not ready to issue”
distinction into part of a unified candidate comparison. Pending
instructions still identify structural stalls in the current cycle, but
they are now evaluated directly against available instructions by stall
cost, making the heuristics both more intuitive and more expressive.
- Add getStructuralStallCycles() to GCNSchedStrategy that computes the
number of cycles an instruction must wait due to:
- Resource conflicts on unbuffered resources (from the SchedModel)
- Sequence-dependent hazards (from GCNHazardRecognizer)
- Add getHazardWaitStates() to GCNHazardRecognizer that returns the number
of wait states until all hazards for an instruction are resolved,
providing cycle-accurate hazard information for scheduling heuristics.
[InferAS][NFC] Improve documentation for getAddrSpaceCastPreservedPtrMask (#185239)
Clarify the description of the preserved pointer bit mask and its
purpose in address space inference. Reformat the example for better
readability.
Co-authored-by: Yuanke Luo <ykluo at birentech.com>
[DA] Fix test case for the Weak Zero SIV tests (NFC) (#185555)
The IR does not match the pseudo code. The pseudo code is intentional,
so update the IR accordingly.
www/py-a2wsgi: New port
Convert WSGI app to ASGI app or ASGI app to WSGI app.
Pure Python. Only depend on the standard library.
Compared with other converters, the advantage is that a2wsgi
will not accumulate the requested content or response content
in the memory, so you don't have to worry about the memory
limit caused by a2wsgi. This problem exists in converters
implemented by uvicorn/startlette or hypercorn.
[clang] Adjust -pedantic-errors -WX/-Wno-error=X interaction (#184756)
While -Wno-long-long suppresses -pedantic-errors diagnostics in both GCC
and Clang, GCC -Wno-error=long-long emits warnings while Clang still
emits errors.
```
% echo 'long long x = 0;' | gcc -std=c89 -pedantic-errors -Wno-error=long-long -x c -fsyntax-only -
<stdin>:1:6: warning: ISO C90 does not support 'long long' [-Wlong-long]
% echo 'long long x = 0;' | clang -std=c89 -pedantic-errors -Wno-error=long-long -x c -fsyntax-only -
<stdin>:1:1: error: 'long long' is an extension when C99 mode is not enabled [-Werror,-Wlong-long]
1 | long long x = 0;
| ^
1 error generated.
```
The order of -pedantic-errors and -Wno-error=long-long does not matter.
Two fixes to how extension diagnostics interact with -pedantic-errors
[20 lines not shown]
www/py-baize: New port
Powerful and exquisite WSGI/ASGI framework/toolkit. Only relies on the
standard library.
The minimize implementation of methods required in the Web framework.
No redundant implementation means that you can freely customize functions
without considering the conflict with baize's own implementation.
[HLSL] Add support for groupshared args (#181886)
Add support for groupshared args to HLSL.
Some support for template errors and warnings still needs to be added in
a follow up (tracked by #182535)
Closes #174472
[libunwind][PAC] Defang ptrauth's PC in valid CFI range abort
It turns out making the CFI check a release mode abort causes many,
if not the majority, of JITs to fail during unwinding as they do not
set up CFI sections for their generated code. As a result any JITs
that do nominally support unwinding (and catching) through their JIT
or assembly frames trip this abort.
rdar://170862047
A bit of MMU handling cleanup:
- BI_MMU_APOLLO does, in fact, mean the Apollo MMU for 68020 used on the
DN3000 and DN4000 machines.
- If we end up with an unknown MMU value, try to reconcile with machine
type and CPU type.