[LV] Simplify and unify resume value handling for epilogue vec. (#185969)
This patch tries to drastically simplify resume value handling for the
scalar loop when vectorizing the epilogue.
It uses a simpler, uniform approach for updating all resume values in
the scalar loop:
1. Create ResumeForEpilogue recipes for all scalar resume phis in the
main loop (the epilogue plan will have exactly the same scalar resume
phis, in exactly the same order)
2. Update ::execute for ResumeForEpilogue to set the underlying value
when executing. This is not super clean, but allows easy lookup of the
generated IR value when we update the resume phis in the epilogue. Once
we connect the 2 plans together explicitly, this can be removed.
3. Use the list of ResumeForEpilogue VPInstructions from the main loop
to update the resume/bypass values from the epilogue.
This simplifies the code quite a bit, makes it more robust (should fix
[11 lines not shown]
[Inliner] Fix return attribute propagation across multiple return sites (#186076)
Fixes #185159
This patch fixes a bug in `AddReturnAttributes()` where propagated
return attributes could incorrectly leak across multiple return sites in
the callee being inlined.
`AddReturnAttributes()` walks the callee's return instructions and tries
to backward-propagate return attributes from the callsite to the
returned call when the callee directly returns a call result. However,
the propagated attribute builders were updated in-place while iterating
over return sites. As a result, attributes refined for one return site
could be reused when
processing a later return site. This is incorrect because each return
site should be handled independently, starting from the original
callsite attributes.
This patch ensures that propagated return attributes are reinitialized
for each return site, so propagation is computed independently per
returned call.
NAS-140297 / 26.0.0-BETA.2 / Use truenas_os_pyutils (by anodos325) (#18472)
Several functions that were originally provided by middlewared/utils
were moved to the truenas_os_pyutils module so that they can be cleanly
consumed by python modules outside of the middleware repository without
causing odd inter-dependencies. This commit finishes up the moves by
swapping out imports at call sites and removing redundant tests.
Original PR: https://github.com/truenas/middleware/pull/18458
Co-authored-by: Andrew Walker <andrew.walker at truenas.com>
[AArch64] Allocate two emergency spill slots for MTE to fix register … (#186505)
…scavenger crash
When `-sanitize=memtag-stack` is enabled and the compiler optimizes
contiguous ST2Gi instructions into an MTE loop (via
`TagStoreEdit::emitLoop`), it spawns two new post-RA virtual registers
simultaneously:
1. `BaseReg`
2. `SizeReg`
Under extremely high register pressure (such as in Swift async
continuation thunks, where almost all registers are kept live), the
Register Scavenger must fall back to using emergency spill slots to
assign physical registers to `BaseReg` and `SizeReg`.
Prior to this patch, `determineCalleeSaves` assumed that a maximum of
one register would ever need to be scavenged at a time. It either
allocated a single emergency spill slot, or bypassed the allocation
[20 lines not shown]
queue.h: Reorder STAILQ_INSERT_TAIL
The current implementation briefly violates the tail invariant. This
is not usually an issue, but if an insert is in flight when a panic
occurs, we may then trip the invariant while dumping core.
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: NetApp, Inc.
Reviewed by: obiwac, olce, jhb
Differential Revision: https://reviews.freebsd.org/D55819
[HLSL] Use 0 to represent unbounded resources (#186022)
SPIRV backend uses 0 to represent unbounded arrays. This patch makes
unbounded resources be represented with 0 when binding them, as well as
makes sure the backend uses OpTypeRuntimeArray to represent such cases.
Fix: https://github.com/llvm/llvm-project/issues/183367
[clang] use canonical arguments for checking function template constraints
This is a partial revert of #161671, restoring the original behaviour
where the canonical template arguments are used for function template
constraint checking in diagnostics.
This reverts the fix from #183010, which attempted to fix #182344
but it causes regressions. These regressions now have test cases included.
The attempt at #183010 is flawed because in the general case we can't
check satisfaction for constraints which have unsubstituted template arguments,
even if they don't affect the canonical type (ie they are purely syntactical),
because these types can still turn out to be invalid after substitution.
This is a problem when directly evaluating a concept specialization, but
it's not a problem with other template specializations because the as-written
types are preserved, and will be later substituted, and any failures here will
cause the program to be ill-formed anyway.
[13 lines not shown]
[Clang][Docs] Discontinue documenting the GCC -I- and --include-barrier options. (#184941)
Clang has never implemented the GCC `-I-` and `--include-barrier`
options. An error is issued if they are used. GCC deprecated these
options in GCC 4. Advertising their availability in documentation and
help text is misleading.
[LLVM] [SeparateConstOffsetFromGEP] patch PR 183402 to handle negative C correctly (#186858)
Small typo in negative C threshold calculation would result in a
threshold that is too conservative due to overflow.
[PowerPC][NFC] Refactor Register class and operand definitons (#185647)
Created a comprehensive base class system in PPCRegisterClasses.td to
eliminate repetitive RegisterOperand definitions across PowerPC register
files and introduced PPCRegOperand multiclass in to automatically
generate AsmOperandClass and RegisterOperand definitions, eliminating
~50 lines of boilerplate.
Asissted by AI.
[libc] Build fuzzing tests in pre-merge CI tests (#185018)
At the moment, no CI job tests whether the fuzzing tests build
correctly.
This patch adds the build of fuzzing tests to the pre-merge CI job.
Only two configurations have it enabled for now. The none-eabi
configurations seemingly do not support it because in their cmake
configs compiler-rt is not enabled, hence libFuzzer isn't built. I did
not dig too much to understand why that is, preferring to just leave it
disabled for these configurations. For the remaining ones that seem to
support it, I selected one x86 and one aarch64.
In addition, it removes one outdated comment about the build type used
and changes the action to run on all branches, not only on PRs that
target main.
If we limit it to run only on PRs to the main branch, it will not run on
stacked PRs. I believe it is also okay to run it on PRs to release
branches. Therefore it is just easier to remove the limit altogether.