[SandboxIR][Tracker] Support nested checkpoints (#191097)
This patch implements nested checkpointing, i.e., you can now save the
IR state more than once and revert more than once.
For example, after two saves: save(1) and save(2), a revert() will bring
you back to the IR state of save(2), one more revert will bring you back
to the IR state of save(1).
[cmake][compiler-rt][darwin] builtin libraries don't build for armv6m in Darwin (#195372)
darwin_add_builtin_libraries tests for _Float16 and __bf16 for the host
architecture rather than the one being built, add -arch to fix that so
that armv6m correctly reports that it does not support __bf16.
cfcmp/cdcmp get "error: unsupported relocation type" on their "Branch to
target address" to c{f,d}cmple. Switch those to "Call a subroutine"
instructions on Thumb-1 (e.g. armv6m).
Assisted-by: Claude Code
rdar://167828904
[flang][OpenMP] Detect DSA conflicts in nested loop constructs (#195323)
Follow-up to https://github.com/llvm/llvm-project/pull/194961
The fix from PR194961 did not detect explicit/predefined DSA conflicts
on an iteration variable in a nested loop construct. For example, in a
testcase inspired by Fujitsu 0165_0035.f90:
```
!$omp parallel do private(i) shared(j)
do i=1,1
do j=1,1
!$omp parallel do default(none) shared(k)
do k=1,1
end do
!$omp end parallel do
end do
end do
```
the "shared(k)" was not flagged as incorrect.
[2 lines not shown]
Update description of spl_schedule_hrtimeout_slack_us
Clarify the effect of the non-zero value on wakeup coalescing.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18467
man: document three missing properties and tunables
Add manpage entries for parameters and properties that exist in
source but were not previously described:
- spl.4: spl_schedule_hrtimeout_slack_us
- zfsprops.7: longname
- vdevprops.7: raidz_expanding
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin at TrueNAS.com>
Signed-off-by: Christos Longros <chris.longros at gmail.com>
Closes #18467
[DWARFLinker] Emit DW_IDX_parent in the accelerator table (#195403)
.debug_names entries produced by the parallel linker were always emitted
with std::nullopt for ParentDIEOffset, resulting in a missing
DW_IDX_parent. The classic linker emits it via
DWARF5AccelTableData::getDefiningParentDieOffset on the output DIE tree.
The parallel linker can't use the same approach because the records are
saved during cloneDIE, before the output DIE has been linked into its
parent, so DIE::getParent() is nullptr at that time time. Fix that by
computing the parent offset from the input-side DIE tree instead. We
look up InputDieEntry's parent via getParentIdx, skip parents marked
DW_AT_declaration, and translate them to the output offset through
CompileUnit::getDieOutOffset. Since no real DIE can live at offset 0, we
can use that to unambiguously mark input DIEs that were not cloned into
this CU's plain DWARF (e.g. routed only into the artificial type unit)
and is treated as "no parent".
Only compile-unit accelerator entries are covered. Type-unit entries
(artificial type unit) still emit no DW_IDX_parent, tracked by a TODO.
Import smart revision 1.0.2
The smart command allows the user to monitor the various information
reported by Self-Monitoring, Analysis and Reporting Technology (SMART)
present on most ATA, SCSI, and NVMe storage media.
[AArch64] Fix `shufflevector` miscompilation on `aarch64_be` (#193076)
A function like
```llvm
define <4 x i16> @xtn_shuffle_even_v8i16(<8 x i16> %a) {
entry:
%r = shufflevector <8 x i16> %a, <8 x i16> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
ret <4 x i16> %r
}
```
will use the `xtn` instruction, which for each 32-bit vector element
keeps only the lower 16 bits, so effectively this is a truncation.
However, if the vector actually has 16-bit elements, then the conversion
from a shuffle to a truncation is only valid on LE, not on BE. On BE,
`uzp1` should be used instead. So this PR moves some logic to right
after a check for LE, so that BE does not miscompile.
[5 lines not shown]
Prevent undefined behavior caused by combination of branch and load delay slots on MIPS1 (#185427)
Under certain conditions the LLVM `MipsDelaySlotFiller` fills a branch
delay slot with an instruction requiring a load delay slot. However the
`MipsDelaySlotFiller` does not check the filled instruction for hazard
which leads to code like this:
```asm
beqz $1, $BB0_5
lbu $2, %lo(_RNvCs5jWYnRsDZoD_3app13CONTROLLERS_A)($2)
# --- Some other instructions
$BB0_5:
andi $1, $2, 1
```
`lbu` got moved into the branch delay slot but has a load delay slot -
so when jumping to `$BB0_5` the value for `$2` will not be ready, which
leads to undefined behavior.
This PR suggests to declare instructions with a load delay slot to be
hazardous for the branch delay slot, only for `MIPS1`. This will prevent
[23 lines not shown]
[RFC][NFCI][Constants] Add `Constant::isZeroValue`
The old `isZeroValue` was removed because it was functionally identical to
`Constant::isNullValue`. Currently, a "null value" in LLVM means a zero value.
We are moving toward changing the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero-valued pointer. As a result,
the meaning of "null value" will also change in the future.
This PR series is the first step toward renaming the two widely used "null
value" interfaces to "zero value". As the first PR in the series, this change
adds a "new" `isZeroValue` alongside `isNullValue`, and makes `isNullValue` call
`isZeroValue` directly. Then, all uses of `isNullValue` in LLVM are replaced
with `isZeroValue`. Uses in other projects will be updated in separate PRs.
The plan is to eventually remove `isNullValue` after all uses have been
migrated.
[VPlan] Get GEP wrap flags from VPInstructions (NFCI). (#195730)
Add helper to retrieve GEP no-wrap flags from VPInstructions, looking
through zero-index GEPs and pointer casts, like
Value::stripPointerCasts. Removes an access to underlying IR.
[ModuleInliner] Skip function declarations during candidate scan (#195567)
This patch skips function declarations during the candidate scan in
ModuleInlinerPass::run as declarations do not have bodies.
[InlineOrder] Fix assertion failure in CostBenefitPriority (#195564)
InlineCost::getStaticBonusApplied() triggers an assertion failure
if the CostBenefitPriority constructor calls it when
IC.isVariable() is false. This is because
getStaticBonusApplied() expects isVariable() to be true.
Unconditionally populating CostBenefit also incorrectly prioritizes
a NeverInline candidate with a cost-benefit pair over other
valid variable-cost sites.
This patch fixes the crash and the sorting issue by calling
getStaticBonusApplied() and populating CostBenefit only when
IC.isVariable() is true. For AlwaysInline and NeverInline costs,
CostBenefit is explicitly set to std::nullopt.
[IPO] Fix infinite recursive inlining in ModuleInliner (#195471)
The ModuleInliner currently lacks inline history tracking. Without
it, the inliner can get stuck in an infinite loop when mutually
recursive functions are involved.
This patch enables inline history tracking in the ModuleInliner to
address this issue.
The minsize attribute in the test case lowers the threshold for the
mutually recursive functions, ensuring the bug reproduces in pass
isolation.