[clang][NFC] Remove alignment checks from test/CodeGen/c-strings.c (#196501)
and re-enable it on more targets.
I don't think this test was intended to check for alignment. Those
expectations were added as part of FileCheck-izing the test in
e29dadb6403c8b0d3658f9bbbe2f5fbde5431fdb and we've been working around
them or xfailing the test since.
[AA] No synchronization effects for never-escaping identified local (#193939)
Fences and other synchronizing operations (such as atomic accesses
stronger than monotonic) are modelled as reading and writing all memory,
in order to enforce their implied ordering constraints.
Currently, this happens even for identified function locals that do not
escape. This patch excludes those objects.
Notably, we can *not* reason based on captures-before here, because the
synchronizing operation still has an effect even if the object only
escapes *later*.
The hope here is that with this restriction in place, it may be viable
to respect potential synchronization inside non-nosync function calls.
[libc] Fix partial multi-byte write detection in File (#196402)
File::write_unlocked(const wchar_t*, size_t) checked 'write_res.value <
1' after writing a converted UTF-8 sequence. For multi-byte characters,
a short platform write (e.g. 2 of 3 bytes for a 3-byte character) passed
this check and was counted as a successful write. The output stream
would then contain an incomplete UTF-8 sequence with no error reported
to the caller.
Changed the check to 'write_res.value < char_size' and set the error
indicator on the stream when it triggers.
Added a regression test using a mock File subclass that limits
platform_write to 2 bytes per call, simulating short writes on pipes and
sockets.
Assisted-by: Automated tooling, human reviewed.
---------
Co-authored-by: Michael Jones <michaelrj at google.com>
[LoopFusion] Remove SCEV-based dependence analysis path (#195864)
Loop Fusion has used Dependence Analysis (DA) as the default dependence
check since the option default was flipped in #187309. The SCEV-based
strategy and the combined "all" mode were retained only for fallback and
experimentation, with a comment noting that the SCEV code would be
removed in a follow-up.
This patch removes the SCEV-based dependence path and the now-unused
selector machinery.
Fixes #194821.
Assisted by Cursor.
[DebugInfo] Pack DILocation hash inputs (#196556)
Pack DILocation fields before hashing. Now that column is 16-bits
Line/Column/ImplicitCode fit in one 64-bit value (32 + 16 + 1 = 49 bits)
and AtomGroup and AtomRank also fit cleanly in one 64-bit value (61 + 3
= 64 bits).
Fewer hash_combine inputs on the hot DILocation path is a small
compile-time improvement.
CTMark geomean:
- stage1-ReleaseLTO-g: -0.10%
- stage1-O0-g: -0.23%
- stage1-aarch64-O0-g: -0.19%
- stage2-O0-g: -0.07%
https://llvm-compile-time-tracker.com/compare.php?from=71fef6d5a306d1adf8bf7d30d2fe9e286380fecf&to=1d80b5f5aa98561d2ba09adc3f20c3eacd24cb88&stat=instructions%3Au
Assisted-by: codex
[libc][stdlib] Add setenv (#163018)
Add the POSIX setenv() function, with EnvironmentManager::set()
handling environment array management and ownership tracking.
Registered for x86_64, aarch64, and riscv architectures. Integration
tests cover overwrite/no-overwrite semantics, empty/invalid names,
empty values, and repeated replacement.
Assisted-by: Automated tooling, human reviewed.
---------
Co-authored-by: Michael Jones <michaelrj at google.com>
[MLIR][NVVM][NFC] Restructure NVVM dialect (#195811)
Moves the declarations of the NVVM dialect and some widely used enums
(`FPRoundingModeAttr` and `SaturationModeAttr`) to separate files to make
them easier to maintain and also use in the NVGPU dialect.
[AArch64][CostModel] Model sve costs for ctpop (#192428)
Targets supporting sve prefer sve for ctpop with fixed length vectors.
Update cost model to reflect the same.
[InstCombine][NFC] Change the order of checks in SliceUpIllegalIntegerPHI for faster compile time. (#183726)
SliceUpIllegalIntegerPHI searches for PHIs that have illegal type and
are only used by trunc or trunc(lshr) operations. It bails out if
encounters invoke or EH pad instructions.
It first checks whether it encounters invoke or EH pad, which is time
consuming as it checks every instruction. Then it checks whether it is
used by trunc or trunc(lshr). The former check is generally loose, while
the latter one is stricter. Switch the order of the checks will speed up
compilation.
Signed-off-by: XinlongZHANG-Bob <zhangxinlong.bob at bytedance.com>
[LV][NFC] Reshape pointer_iv_non_uniform_0 test to use distinct loads (#196494)
The followup [patch](https://github.com/llvm/llvm-project/pull/196080)
is folding some of the idempotent binary ops This test has `sub x - x`
operation which is affected by the followup patch. This patch is making
the test immune to the fold.