[OpenMP][OMPIRBuilder] Hoist static parallel region allocas to the entry block on the CPU (#174314)
Follow-up on #171597, this PR hoists allocas in a parallel region to the
entry block of its corresponding outlined function. This PR does this
for the CPU while #171597 introduced the main mechanism to do so and did
it for the GPU.
devel/ast-grep: update to 0.40.4
feat: support --files-with-matches to list all files like ripgrep #2371
fix: use new assert_cmd command #2399
chore(deps): update dependency dprint to v0.51.1 9d00e5b
chore(deps): update dependency @ast-grep/napi to v0.40.3 80f9c2b
chore(deps): update dependency oxlint to v1.36.0 dea9153
15.0/errata: Begin listing known open regressions
List three known traps upgrading to 15.0R in the release errata.
Noting: this is 34 days late, but first time since 12.0R
Discussed with: adrian, imp, jhb, jrtc27, ngie
Reviewed by: adrian
periodic/801.trim-zfs: Fix daily-trim-zfs-flags
This variable was named incorrectly, resulting in any specified flags
being silently ignored.
PR: 292074
MFC after: 3 days
Reported by: CrazyMihey at Ya.Ru
Fixes: 493908c4b45c (Add a daily zfs trim script)
(cherry picked from commit 68d6abd9714384a41028dc0d5086b4930366bbea)
periodic/801.trim-zfs: Fix daily-trim-zfs-flags
This variable was named incorrectly, resulting in any specified flags
being silently ignored.
PR: 292074
MFC after: 3 days
Reported by: CrazyMihey at Ya.Ru
Fixes: 493908c4b45c (Add a daily zfs trim script)
(cherry picked from commit 68d6abd9714384a41028dc0d5086b4930366bbea)
[OpenMP][OMPIRBuilder] Hoist static parallel region allocas to the entry block on the CPU
Follow-up on #171597, this PR hoists allocas in a parallel region to the
entry block of its corresponding outlined function. This PR does this
for the CPU while #171597 introduced the main mechanism to do so and did
it for the GPU.
devel/cargo-nextest: update to 0.9.118
Added
Nextest now supports user configuration for personal preferences. User config is stored in ~/.config/nextest/config.toml (or %APPDATA%\nextest\config.toml on Windows) and includes the following settings:
show-progress: Controls progress display during test runs.
max-progress-running: Maximum number of running tests to show in the progress bar.
input-handler: Enable or disable keyboard input handling.
output-indent: Enable or disable output indentation for captured test output.
User config settings are lower priority than CLI arguments and environment variables. For details, see User configuration.
Fixed
Fixed an issue where nextest could hang when tests spawn interactive shells (e.g., zsh -ic) that call tcsetpgrp to become the foreground process group. Nextest now ignores SIGTTIN and SIGTTOU signals while input handling is active. (#2884)
[OpenMP][OMPIRBuilder] Hoist static parallel region allocas to the entry block on the CPU
Follow-up on #171597, this PR hoists allocas in a parallel region to the
entry block of its corresponding outlined function. This PR does this
for the CPU while #171597 introduced the main mechanism to do so and did
it for the GPU.
[OpenMP][MLIR] Hoist static `alloca`s emitted by private `init` regions to the allocation IP of the construct (#171597)
Having more than 1 descritpr (allocatable or array) on the same
`private` clause triggers a runtime crash on GPUs at the moment.
For SPMD kernels, the issue happens because the initialization logic
includes:
* Allocating a number of temporary structs (these are emitted by flang
when `fir` is lowered to `mlir.llvm`).
* There is a conditional branch that determines whether we will allocate
storage for the descriptor and initialize array bounds from the original
descriptor or whether we will initialize the private descriptor to null.
Because of these 2 things, temp allocations needed for descriptors
beyond the 1st one are preceded by branching which causes the observed
the runtime crash.
This PR solves this issue by hoisting these static `alloca`s
instructions to the suitable allca IP of the parent construct.
One lock to rule them all.
Break the "lock" part of lfs_seglock() into its own function,
lfs_prelock(). Remove the lock flag SEGM_PROT, replacing instances of
lfs_seglock(fs, SEGM_PROT) with lfs_prelock(fs, 0). Reimplement the
fragment lock and cleaner lock to use lfs_prelock().
Avoids an observed deadlock between fragment extension and segment writing.
[InstCombine] Fold redundant FP clamp selects; relax min-max-pattern bailout in visitFCmp (#173452)
visitFCmp() previously bailed out when a following select matched a
clamp pattern. This blocks simplifications when the clamp is provably
redundant.
This PR allows simplification for clamp selects of flavor SPF_FMAXNUM/
SPF_FMINNUM when one arm is a constant and the other is a sitofp/uitofp
of an integer value, and the constant equals the exact min/max of that
integer domain:
* SPF_FMAXNUM (pattern max(X,C)): redundant if C is the minimum integer
mapped exactly to FP (e.g. X = sitofp i8, C = -128.0f).
* SPF_FMINNUM (pattern min(X,C)): redundant if C is the maximum integer
mapped exactly to FP (e.g. X = uitofp i8, C = 255.0f).
This fixes a regression in #173454
---------
Co-authored-by: Copilot <175728472+Copilot at users.noreply.github.com>
Co-authored-by: Yingwei Zheng <dtcxzyw at qq.com>
opencv*: update to 4.13
New Year update for OpenCV 4.x has been released.
Core module:
Modified Input/OutputArray methods to handle 'std::vector' or 'std::vector<std::vector>' in more accurate way #28242
Made cuda::GpuMatND compatible with InputArray/OutputArray #23913
Forced output type for empty matrices where it's defined in API #27972
Added std::vector length check Input/OutputArray #27817
Added 16-bit LUT and corresponding HAL entrypoint #27890, #27911
Add cv::Mat::copyAt for for ROI operation #27318
Extended JSON support in cv::FileStorage: null parsing #27579 and
Added support parsing null in JSON parser in cv::FileStorage #27579 and back slash "" support #27587
Fixed cv::solveCubic numerical instability via coefficient normalization #28117
Fixed tempfile race condition on Windows #28087
Restore parallel framework name on failure attempt #27802
Dropped OPENCV_FOR_OPENMP_DYNAMIC_DISABLE environment variable in favor of standard OMP_DYNAMIC #28122
Enabled fp16 conversions, but disabled NEON FP16 arithmetics on Windows for ARM #27897
Fixed dot product accumulation causing NORM test failures on Windows ARM64 #28211
[181 lines not shown]