[flang] Add setMemcpyAlignmentArgAttrs and use it for box-load memcpy (#185126)
Introduce `setMemcpyAlignmentArgAttrs` to set the LLVM alignment operand
attributes on a memcpy. The function is used when lowering `fir.load` of a box to
LLVM memcpy so the generated memcpy has correct align attributes on dst and src,
improving codegen and downstream optimizations.
[MLIR][LLVMIR] Fix inline byval alloca hoisting out of allocation scope (#185399)
This PR fixes a bug of the LLVM dialect's function inlining, which
materializes allocas for byval arguments and moves them to the entry
block for optimization. This optimization should not cross allocation
scopes, as it leads to data races in parallel regions. See the example
below.
```mlir
// Optimize with mlir-opt --pass-pipeline="builtin.module(inline)" to trigger the inlining bug.
module {
// runner calls kernel(thread_idx, byval_struct, output)
// After inlining, the alloca for the byval struct gets placed in the outer
// region (shared by all threads) instead of inside the scf.forall body.
// This is a bug: each thread should have its own copy, but they all share
// the same alloca.
llvm.func hidden @runner(%arg0: !llvm.ptr {llvm.byval = !llvm.struct<(f32)>}, %arg1: !llvm.ptr) {
%c0 = arith.constant 0 : index
[25 lines not shown]
Reapply "[clang][ssaf] Add --ssaf-extract-summaries= and --ssaf-tu-summary-file= options" (#185414)
Reapplies #185391, and links `clangSema` to `clangAnalysisScalable` for
the missing `clang::SemaConsumer::anchor()` symbol from
`TUSummaryExtractorFrontendAction.cpp`.
In static builds, this missing symbol wasn't an issue, but it is for
shared lib builds.
AMDGPU: Annotate group size ABI loads with range metadata
We previously did the same for the grid size when annotated.
The group size is easier, so it's weird that this wasn't implemented
first.
Merge commit 81b20e110b3f from llvm git (by Roland McGrath):
[libc++] Work around new GCC 15 type_traits builtins that can't be
used as Clang's can (#137871)
GCC 15 has added builtins for various C++ type traits that Clang
already had. Since `__has_builtin(...)` now finds these, the #if
branches previously only used for Clang are now used for GCC 15.
However, GCC 15 requires that these builtins only be used in type
aliases, not in template aliases.
For now, just don't use the `__has_builtin(...)` branches under newer
GCC versions, so both 14 and 15 work during the transition. This
can be cleaned up later to use all the GCC 15 builtins available.
Fixed: #137704
Fixed: #117319
Reviewed by: dim
[4 lines not shown]
Merge commit 81b20e110b3f from llvm git (by Roland McGrath):
[libc++] Work around new GCC 15 type_traits builtins that can't be
used as Clang's can (#137871)
GCC 15 has added builtins for various C++ type traits that Clang
already had. Since `__has_builtin(...)` now finds these, the #if
branches previously only used for Clang are now used for GCC 15.
However, GCC 15 requires that these builtins only be used in type
aliases, not in template aliases.
For now, just don't use the `__has_builtin(...)` branches under newer
GCC versions, so both 14 and 15 work during the transition. This
can be cleaned up later to use all the GCC 15 builtins available.
Fixed: #137704
Fixed: #117319
Reviewed by: dim
[3 lines not shown]
We've never seen this panic where *_fast_ipi() fails because a cpu isn't
responding. I don't think we can see the panic -- I think we are so low
that panic code will misbehave and more likely we see a hang.
It is easier to accept this impossible failure, decrement the counter, and
carry on.
[BOLT] Fix test with -DCLANG_DEFAULT_PIE_ON_LINUX=OFF (#185047)
Use `%cxxflags`, so that `-fPIE -pie` get passed in order to ensure the
test behavior is the same regardless of cmake configuration. We do
similar in many other BOLT tests.
use ZFS object counts to estimate % complete
This commit switches our filesystem permissions-related API
endpoints to calcluate thep percentage compelte for the task
based on object counters that libzfs provides. This is
somewhat imperfect, but gets us in the ballpark of a reasonable
number at a very low cost (much lower than pre-scanning).