[MLIR][Vector] Fix crash in foldDenseElementsAttrDestInsertOp on poison index (#188508)
When a dynamic index of -1 (the kPoisonIndex sentinel) was folded into
the static position of a vector.insert op,
foldDenseElementsAttrDestInsertOp would proceed to call
calculateInsertPosition, which returned -1. The subsequent iterator
arithmetic (allValues.begin() + (-1)) was undefined behaviour, causing
an assertion in DenseElementsAttr::get.
Fix by bailing out early in foldDenseElementsAttrDestInsertOp when any
static position equals kPoisonIndex, consistent with how
InsertChainFullyInitialized already guards this case.
Fixes #188404
Assisted-by: Claude Code
[mlir][reducer] Add eraseRedundantBlocksInRegion and getSuccessorForwardOperands API to BranchOpInterface (#187864)
To simplify the output of the reduction-tree pass, this PR introduces
the eraseRedundantBlocksInRegion. For regions containing multiple
execution paths, this functionality selects the shortest 'interesting'
path. Additionally, this PR adds the getSuccessorForwardOperands API to
BranchOpInterface. This allows us to extract the ForwardOperands for a
specific path chosen from multiple alternatives, enabling the creation
of a cf.br operation for the redirected jump.
[clang][bytecode] Skip rvalue subobject adjustments for global temporaries (#189044)
We only did this for local variables but were were missing it for
globals.
[Driver][HIP] Fix bundled -S emitting bitcode instead of assembly for device (#189140)
[Driver][HIP] Fix bundled -S emitting bitcode instead of assembly for
device
PR #188262 added support for bundling HIP -S output under the new
offload driver, but the device backend still entered the
bitcode-emitting path in ConstructPhaseAction. The condition at the
Backend phase checked for the new offload driver and directed device
code to emit TY_LLVM_BC, without excluding the -S case. This caused
the device section in the bundled .s to contain LLVM bitcode instead
of textual AMDGPU assembly.
This broke the HIP UT CheckCodeObjAttr test which greps
copyKernel.s for "uniform_work_group_size" — a string that only
appears in textual assembly, not in bitcode.
Fix by excluding -S (without -emit-llvm) from the new-driver
bitcode path, so the device backend falls through to emit TY_PP_Asm
[3 lines not shown]
[XeVM] Fix the cache-control metadata string generation. (#187591)
Previously, it generated extra `single` quote marks around the outer
braces (i.e., `'{'` `6442:\220,1\22` `'}'`). SPIR-V backend does not
expect that. It expects `{6442:\220,1\22}`.
AMDGPU: Make VarIndex WeakTrackingVH in AMDGPUPromoteAlloca (#188921)
The test used to look all good, but actually not. The WeakVH just make
itself null after the pointed value being replaced. So a zero value was
used because VarIndex become null. The test checks looks all good.
Actually only the WeakTrackingVH have the ability to be updated to new
value.
Change the test slightly to make that using zero index is wrong.
[Clang] remove redundant uses of dyn_cast (NFC) (#189106)
This removes dyn_cast invocations where the argument is already of the
target type (including through subtyping). This was created by adding a
static assert in dyn_cast and letting an LLM iterate until the code base
compiled. I then went through each example and cleaned it up. This does
not commit the static assert in dyn_cast, because it would prevent a lot
of uses in templated code. To prevent backsliding we should instead add
an LLVM aware version of
https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html
(or expand the existing one).
[clang][RISC-V] fixed fp calling convention for fpcc eligible structs for risc-v (#110690)
The code generated for calls with FPCC eligible structs as arguments
doesn't consider the bitfield, which results in a store crossing the
boundary of the memory allocated using alloca, e.g.
For the code:
```
struct __attribute__((packed, aligned(1))) S {
const float f0;
unsigned f1 : 1;
};
unsigned func(struct S arg)
{
return arg.f1;
}
```
The generated IR is:
```
define dso_local signext i32 @func(
[29 lines not shown]
[Fuchsia] Set LIBCXX_ABI_UNSTABLE instead of LIBCXX_ABI_VERSION (#189123)
Use the generic switch rather than encoding the version number it
currently corresponds to.
[libc] Remove more header template files (#189066)
Get rid of several .h.def files which were used to ensure that the
macro definitions from llvm-libc-macro would be included in the public
header. Replace this logic with YAML instead - add entries to the
"macros" list that point to the correct "macro_header" to ensure it
would be included.
For C standard library headers, list several standard-define macros
to document their availability. For POSIX/Linux headers, only reference
a handful of macro, since more planning is needed to decide how to
represent platform-specific macro in YAML.