[LLVM] remove redundant uses of dyn_cast (NFC) (#189105)
This removes dyn_cast invocations where the argument is already of the
target type (including through subtyping). This was created by adding a
static assert in dyn_cast and letting an LLM iterate until the code base
compiled. I then went through each example and cleaned it up. This does
not commit the static assert in dyn_cast, because it would prevent a lot
of uses in templated code. To prevent backsliding we should instead add
an LLVM aware version of
https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html
(or expand the existing one).
[msan] Add MSan instrumentation support for Hexagon (#189122)
Add MemorySanitizer instrumentation pass support for Hexagon Linux. This
is the codegen/instrumentation side; the compiler-rt runtime changes are
in a separate patch.
The shadow memory layout uses XOR-based mapping with XorMask=0x20000000
and OriginBase=0x50000000, designed to fit within the 32-bit address
space.
VarArg handling uses VarArgGenericHelper with VAListTagSize=12, matching
the Hexagon ABI where va_list is a three-pointer struct {
current_reg_area, reg_area_end, overflow_area }.
[compiler-rt][msan] Guard shmat interceptor w SANITIZER_INTERCEPT_SHMCTL (#189198)
The shmat interceptor calls REAL(shmctl), but shmctl is not intercepted
on all targets (e.g. 32-bit Linux with musl). Guard shmat behind
SANITIZER_INTERCEPT_SHMCTL and use a MSAN_MAYBE_INTERCEPT pattern
consistent with other conditional interceptors.
[compiler-rt][msan] Add MSan support for Hexagon (Linux) (#189124)
Add the runtime infrastructure for MemorySanitizer on Hexagon Linux.
Hexagon is 32-bit, so the shadow memory layout uses a compact XOR-based
mapping that fits within the lower 3GB of address space:
0x00000000 - 0x10000000 APP-1 (256MB, program text/data/heap)
0x10000000 - 0x20000000 ALLOCATOR (256MB)
0x20000000 - 0x40000000 SHADOW-1 (512MB, covers APP-1 + ALLOCATOR)
0x40000000 - 0x50000000 APP-2 (256MB, shared libs + stack)
0x60000000 - 0x70000000 SHADOW-2 (256MB, covers APP-2)
0x70000000 - 0x90000000 ORIGIN-1 (512MB)
0xB0000000 - 0xC0000000 ORIGIN-2 (256MB)
MEM_TO_SHADOW uses XOR 0x20000000, and SHADOW_TO_ORIGIN adds 0x50000000.
The dual-APP layout accommodates QEMU user-mode, which places shared
libraries and the stack at 0x40000000.
The allocator uses SizeClassAllocator32 with a 256MB region at
0x10000000, and kMaxAllowedMallocSize is set to 1GB consistent with
other 32-bit targets.
clang: Avoid intermediate DenseSet of triples (#189263)
This was computing a DenseSet<StringRef> of triples, but the
only use was to insert all the entries into a multiset. Just
use the multiset in the first place.
clang: Store Triple in multiset
Previously this was storing StringRefs, which just happen
to be constant allocated strings. Change this into an owning
reference in the form that will actually be used. This will allow
changing the triples to something computed without maintaining
a table of every possible permutation.
clang: Simplify OpenMP triple adjustment
Previously this would find a list of offloading triples,
then later fill in the unknown components specifically for
OpenMP after the fact. Start normalizing the triples upfront,
before inserting into the set. Also stop special casing OpenMP
since there's no apparent reason to treat it differently from
other offload languages.
Also operate on the Triple rather than the string, and handle
the unset OS and environment separately.
clang: Avoid intermediate DenseSet of triples
This was computing a DenseSet<StringRef> of triples, but the
only use was to insert all the entries into a multiset. Just
use the multiset in the first place.
[LV] Fix offset handling for epilogue resume values. (NFCI) (#189259)
Instead of replacing all uses of the canonical IV with an add of the
resume value and then relying on the fold to simplify, directly create
offset versions of both the canonical IV and its increment.
The original offset computation were incorrect, but not resulted in
mis-compiles due to the corresponding fold.
Split off from approved
https://github.com/llvm/llvm-project/pull/156262.
clang: Simplify OpenMP triple adjustment
Previously this would find a list of offloading triples,
then later fill in the unknown components specifically for
OpenMP after the fact. Start normalizing the triples upfront,
before inserting into the set. Also stop special casing OpenMP
since there's no apparent reason to treat it differently from
other offload languages.
Also operate on the Triple rather than the string, and handle
the unset OS and environment separately.
clang: Avoid intermediate DenseSet of triples
This was computing a DenseSet<StringRef> of triples, but the
only use was to insert all the entries into a multiset. Just
use the multiset in the first place.
clang: Store Triple in multiset
Previously this was storing StringRefs, which just happen
to be constant allocated strings. Change this into an owning
reference in the form that will actually be used. This will allow
changing the triples to something computed without maintaining
a table of every possible permutation.
[NFC][AArch64] update tests for `is_fpclass` (#187336)
Hopefully this is better.
One wrinkle is that `@llvm.is.fpclass.bf16` is not currently implemented
for GI. That might be easy to add but I've not been able to figure out
where the issue is exactly so far.
I'm also not totally sure `-mattr=-fp-armv8` is equivalent to softfloat,
but some tests do suggest that they are equivalent (and looking at the
assembly, that seems right).
[DA] Fix overflow of calculation in weakCrossingSIVtest
This patch fixes a correctness issue where integer overflow in the
upper bound calculation of weakCrossingSIVtest caused the pass to
incorrectly prove independence.
The previous logic used `SCEV::getMulExpr` to calculate
`2 * ConstCoeff * UpperBound` and compared it to `Delta` using
`isKnownPredicate`. In the presence of overflow, this could yield
unsafe results.
This change replaces the SCEV arithmetic with `ConstantRange` and
its operation (`smul_fast`). If the calculation overflows,
`intersectWith(MLRange).isEmptySet()` would be false, ensures we
conservatively assume a dependence if the bounds cannot be proven
safe.
Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
[MLIR][LLVMIR] Allow llvm.call and llvm.invoke to use llvm.mlir.alias as callee (#189154)
Previously, the verifier for `llvm.call` and `llvm.invoke` would reject
calls where the callee was an `llvm.mlir.alias`, reporting that the
symbol does not reference a valid LLVM function or IFunc. Similarly, the
MLIR-to-LLVM-IR translation had no handling for aliases as callees.
This patch extends both the verifier and the translation to accept
`llvm.mlir.alias` as a valid callee for `llvm.call` and `llvm.invoke`,
mirroring the existing support for `llvm.mlir.ifunc`. The function type
for alias calls is derived from the call operands and result types, and
the translation emits a call through the alias global value.
Fixes #147057
Assisted-by: Claude Code