[X86] Don't form pmaddwd for shift by 15 (#206473)
I believe the transform is not correct for shifts by 15, because the
multiplier gets sign extended, turning a multiply by 2^15 into one by
-2^15.
[clang][bytecode] Ignore irrelevant InitializingPtrs in dynamic_cast (#206468)
If the initializing pointer is unrealted to the one we're operating on,
ignore it.
[EarlyCSE] Do not forward memset zero to intrinsic (#206452)
We can do this in principle, but it would require more precise handling.
E.g. for masked.load we'd have to respect the passthru argument for
masked out lanes. I don't think this is worthwhile, so just bail out.
[clang] use typo-corrected name qualifier for template names (#206180)
This also prevents error-recovery from forming a member specialization
which is not a class member, which leads to crashes-on-invalid.
Fixes #204561
[IR] Add non-const DbgRecord::getInstruction() overload (#206059)
DbgRecord exposes const+non-const overload pairs for getParent(),
getModule() and getContext(), but getInstruction() was const-only and
returned a const Instruction*. Code holding a non-const DbgRecord that
needs a mutable host instruction, for example to use as a DIBuilder
insert position, was therefore forced to const_cast the result.
Add a non-const getInstruction() overload returning a mutable
Instruction*, matching the existing accessor pairs and removing the need
for those casts. The const overload is unchanged.
AI tools were used to write this description.
clang/AMDGPU: Stop passing redundant -target-cpu to cc1
Now that the exact target is encoded in the triple's subarch field,
-target-cpu is redundant. This avoids polluting the resultant IR with
unwanted "target-cpu" attributes. The net result is the desired codegen
when compiling libraries for a major subarch and linking it into a
program compiled for a specific arch. e.g., compiling for "gfx9-generic"
would pollute the IR with "target-cpu"="gfx9-generic", so codegen
would ultimately be performed for the generic target even after
linking into the concrete gfx9 cpu. The specialization will now be
achieved by merging the triples without the linker or optimization
passes needing to fixup function attributes.
clang: Start using new amdgpu subarch triples
Fixup invocations using --target=amdgcn + -mcpu to introduce
the subarch in the triple.
For offload toolchains, a single toolchain is constructed for the
top level amdgpu architecture, and the effective triple is used for
target specific tool invocations.
The specifics of the resource directory layout are tbd. This does
try to find resources in the subarch named directory. The paths
are searched at toolchain creation time, so that does not work
when there are multiple subarches.
Fixes #154925
[CallGraph] Collect callables from global variable initializers (#206458)
CallGraph::TraverseStmt is a no-op, so declaration traversal never
descends into a variable initializer. A lambda or block defined in a
global-storage variable's initializer was therefore never added to the
graph. TU-end lifetime safety analysis walks the call graph to find
functions to analyze, so such a lambda was silently skipped and a real
lifetime bug in it went unreported, while the default per-function mode
caught it.
Walk the initializer of every global-storage variable during declaration
visitation and add any callables defined within it. The hasGlobalStorage
guard excludes parameter default arguments, which CGBuilder already
handles at call sites.
Assisted-by: Claude Opus 4.8
gotosocial: Update to 0.22.0
Changes:
- Update go modules depencies.
Pkglint: passed.
Package built and tested tested on NetBSD 10.1 amd64.
WARNING:
- Configuration changes and database schema changes.
- Before starting this new release, first adapt the configuration file,
then do not interrupt as a database migration will take some time.
See https://codeberg.org/superseriousbusiness/gotosocial/releases/tag/v0.22.0
for upgrade instructions.
[lldb][windows] fix always falsy comparison (#206179)
This is a follow up to https://github.com/llvm/llvm-project/pull/206107,
which introduced a comparison that is always falsy. `os.environ.get`
returns a string and `('1' == 1) == False` in Python.
Compare to a string and return a string as a default value.
[X86] phaddsub-extract.ll - pull out optsize / pgso tests and move to vector-reduce-add-codesize.ll test file (#206492)
We already have PhaseOrdering middle-end tests for the @llvm.vector.reduce.add pattern matching
NAS-140857 / 27.0.0-BETA.1 / Handle TNC license delivery and token states in heartbeat (#19153)
This commit adds changes to read the TNC heartbeat response body so we
can report the system fingerprint and installed license id, install a
license PEM that TNC delivers, and drive token rotation and the terminal
token states off the body fields instead of the old X-New-Token header.
A delivered license is deduped against the one already installed so we
don't reinstall it every beat, and a 205 that carries no license or
token is logged as a TNC fault rather than silently skipped.
Revert "[Dexter] Add rewriting for aggregate variables (#202800)" (#206495)
This reverts commit 2cf48dca3338951a7fbe83fecc9e6d35caaa9b11.
The original commit is failing sometimes in pre-commit CI for linux
builds, possibly due to some unspecified environmental dependency.
[HLSL] Implement Texture2DArray for HLSL (#203951)
Add support for the Texture2DArray type, builtin argument checking,
codegen, and associated tests.
This change also implements the parts of #194910 which could not be
tested without a HLSL texture array type.
Assisted by Cursor
Fixes #194944
---------
Co-authored-by: Tim Corringham <tcorring at amd.com>
[NFC][LLVM] Minor code cleanup in BitcodeReader (#206105)
Use structured binding in the range for loop for iterating over upgraded
intrinsics. Also `UpdatedIntrinsicMap` type alias is used just once, so
eliminate it.