[profcheck] Propagate profile metadata to Wrapper function in optimize mode of ExpandVariadic. (#168161)
This PR fixes the issue where profile metadata (`!prof`) is dropped from
the `VariadicWrapper` when `ExpandVariadics` runs in
`--expand-variadics-override=optimize` mode.
In optimize mode, the pass splits the original variadic function into
two parts:
- A **VariadicWrapper** (retaining the original name) that handles the
`va_list` setup.
- A **FixedArityReplacement** (new function) that contains the original
core logic.
During this process, the basic blocks and associated metadata are
spliced into the `FixedArityReplacement`. Consequently, the
`VariadicWrapper`—which serves as the entry point for callers—is left
without function entry count metadata.
[4 lines not shown]
[Clang][Sema] Add fortify warnings for strcat (#168965)
Continue to add fortify warnings that are missing in Clang for string
functions as part of #142230
[clang-doc] `<ul>` must be nested in `<li>`
The HTML spec states that only `<li>` can be children of `<ul>`. Nested
`<ul>` tags in an unordered list must be children of `<li>`.
[clang-doc] Fix `</section>` mismatch in the namespace template
A `</section>` tag wasn't inside the `{{#HasRecords}}` Mustache tag, which caused a
mismatch if there weren't any records to render.
AMDGPU: Improve getShuffleCost accuracy for 8- and 16-bit shuffles
These shuffles can always be implemented using v_perm_b32, and so this
rewrites the analysis from the perspective of "how many v_perm_b32s does
it take to assemble each register of the result?"
The test changes in Transforms/SLPVectorizer/reduction.ll are
reasonable: VI (gfx8) has native f16 math, but not packed math.
commit-id:8b76e888
VectorCombine: Improve the insert/extract fold in the narrowing case
Keeping the extracted element in a natural position in the narrowed
vector has two beneficial effects:
1. It makes the narrowing shuffles cheaper (at least on AMDGPU), which
allows the insert/extract fold to trigger.
2. It makes the narrowing shuffles in a chain of extract/insert
compatible, which allows foldLengthChangingShuffles to successfully
recognize a chain that can be folded.
There are minor X86 test changes that look reasonable to me. The IR
change for AVX2 in llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll
doesn't change the assembly generated by `llc -mtriple=x86_64-- -mattr=AVX2`
at all.
commit-id:c151bb04
emulators/virtualbox-ose-{,-nox11}-7{0,1,2}: Make Qt optional for building
Remove build dependency for Qt when NLS and graphical frontend is not
required.
PR: 291023
Co-authored-by: takahiro.kurosawa at gmail.com
MFH: 2025Q4
(cherry picked from commit 4f1b651ebc7aa3fca1b19f4f45fed00e5a397d57)
emulators/virtualbox-ose-{,-nox11}-7{0,1,2}: Make Qt optional for building
Remove build dependency for Qt when NLS and graphical frontend is not
required.
PR: 291023
Co-authored-by: takahiro.kurosawa at gmail.com
MFH: 2025Q4
[OpenACC][CIR] Handle 'declare' construct local lowering (&link clause) (#168793)
'declare' is a declaration directive, so it can appear at 3 places:
Global/NS scope, class scope, or local scope. This patch implements ONLY
the 'local' scope lowering for 'declare'.
A 'declare' is lowered as a 'declare_enter' and 'declare_exit'
operation, plus data operands like all others. Sema restricts the form
of some of these, but they are otherwise identical.
'declare' DOES require at least 1 clause for the examples to make sense,
so this ALSO implements 'link', which is the 'simpliest' one. It is ONLY
attached to the 'declare_enter', and doesn't require any additional work
besides a very small addition to how we handle clauses.
[HLSL] Add Load overload with status (#166449)
This PR adds a Load method for resources, which takes an additional
parameter by reference, status. It fills the status parameter with a 1
or 0, depending on whether or not the resource access was mapped.
CheckAccessFullyMapped is also added as an intrinsic, and called in the
production of this status bit.
Only addresses DXIL for the below issue:
https://github.com/llvm/llvm-project/issues/138910
Also only addresses the DXIL variant for the below issue:
https://github.com/llvm/llvm-project/issues/99204
[clang][Sema][OpenMP] Fix GPU exception target check (#169056)
Looks like I missed this when I added `Triple::isGPU()`.
Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
[MLIR] Drop use of REQUIRES:shell from tests (#168989)
This patch drops two instances of REQUIRES: shell from MLIR tests. This
feature does not mean much given the internal shell is the default for
MLIR. It does prevent these tests from running on Windows, but it does
not seem like there is anything inherent to these tests preventing them
from running on Windows (minus maybe the lack of spirv-tools, which is
explicitly required anyways.
[CIR] Add NYI cases to builtin switch statement and move existing cases into functions (#168699)
This PR adds a number of cases to the switch statement in
`CIRGenBUiltin.cpp`. Some existing cases were relocated, so the order
matches the order from the switch statement in clangs codegen.
Additionally, some exisiting cases were moved to functions, to keep the
code a little cleaner. In the future, it will be easier to keep track of
which builtins have not been implemented, since there would always be a
NYI case for unimplemented builtins.