[lldb] Restore the old behavior in lua-typemaps.swig (#169103)
Restore the original behavior (i.e. before #167764), which uses
eOpenOptionWriteOnly, not eOpenOptionReadWrite. Fixes TestLuaAPI.py.
[clang-doc] Add Mustache HTML output to namespace test
This patch adds Mustache HTML tests alongside the legacy HTML backend
for namespace output. This way, we can see exactly where the output
currently differs before replacing the legacy backend.
The same thing will be done for all other tests where the legacy HTML
backend is tested.
[lld] Add (ignored) /link flag to lld-link for compatibility with MSVC link.exe (#168364)
Various build tools may produce command lines invoking clang-cl and
lld-link which contain /link twice like so: e.g. `clang-cl.exe
sanitycheckcpp.cc /Fesanitycheckcpp.exe .... /link /link ...`
If link.exe is used, it ignores the extra `/link` and just issues a
warning, however lld-link tries to treat `/link` as a file name.
This PR adds a flag which is ignored in order to improve compatibility
with link.exe
There's some extra context including an "in-the-wild" example and
reproducer of the problem here:
https://github.com/frankier/meson_clang_win_activation
Co-authored-by: Frankie Robertson <frankie at robertson.name>
[SCEVExp] Remove early exit, rely on InstSimplifyFolder (NFCI).
Remove the SCEV-based check refined in
https://github.com/llvm/llvm-project/pull/156910, as InstSimplifyFolder
manages to simplify the generated code to false directly as well.
[clang-tidy][NFC] Enable misc-const-correctness rule in clang-tidy codebase (#167172)
After successful `misc-const-correctness` cleanup (last patch
https://github.com/llvm/llvm-project/pull/167131), we can enable
`misc-const-correctness` rule for the whole project.
During cleanup, I didn't encounter any false positives so it's safe to
assume that we will have minimal FP in the future.
[ASTMatchers] Make isExpandedFromMacro accept llvm::StringRef (#167060)
We can use non-owning `StringRef` in `MacroName` parameter to avoid
unnecessary copy because `MacroName` only used as an argument to
`internal::getExpansionLocOfMacro` which already accept `StringRef`.
[unroll-and-jam] Document dependencies_multidims.ll and fix loop bounds (NFC) (#156578)
Add detailed comments explaining why each function should/shouldn't be
unroll-and-jammed based on memory access patterns and dependencies.
Fix loop bounds to ensure array accesses are within array bounds:
* sub_sub_less: j starts from 1 (not 0) to ensure j-1 >= 0
* sub_sub_less_3d: k starts from 1 (not 0) to ensure k-1 >= 0
* sub_sub_outer_scalar: j starts from 1 (not 0) to ensure j-1 >= 0
[clang-doc] `<ul>` must be nested in `<li>` (#168972)
The HTML spec states that only `<li>` can be children of `<ul>`. Nested
`<ul>` tags in an unordered list must be children of `<li>`.
[AMDGPU] Implement CFI for CSR spills
Introduce new SPILL pseudos to allow CFI to be generated for only CSR
spills, and to make ISA-instruction-level accurate information.
Other targets either generate slightly incorrect information or rely on
conventions for how spills are placed within the entry block. The
approach in this change produces larger unwind tables, with the
increased size being spent on additional DW_CFA_advance_location
instructions needed to describe the unwinding accurately.
Co-authored-by: Scott Linder <scott.linder at amd.com>
Co-authored-by: Venkata Ramanaiah Nalamothu <VenkataRamanaiah.Nalamothu at amd.com>
[OpenACC][CIR] deviceptr clause lowering for local 'declare' (#169085)
This is very similar to the 'link' that was done in the last patch,
except this works on all storage, but only on pointers. This also shows
a bit more of how the enter/exit pairs work in the test.
Implementation itself is very simple, as it is just properly handling it
in the clause handler.
[clang-doc] `<ul>` must be nested in `<li>`
The HTML spec states that only `<li>` can be children of `<ul>`. Nested
`<ul>` tags in an unordered list must be children of `<li>`.
AMDGPU: Improve getShuffleCost accuracy for 8- and 16-bit shuffles (#168818)
These shuffles can always be implemented using v_perm_b32, and so this
rewrites the analysis from the perspective of "how many v_perm_b32s does
it take to assemble each register of the result?"
The test changes in Transforms/SLPVectorizer/reduction.ll are
reasonable: VI (gfx8) has native f16 math, but not packed math.
[clang-doc] Fix `</section>` mismatch in the namespace template (#168966)
A `</section>` tag wasn't inside the `{{#HasRecords}}` Mustache tag, which caused a
mismatch if there weren't any records to render.
[profcheck] Propagate profile metadata to Wrapper function in optimize mode of ExpandVariadic. (#168161)
This PR fixes the issue where profile metadata (`!prof`) is dropped from
the `VariadicWrapper` when `ExpandVariadics` runs in
`--expand-variadics-override=optimize` mode.
In optimize mode, the pass splits the original variadic function into
two parts:
- A **VariadicWrapper** (retaining the original name) that handles the
`va_list` setup.
- A **FixedArityReplacement** (new function) that contains the original
core logic.
During this process, the basic blocks and associated metadata are
spliced into the `FixedArityReplacement`. Consequently, the
`VariadicWrapper`—which serves as the entry point for callers—is left
without function entry count metadata.
[4 lines not shown]
[Clang][Sema] Add fortify warnings for strcat (#168965)
Continue to add fortify warnings that are missing in Clang for string
functions as part of #142230
[clang-doc] `<ul>` must be nested in `<li>`
The HTML spec states that only `<li>` can be children of `<ul>`. Nested
`<ul>` tags in an unordered list must be children of `<li>`.
[clang-doc] Fix `</section>` mismatch in the namespace template
A `</section>` tag wasn't inside the `{{#HasRecords}}` Mustache tag, which caused a
mismatch if there weren't any records to render.
AMDGPU: Improve getShuffleCost accuracy for 8- and 16-bit shuffles
These shuffles can always be implemented using v_perm_b32, and so this
rewrites the analysis from the perspective of "how many v_perm_b32s does
it take to assemble each register of the result?"
The test changes in Transforms/SLPVectorizer/reduction.ll are
reasonable: VI (gfx8) has native f16 math, but not packed math.
commit-id:8b76e888
VectorCombine: Improve the insert/extract fold in the narrowing case
Keeping the extracted element in a natural position in the narrowed
vector has two beneficial effects:
1. It makes the narrowing shuffles cheaper (at least on AMDGPU), which
allows the insert/extract fold to trigger.
2. It makes the narrowing shuffles in a chain of extract/insert
compatible, which allows foldLengthChangingShuffles to successfully
recognize a chain that can be folded.
There are minor X86 test changes that look reasonable to me. The IR
change for AVX2 in llvm/test/Transforms/VectorCombine/X86/extract-insert-poison.ll
doesn't change the assembly generated by `llc -mtriple=x86_64-- -mattr=AVX2`
at all.
commit-id:c151bb04