[ThinLTO] Remove unused relative block frequency support (#177215)
This removes most of the handling of the relative block frequency
support added in 2018 in c73cec84c99e5a63dca961fef67998a677c53a3c, which
was disabled by default and never utilized in the thin link as expected.
Support for reading old Bitcode containing the record is maintained as
required for backwards compatibility requirements, as is the support for
parsing old LLVM assembly containing that information. Tests ensure that
this backwards compatibility is maintained.
This came up in the context of redundant BFI/DT computations which
existed largely for the purpose of computing this information
and are being addressed in PR176646.
[NFC][LTO] Move isPreservedName out of IRSymtab into LTO's Symbol as isLibcall (#177046)
This resolves the FIXME in IRSymtab and cleans up the semantics of the
IRSymtab. The list of preserved symbols really shouldn't be seen as a
property of the IR symbol table, since it's an LTO-specific concern, and
it's very tenuous to claim that this information is actually present in
the bitcode file to be exposed through its symbol table.
Instead, this PR moves this logic into LTO's view of the symbol, which
allows consumers to determine preserved-ness themselves. This was broken
out of #164916; this prevents that PR from introducing a circular
dependency, but it still seems like an independently good idea by virtue
of the above.
[clang-doc] Add a Mustache Markdown generator
Adds a Markdown generator that uses Mustache templates. This patch adds
the templates themselves and implements changes to the JSONGenerator to
allow for the creation of specific files needed by the MD tests like
`all-files.json`.
This backend should be considered experimental. It satisfies all the
same tests that the current MD backend is tested against, but those
don't seem to provide full coverage for all functionality inside that
backend. It also doesn't output everything provided by JSON. It doesn't
use the MD unittests because the Mustache templates must currently be
written to files.
[AMDGPU] Improve codegen for uniform f16<-->i32 conversions (#176833)
This patch improves codegen by chaining scalar operations for uniform
f16<-->i32 conversions where hardware supports the specific SALU
operations.
Added patterns in SOPInstructions.td to synthesize f16<-->i32
conversions via
intermediate f32 (f16-->f32-->i32 and i32-->f32-->f16).
[flang][cuda] Remove option allocationConversion from pass (#177037)
The pass option was meant to be used during migration. This is not
needed anymore.
Resolve all the typos people found (thanks everyone!)
Co-authored-by: Alan Li <me at alanli.org>
Co-authored-by: Jakub Kuderski <jakub at nod-labs.com>
Co-authored-by: Maksim Levental <maksim.levental at gmail.com>
[HLSL] Improve HLSL resource method generation (#176806)
Refactor how HLSL resource methods are constructed in
HLSLBuiltinTypeDeclBuilder to be more robust and semantically correct.
- Switch to using Sema::BuildCallExpr and Sema::BuildCStyleCastExpr for
building builtin calls, ensuring proper type checking and AST
structure. This fixes issues with non-template resources like
SamplerState where AST errors aren't automatically resolved during
instantiation.
- Treat parameter placeholders as LValues in convertPlaceholder. This is
required for builtins with 'out' parameters (e.g., GetDimensions) now
that proper type checking via BuildCallExpr is performed.
- Fix a bug in CreateFromBinding methods where the counter handle was
assigned an incorrect handle type.
- Add assertions to ensure the correct field is accessed for handles,
preventing errors when implementing methods like Texture2D.Sample.
- Update AST tests to reflect changes in expression value categories
(VK_LValue) and the introduction of CStyleCastExpr.
[CIR][NFC] Update out-of-sync OGCG checks in test CIRGen/builtin_bit (#177189)
This patch updates various out-of-sync OGCG checks in the test file
`clang/test/CIR/CIRGen/builtin_bit.cpp`.
These checks are all related to the original clang CodeGen for the
bitwise rotate builtins. The OGCG patch #160259 inserts a new `urem`
instruction before calling the `llvm.fshr.*` intrinsic, which truncates
the rotate amount against the input's bit width. This breaks our OGCG
checks.
I have not yet dug deep enough into the rationale behind the OGCG patch.
The LLVM intrinsic `llvm.fshr.*` should already handle the truncation,
and the new `urem` instruction seems redundant in terms of semantic
correctness. Thus I choose not to hurry to also update relevant CIRGen
code to match OGCG behavior in this patch.
[AMDGPU] Latency calculation must be independent of meta insts (#177052)
Debug and other meta instructions in bundles must not affect latency
calculation.
Ensure that code compiled with and without debug instructions is
identical.
---------
Signed-off-by: John Lu <John.Lu at amd.com>
[ProfCheck] Exclude Transforms/InstCombine/load-addrspacecast-select.ll
This was added recently and needs to be fixed, but for now exclude it so
that the bot can return to green and we can better catch future
regressions.
[mlir][SCF] Improve `ForOp::getSuccessorRegions` (#177116)
- Loops with 0 iterations always branch back to the parent.
- Loops with 1 iteration always branch into the loop, then immediately
back to the parent.
This change improves the quality of data flow analyses (e.g., dead code
analysis). It is also in preparation of adding a generic region inlining
canonicalization pattern for `RegionBranchOpInterface` ops (#176641).
---------
Co-authored-by: Jakub Kuderski <jakub at nod-labs.com>
[HLSL] Add wave active ballot to set of wave ops that set waveops shader flag (#177043)
This PR simply adds wave active ballot to the set of wave ops that
switch on the waveops shader flag.
[NFC][DirectX] Clean-up `llvm-objcopy` to be consistent across implementation details (#177006)
This change is to ensure the implementation of the various
`llvm-objcopy` args are implemented with consistent patterns.
This is intended to help have a clear and consistent point of reference
for new contributors to extend `llvm-objcopy`.
These changes are largely to propagate the review comments of
https://github.com/llvm/llvm-project/pull/159999 back onto the changes
introduced before it.
[msan] Handle NEON dot product intrinsics (#176084)
Propagate shadow by reusing existing `handleVectorPmaddIntrinsic()`
(used for analogous x86 instructions; renamed to
`handleVectorDotProductIntrinsic()`), instead of strictly handling.
[TableGen] Gracefully error out in ParseTreePattern when DAG has zero operands so that llvm-tblgen doesn't crash (#161417)
Also handle the case when Pat->Child(i) is null in
CodeGenDAGPatterns::FindPatternInputsAndOutputs().
Fixes issue #157619 : TableGen asserts on invalid cast
[OpenMP][NFC] Use `uinc` atomic builtins for this operation (#177207)
Summary:
We support this now, this is 1-to-1 equivalent and simply prevents us
from needing to do it ourselves.
[lldb] Skip TestDAP_launch_io.py tests on asan builds (#177198)
Two out of three TestDAP_launch_io.py's test's classes have been failing
on ASAN builds ever since it was added into the repo. The ASAN failure
is not easy to debug, so skip these tests until we fix it.