Simple folder for limited size aten tensor operations. This is primarily
useful for shape computation folding as they unfortunately can use
`aten` operators. Add, sub, mul are common examples of these folders.
Onnx slice lowering used arange needlessly instead of directly
constructing the constant dimension values. This makes lowerings to
linalg struggle as multiple folders are required to get what is a
constant index value.
Even though the reference compiler is not about performance, inlining
the generated sparse helper methods has a rather big positive impact on
performance, leaving a much better first impression. Therefore, we added
this inlining pass (which leaves all other PyTorch modules unaffected,
since they tend to be one big main() method to start with).
testing:
$./tools/e2e_test.sh --config linalg
Summary:
Passed: 1164
Expectedly Failed: 8
$ python -m e2e_testing.main --config=torchdynamo
Summary:
Passed: 976
Expectedly Failed: 162
As of https://github.com/pytorch/pytorch/pull/118969, `ExportedProgram`
has the long awaited fixes to correctly categorize various things
relating to parameters, buffers, mutated inputs and constants.
With this additional modeling, we are finally able to implement
(safely/soundly) the mutable semantics that were attempted on the
TorchScript path. The difference is that on that path, we had to
conservatively treat everything as mutable and run some dodgy heuristics
(which have been the cause of many bugs relating to
"MaximizeValueSemantics") to try to get back to an immutable state.
The new model supports mutability at the graph edges, allowing both user
inputs and buffers to be mutated (there is some more support than that,
but that is all I fully tracked through to implementation).
Therefore, when we receive programs like this, we now can selectively
enable mutation at the edges. This happens to be the mutability model
that IREE supports, which I expect to be a primary beneficiary. However,
there is nothing stopping anyone else from handling the `!torch.tensor`
types and the existing copy/overwrite ops that will be selectively
added.
Since this relies on API changes that will not release until 2.3, I'm
being a bit cautious about not refactoring existing facilities.
We can route the torch tests via `onnx` using the `torch.onnx.export`
tooling. We can then reimport, lower to torch, and compile to linalg to
validate the onnx path is working correctly.
The current implementation exposes some failures in the `onnx` path so
we cannot enable the onnx test suite yet due to segmentation faults.
This commit adds decomposition support into the core aten operators
before importing the module from torch.
Also, this commit deals with the lifted tensor constants in
torch.export.export(). We don't want to add unnecessary placeholder
nodes in the graph (extra args in the block module), and should treat
them like the constants that they are. The unnecessary clone is also
removed for max efficiency.
This version of pytorch includes a patch to enable dynamo support on
Windows, so I would like to sync on this torch version across
torch-mlir/shark-turbine for a seamless Windows import flow.
This commit adds the OnnxToTorch lowering for cosh, acosh, asin, asinh,
and atanh op.
This commit also adds the TorchToLinalg lowering for acosh, asin, asinh,
and atanh op.
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
Some operations include a backend matcher for specialized operations. We
map these back to generics so they appropriately match to the high
performance versions. This is done for the attention operation.
Fixes https://github.com/llvm/torch-mlir/issues/2866
Some backends / downstream projects expect that a "fully converted"
program has no remaining ops or attributes from the original dialect(s).
This test exposes issues that need fixing
(1) propagate sparsity into the FX graph (over elt-wise) (2) batched
dimensions need a new "dense(batch)" format
The investigation is largely recorded in
https://github.com/llvm/torch-mlir/pull/2881, but this change allows us
to capture non-persistent buffers that were lifted as tensor constants
(after https://github.com/pytorch/pytorch/pull/118969 landed in upstream
PyTorch), and propagate them to `Torch` dialect as "frozen"
`torch.vtensor.literal`. I believe this patch should work with both
nightly and stable PyTorch, but will let CI confirm the same. Thanks
@stellaraccident for the valuable pointers and guidance.
---------
Co-authored-by: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
This commit adds the OnnxToTorch support for Mean, IsInf, IsNaN, and
PRelu ops. All high priority ops were taken so went with these. The non
trivial ones are Mean and IsInf which might require extra review
---------
Co-authored-by: MaheshRavishankar <mravisha@amd.com>
Various improvements on sparsity metadata:
(1) define single data structure for all sparsity related metadata
(2) handle batched dense dimensions, as well as dense subtensor
dimensions
(3) refine sparsity propagation for deeper networks
This patch makes the Protobuf package mandatory in addition to forcing a
config mode search. The (default) module mode search looks for the
CMake-provided FindProtobuf.cmake file, but this file does not list
Abseil as a dependency, causing linker issues like the one below:
```
ld: Undefined symbols:
absl::lts_20230802::log_internal::LogMessageFatal::LogMessageFatal(char const*, int, std::__1::basic_string_view<char, std::__1::char_traits<char>>), referenced from:
google::protobuf::RepeatedPtrField<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>::TypeHandler::Type const& google::protobuf::internal::RepeatedPtrFieldBase::Get<google::protobuf::RepeatedPtrField<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>::TypeHandler>(int) const (.cold.1) in OnnxImporter.cpp.o
```
By forcing a config mode search, CMake looks for the file that is
installed as part of the protobuf package and which does contain the
Abseil dependency. This workaround is also mentioned in a GitHub issue
for Protobuf:
https://github.com/protocolbuffers/protobuf/issues/12292#issuecomment-1529680040.
This PR introduces a sparse_jit wrapper that can run simple models with
sparse tensor inputs end-to-end. The implementation shows all required
components on modifying sparse tensor types with a 1:N relation on the
call sites. Two tests shows that the JIT runs end-to-end while computing
the correct results.
More details to follow (generalizing to COO and different ranks, as well
as support for *output* sparse tensors), but the general concepts are
all here now.
**_Update: Thanks to Rob, bump to proper LLVM/MLIR hash is done!_**
_**NOTE that all parameter passing changes are nicely done "downstream"
in MLIR, so very little changes are required in torch-mlir code
proper**_
---------
Co-authored-by: Franz Haniel <77495327+frafranz@users.noreply.github.com>
Co-authored-by: Franz Haniel <franz.haniel@amd.com>
By updating convertScalarToDtype invocation pass original source and
destination datatypes for the add op. Also fixes a potential problem
with the sub op.
---------
Co-authored-by: Xida Ren <xida.ren.dev@gmail.com>
The lowering decomposes AtenTraceOp into an AtenDiagonalOp followed by
AtenSumOp.
The progress is tracked in
https://github.com/nod-ai/SHARK-Turbine/issues/333.
---------
Co-authored-by: Franz Haniel <franz.haniel@amd.com>
There is no lowering support for math::AbsIOp, so if the operand is an
integer type, it will fail to lower to math::AbsFOp since the op operand
#0 must be floating-point-like.
This adds a few passes that will ensure linalg with sparse tensors are
properly lowered to loops and can run using the ExecutionEngine for
testing (a few details on parameter passing from PyTorch still TBD)
Test results:
$ ./tools/e2e_test.sh --config linalg
Summary:
Passed: 1144
Expectedly Failed: 8
$ python -m e2e_testing.main --config=torchdynamo -v
Summary:
Passed: 960
Expectedly Failed: 163
Filed issue:
https://github.com/pytorch/pytorch/issues/119407
This PR contains three commits to update the validation checks in the
ONNX -> Torch conversion pass for the AveragePool, Pad, and Slice operators:
> onnx: fix preconditions for lowering AveragePool ops
>
> The `pads` attribute of the AveragePool operator specifies the value to
> pad at both the beginning as well as the end of the axis (see
> https://onnx.ai/onnx/operators/onnx__AveragePool.html#attributes), so
> the size of this attribute should be twice the rank of the input tensor.
> However, our TorchOnnxToTorch bails out early since it incorrectly
> compares the pads attribute with the rank (not twice the rank) of the
> input tensor.
>
> This patch fixes the code to match the spec and adds a lit test.
> onnx: allow optional constant value for Pad operator
>
> The `constant_value` input of the onnx.Pad operator is optional (see
> https://onnx.ai/onnx/operators/onnx__Pad.html#inputs), but the
existing
> logic for lowering the operator into the Torch dialect assumes that it
> is mandatory.
>
> This patch makes the attribute optional and constructs a default value
> (a list of zeros the size of the input tensor) if the attribute was not
> specified.
> onnx: fix checks for axes and steps inputs of Slice operator
>
> The ONNX Spec for the Slice operator allows the `starts` and `ends`
> inputs to have fewer indices that the dimensions of the `data` tensor
> (see https://onnx.ai/onnx/operators/onnx__Slice.html), but our code
> expects these inputs to be as many as the `data` tensor's dimensions.
>
> More precisely, the spec requires that the `starts` and `ends` inputs
> are only as long as the `axes` input, but since the `axes` input is
> optional, the default type for the `axes` input has to match the type
> for the `starts` and `ends` inputs. Moreover, the number of indices in
> the `steps` input also has to match those in the `axes` inputs (instad
> of matching the dimensions of the `data` input).
>
> This patch fixes the checks in the TorchOnnxToTorch conversion so that
> they match the ONNX spec.
This commit modifies the OnnxToTorch lowering of Onnx.Reshape op by
creating the result shape list for the aten.reshape using the result
shape values inferred from the op's result shape.
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
Folds aten::index_select ops under the following conditions:
1. If the input and output are the same shape, the indexing operation is
a NOP, so just return the input.
2. If the input has shape <1x1x...xNx...x1> (all 1's except for one
dim), and the output shape is <1x1x...x1> (all 1's), then there is a
single index, so extract the single element value and return a tensor
with that value.
---------
Co-authored-by: Dave Liddell <dliddell@xilinx.com>
Lowering of torch.aten.all.dim to linalg.
Per PyTorch documentation:
> This function matches the behaviour of NumPy in returning output of
dtype bool for all supported dtypes except uint8. For uint8 the dtype of
output is uint8 itself.
Since there is no support for ui8 in torch-mlir currently
(https://github.com/llvm/torch-mlir/pull/1384#issuecomment-1260011334)
implementation returns failure for that case.
Link to related RFC:
https://discourse.llvm.org/t/rfc-rename-torch-mlir-compile-apis-and-introduce-fx-based-analogs/76646
This commit updates the documentation, tests, CMake files, and API for
the proposed changes in the RFC. There is a new torch_mlir/fx.py for
user level APIs related to importing modules and a corresponding test
for this path can be found at test/python/fx_importer/basic_test.py.
---------
Co-authored-by: MaheshRavishankar <mravisha@amd.com>
Adds an escape hatch from creating a DenseResourceElementsAttr for
single value tensors into DenseElementsAttr.
For 0d or 1element, splats are better as DenseElementsAttr. Don't use
DenseResourceElementsAttr for it
If a tensor is initialized by a list with a single constant integer,
this folder turns it into a torch.vtensor.literal
---------
Co-authored-by: Dave Liddell <dliddell@xilinx.com>
Leaning on the QDQ functionality in torch we can support the QLinearConv
operation by piggybacking through `torch.Convolution`. This includes
some changes such as allowing the `onnx` rewriter to run recursively.
Doing so allows `QLinearConv` to decopmose to `onnx.Convolution` which
is then lowered to `torch`.
The existing `flatten` lowering did not define what the intermediate
shape was. This could result in failures to lower further to linalg as
the intermediate shape was unknown. Added a shape refinement section.