This commit modifies the OnnxToTorch lowering of Onnx.Reshape op by
creating the result shape list for the aten.reshape using the result
shape values inferred from the op's result shape.
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
Folds aten::index_select ops under the following conditions:
1. If the input and output are the same shape, the indexing operation is
a NOP, so just return the input.
2. If the input has shape <1x1x...xNx...x1> (all 1's except for one
dim), and the output shape is <1x1x...x1> (all 1's), then there is a
single index, so extract the single element value and return a tensor
with that value.
---------
Co-authored-by: Dave Liddell <dliddell@xilinx.com>
Link to related RFC:
https://discourse.llvm.org/t/rfc-rename-torch-mlir-compile-apis-and-introduce-fx-based-analogs/76646
This commit updates the documentation, tests, CMake files, and API for
the proposed changes in the RFC. There is a new torch_mlir/fx.py for
user level APIs related to importing modules and a corresponding test
for this path can be found at test/python/fx_importer/basic_test.py.
---------
Co-authored-by: MaheshRavishankar <mravisha@amd.com>
Adds an escape hatch from creating a DenseResourceElementsAttr for
single value tensors into DenseElementsAttr.
For 0d or 1element, splats are better as DenseElementsAttr. Don't use
DenseResourceElementsAttr for it
If a tensor is initialized by a list with a single constant integer,
this folder turns it into a torch.vtensor.literal
---------
Co-authored-by: Dave Liddell <dliddell@xilinx.com>
Leaning on the QDQ functionality in torch we can support the QLinearConv
operation by piggybacking through `torch.Convolution`. This includes
some changes such as allowing the `onnx` rewriter to run recursively.
Doing so allows `QLinearConv` to decopmose to `onnx.Convolution` which
is then lowered to `torch`.
The existing `flatten` lowering did not define what the intermediate
shape was. This could result in failures to lower further to linalg as
the intermediate shape was unknown. Added a shape refinement section.
So that the CumSum Op in OPT can get the constant that it requires to be lowered to TMTensor
---------
Co-authored-by: Rob Suderman <rob.suderman@gmail.com>
Co-authored-by: Xida Ren <xida.ren.dev@gmail.com>
`torch` requires that padding be symmetric for pooling operations. To
support non-symmetric pad we need to separately materialize out the
padding operation.
---------
Co-authored-by: James Newling <james.newling@gmail.com>
Fix for https://github.com/llvm/torch-mlir/issues/2765
The onnx docs say that you can't do shape inference using the in-memory
API for models > 2 GB. This fix replaces that API with the file-based
API. Since the new API generates an intermediate file, also added a
--keep switch to keep that file, which I delete by default.
---------
Co-authored-by: Dave Liddell <dliddell@xilinx.com>
With the recent LLVM integrate and changes from
https://github.com/llvm/llvm-project/pull/78260, we hit this build error
in Stablehlo (which is quite old).
```
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1020:14: error: no member named 'startRootUpdate' in 'mlir::PatternRewriter'
rewriter.startRootUpdate(op);
~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1026:16: error: no member named 'finalizeRootUpdate' in 'mlir::PatternRewriter'
rewriter.finalizeRootUpdate(op);
~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1029:16: error: no member named 'cancelRootUpdate' in 'mlir::PatternRewriter'
rewriter.cancelRootUpdate(op);
~~~~~~~~ ^
external/stablehlo/stablehlo/transforms/StablehloRefineShapes.cpp:1108:14: error: no member named 'updateRootInPlace' in 'mlir::PatternRewriter'
rewriter.updateRootInPlace(op->getParentOp(), [&]() { return; });
~~~~~~~~ ^
4 errors generated.
Target @torch-mlir//:torch-mlir-opt failed to build
```
I'm still puzzled as to how this didn't fail with the CMake merge gating
CI (do we not test Stablehlo builds/tests?). In any case, bumping our
submodule to https://github.com/openxla/stablehlo/pull/1918 fixes it.
It exposes a new failing lit test in TorchToStablehlo though, that I
have looped stablehlo developers into
([here](https://discord.com/channels/999073994483433573/999074539138990131/1201235845391331419)).
```
bazel run @torch-mlir//test/Conversion:TorchToStablehlo/scatter.mlir.test
...external/torch-mlir/test/Conversion/TorchToStablehlo/scatter.mlir
within split at <stdin>:1 offset :33:8: error: unexpected error: Expects non-empty reduction block for type inference
%0 = torch.aten.scatter.src %arg0, %int0, %arg1, %arg2 : !torch.vtensor<[?,?],si64>, !torch.int, !torch.vtensor<[?,?],si64>, !torch.vtensor<[?,?],si64> -> !torch.vtensor<[?,?],si64>
^
LLVM ERROR: Failed to infer result type(s).
```
Bazel CI:
https://github.com/sjain-stanford/torch-mlir/actions/runs/7732673480/job/21083102228
`onnx` explicitly specifies that `raw_data` is stored in `little-endian`
layout. While converting
to `torch` we need to convert from a known endian format to an internal
format of consistent
layout. This means endianness must be correct during the import of
`onnx.Constant`.
---------
Co-authored-by: Xida Ren (Cedar) <cedar.ren@gmail.com>
Note that we are waiting for actual FX traced graph support for sparse
tensors. For details see
https://github.com/pytorch/pytorch/issues/117188
Until then, however, we provide this clever importer that builds the FX
traced graph for for the dense case and then puts a sparse annotation
back on the parameters.
With import test.
Linalg has quantized specific operations. We can lower to these
operations when there is a known zeropoint and scale operations. This
allows the `convolution` to occur with lower bitwidth's, improving the
overall performance.
We were seeing some assertion failures after some checks around folders
were tightened up in LLVM:
https://github.com/llvm/llvm-project/pull/75887 . This PR essentially
moves the logic that used to be applied at the LLVM level into the
folder, which seems to be the suggested fix.
I'm not sure if the IR that caused issues for us _should_ be valid?
```
%1 = torch.aten.detach %arg0 : !torch.tensor<[1],f32> -> !torch.tensor
```
A better fix might be to create a verifier ensuring the result of
`aten.detach` has the same type as its operand.
---------
Co-authored-by: aaron-stgeorge <aaron.stgeorge@getcruise.com>
Torch does not have an equivalent matmul operation for integers. Instead
it sidechannels the information via its quantized types. For this
lowering we setup these sidechannels then invoke `torch.mm`.
This preserves sparsity at the most obvious places of lowering TORCH
tensors to MLIR RankedTensorType tensors. Other places are marked for
audit. With some initial lowering tests.
This adds an encoding field to the torch type, using the interfaces for
printing, parsing, and verification. Note that although this change
prepares adding sparsity to the torch type (as illustrated by the round
trip and invalid tests), nothing in this change depends on the actual
contents of the encoding field!
This includes custom op matching for decomposed operations and fusing
dequantization into dense operations. As a validation we compare
to the dequant+mm torch implementation.
We can plumb the linear matmul into pytorch using its quantized types
with side channel information. To handle the final int8 operation we
dequantize and requantize.
This commit adds mapping from `onnx.pad` op to `torch.pad` op. Currently
it does not support `axes` parameter of `onnx.pad` op.
Signed-off-by: Gaurav Shukla <gaurav.shukla@amd.com>
Currently transposed convolution is not handled correctly by
`TorchToTosa`. This PR allows transposed convolutions to pass through
the conversion so that they can be handled by other conversion passes
later in a pipeline.
An example input which produces a compilation error is:
```
func.func @forward(%input: !torch.vtensor<[1,64,1,100],f32>) -> !torch.vtensor<[1,64,2,200],f32> {
%true = torch.constant.bool true
%int1 = torch.constant.int 1
%int2 = torch.constant.int 2
%weight = torch.vtensor.literal(dense<0.0> : tensor<64x64x3x3xf32>) : !torch.vtensor<[64,64,3,3],f32>
%bias = torch.vtensor.literal(dense<0.0> : tensor<64xf32>) : !torch.vtensor<[64],f32>
%stride = torch.prim.ListConstruct %int2, %int2 : (!torch.int, !torch.int) -> !torch.list<int>
%int1x1 = torch.prim.ListConstruct %int1, %int1 : (!torch.int, !torch.int) -> !torch.list<int>
%output = torch.aten.convolution %input, %weight, %bias, %stride, %int1x1, %int1x1, %true, %int1x1, %int1 : !torch.vtensor<[1,64,1,100],f32>, !torch.vtensor<[64,64,3,3],f32>, !torch.vtensor<[64],f32>, !torch.list<int>, !torch.list<int>, !torch.list<int>, !torch.bool, !torch.list<int>, !torch.int -> !torch.vtensor<[1,64,2,200],f32>
return %output : !torch.vtensor<[1,64,2,200],f32>
}
```
This MLIR produces an error about a cast operation with a size mismatch
when passed through `torch-to-tosa`:
```
error: 'tensor.cast' op operand type 'tensor<1x64x1x50xf32>' and result type 'tensor<1x64x2x200xf32>' are cast incompatible
```
---------
Co-authored-by: Srinath Avadhanula <srinath.avadhanula@getcruise.com>
We can make the per-tensor version of the operation to the dequantize
operation via marking with the make quantized tensor component. This
introductions the `qint*` and `quint*` tensor type that can be lowered
to teh appropriate dequantization behavior during the torch-to-linalg
conversion.
We can map the per_tensor case to the `torch.aten.quantize_per_linear`
operation. In this case we extract the `scale` and `zeropoint` values
and directly invoke the quantization, then return the integer
representation value.
Implemented ONNX.Range. The spec says the data type for start, limit,
delta are 0-D can be double, float, int16, int32, int64, All int types
mapped to !torch.int and all float types mapped to !torch.float
---------
Co-authored-by: Kumar Deepak <kumar@xilinx.com>
Handles the multiple cases of `onnx` constant values and converts them
to `torch` literal tensors. This can include splats with a single
integer or floating point value, a set of explicit integer values, or
an elements array attr of values.
This PR updates the torch-to-tosa conversion with following changes:
- Support torch.none as min/max input argument for tosa.clamp op
- Support negative value as start index for tosa.slice op
- Add tosa.logical_or lowering support
e2e test:
python -m e2e_testing.main --config=tosa
LIT tests:
cmake --build build --target tools/torch-mlir/all
---------
Co-authored-by: Ze Zhang <ze.zhang@getcruise.com>
Changes made during upstreaming:
* Removed comments attributing some copied code back to torch-mlir
(since it is now repatriated).
* Re-organized imports.
* Inlined RefMapping/RefTracker and TypeSubclassMap from an external
utility module.
* Added FxImporter class comments.
* Updated stack trace extraction to be fail safe.
* Added an entry-point for `import_frozen_exported_program` which uses
the shiny new upstream `torch.export.export()` API (versus the
lower-level/older API that Turbine is presently using). This
necessitated a small FX rewrite to line external state management up
with current conventions.
* Adapted one of Turbine's importer tests to go with this initial
submission. Turbine unfortunately has a lot of more-integration-ey
tests, and I would like to extract those as more of unit tests of the
importer features and upstream them that way vs trying to copy directly.
For now, one overall test with the initial submission gets us moving.
I acknowledge that there are some code quality things that could be
improved in this submission: this was authored over the course of many
months (and often via some trial and error). I would like to keep it
relatively converged with the downstream for the next few steps while
getting the test suite upstreamed. And then it will be easier to take a
hygienic pass through the code.
Including co-authors for contributors in the git log of the original
repository.
Co-authored-by: Ean Garvey <87458719+monorimet@users.noreply.github.com>
Co-authored-by: Avinash Sharma <aviator1994@gmail.com>
Co-authored-by: Arham Khan <arhammkhan@gmail.com>
Co-authored-by: brucekimrokcmu <kwangkyk@alumni.cmu.edu>
Co-authored-by: saienduri <77521230+saienduri@users.noreply.github.com>
The expression for HardSigmoid in Onnx
(https://onnx.ai/onnx/operators/onnx__HardSigmoid.html): max(0, min(1,
alpha * x + beta))
is inherently different from HardSigmoid in Torch
(https://pytorch.org/docs/stable/generated/torch.nn.Hardsigmoid.html)
which is: if x < -3 -> 0
elif x > 3 -> 1
else x/6 + 1/2
That being said, it was just better to compute out the entire expression
when translating the Onnx expression to Torch mlir, which is done in
this PR. Some of the logic is shared from the files in
`DecomposeComplexOps`. Therefore, refactored some shared logic between
`DecomposeComplexOps` and `DefaultDomainGToP` and put it in a `Utils`
file.
This PR adds the `enable_ir_printing` option to `torch_mlir.compile`,
which can be used to print the IR for all intermediate passes.
When running the added test file via:
```shell
$ python test/python/compile.py 2> tiny.stderr
```
the file `tiny.stderr` is about 700 KB.
The three remaining compare operations
onnx.Greater
onnx.Less
onnx.GreaterOrEqual
Are also added with this push request.
This concludes a set of basic tensor compare functions.
Lowerings for `transpose` from ONNX to `aten`. Implementation depends on
making multiple `aten.transpose` operations swapping pairs of dimensions.
As `onnx.transpose` can swap around any dimensions it may require
constructing multiple `aten.transpose`.
This replaces the lowering of aten.cat with tensor.concat, allowing more
efficient handling of concatenations in downstream flows. The refbackend
populates concat decomposition patterns that can be used to recover the
previous lowering.
This commit adds the OnnxToTorch support for Reciprocal, Round,
ScatterElements, Sigmoid, Sin, Tanh, Sqrt, Sub, Sum, Where, Xor,
Squeeze, Unsqueeze ops.
For reviewers, the ops that weren't trivial and probably require extra
review are Sum, Squeeze, and Unsqueeze.
Lowerings for `selu` lowerings for ONNX to the corresponding torch
implementations. Torch's `selu` implementation has fewer features so
we use the a generalized `elu` with the input scale set to `1.0`.
Simple Python console script to import an ONNX protobuf to the torch
dialect for additional processing.
For installed wheels, this can be used with something like:
```
torch-mlir-import-onnx test/python/onnx_importer/LeakyReLU.onnx
```
Or from a dev setup:
```
python -m torch_mlir.tools.import_onnx ...
```
This is part 1 of 2, which will also include upstreaming the FX
importer. I started with ONNX because it forces some project layout
updates and is more self contained/easier as a first step.
Deviating somewhat from the RFCs on project layout, I made the following
decisions:
* Locating the `onnx_importer.py` into `torch_mlir.extras` as Maks
already has opened up that namespace and it seemed to fit. Better to
have fewer things at that level.
* Setup the build so that the root project only contains MLIR Python and
pure Python deps (like the importers), but this can be augmented with
the `projects/` adding more depending on which features are enabled.
* The default build continues to build everything whereas in
`TORCH_MLIR_ENABLE_ONLY_MLIR_PYTHON_BINDINGS=1` mode, it builds a
`torch-mlir-core` wheel with the pure contents only.
`onnx_importer.py` and `importer_smoke_test.py` are almost verbatim
copies from SHARK-Turbine. I made some minor local alterations to adapt
to paths and generalize the way they interact with the outer project. I
expect I can copy these back to Turbine verbatim from here. I also
updated the license boilerplate (they have the same license but slightly
different project norms for the headers) but retained the correct
copyright.
Other updates:
* Added the ONNX importer unit test (which also can generate test data)
in lit, conditioned on the availability of the Python `onnx` package. In
a followup once I know everything is stable, I'll add another env var
that the CI can set to always enable this so we know conclusively if
tests pass.
* Moved the ONNX conversion readme to `docs/`.
* Renamed CMake option `TORCH_MLIR_ENABLE_ONLY_MLIR_PYTHON_BINDINGS` ->
`TORCH_MLIR_ENABLE_PYTORCH_EXTENSIONS` and inverted the sense. Made the
JitIR importer and LTC options `cmake_dependent_options` for robustness.
This commit adds the OnnxToTorch support for BitwiseXor, BitwiseOr, Div, Equal, Cast,
Ceil, Floor, Cos, and Clip op.
This commit also adds the TorchToLinalg support for aten.clamp.Tensor and aten.clamp_min.Tensor op.
Signed-Off By: vivekkhandelwal1424@gmail.com
Despite aten.mm requiring the input and output types match, we still opt
to maintain signedness semantics in case later passes try to do any sort
of integer type narrowing.
This commit adds the OnnxToTorch support for Atan, Bitshift, BitwiseAnd,
and BitwiseNot op.
This commit also adds the TorchToLinalg support for AtenBitwiseLeftShiftTensorOp.
Signed-Off By: vivekkhandelwal@nod-labs.com
Adds a pipeline to convert custom ops and metadata represented as
`torch.operator` custom ops to corresponding `torch` ops where possible.
This is part of a multi-part approach for building ONNX import in as a
regular feature of torch-mlir. It is focused on the conversions vs the
infra. We will end up maintaining a [pure-python
importer](https://github.com/nod-ai/SHARK-Turbine/blob/main/python/shark_turbine/importers/onnx_importer.py)
to go with this in torch-mlir, and we will also maintain test case
generation utilities derived from it.
I have left substantial documentation in the README of the conversion
directory, including the recommended approach that we will take to keep
building this out.
(note that this organizes the code to coincide with the refactoring in
#2442 versus the current flat arrangement)
The logic for lowering the aten view op to linalg is fairly complex.
In this PR I have tried to follow all non-failing paths through the
lowering and add unit tests where they're missing.
There is 1 logical change to the lowering: redundant tensor.cast ops
(same source and destination type) are folded.
This lifts the core of the jit_ir_importer and ltc out of the pt1
project, making them peers to it. As a side-effect of this layering, now
the "MLIR bits" (dialects, etc) are not commingled with the various
parts of the pt1 project, allowing pt1 and ltc to overlay cleanly onto a
more fundamental "just MLIR" Python core. Prior to this, the Python
namespace was polluted to the point that this could not happen.
That "just MLIR" Python core will be introduced in a followup, which
will create the space to upstream the FX and ONNX pure Python importers.
This primary non-NFC change to the API is:
* `torch_mlir.dialects.torch.importer.jit_ir` ->
`torch_mlir.jit_ir_importer`.
The rest is source code layering so that we can make the pt1 project
optional without losing the other features.
Progress on #2546.
- adds support for an optional verifier to the generated torch op
tablegen (GeneratedTorchOps.td)
- uses the above to add a verifier for the torch permute op.
Motivation: I hit an unclear error from linalg while developing a
decomposition pass for pixel_shuffle. The error would have been clearer
if the problem had been detected earlier in the invalid aten.permute op.
Testing: new tests added. To run added tests, from the base directory
run
```
./build/bin/llvm-lit test/Dialect/Torch/invalid.mlir
```
This is a first step towards the structure we discussed here:
https://gist.github.com/stellaraccident/931b068aaf7fa56f34069426740ebf20
There are two primary goals:
1. Separate the core project (C++ dialects and conversions) from the
hard PyTorch dependencies. We move all such things into projects/pt1 as
a starting point since they are presently entangled with PT1-era APIs.
Additional work can be done to disentangle components from that
(specifically LTC is identified as likely ultimately living in a
`projects/ltc`).
2. Create space for native PyTorch2 Dynamo-based infra to be upstreamed
without needing to co-exist with the original TorchScript path.
Very little changes in this path with respect to build layering or
options. These can be updated in a followup without commingling
directory structure changes.
This also takes steps toward a couple of other layering enhancements:
* Removes the llvm-external-projects/torch-mlir-dialects sub-project,
collapsing it into the main tree.
* Audits and fixes up the core C++ build to account for issues found
while moving things. This is just an opportunistic pass through but
roughly ~halves the number of build actions for the project from the
high 4000's to the low 2000's.
It deviates from the discussed plan by having a `projects/` tree instead
of `compat/`. As I was thinking about it, this will better accommodate
the follow-on code movement.
Once things are roughly in place and the CI passing, followups will
focus on more in-situ fixes and cleanups.
NonValueSemantic Ops like Add_, div_, etc. expect result DType to be the
same as the first input. However, current implementation would result in
wrong result type for case like:
```python
a = torch.randn(3, 3).half() # float16
b = torch.randn(3, 3) # float32
a += b # i.e. torch.ops.aten.add_(a, b)
```
torch expects `a` to be float16, but dtype refinement would infer
float32 type, since it's replaced by `aten.add`.
Add aten.isclose op
Add its torch-to-tosa lowering
Update the TorchToTosa/basic.mlir tests
To test e2e tosa lowering:
`python -m e2e_testing.main -v -c=tosa`
---------
Co-authored-by: Ze Zhang <ze.zhang@getcruise.com>
Add aten.unflatten.int op
Add its torch-to-tosa lowering
Update the TorchToTosa/basic.mlir tests
To test e2e tosa lowering:
`python -m e2e_testing.main -v -c=tosa`
---------
Co-authored-by: Ze Zhang <ze.zhang@getcruise.com>
Strict symbolic shapes allow us to assume numpy-style dynamic broadcasts
never occur. This allows us to strengthen the folder for broadcasts to
cases where the rank is the same and all shapes match (including dynamic
sentinel values).
When importing dynamic shaped programs from Dynamo, via torch.compile or
torch.export, we can assume that strict symbolic shape checks have been
done prior to generating torch IR. Among other shape checking, this
eliminates the case where an unknown dimension can be dynamically '1' in
a way that signals a broadcast.
Adds a `isAssumingStrictSymbolicShapes` utility which consults a
`torch.assume_strict_symbolic_shapes` attribute on an enclosing scope
and returns true if present.
In the linalg pipeline, many runtime checks are elided when this returns
true.
Corresponding commits:
* mlir-hlo: 16886a108eff5197f816ca0f1950cc5ff1b078d9
* stablehlo: 77a59815a82b34f7b08ed2d42a711d9920682d0e
* llvm-project: 4acc3ffbb0af5631bc7916aeff3570f448899647
* Adapt to ByteCodeOpInterface changes.
* Adapt to RegionBranchPoint changes: https://reviews.llvm.org/D159116
* Adapt inferReturnTypes to get the value from properties.
* Adapt invalid.mlir to properties syntax
* [TOSA] Align with custom assembly format change.
* [TOSA] handle change of axis to int32 type
* [TOSA] Restore improper convert to i32
Landing with Windows broken (it cannot be fixed because of the way the mlir-hlo dep is inserted). Will followup with an untangling.
---------
Co-authored-by: TatWai Chong <tatwai.chong@arm.com>
Co-authored-by: Eric Kunze <eric.kunze@arm.com>
* view_as_real test case, allow dtype in testutils.randn
* abstract python upstream func implemented
* fixed upstream dtype func, implemented view_as_real backend op
* formatted AtenViewAsRealOp, removed change in e2etest/framework
* removed test suit from reshape_like.py, because it's moved to basic.py
* implemented C-API wrapper for mlirComplexF128 type
* fixed torch.complex dtype width in MLIR and Torch MLIR, deleted float16 dtype dict
* Changed IR input of aten fft_fft unit test
* code refactored
* code refactored and fixed ci test
* refactored: removed white spaces, and rolled back to having both input/output affine expr
* refactored: deleted output affine expr to reduce redundancy
* xfail ltc backend
* removed ComplexImag and ComplexReal from torchdynamo xfail set
* copied and pasted from main branch as there's no change to be made in this file
* refactored abstract_interp_lib_gen.py
* refactored: torchtypes.td, formatted, removed commented out code
* Support brevitas custom op (#2320)
* f16 change for brevitas
* Adapt the change of brevitas quant custom op name
* Add unit tests
* Make brevitas conversions isolated
* Address the comments
---------
Co-authored-by: dan <danimal197@gmail.com>
When using custom ops, sometimes PyTorch will insert namespaces to the
abstract interpretation function name in the format:
`__torch__.{namespace_1}.{namespace_2}...{op_name}`. The extra
namespaces are not part of the abstract interpretation function name,
so it needs to be removed before generating the library of MLIR
snippets of abstract interpretation functions. This commit adds
support for removing the namespace information.
* LTC->MLIR Debug Info support
* SW-95317 Propagate Lazy->Jit->MLIR scope name.
* Enhance location information based on op names
Currently, the location information attached to the ops just considers
the filename, line number and column number. Attaching operation name
would help identify the type of computation by just looking at the
profile of execution.
* Update locations logic; updated debug-info.py test
* Use {scope}/{op_name} format to track names by default
---------
Co-authored-by: Gleb Kazantaev <gleb.kazantaev@cerebras.net>
Co-authored-by: Mark Browning <mark@cerebras.net>
Co-authored-by: Vimal Patel <vimal@polymagelabs.com>
The implementation at this place was a remnent of the times the pipeline was
run only once.
Rely instead on the backend verification, after optimizations have had an
opportunity to resolve some uncertainties. (e.g. `!torch.optional`).
* RecomposeComplexOps: Remove dead slice op
* lib/Dialect/Torch/IR/TorchOps.cpp: Fold slice ops even when they are on non-value tensors
* lib/Conversion/TorchToTosa/TorchToTosa.cpp: Fix slice start/end out of range/none
* lib/Dialect/Torch/IR/TorchOps.cpp: AtenSliceTensorOp::fold: Fold slices that go from 0:int_max
* More tests for aten.split.Tensor
In PyTorch, the `NumberType` is equal to `Union[int, float,
complex]`. However, the abstract interpretation library was treating
the `NumberType` as `Union[int, float]`, resulting in type mismatches
when reifying certain dtype functions. This commit fixes the type
inconsistency by having the abstract interpretation functions take as
an input a `Union[int, float, complex]` for the ops that take
`!torch.number` inputs.
This commit adds the support for index.Tensor op when the index values
are negative. This commit wraps around the index values by checking
their values at run time.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
check the return type of the division to figure out whether to use
the floating point implementation of a division or to use the integer.
the issue rose from the fact that the inputs are all integer but the
result was casted to floating point. The conversion then chose to
use the integer implementation of division which is not legal in tosa
when all the inputs get casted to floating point.
fix(TorchToLinalg): AtenDivScalarOp
upcast self operand as well if applicable, the self operand must also
be casted to float as it can be an integer.
When `use_tracing=True` is used to import a model into Torch-MLIR,
several casts get inserted in the IR to bridge the untyped inputs and
outputs with the typed body of the computation. These casts create
extra aliases of tensors that cause the current analysis in
`maximize-value-semantics` to fail.
In particular, the `maximize-value-semantics` analysis assumes that the
only valid alias right after an overwrite is the overwritten
alias. So, if there is a use of a casted version of the overwritten
alias after the overwrite, the analysis fails.
This commit improves the analysis by identifying all cast-like aliases
of the overwritten alias and allowing such aliases to be used after an
overwrite.
Because this issue only arises when using tracing, it cannot be
currently tested e2e, so only lit test is added.
Lowering torch operations that allow different compatible data types
in its operands to tosa end up generating invalid tosa IR with mixed
data types. In tosa spec, certain operations (generally element-wise
operations) require all operands to have the same data type.
Add wrapper functions for those element-wise tosa ops to perform op
creation with type conversion if necessary.
This commit adds dtype functions for all the torch ops that did not
previously have one and removes the pass `RefineTypes`, since the
abstract interpretation library now takes care of all the dtype
propagation.
All dtype functions added are tested except for
- `aten.embedding`
- `aten._embedding_bag`
- `aten.embedding_bag`
These functions need a change to the testing framework to allow
specifying the actual data inside the tensor used for testing. I will
fix this in a follow up patch.
Co-authored-by: Jiahao Li <liplus17@163.com>