Commit Graph

1033 Commits (main)

Author SHA1 Message Date
rohan-tan-bhowmik e86f56bc76
[Torch] [TMTensor] Added mask and is_causal support for torch.aten.scaled_dot_product_attention (#3690)
Enabled mask and is_causal parameters for torch.aten.scaled_dot_product
attention + relevant comments + tests.

The tests added highlight the new capabilities introduced in this PR,
including:

Attention with F16 mask
Attention with Boolean mask
Causal attention with same Q K V shapes
Causal attention without Q K V shapes

Made sure that one cannot input both mask and is_causal.
2024-09-09 15:51:41 -07:00
Srinath Avadhanula 0a788e0467
Decompose aten.fmod into aten.mul,sub,div etc. (#3689)
As titled, create a new decomposition for `aten.fmod.Tensor` to
`aten.div`, `aten.trunc`, `aten.mul` and `aten.sub`. Note that we only
use `aten.trunc` for floating point operations. This further gets
decomposed to `aten.where` etc. by other existing decompositions.

This decomposition now makes TOSA pass for a simple model with
`aten.fmod` while it makes `stablehlo` fail. For now, we disallow this
decomposition for `stablehlo`

---------

Co-authored-by: Srinath Avadhanula <srinath.avadhanula@getcruise.com>
2024-09-09 09:00:11 -07:00
Branko Trifkovic 70d5730c87
[LINALG] Implement lowering of torch.aten.rot90 (#3551) 2024-09-06 10:36:17 +05:30
zjgarvey 295bf418a4
Add a canonicalization pattern for `aten.unflatten.int` (#3656)
Addresses an issue in <https://github.com/llvm/torch-mlir/issues/3651>
where some unflatten ops generated from onnx models weren't propagating
static shape information. It may be necessary to add further
optimizations for the more general case when some static information is
present in the unflatten (or possibly reshape/view) op's `sizes` list,
but not reflected in the output shape. These ops will only successfully
infer shapes if the `sizes` list is gotten from a list of constant ints
(with possibly one -1). A common example where this fails is when some
of the `sizes` are determined from `aten.size.int` ops on dynamic
tensors, and other `sizes` are known statically.

This PR includes:
- a canonicalizer for `aten.unflatten.int` which converts to
`aten.unsqueeze` when it is expanding one dim to two, and one of the new
dims is statically 1.
- an improvement to the folder for `aten.__or__.bool` which does not
rely on *both* operands being static.
2024-09-03 16:38:20 -07:00
Ze Zhang b3942ff984
Add canonicalize pattern for aten.mul.int and aten.floordiv.int (#3680)
This PR add `floordiv` to the `PY_BUILTIN_TO_TORCH_OP`. For
`aten.mul.int` and `aten.floordiv.int` ops, we add new Canonicalization
Patterns as follow:

```
%1 = torch.aten.mul.int %input, %const-5
%2 = torch.aten.mul.int %1, %const-6
```

Will be replaced by

`torch.aten.mul.int %input, %const-30`


And 

```
%1 = torch.aten.mul.int %input, %const-5
%2 = torch.aten.floordiv.int %1, %const-5
```
Will directly return `%input`


This PR also relaxes the `float` type constraint in TorchToTosa for the
`AtenRsubScalarOp` conversion.



To test:

`cmake --build build --target check-torch-mlir-all`
2024-09-03 09:13:59 -07:00
Vivek Khandelwal 567ed44fd0
[MLIR][TORCH] Add E2E support for aten.polar op (#3671)
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-09-03 10:51:03 +05:30
lingzhiz1998 5bc59ce1fa
[TorchToLinalg] Support lowering MaxPool3dWithIndices (#3652)
Support torch.MaxPool3dWithIndices lowering to linalg backend.
2024-08-27 14:14:25 -05:00
Xida Ren (Cedar) eb7bf78a9c
Add RestructureNonConstantAxes pass to address reduce op tests failing on non constant axes (#3600) 2024-08-26 14:06:06 -07:00
Rob Suderman f9766c89f6
[onnx] Handle `torch.aten` for inner product case (#3634)
The following case was failing to lower for einsum. This fixes up the
inner product issue.
2024-08-24 11:41:25 -07:00
Vivek Khandelwal fcc5f444cd
MLIR][TORCH] Fix GroupNorm decomposition by adding shape info (#3658)
This commit adds the shape info for the tensors created during the
decomposition of GroupNorm op.

Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-08-22 21:20:40 +05:30
Vivek Khandelwal 0a86deb59a
build: manually update PyTorch version (#3627)
Set PyTorch and TorchVision version to nightly release 2024-08-18.
This commit also updates the `scaled_dot_product_attention` op. 
A new attribute `enable_gqa` has been added. As of now, only the
default value for the same is supported.

Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-08-19 12:03:56 +05:30
pkapris-syrmia 23ec5399e5
Implement lowering of aten.atleast_2d (#3546)
This operator is needed to implement aten.vstack, which will be
submitted in a subsequent PR
2024-08-14 18:52:31 +05:30
pkapris-syrmia 10fe5d08d1
Implement lowering for torch.aten.rad2deg (#3586) 2024-08-14 16:37:28 +05:30
Rob Suderman 9ab93436c4
[torch] Support diagonal `einsum.Diagonal` (#3618)
The einsum lowering was missing the behavior for duplicate indices in
the equation. This amounts to a diagonalization along duplicate pairs of
indices in the equation.
2024-08-13 09:38:43 -07:00
Yuanqiang Liu c5b3cf299a
[Torch] emit upsample_nearest1d/2d/vec, and add shape/dtype functions (#3629) 2024-08-13 19:14:24 +08:00
Felix Schneider 0314188dbe
[torch] Basic support for per-channel quantized graphs (#3623)
This patch adds basic support for lowering graphs with per-channel
quantization. Per-channel quantized ops have to be excluded from
`FuseQuantizedOps` for now but can be used in QDQ quantized form.

Using this patch, we're able to import and execute (on the linalg
backend) graphs with per-channel quantization applied using the "new"
PyTorch 2.0 Export Quantization.
2024-08-10 15:51:09 +02:00
Rob Suderman fd98476f77
[torch] Unpacking sometimes misses shape inference (#3609)
It is possible that the unpacked tensor does not match the same inferred
shapes. This is pretty common when ingesting form the `onnx` frontend.
2024-08-08 16:17:31 -07:00
Rob Suderman 59a4c6fda4
[onnx] Fix transposition code for `onnx.OneHot` (#3606)
The post onehot transposition code was unexercised. Fixed the test and
transformation to check use.
2024-08-07 18:20:26 -07:00
Chi_Liu a51b4e014a
[Torch] Disable 1-d quantized convolution (#3601)
To fix https://github.com/nod-ai/SHARK-Turbine/issues/253#issuecomment-2271815640
Prevent fusion for 1d convolution ops and just do it as an f32 conv
since there isn't a linalg named op for quantized 1-d convolution yet.  
Get 24 onnx eca* models passed in iree-comiple.
2024-08-07 09:01:16 -07:00
Rob Suderman 7e7af67080
Avoid warnings-as-errors build failure (#3588)
Lambda needs a return value to avoid a build failure.
2024-08-02 12:27:31 -07:00
yyp0 22cd4441e7
[Torch] Add support for static uneven divisible AdaptiveAvgPool2d (#3566)
The static uneven divisible AdaptiveAvgPool2d means that although the
input size is not an integer multiple of ouput size, but the kernel and
stride size can also be fixed (not dynamic). The derivation logic of
kernel and stride size is consistent with
torch/_decomp/decomposations.py:adaptive_avg_pool2d as described in the
following:

1. Stride Size
Firstly , derive the start index in each reduce operation according to
the output size (`n`), `start_index = ([0, 1, ..., n - 1] * input_size)
// output_size`. For each index `k`, if `k * (input_size % output_size)
< output_size`, then the current and previous stride keeps the same as
`input_size // output_size`. So suppose `(n-1) * (input_size %
output_size) < output_size`, the stride in the whole AdaptiveAvgPool2d
process keeps static, as `input_size // output_size`.

2. Kernel Size
torch/_decomp/decomposations.py:adaptive_avg_pool2d calculates a static
kernel size when the input/output sizes satisfy either of the two
conditions, `input_size % output_size == 0` or `output_size %
(input_size % output_size) == 0`. Here if `input_size % output_size ==
0`, then the kernel size equals `input_size // output_size`, otherwise
`input_size // output_size + 1.`
2024-08-01 11:37:53 +08:00
Rob Suderman 7f475e174e
Add extf-trunc f32-f64-f32 ellision (#3579)
Torch has all scalars represented as i64 and f64 types which results in
extraneous trunc-extf commands. We can rework this by elliding
widen-narrow cases away.
2024-07-31 16:50:00 -07:00
yyp0 f49b9c14f1
[Torch] Add support for Aten__Or__BoolOp (#3574) 2024-07-31 17:23:53 +08:00
Ivan Butygin 8bd1b9751f
`max_unpool3d` linalg lowering (#3536)
An attempt of  `aten.max_unpool3d` to linalg lowering.
There are known issues with this implementation (see comment in code).
2024-07-30 20:59:17 +03:00
Vivek Khandelwal b6e4725259
[ONNX] Add OnnxToTorch lowering for NonMaxSuppression op (#3501)
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-07-26 21:01:27 +05:30
yyp0 ea60d72489
[Torch] Add AtenMaskedFillTensorOp support (#3561) 2024-07-26 15:32:13 +08:00
Yuanqiang Liu 003b06dfa1
[Torch] enhance naryFolderHelper to support mixed dtypes (#3559)
* so that it could support like `i64 + f64 => f64`.
* also unify `aten.log`'s folder code to use `naryFolderHelper`.
2024-07-24 17:54:59 +08:00
Yuanqiang Liu aad1604046
[Torch] enhance fold of aten.squeeze.dim (#3558) 2024-07-24 14:13:48 +08:00
Ze Zhang d1e172f418
Register fake_quantize_cachemask ops and add their decompose patterns (#3556)
Test:

`cmake --build build --target check-torch-mlir-all`
2024-07-23 11:33:12 -07:00
Yuanqiang Liu 21ad890009
[Torch] enhance fold of aten.slice.Tensor (#3557)
so that it could support folding slice with any static shape.
2024-07-23 22:53:03 +08:00
Yuanqiang Liu 78846425e2
[Torch] add constriants when decompose aten.split_with_sizes (#3555) 2024-07-23 10:34:29 +08:00
Vivek Khandelwal 22c9008bb9
build: Update Roll PyTorch version (#3548)
This commit also updates the PyTorch and Torchvision nightly links since
they are now moved to a different location.

PyTorch Nightly: https://download.pytorch.org/whl/nightly/cpu/torch/
Torchvision Nightly:
https://download.pytorch.org/whl/nightly/cpu/torchvision/

Disables dtype checks for some ops, tracked by https://github.com/llvm/torch-mlir/issues/3552

Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-07-19 21:38:57 +05:30
bosko-syrmia 2cdf3deae3
implement lowering of torch.aten._linalg_slogdet (#3524) 2024-07-19 11:24:43 +05:30
Branko Trifkovic c7d972ed58
Implement lowering of torch.aten.tril_indices (#3517) 2024-07-18 18:38:12 +05:30
pkapris-syrmia fde286f491
Implement lowering for torch.aten.hann_window.periodic (#3502) 2024-07-17 18:21:23 +05:30
pkapris-syrmia b59efc75f3
Implement lowering of torch.aten.atleast_1d (#3498)
This operator is necessary in order to implement torch.aten.vstack.
Which will be added in a future PR.
2024-07-17 18:20:30 +05:30
Arham Khan 574143448b
[E2E][ONNX] torch.multinomial (#3404)
This PR adds a conversion in the TorchOnnxToTorch pass for the ONNX
Multinomial operation. It also adds a TorchToLinalg lowering for the
`aten.Multinomial` op and does a light refactor of some repeated code
that generates random floating point numbers in
`TorchToLinalg/Random.cpp`.
2024-07-16 23:09:39 +05:30
rohan-tan-bhowmik 0791a8860c
[Torch] Implements TorchToLinalg lowering of torch.ops.aten._weight_norm_interface (#3538)
Resolves https://github.com/nod-ai/SHARK-Turbine/issues/757.

Adds TorchToLinalg lowering for `Aten_WeightNormInterfaceOp`.

---------

Co-authored-by: Ubuntu <rbhowmik@RohanBhowmikVM.judsoscro3wupi0qm4bjlj5m3b.bx.internal.cloudapp.net>
2024-07-16 23:09:12 +05:30
Yuanqiang Liu 714270a922
[Stablehlo] legalize deprecated ops to stablehlo ops (#3543) 2024-07-17 00:05:11 +08:00
Xinyu Yang e5d1677894
[Torch] Eliminate getWithLeastStaticInformation in DecomposeAtenLinspaceOp and DecomposeAtenFakeQuantizePerTensorAffineOp (#3539)
as title
2024-07-15 10:02:36 +08:00
Yuanqiang Liu 5e4f00acb1
[Torch] add support for aten.scatter_add (#3534) 2024-07-12 09:15:42 +08:00
zjgarvey 0fb8b017d8
Adds misc fixes for some padding related issues (#3528)
This patch adds a few misc pad op related changes:

1. Addresses issue <https://github.com/llvm/torch-mlir/issues/3457>
2. Addresses issue <https://github.com/llvm/torch-mlir/issues/3442>
3. Fixes the padding order for asymmetrically padded onnx.Conv ops
4. Enables passing quantization through those onnx.Conv op pre-paddings
5. Modifies the torch-to-linalg lowering of AtenReplicationPad2d op to
enable support for input rank != 4

Unfortunately, even with all of these changes, the e2e tests for the
ReplicationPad2d still fail the onnx config, since the torch export
procedure for rearranging the pad order is complicated enough that the
padding ints end up not being able to fold back to constants.
2024-07-11 20:01:45 -05:00
Yuanqiang Liu b38585e077
[Torch Dialect] fix aten.nan_to_num's decomposition when inf=None (#3530)
also add shape infer in decomposition, see
https://github.com/llvm/torch-mlir/issues/3312
2024-07-11 08:46:40 +08:00
Ze Zhang d466d5b809
Register fake_quantize related ops (#3522)
Register `aten.fake_quantize_per_channel_affine` and
`aten.fake_quantize_per_tensor_affine.tensor_qparams` ops

---------

Co-authored-by: Ze Zhang <ze.zhang@getcruise.com>
2024-07-05 11:02:03 -07:00
Sagar Kulkarni 0fe74845da
[ONNX] Fix bug in ONNXToTorch PadOp's pads tensor rearrangement (#3485)
Fix the pad tensor rearrangement such that we change the representation
from [x1_begin, x2_begin, ..., x1_end, x2_end,...] to [xn_begin, xn_end,
...., x2_begin, x2_end, x1_begin, x1_end] where x1, x2 .. xn are the
dimensions of the pads tensor argument.

---------

Co-authored-by: zjgarvey <zjgarvey@gmail.com>
Co-authored-by: zjgarvey <47986913+zjgarvey@users.noreply.github.com>
2024-07-03 15:02:49 -05:00
Scott Todd ca0e906675
Fix `uint64_t` type. (#3519)
`u_int64_t` is nonstandard and does not exist in MSVC.
2024-07-02 16:06:20 +00:00
Yuanqiang Liu e2fbded49c
[Torch Dialect] improve argmax/argmin's decomposition to support keep… (#3514)
…dim=True when dim=None
2024-07-02 09:08:57 +08:00
Yuanqiang Liu 0e71a192d8
[Torch] support decomposition of aten.aminmax (#3513)
* unify decompisition of `aten.amax` and `aten.amin`
* support `aten.amax` with `dim=()`
2024-06-29 21:44:05 +08:00
Jiawei Wu f75cbb4df9
[torch dialect] emit aten.fmax/fmin and add decomposition patterns (#3510) 2024-06-29 00:07:55 +08:00
Phaneesh Barwaria 5a627c46b7
onnx.DFT basic support (#3463)
- adds support for DFT v20 on the FFT and IFFT path
- adds required skeleton code for IFFT ops to be recognised in TMlir
2024-06-28 20:08:43 +05:30
Christopher McGirr 7e6d76e997
[Torch] Fix torch.constant.int operation parsing (#3476)
Due to the custom operation parser, the print and parser were expecting
two different forms.

One having the dictionary before the value and the other after.
Following the format of the other constants ops, the constant.int will
follow the `value attr-dict` format. Updated the parser accordingly.
2024-06-28 16:06:52 +02:00
Aart Bik 1f73895f93
[torch-mlir] bump to llvm/llvm-project@9b78ddf3b2 (#3491)
This bump triggered an upstream assert. Includes a WAR for #3506.

Also includes several things I needed to do to repro:

* When TORCH_MLIR_TEST_CONCURRENCY=1, test runs will be printed.
* Added TORCH_MLIR_TEST_VERBOSE=1 handling to enable verbose mode
(useful on CI).

---------

Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com>
2024-06-27 19:28:02 -07:00
Matthias Gehre 6678e1a256
TorchToLinalg: Try folding shape computations to keep static shapes when possible (#3475)
Before this PR, a statically shaped aten.convolution would generate
dynamically shaped linalg IR, and even `-canonicalize` would not be able
to fold it back into static shapes. This PR ensure that shape
calculations are folded on construction to directly generate statically
shaped linalg IR.

We achieve that by ensuring that `arith` ops involved in computing
shapes are created via `createOrFold`, so that later uses of
`getAsOpFoldResult` see constants instead of those ops.

For example
```
module {
  func.func @forward(%arg0: !torch.vtensor<[32,336,112,112],f32>,
                        %arg1: !torch.vtensor<[336,168,3,3],f32>, 
                        %arg2: !torch.vtensor<[336],f32>) 
                        -> !torch.vtensor<[32,336,56,56],f32> {
    %false = torch.constant.bool false
    %int2 = torch.constant.int 2
    %int1 = torch.constant.int 1
    %0 = torch.prim.ListConstruct %int1, %int1 : (!torch.int, !torch.int) -> !torch.list<int>
    %1 = torch.prim.ListConstruct %int2, %int2 : (!torch.int, !torch.int) -> !torch.list<int>
    %2 = torch.prim.ListConstruct  : () -> !torch.list<int>
    %3 = torch.aten.convolution %arg0, %arg1, %arg2, %1, %0, %0, %false, %2, %int2 
    : !torch.vtensor<[32,336,112,112],f32>, !torch.vtensor<[336,168,3,3],f32>, !torch.vtensor<[336],f32>, !torch.list<int>,
      !torch.list<int>, !torch.list<int>, !torch.bool, !torch.list<int>, !torch.int
   -> !torch.vtensor<[32,336,56,56],f32>
    return %3 : !torch.vtensor<[32,336,56,56],f32>
  }
}
```
would result in
```
[...]
  %padded = tensor.pad %2 low[%14, %15, %16, %17] high[%14, %15, %16, %17] {
    ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index):
      tensor.yield %cst : f32
    } : tensor<32x336x112x112xf32> to tensor<?x?x?x?xf32>
[...]
  %45 = linalg.conv_2d_ngchw_gfchw {dilations = dense<1> : vector<2xi64>, strides = dense<2> : vector<2xi64>}
    ins(%expanded, %expanded_37 : tensor<?x2x?x?x?xf32>, tensor<2x168x168x3x3xf32>)
    outs(%expanded_44 : tensor<32x2x168x?x?xf32>) -> tensor<32x2x168x?x?xf32>
[...]
```
and with this PR all shapes are static.
2024-06-27 08:43:10 +02:00
zjgarvey d2bc70f188
[TorchToLinalg][ONNX] Add Basic Determinant Support (#3481)
This adds support for a few ops:

- torch.linalg_det
- torch._linalg_det (if the LU and pivot returns are unused)
- onnx.Det

An scf loop is used, since the row reduction algorithm applied here has
some loop-carried dependencies.
The current support being added here is very basic, and only works if no
permutations are required during row reduction, and assumes the matrices
are non-singular.
2024-06-25 13:34:19 -05:00
zjgarvey 368fabf0c1
[ONNX] Basic Support for DeformConv (#3469)
This adds a torchvision op to torch-mlir and a path from onnx.DeformConv
to torchvision.deform_conv2d.

I'm not implementing the torch->linalg lowering for the torchvision op
yet, but posting this PR to get feedback on some of the choices being
made here and to flesh out the onnx frontend a bit.
2024-06-25 12:16:51 -05:00
zjgarvey e346c911f7
[ONNX] Add basic support for RoiAlign (#3493)
This adds an onnx->torch conversion for onnx.RoiAlign into
torchvision.roi_align or torchvision.roi_pool, and adds those two
torchvision ops to torch-mlir.
2024-06-25 11:02:45 -05:00
Vinayak Dev 02340408b7
[torch] Add OnnxToTorch lowering for Onnx.STFT op (#3492)
Adds OnnxToTorch lowering for `Onnx.STFT` op.
2024-06-25 19:00:45 +05:30
Branko Trifkovic 98c6971a01
Implement lowering of torch.aten.triu_indices (#3451)
Closes
[nod-ai/SHARK-Turbine/issues/709](https://github.com/nod-ai/SHARK-Turbine/issues/709)

---------

Co-authored-by: Branko Trifkovic <branko.trifkovic@syrmia.com>
2024-06-21 16:16:38 -07:00
Matthias Gehre acd57a3520
Support fake_quantize_per_tensor_affine_cachemask (#3477)
Add a new op with shape/dtypes and decompose into
`fake_quantize_per_tensor_affine` when the second result is unused.

The xfail_set change is on ONNX because torch cannot export this op to
ONNX.
2024-06-21 07:15:31 +00:00
zjgarvey 694210f429
[TorchToLinalg] Fix Quantized Convolution Accumulator Type (#3459)
1. truncates zero-points to i32
2. modifies the default accumulator type for i8 from i64 to i32. 
3. now uses the input dtype to infer accumulator dtype.
2024-06-20 13:54:20 -07:00
Xinyu Yang c7d52f63b4
[stablehlo] add aten::_int_mm lowering (#3474)
as title
2024-06-20 16:10:31 +08:00
Branko Trifkovic 676fa8cc09
Implement lowering of torch.aten.renorm (#3388)
Closes
[nod-ai/SHARK-Turbine/issues/689](https://github.com/nod-ai/SHARK-Turbine/issues/689)

---------

Co-authored-by: Branko Trifkovic <branko.trifkovic@syrmia.com>
2024-06-17 10:40:57 -07:00
ptrifunovic98 4555629246
Implement lowering of torch.aten.kthvalue (#3360)
Closes
[nod-ai/SHARK-Turbine#620](https://github.com/nod-ai/SHARK-Turbine/issues/620)
2024-06-15 11:18:39 +05:30
Arham Khan 09c988046c
[ONNX] Add OnnxToTorch lowering for Onnx.NegativeLogLikelihoodLoss Op (#3380)
This implements the Onnx.NegativeLogLikelihoodLoss op using the
signature provided
[here](https://onnx.ai/onnx/operators/onnx__NegativeLogLikelihoodLoss.html)
by replacing it with a `NLLLossForward` op.

Additionally, I included a helper function `get_loss_reduction_enum` to
convert from a string `reduction` parameter to the corresponding
intended integer value since this is an operation that will be reused
for any loss function module. This differs from `get_reduction_enum` in
`TorchUpstream.cpp` which handles the `reduce` parameter from
`scatter_reduce` type operations.
2024-06-14 22:01:11 +05:30
Xinyu Yang 6f94c7b0aa
[Torch] Add support for Meshgrid (#3462) 2024-06-14 23:59:08 +08:00
Vinayak Dev 39d882f7c9
[torch] Add OnnxToTorch lowering for the Col2Im op (#3424)
Adds OnnxToTorch lowering for the `onnx.Col2Im` op.
2024-06-13 08:42:06 +00:00
Lei Zhang 77d7f64472
Update to llvm/llvm-proect@27ac46e6be (2024-6-12) (#3454)
This would require to bump stablehlo at the same time.
2024-06-12 19:34:01 -07:00
zjgarvey de28c8540b
[ONNX] add int16 quantization support (#3446)
There is currently no int16 quantization support in torch. This patch
adds a new mlir type to correspond to the missing "torch.qint16" type,
and enables lowering of quantization-related onnx ops using int16 types.

In follow-up patches, custom quantization logic for ops like
aten.matmul/aten.mm/aten.convolution may need to be revisited to allow
support for qint16. The passes in FuseQuantizedOps.cpp may also need
slight modifications.
2024-06-12 10:37:22 +05:30
Yuanqiang Liu 689efc8917
[Torch] fix toBuiltinTensor() (#3415)
* Let `toBuiltinTensor()` reflects the original dtype of
`!torch.vtensor`.
* Backend handles dtype conversion themselves.
2024-06-08 09:36:32 +08:00
Rob Suderman 75af64fc12
[torch] Add support for f8 types for linalg conversion (#3436)
Linalg conversion requires mapping for f8 types
2024-06-07 13:59:38 -07:00
Sambhav Jain d0a818a03e
Representing Symbolic Shape Expressions in Torch Dialect (#3372)
Torch Dialect with symbolic shape expressions:
```ll
module {                                                                                                                                                                                                     
  func.func @main(%arg0: !torch.vtensor<[?,?,3],f32>, %arg1: !torch.vtensor<[?,?,3],f32>) -> !torch.vtensor<[?,?,3],f32> {                                                                                   
    %0 = torch.symbolic_int "s0" {min_val = 5, max_val = 10} : !torch.int                                                                                                                                    
    %1 = torch.symbolic_int "s1" {min_val = 0, max_val = 100} : !torch.int                                                                                                                                   
    %2 = torch.symbolic_int "s3" {min_val = 0, max_val = 50} : !torch.int                                                                                                                                    
    
    torch.bind_symbolic_shape %arg0, [%0, %1], #affine_map<()[s0, s1] -> (s0, s1, 3)> : !torch.vtensor<[?,?,3],f32>                                                                                          
    torch.bind_symbolic_shape %arg1, [%0, %2], #affine_map<()[s0, s1] -> (s0, s1, 3)> : !torch.vtensor<[?,?,3],f32>                                                                                          
    
    %3 = torch.aten.tanh %arg0 : !torch.vtensor<[?,?,3],f32> -> !torch.vtensor<[?,?,3],f32>                                                                                                                  
    torch.bind_symbolic_shape %3, [%0, %1], #affine_map<()[s0, s1] -> (s0, s1, 3)> : !torch.vtensor<[?,?,3],f32>                                                                                             
    
    %4 = torch.aten.sigmoid %arg1 : !torch.vtensor<[?,?,3],f32> -> !torch.vtensor<[?,?,3],f32>                                                                                                               
    torch.bind_symbolic_shape %4, [%0, %2], #affine_map<()[s0, s1] -> (s0, s1, 3)> : !torch.vtensor<[?,?,3],f32>                                                                                             
    
    %5 = torch.prim.ListConstruct %3, %3, %4 : (!torch.vtensor<[?,?,3],f32>, !torch.vtensor<[?,?,3],f32>, !torch.vtensor<[?,?,3],f32>) -> !torch.list<vtensor>                                               
    %int1 = torch.constant.int 1                                                                                                                                                                             
    %6 = torch.aten.cat %5, %int1 : !torch.list<vtensor>, !torch.int -> !torch.vtensor<[?,?,3],f32>                                                                                                          
    torch.bind_symbolic_shape %6, [%0, %1, %2], #affine_map<()[s0, s1, s2] -> (s0, s1 * 2 + s2, 3)> : !torch.vtensor<[?,?,3],f32>                                                                            
    
    return %6 : !torch.vtensor<[?,?,3],f32>                                                                                                                                                                  
  }                                                                                                                                                                                                          
}              
```

For reference, this is the TorchDynamo exported program with symbolic
shape expressions that the above Torch dialect program is imported from:
```py
ExportedProgram:                                                                                                                                                                                             
    class GraphModule(torch.nn.Module):                                                                                                                                                                      
        def forward(self, x: "f32[s0, s1, 3]", y: "f32[s0, s3, 3]"):                                                                                                                                         
            # File: /home/sambhav.jain/workspaces/cruise/src/3p/torch-mlir/test/python/fx_importer/symbolic_shape_expr_test.py:31 in forward, code: a = torch.tanh(x)                                        
            tanh: "f32[s0, s1, 3]" = torch.ops.aten.tanh.default(x);  x = None                                                                                                                               
                                                                                                                                                                                                             
            # File: /home/sambhav.jain/workspaces/cruise/src/3p/torch-mlir/test/python/fx_importer/symbolic_shape_expr_test.py:32 in forward, code: b = torch.sigmoid(y)                                     
            sigmoid: "f32[s0, s3, 3]" = torch.ops.aten.sigmoid.default(y);  y = None                                                                                                                         
                                                                                                                                                                                                             
            # File: /home/sambhav.jain/workspaces/cruise/src/3p/torch-mlir/test/python/fx_importer/symbolic_shape_expr_test.py:33 in forward, code: return torch.cat((a, a, b), dim=1)                       
            cat: "f32[s0, 2*s1 + s3, 3]" = torch.ops.aten.cat.default([tanh, tanh, sigmoid], 1);  tanh = sigmoid = None                                                                                      
            return (cat,)                                                                                                                                                                                    
                                                                                                                                                                                                             
Graph signature: ExportGraphSignature(input_specs=[InputSpec(kind=<InputKind.USER_INPUT: 1>, arg=TensorArgument(name='x'), target=None, persistent=None), InputSpec(kind=<InputKind.USER_INPUT: 1>, arg=TensorArgument(name='y'), target=None, persistent=None)], output_specs=[OutputSpec(kind=<OutputKind.USER_OUTPUT: 1>, arg=TensorArgument(name='cat'), target=None)])                                               
Range constraints: {s0: ValueRanges(lower=5, upper=10, is_bool=False), s1: ValueRanges(lower=0, upper=100, is_bool=False), s3: ValueRanges(lower=0, upper=50, is_bool=False)} 
```

Huge credit to @stellaraccident for the inputs that helped evaluate the
various design options and arrive at the representation of choice.


- [x] Op definitions for symbolic_int and bind_symbolic_shape ops
- [x] fx_importer updates to import range constraints + create
symbolic_int ops
- [x] fx_importer changes for AffineMapAttr building + adding
bind_symbolic_shape ops
- [x] custom printer/parser for inlined AffineMap expressions in mlir
assembly
- [x] Dialect lit test
- [x] fx_importer python lit tests
- [ ] Cleanup pass to remove these ops (can add in a follow-on)
2024-06-07 04:04:03 -07:00
Vivek Khandelwal 72837fbb3d
build: manually update PyTorch version (#3340)
Set PyTorch and TorchVision version to nightly release 2024-05-14.

Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-06-06 22:23:40 +05:30
Vivek Khandelwal 661be2d5b0
[MLIR][Torch] Add TorchToLinalg lowering for AtenAvgPool3dOp (#3030)
This commit also fixes the average pool op' test failing for
OnnxToLinalg lowering.

Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-06-04 22:12:34 +05:30
Yuanqiang Liu 50f7103098
[Stablehlo] support uint8 (#3367)
Support lowering unsigned integer type to stablehlo as discussed in
https://github.com/llvm/torch-mlir/pull/2184.

The things I do in this PR:
1. create `setupBackendTypeConversionForStablehlo()`,
`createFuncBackendTypeConversionForStablehloPass` and
`createFinalizingBackendTypeConversionForStablehloPass`.
2. remove `InferTypeOpInterface` from `torch_c.to_builtin_tensor`,
because it's different result type between linalg backend and stablehlo
backend:
```
// linalg backend
func.func @forward(%arg0: !torch.vtensor<[3],ui8>) -> tensor<3xf32> {
    %c = torch_c.to_builtin_tensor %arg0 : (!torch.vtensor<[3], ui8> -> tensor<3xi8>
    %0 = tensor.empty() : tensor<3xf32>
    %1 = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel"]} ins(%arg0 : tensor<3xi8>) outs(%0 : tensor<3xf32>) {
    ^bb0(%in: i8, %out: f32):
      %2 = arith.uitofp %in : i8 to f32
      linalg.yield %2 : f32
    } -> tensor<3xf32>
    return %1 : tensor<3xf32>
}
// stablehlo backend
func.func @forward(%arg0: !torch.vtensor<[3],ui8>) -> tensor<3xf32> {
    %c = torch_c.to_builtin_tensor %arg0 : (!torch.vtensor<[3], ui8> -> tensor<3xui8>
    %0 = stablehlo.convert %arg0 : (tensor<3xui8> -> tensor<3xf32>
    return %0 : tensor<3xf32>
}
```
3. fix stablehlo and linalg's conversion
2024-06-04 09:04:59 +08:00
zjgarvey 8995c90879
[TorchToLinalg] add support for quantized group conv (#3341)
This addresses 7 of the model failures I'm seeing in the test suite. See
[Shark-Turbine issue
#566](https://github.com/nod-ai/SHARK-Turbine/issues/566).

Need the op ```linalg.conv_2d_ngchw_gfchw_q``` to be added upstream
before merging this. See [llvm-project PR #92136
](https://github.com/llvm/llvm-project/pull/92136).

A small additional expansion to operand quantization is included in this
patch to address a model failure that occurs when unblocking the
quantized group convolutions in one of these onnx models.
2024-06-03 21:57:44 +05:30
Vivek Khandelwal 6382dbbcc0
[ONNX] Add OnnxToTorch lowering for SpaceToDepth op (#3393)
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
2024-06-03 20:29:39 +05:30
Xinyu Yang 285b087a5d
[Torch] Emit rrelu and decompose it (#3250)
as title
2024-06-03 19:25:52 +08:00
Xinyu Yang 267052df2a
[Torch] decompose AtenLerpTensorOp (#3251)
as title
2024-06-03 15:25:09 +08:00
Xinyu Yang 23b53050de
[Torch]Support conv_transpose1d and conv_transpose3d (#3286)
1. Support conv_transpose1d and conv_transpose3d
2. Fix bugs of convertTransposedConv func in
lib/Conversion/TorchToStablehlo/Linear.cpp
2024-06-03 15:11:12 +08:00
Rob Suderman afca88a058
[NFC] Change to *cast instead of .*cast variants (#3405)
Member casts have been deprecated. Changing over a bunch of the member
cast calls to the global templated variants to remove deprecation
warnings.
2024-05-30 23:45:13 -07:00
Yuanqiang Liu 4e05e2cd1e
[Torch] support recompose of aten.split.with_sizes and aten.tensor_sp… (#3401)
…lit.sections

* support recompose to aten.split.with_sizes and
aten.tensor_split.sections
* fix recompose of aten.chunk
2024-05-31 09:56:47 +08:00
zjgarvey 074098d20c
Modifies onnx resize lowering to fix numerical issues (#3381)
Updates:

- some unsupported modes are now going to report a match failure for
unsupported coordinate transformation modes.
- fixes a bug that was introduced in the last patch for resize (my
bad...)
- uses actual x and y coordinates for computing weights in bilinear
interpolation (rather than eps modified values)
- slightly simplifies the bilinear interpolation payload for readability
and performance
- passes coordinate transformation mode information from an onnx.Resize
op to the mode string for the aten._interpolate op. This allows us to
perform custom logic in the torch->linalg lowering to support
onnx.Resize options without losing the default behaviors of the
interpolate op.
2024-05-30 20:34:37 -04:00
penguin_wwy 1f544c37d0
[NFC] Remove unused header files (#3386) 2024-05-30 14:30:36 +08:00
Yuanqiang Liu e0a5adb1db
[Torch] fix aten.linear's decomposition (#3391)
* support aten.linear with more rank.
2024-05-27 15:49:50 +08:00
Yuanqiang Liu 5bb1a65ec9
[Stablehlo] refactor reduction lowering and support aten.amin (#3383)
* implement detailed lowering template pattern
`ConvertAtenReduceAllDimsOp` and `ConvertAtenReduceKeepDimOp`
* support `aten.amin`'s lowering.
2024-05-23 20:40:20 +08:00
Angel Zhang 2e194e13d6
[Torch] Fix bugs for `Torch::AtenOneHotOp` (#3350)
This PR fixes the bugs for `Torch::AtenOneHotOp` by:

1) Using `Torch::kUnknownSize` as the default value for `numClasses` in
   the pattern matching stage in `DecomposeAtenOneHotOp`
2) Adding `AtenIntScalarOp` to the patterns in `TorchToArith`
3) Handling both `int` and `float` types for `off` and `on` values in
`TorchOnnxToTorch` conversion

It also includes:

1) A new test in `TorchToArith/basic.mlir`, for `torch.aten.Int.Scalar`,
and
2) A new test in `decompose-complex-ops.mlir`, for `torch.aten.one_hot`

**Dependencies**

This PR is dependent on #3334.
2024-05-22 17:19:08 +00:00
Xinyu Yang 4d7cdba4bf
[Torch] eliminate "getWithLeastStaticInformation" in DecomposeAtenTriuOp (#3330)
I am trying to eliminate 'getWithLeastStaticInformation' in
DecomposeAtenTriuOp. Could you provide me with some suggestions?
@qingyunqu @zjgarvey 
See issue https://github.com/llvm/torch-mlir/issues/3312
2024-05-22 23:16:57 +08:00
Sambhav Jain 6e485574e5
[Pipeline] Use dedicated simplification pipeline for TorchDynamo frontend (#3376)
Discord Thread:
https://discord.com/channels/636084430946959380/1238330633328005243

## Context: 

[This](https://github.com/llvm/torch-mlir/blob/main/python/torch_mlir/fx.py#L61)
was updated to support e2e tests for the TorchDynamo frontend in
Torch-MLIR, where we run FX decompositions and import the FX IR to
generate Torch dialect, followed by
`torch-function-to-torch-backend-pipeline`, skipping only the shape/type
refinement for now. However, we should be able to skip many of the torch
simplification passes, as depicted in the [frontend
roadmap](https://github.com/llvm/torch-mlir/blob/main/docs/images/roadmap_frontend.png).

Based on IREE's TorchDynamo
[pipeline](https://github.com/iree-org/iree/blob/main/compiler/plugins/input/Torch/InputConversion/Passes.cpp#L29),
the only two passes we seem to require are: `ReduceOpVariantsPass` and
`DecomposeComplexOpsPass`. This is inline with our findings as well
based on initial exploration.

This PR creates a dedicated frontend simplification pipeline for
TorchDynamo / FX Importer which calls only `ReduceOpVariantsPass` and
`DecomposeComplexOpsPass`. We rely on the e2e fx_importer tests to
ensure we're not regressing by removing many of the passes that were
historically needed for TorchScript.

One notable change here is that we do not call the
`LowerToBackendContractPass` anymore, which used to call
`TorchSimplificationPipeline` iteratively until VerifyBackendContract
was clean. Some of this was required for the shape/type refinement to
converge, which seems a non-issue for Dynamo frontend. Do we anticipate
this (the iterative invocation of TorchSimplificationPipeline followed
by VerifyBackendContract) to be worth retaining in the Dynamo frontend
pipeline? If so, I can make those changes, PLMK.
2024-05-22 05:23:18 -07:00
Yuanqiang Liu 8814d0ae64
[Torch] emit aten.dot and canonicalize it to aten.matmul (#3361)
* canonicalize `aten.dot` to `aten.matmul`
2024-05-18 22:45:14 +08:00
Xinyu Yang 7faba75696
[Torch] Decompose AtenMaskedScatterOp (#3353)
Co-authored-by: Yuanqiang Liu <liuyuanqiang.yqliu@bytedance.com>
2024-05-16 15:27:25 +08:00
Xinyu Yang a9edefb3cf
[Torch] Fix AtenSliceTensorOp::fold (#3345) 2024-05-16 11:42:43 +08:00
penguin_wwy 405f884522
[stablehlo] verify stablehlo backend contract (#3338) 2024-05-16 11:03:43 +08:00
Peiming Liu ccb772cd0f
[sparse] propagate sparsity properly when decompose torch operations. (#3318) 2024-05-15 10:09:27 -07:00
Aaron St George ba32b9cee7
Don't fold `aten.clone` if result isn't same type as input (#3347)
Similar to https://github.com/llvm/torch-mlir/pull/2824, we were seeing
some assertion failures after the addition checks around folders were
tightened up in LLVM: https://github.com/llvm/llvm-project/pull/75887 .
This PR essentially moves the logic that used to be applied at the LLVM
level into the folder, which seems to be the suggested fix.
2024-05-16 00:07:45 +08:00
Xinyu Yang 6b95dd461d
[Torch] Fix PrimNumToTensorScalarOp::fold (#3339)
In constant folding progress, a new constant op will be created
according to the origin op's result type.

See the code in TorchDialect.cpp.

```cpp
Operation *TorchDialect::materializeConstant(OpBuilder &builder,
                                             Attribute value, Type type,
                                             Location loc) {
  if (auto integerType = dyn_cast<Torch::IntType>(type))
    return builder.create<Torch::ConstantIntOp>(loc, cast<IntegerAttr>(value));

  if (auto floatType = dyn_cast<Torch::FloatType>(type))
    return builder.create<Torch::ConstantFloatOp>(loc, cast<FloatAttr>(value));

  if (auto numberType = dyn_cast<Torch::NumberType>(type)) {
    if (auto floatValue = dyn_cast<mlir::FloatAttr>(value)) {
      return builder.create<Torch::ConstantNumberOp>(loc, floatValue);
    } else if (auto intValue = dyn_cast<mlir::IntegerAttr>(value)) {
      return builder.create<Torch::ConstantNumberOp>(loc, intValue);
    }
  }

  if (isa<Torch::BoolType>(type)) {
    return builder.create<Torch::ConstantBoolOp>(loc, cast<IntegerAttr>(value));
  }

  if (isa<Torch::NoneType>(type))
    return builder.create<ConstantNoneOp>(loc);

  if (auto stringAttr = dyn_cast<StringAttr>(value))
    return builder.create<ConstantStrOp>(loc, stringAttr);

  if (auto elementsAttr = dyn_cast<ElementsAttr>(value)) {
    // Only !torch.vtensor can be constant folded. !torch.tensor has
    // non-trivial aliasing semantics which prevent deduplicating it.
    assert(isa<ValueTensorType>(type) && "should be a vtensor type!");
    return builder.create<ValueTensorLiteralOp>(loc, elementsAttr);
  }

  return nullptr;
}
```
So when the op has a tensor result type, it must be "ValueTensorType"
due to the **assert** statement. However, many fold methods in
TorchOps.cpp only have a judgment of "BaseTensorType".
2024-05-15 20:54:19 +08:00
zjgarvey 911e723581
Expands Q Commuting Ops (#3332)
After running the model tests in SHARK-TestSuite, I noticed a few model
failures due to half-fusion.

Notably, RDN_pytorch_vaiq_int8 had a depth=5 convolution chain with
multiple AtenViewOp's.
2024-05-13 11:01:53 -07:00
zjgarvey 75d1d72059
Generalize Operand Quantization in FuseQuantizeOps (#3327)
This change enables more customization with operand quantization, and
generalizes the patterns QuantizeOperands and QuantizeTransposeOperands
to QuantizeOperandsPastCommutingOps.

This allows for passing quantization through operations which are
functionally unaffected by quantization, such as view-like ops. The
purpose of this change is to address a myriad of quantization issues
seen in quantized onnx models that have some reshape-like operations
sandwiched in between a dequant and something like a matmul (whose other
operand is immediately quantizable).
2024-05-12 20:49:59 -07:00
NeverRaR 1d4859699b
MaxPool1d lowering to linalg (#3295)
Co-authored-by: root <root@i32b01216.sqa.eu95>
2024-05-10 22:05:26 +05:30
penguin_wwy 64b59c7fc3
[FxImporter] Eliminate the dependency on the refinement pass (#3309) 2024-05-10 02:44:36 +08:00
aldesilv ec6d7aa5d2
OnnxToTorch lowering resize op (#3013)
https://github.com/nod-ai/SHARK-Turbine/issues/358
adds a lowering from onnx to linalg for bilinear and nearest resize with
support for using scales or sizes to get resize shape. uses coordinate
transform half pixel for bilinear mode and asymmetrical for nearest
mode. See
https://github.com/onnx/onnx/blob/main/docs/Operators.md#Resize. Added
two passes -- one for bilinear and the other for nearest.
2024-05-08 21:35:03 +00:00