Note that to enable folding of the code coming from an example
like the ConstantPad2dStaticModule e2e test, support for other
operations had to be added/improved:
- aten::neg.int
- aten::eq.float
- aten::eq.str
- prim::Uninitialized
This commit adds lowering of `aten.threshold` op
This commit adds lowering of `aten.threshold_backward` op
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This involes the following 2 parts:
- Change refine type to propagate more static shape info.
- Get as much static shape info as possible when creating the result
tensor when converting to linalg.
This commit adds lowering of `aten.arange.start_step` op.
This commit decomposes `aten.arange` and `aten.arange.start` into
`aten.arange.start_step` op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
We only handle the expanding OR collapsing cases, we do not handle
expanding And collapsing happening at the same time or cases where
it's neither collapsing nor expanding like view of [2,3] for
3x2 tensor.
It's assumed that if a shape list element is got from
`aten.size(tensor, dim)` the corresponding dim is not splitted or
collapsed. This assumption makes it easier to deal with dynamic shapes.
- Added E2E support for `aten.eq.Tensor` and `aten.lt.Tensor` ops. Both
the operands are expected to be of the same type, i.e., type promotion
is not addressed as a part of this commit.
- Added E2E support for `aten.eq.Scalar` and `aten.lt.Scalar` ops.
Tensor operand type to Scalar operand type promotion has not been
handled in this commit.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
The existing implementation of `ConvertConstantTensorAllocOp<>` requires
a C++17 feature `if constexpr ()`. This commit removes the use of that
feature to support the implementation even for lower C++ versions.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Add the required lowerings and correct test cases.
These op produce zero-d tensors and it was incorrectly mentioned in
refine types to produce 1d tensor of size 1.
- Templatize `aten.zeros` and `aten.ones` ops lowering.
- Add E2E support for `aten.empty` op.
- Add Integer type support in `aten.mul.Scalar` op lowering.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
`aten.gt.Tensor` op has been added in torch dialect and the
lowering of the op has been done to the linalg dialect.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
This commit adds support for aten.native_layer_norm operation. Here
the previous code for aten.layer_norm is tweaked a little bit to
accomodate both mean and variance values alongwith the layer norm
value. This commit also adds decomposition of aten.layer_norm into
aten.native_layer_norm, which was previously getting lowered directly
to linalg.
Signed-Off-By: Prateek Gupta<prateek@nod-labs.com>
This commit adds lowering of `aten.squeeze.dim` op into
`linalg.TensorCollapseShape` op. Here, the dim(th) dimension of the
input tensor is not supposed to be dynamic.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds lowering of `aten.gt.Scalar` and `aten.where.self` as a
part of element-wise ops lowering.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
The op lowering has been added as a part of `torch-lower-to-linalg`
pass. This takes care of ignore_index but the weight and reduction
operand is still to be accounted for.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
- Supports variants with multiple dims, one dim, all dime
- Leverages legalize_common and legalize_utils code from
TensorFlow-TOSA work
Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com>
Many reduction ops take as an argument an optional output dtype that
can change the type of the input tensor before the reduction is
performed. This commit adds support for the optional dtype flag that
had been previously ignored.
Test:
/tools/torchscript_e2e_test.sh -f 'ReduceSumDtype'
/tools/torchscript_e2e_test.sh -f 'ReduceSumDImIntListDtype'
This commit adds lowering of `aten.Squeeze` op into
`linalg.TensorCollapseShape` op. The size 1 dynamic dimensions are not
handled as a part of this commit.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
The lowering of aten.fill.Scalar has been added.
The changes have been made as a part of -torch-convert-to-linalg pass.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
This commit adds lowering of `aten.mul.Scalar` and also adds
decomposition of `aten.addmm` to `aten.mul.Scalar`, `aten.add.Tensor`
and `aten.mm` ops.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Now, aten::linear supports rank 3 inputs. This is a fix
for upcoming bert-inference task. The correct way should be
to support broadcasting in `aten.matmul` op and decompose
`aten.linear` into right ops.
This commit adds new operation `aten.gelu_backward` in the aten
dialect and adds lowering of this operation from aten to linalg.
Signed-Off-By: Prateek Gupta <prateek@nod-labs.com>
- Remove use of conversion construction macros
- Add mul and div op conversions
- Add corresponding tests
Signed-off-by: Suraj Sudhir <suraj.sudhir@arm.com>
This is to facilitate scalar type conversion in the TorchToLinalg. As
part of adding the helper, this PR also:
- Updated `AtenAddTensorOp`, `AtenSubTensorOp` to use the helpers to
support more type variants.
- Added e2e type promotion testing.
- Added i32 memref return/arg type to support e2e testing.
The lowering of `aten.Int.Tensor` op has been added.
The changes has been made as a part of `convert-torch-to-linalg` pass.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
- This commit adds lowering of `aten.View` to `linalg.TensorExpandShape`.
- This lowering will be successful only when one or more static
dimensions are expanded.
- It also fixes a typo in `ConvertAtenFlattenUsingIntsOp` conversion
pattern.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
The types have different levels of categories: where
complex > floating > integral > boolean (> means left hand
side has higher category).
The operands have different levels of priorities where:
dimensioned tensor > 0-dim tensor > scalar == wrapped 0-dim tensor.
This is represented by the `ResultTypeState.dimResult`,
`ResultTypeState.zeroResult` and `ResultTypeState..wrappedResult` in
the source code.
For operands of the same priorities, the result type should be the
highest categories with sufficient width to hold all operands.
By default, only the highest priority operands participate in the type
promotion logic. Lower priority operands participate if they are in
a higher category than any higher priority operands.
For example, <[],f32> (lower priority) and <[1], si64> tensor would
result in <[?],f32> tensor because floating > integeral. Another example
<[],f64> (lower priority) and <[1], f32> tensor would result in
<[?], f32> tensor because f32 and f64 are the same category.
The ScalarType enum definition, type promotion table, ResultTypeState
struct definition and some helpers are copied from
aten/src/ATen/native/TypeProperties.*
Other references:
- https://pytorch.org/docs/stable/tensor_attributes.html#type-promotion-doc
- https://github.com/pytorch/pytorch/issues/9515
Other minor changes:
1. Fix `visitExpandLikeOp` to consider cases where the given sizes list
size is larger than the input rank.
2. Add back the somehow deleted `torch.aten.softmax.int` tests in
decompose-complex-ops.mlir.
Part of #380
Also
- BoolType is not considered as Scalar
- e2e framework fixes for nan handling
- `tu.rand(..., low=, high=)` support
- delete unused variable (fix warning)
- Add IouOfModule from #380 to e2e test suite (this is a common
calculation in vision models)
Your branch is ahead of 'origin/main' by 1 commit.
Lowering of `aten.matmul` op is added from torch to linalg dialect.
The different cases correspond to
https://pytorch.org/docs/stable/generated/torch.matmul.html.
TODO: Broadcasting in case of batch-matmul is yet to be taken care of.
Signed-off-by: Prashant Kumar <prashant@nod-labs.com>
* Print more exception info on error during test execution
* Fix formatting
* Add aten::gelu lowering
Co-authored-by: Boian Petkantchin <boian@nod-labs.com>
Summary:
This commit fixes an off-by-one error in how negative dimensiosn were
being handled in the lowering of transpose. This commit also adds
tests to transpose and unsqueeze to test negative dimensions.
- Added a DecomposeComplexOps pass to decompose complex torchOps.
- Refactored `visitAtenArgmaxOp` and `visitAtenAnyDimOp` to
`visitReductionAlongDimIntOp`.
- Moved some helper functions into
torch-mlir/Dialect/Torch/Utils/Utils.h to be shared by multiple files.
- Added support for f64 tensor as argument and return types.
We lower through linalg-on-tensors and use RefBackend to run it.
This adds enough support for a "tanh" op. Adding more ops should be
fairly mechanical now that things are wired up. Run with:
```
./tools/torchscript_e2e_test.sh -c tosa
```
The backend structure is very similar to linalg-on-tensors based E2E
backends and is a nice parallel (see `tosa_backend.py`). Actually, this
forced a nice refactoring to the layering here. We removed
`torchscript-module-to-linalg-on-tensors-backend-pipeline` and instead
require separately running
```
torchscript-function-to-torch-backend-pipeline,torch-backend-to-linalg-on-tensors-backend-pipeline
```
This highlights the step that lowers to the "torch backend contract"
of cleaned up `torch` dialect ops is a critical step in the lowering.
Going forward, that is the key load-bearing contract of the torch-mlir
project, not the linalg-on-tensors backend contract.
Recommended review order:
- `TorchToTosa.cpp` / `TorchToTosa/basic.mlir`
- `python/torch_mlir_e2e_test/torchscript/configs/tosa_backend.py` and
the new `utils.py` file there.
- `python/torch_mlir_e2e_test/tosa_backends/linalg_on_tensors.py` and
`abc.py` in that directory for the TOSA backend e2e interface.
- other misc mechanical changes
This commit (with approval from all contributors) dual licenses
the torch-mlir project under both the standard LLVM license and the
standard PyTorch license. This will facilitate moving code between
torch-mlir and the two upstream projects.
The standard file comment is now:
```
// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// Also available under a BSD-style license. See LICENSE.
```
See `LICENSE` in the project root for the terms of both licenses.
Also contains the following changes:
- Remove derefineOp canonicalizer because it's not safe.
- Support for optional tensor and list tensors in reduceOpVariant. This
only works for some special detected and easy to handle cases. For list,
it covers the case list is got from a `ListConstruct`. For optional, it
covers the case optional is constructed from a `DerefineOp`.
- Remove the `inferReturnTypes` for `FromBuiltinTensorOp` because it's
not safe to deduce types from the input. For example, a built-in tensor
of i8 could be converted to si8 or ui8. It's better to let the user
specify the return type explicitly.
A few remain in examples/docs that will be naturally be updated in due
time.
This regresses the list support and the general direction of more widely
supported control flow, lists/dicts/globals that we were going for with
the TorchScript path. The idea is that we are deferring that work to
make torch-mlir a very clean standalone thing. We will reboot it,
probably using some of the tools of iree_pydm to make it simpler, and in
a more natural place (such as an iree-torch repo that depends on IREE and
torch-mlir to build a working PyTorch frontend solution for IREE -- it
was really weird that npcomp depended on IREE).
This moves the bulk of the Python code (including the Torch interop)
from `frontends/pytorch` into `torch-mlir/TorchPlugin`. This also
required reconciling a bunch of other Python-related stuff, like the
`torch` dialects.
As I did this, it was simpler to just remove all the old numpy/basicpy
stuff because we were going to delete it anyway and it was faster than
debugging an intermediate state that would only last O(days) anyway.
torch-mlir has two top-level python packages (built into the
`python_packages` directory):
- `torch_mlir_dialects`: `torch` dialect Python bindings (does not
depend on PyTorch). This also involves building the aggregate CAPI for
`torch-mlir`.
- `torch_mlir`: bindings to the part of the code that links against
PyTorch (or C++ code that transitively does).
Additionally, there remain two more Python packages in npcomp (but
outside `torch-mlir`):
- `npcomp_torch`: Contains the e2e test framework and testing configs
that plug into RefBackend and IREE.
- `npcomp_core`: Contains the low-level interfaces to RefBackend and
IREE that `npcomp_torch` uses, along with its own
`MLIR_PYTHON_PACKAGE_PREFIX=npcomp.` aggregation of the core MLIR
python bindings. (all other functionality has been stripped out)
After all the basicpy/numpy deletions, the `npcomp` C++ code is now very
tiny. It basically just contains RefBackend and the `TorchConversion`
dialect/passes (e.g. `TorchToLinalg.cpp`).
Correspondingly, there are now 4 main testing targets paralleling the
Python layering (which is reflective of the deeper underlying dependency
structure)
- `check-torch-mlir`: checks the `torch-mlir` pure MLIR C++ code.
- `check-torch-mlir-plugin`: checks the code in `TorchPlugin` (e.g.
TorchScript import)
- `check-frontends-pytorch`: Checks the little code we have in
`frontends/pytorch` -- mainly things related to the e2e framework
itself.
- `check-npcomp`: Checks the pure MLIR C++ code inside npcomp.
There is a target `check-npcomp-all` that runs all of them.
The `torch-mlir/build_standalone.sh` script does a standalone build of
`torch-mlir`.
The e2e tests (`tools/torchscript_e2e_test.sh`) are working too.
The update_torch_ods script now lives in
`torch-mlir/build_tools/update_torch_ods.sh` and expects a standalone
build.
This change also required a fix upstream related to cross-shlib Python
dependencies, so we also update llvm-project to
8dca953dd39c0cd8c80decbeb38753f58a4de580 to get
https://reviews.llvm.org/D109776 (no other fixes were needed for the
integrate, thankfully).
This completes most of the large source code changes. Next will be
bringing the CI/packaging/examples back to life.
This creates the `external/torch-mlir` directory as an
LLVM_EXTERNAL_PROJECTS-compatible project (analogous to
`iree-dialects`) and completes movement/rename of all pure MLIR C/C++
compiler code into there. The next step will be to move all the Python
code / code that links/includes PyTorch C++ code (which currently lives
in `frontends/pytorch`) into a subdirectory here.
I call this "earthmoving" because it is mostly mechanical changes and
renames. As a quick summary (we can change this down the road easily)
- C++ `mlir::NPCOMP::Torch -> mlir::torch::Torch`
- CAPI `npcompTorchListTypeGet -> torchMlirTorchListTypeGet`
- preprocessor `#ifndef NPCOMP_ -> #ifndef TORCHMLIR_`
- CMake `NPCOMPFoo -> TorchMLIRFoo`
The goal of this is to create a standalone project creating a center of
mass for entry into the MLIR ecosystem from PyTorch, suitable in scope
for eventual inclusion/ownership in PyTorch. The idea is that
`external/torch-mlir` will some day be pulled out into its own
repository, and then npcomp will simply pull it in as a submodule.
Layering-wise, what lives in `torch-mlir` lowers code from PyTorch
(currently TorchScript, but TorchFX or pytorch/xla-style tracing are
possible extensions) down to what we have been calling the "Torch
backend contract" which is cleaned up IR (inlining, simplifcation,
conversion to value tensors, ...) entirely in the `torch` dialect. This
is the branching off point for further lowering, of which npcomp takes
one opinion (outside `torch-mlir` of course!), namely the
`TorchConversion` dialect/transforms which lower to IR suitable for IREE
and other linalg-on-tensors based lower-level compilers.
Summary of changes:
- move `{include,lib,test}/Dialect/Torch` into `torch-mlir`
- move relevant parts of CAPI into `torch-mlir`.
- leave a few things related to the `torch-mlir` Python build commented
out, which should be resolved in a subsequent change.
We were not filling the `outs` with the neutral element of the
reduction, which resulted in reading uninitialized values (we were
getting lucky that sometimes the uninitialized buffers were all zero's).
Also,
- Slight tweak to error messages in the e2e framework.
This converts a basic list op (torch.prim.ListConstruct) to the IREE
dialect.
```
def forward(self, x: float):
return [x, x]
```
turns into:
```
builtin.func @forward(%arg0: !torch.float) -> !torch.list<!torch.float> {
%0 = torch.prim.ListConstruct %arg0, %arg0 : (!torch.float, !torch.float) -> !torch.list<!torch.float>
return %0 : !torch.list<!torch.float>
}
```
which turns into:
```
builtin.func @forward(%arg0: f64) -> !iree.list<f64> {
%c1 = constant 1 : index
%c0 = constant 0 : index
%c2 = constant 2 : index
%0 = iree.list.create %c2 : !iree.list<f64>
iree.list.set %0[%c0], %arg0 : !iree.list<f64>, f64
iree.list.set %0[%c1], %arg0 : !iree.list<f64>, f64
return %0 : !iree.list<f64>
}
```
As part of doing this, I realized that it was time to formalize the IR
form that we reach right before running TorchTo{Linalg,Std,...}. We now
call it the "Torch backend contract". We then lower the "Torch backend
contract" to the "npcomp backend contract", which involves the new
TorchConversion (`torch_c`) dialect, which holds ops that need to
operate on both the npcomp backend types (e.g. builtin tensors, i1, IREE
list, etc.) and the `!torch` types.
This made more sense, as I realized that if I didn't factor out
`torch_c` then the Torch dialect would have a dependency on IREE
dialect (we previously didn't notice this was an issue because we only
depended on `builtin` types), which seemed wrong to me.
Recommended review order:
- TorchToIREE.cpp / `TorchToIREE/basic.mlir`
- Look at the new structure of createTorchScriptToNpcompBackendPipeline.
It now lives in TorchConversion/Transforms/Passes.cpp and cleanly
calls into `Torch::createTorchScriptToTorchBackendPipeline` for the
frontend lowering to the Torch backend contract.
- Mechanical change extracting
`torch_c.{to,from}_{i1,i64,f64,builtin_tensor,iree_list}` into a new
TorchConversion dialect, and a few passes specific to the lowering
from the Torch backend contract to the npcomp backend contract.
- Minor fixes to TorchToLinalg.cpp to use unconverted operands (now that
we convert lists as part of operand materialization, we need to use
the original operands). Also added test for AtenMaxPool2dOp and fixed
m_TorchConstantIntList.
- TmpDeleteDeadIREELists pass. Temporary pass for deleting dead IREE lists that
are created as part of operand materialization for conv/max pool/avg pool ops
in TorchToLinalg.
- torch.aten.flatten.using_ints to linalg lowering
- torch.aten.max_pool2d to linalg lowering
- Support torch.aten.conv2d for more flexible dilation and strides values
These were legacy concepts that are now superceded by direct Torch to
linalg-on-tensors lowering. These were based on some very early thinking
related to the layering of frontends vs codegen, which is now obsolete
because:
- We expected a lot more centralization at the frontend (TCF) level. It
turns out that frontend needs really vary a lot, and there is no grand
unifying TCF dialect plausible. The additional layer isn't worth it.
- Linalg-on-tensors obsoletes the primary need for TCP. There are still
a few things not representable with linalg-on-tensors, but the support
is growing and the whole "not included in linalg-on-tensors" direction
needs to be rethought. Our TCP dialect didn't cover any of the
actually important things in this space (such as sort, FFT, top-k,
etc.).
See historical [slides](https://drive.google.com/file/d/1iljcpTQ5NPaMfGpoPDFml1XkYxjK_6A4/view) / [recording](https://drive.google.com/file/d/1jSPa8TwPKUt0WuLquGc8OgSUVYJHMvWZ/view)
for more details on the origin story here.
Their presence was confusing users too
[bug](https://github.com/llvm/mlir-npcomp/issues/248).
Also,
- Trim down npcomp-run-mlir testing. It was testing TCF to TCP
lowering for the most part. The essential stuff is retained and
rephrased with linalg-on-tensors. (we should probably rename it
"refback-run" or something, as it is just a way to invoke RefBackend)
- test/Python/Backend/RefJIT/simple_invoke_numpy.py is XFAIL'ed. Our
"anti-framework" direction seems to be the likely future path.
* Added additional *ToLLVM conversion patterns (they were disaggregated from standard).
* Misc renames.
* Spelling change on ConvNCHW op, and it now expects strides and dilations attributes.