This commit adds to the lowering of `aten.view` handling for the
following cases:
- `(..., a.size(i))` -> `(..., a.size(i), 1, ..., 1)`
- `(..., a.size(i), 1, ..., 1)` -> `(..., a.size(i))`
Fixes: https://github.com/llvm/torch-mlir/issues/2448
While trying to fix a bug in the `ConvertAtenViewOp` pattern in the
linalg backend, I realized that the pattern had become quite complex and
had accumulated some dead code, making it hard to reason about.
This commit simplifies the pattern quite a bit. The main changes are:
1. All the static helper functions in the `ConvertAtenViewOp` class have
been simplified, both in their signature and their body. Each one now
performs simple calculations on arrays, and take the least number of
arguments necessary.
2. The body of [the `while`
loop](9fce566b0c/lib/Conversion/TorchToLinalg/DataMovement.cpp (L407))
inside the main pattern has been changed to work on `MutableArrayRef`
slices, to avoid having to keep track of `start` and `end` indices for
the input and output shape arrays.
3. All the heuristics used to determine the mapping between the input
and output dimensions are now in [this relatively short `if-else`
section](9fce566b0c/lib/Conversion/TorchToLinalg/DataMovement.cpp (L428-L460)),
making it easy to see what is going on.
4. Dead code was eliminated + updates to some of the documentation
comments
This commit does not add any new functionality to the
`ConvertAtenViewOp` pattern.
* view_as_real test case, allow dtype in testutils.randn
* abstract python upstream func implemented
* fixed upstream dtype func, implemented view_as_real backend op
* formatted AtenViewAsRealOp, removed change in e2etest/framework
* removed test suit from reshape_like.py, because it's moved to basic.py
* implemented C-API wrapper for mlirComplexF128 type
* fixed torch.complex dtype width in MLIR and Torch MLIR, deleted float16 dtype dict
* Changed IR input of aten fft_fft unit test
* code refactored
* code refactored and fixed ci test
* refactored: removed white spaces, and rolled back to having both input/output affine expr
* refactored: deleted output affine expr to reduce redundancy
* xfail ltc backend
* removed ComplexImag and ComplexReal from torchdynamo xfail set
* copied and pasted from main branch as there's no change to be made in this file
* refactored abstract_interp_lib_gen.py
* refactored: torchtypes.td, formatted, removed commented out code
This commit updates the `llvm-project` and `mlir-hlo` submodules to
commits:
llvm-project: a3f2751f782f3cdc6ba4790488ec20163a40ac37
mlir-hlo: 97c7e4b4506c3a2441c923e592833f45da439009
Changes made:
- Rename `getSuccessorEntryOperands` with `getEntrySuccessorOperands`
and remove `operands` from
`getSuccessorRegions` (https://reviews.llvm.org/D157506)
- Make `TypeConverter` a `const` (https://reviews.llvm.org/D157601)
This commit adds the support for index.Tensor op when the index values
are negative. This commit wraps around the index values by checking
their values at run time.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
check the return type of the division to figure out whether to use
the floating point implementation of a division or to use the integer.
the issue rose from the fact that the inputs are all integer but the
result was casted to floating point. The conversion then chose to
use the integer implementation of division which is not legal in tosa
when all the inputs get casted to floating point.
fix(TorchToLinalg): AtenDivScalarOp
upcast self operand as well if applicable, the self operand must also
be casted to float as it can be an integer.
* add support for mhlo
* Add Test for torch.ne
* fix torch.ne shape/add static test case
* add support for static torch.ne
---------
Co-authored-by: root <root@n31-177-039.byted.org>
When the user does not specify the `stride` value in 2d pooling ops,
`stride` is given the value of an empty list. However, the current
lowerings for pooling ops assumed that the `stride` operand would
always be a list of two ints, leading to crashes when that was not the
case. This commit fixes the crashes by setting the value of `stride`
to `kernel_size` when `stride` is the empty list, since this is the
default `stride` value specified in PyTorch docs. See:
https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d
The current decomposition for `aten.randn.generator` does not specify
the `dtype` argument of the empty tensors created to store the random
values. This leads to invalid IR when the output type of the `randn`
op is not the default PyTorch dtype.
-- In Python we have the concept of negative dimension indexing.
-- We would want to normalize such dimensions to be +ve and within the
expected range instead.
-- This commit takes care of a few remaining set of Ops and their
lowerings by applying `toPositiveDim` and `isValidDim` to the
extracted integer `dim` value.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
This commits adds the support for cases for index_put_op:
1.) where index is a 2-d tensor.
2.) where indices is a list of tensors and none, with exactly
2 non none tensors along the consecutive dimensions.
This commit also adds a utility to compute the broadcast shape
given the two input tensors.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit also adds the support for non-unit output padding in the
case of transposed convolution.
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
One of the potential values for a `torch_upstream::ScalarType` is
`Undefined`. This means that conversion of a `ScalarType` to another
type is a computation that can fail. To enforce handling of the
failure case, this commit makes the two helper functions that convert
`ScalarType`s into other types return `failure()` when the
`ScalarType` is `Undefined`.
This reverts commit eaab9be207, since it
is causing the post-merge CI tests to fail, causing subsequent PRs to be
blocked. Specifically, the tests
`ElementwiseAtenLogicalAndOpPromoteBroadcastModule_basic` and
`ElementwiseAtenLogicalXorOpPromoteBroadcastModule_basic` fail because
the oracle does not match the computed result. This patch reverts the
commit to make the post-merge builds green again.
Summary of changes:
- LLVM now includes <optional> instead of "llvm/ADT/Optional.h" in most
(although not all) places
(https://reviews.llvm.org/rG541ef3d61e9341cd38420c0dbca9250c4d0ea04c).
This patch replaces the affected instances of `llvm::Optional` with
`std::optional`.
- In the usages of llvm::Optional that remain, llvm::Optional::value()
is deprecated, so this patch replaces them with a dereference.
- Support for non-prefixed accessors has been removed. See:
https://reviews.llvm.org/D136727
- Rename `operands` to `methodOperands` in `prim.CallMethod` since the
name `operands` overlaps with a builtin method name. See:
https://reviews.llvm.org/D136727
- Add passes in refbackend to lower memref.subview. See:
https://reviews.llvm.org/D136377
- Replace `CopyToValueTensorOps` first in `RewriteViewLikeSubgraph` in
maximize-value-semantics.
The current implementation of the `RewriteViewLikeSubgraph` pass in
maximize-value-semantics creates temporarily invalid IR. In
particular, given a forward slice starting from a
`CopyToNonValueTensorOp` and ending in `CopyToValueTensorOp`s, the
pass first replaces all uses of the `CopyToNonValueTensorOp` with
its operand, which results in all the `CopyToValueTensorOp` users
having their operand have type `!torch.vtensor`, which is invalid.
The correct way to do things is to first replace all the
`CopyToValueTensorOp`s with their operand, and then replace all uses
of the `CopyToNonValueTensorOp` with its operand.
This only started failing now because the generated accessor
`getOperand` for the `CopyToValueTensorOp` now returns a
`TypedValue<NonValueTensorType>`, which has an assert checking that
the value returned is of the expected type.
This commit replaces the LCG algorithm that was being used by the
`TorchToLinalg` lowering of `AtenUniformOp` to generate random numbers
with the `squares64` algorithm, for the LCG algorithm was producing
tensors that were highly correlated with one another.
Squares64 algorithm: https://arxiv.org/abs/2004.06278
Closes https://github.com/llvm/torch-mlir/issues/1608
Summary of changes:
- Replace call to `MemoryEffectOpInterface::hasNoEffect`
with `isMemoryEffectFree`.
- Make fix for the dynamic dims, since
`kDynamicSize` value changed to
`std::numeric_limits<int64_t>::min()` from `-1` in llvm
- `makeShapeLLVMCompatible` and `makeShapeTorchCompatible`
utilities convert shapes in order to remain consistent
with the Torch and MLIR semantics.
- Update tags
llvm: 147fe9de29dc13c14835127b35280c4d95c8e8ba
mhlo: 1944b5fa6062ec4c065d726c9c5d64f1487ee8c5
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
-- aten.upsample_nearest2d.vec op is not present
owing to https://github.com/pytorch/pytorch/pull/85638
-- So this commit adds a lowering on aten.upsample_nearest2d.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
This commit renames the patterns used to match on lists of constant
values to `m_TorchListOfConstant{valueType}s`. This is needed to avoid
ambiguity for when `valueType` has `Optional` in it. In particular, it
makes it clear whether the values in the list are optional or the list
itself is optional.
* build: update llvm tag to 74fb770d
This commit makes the following changes needed to update bump LLVM:
+ replace usages of `tensor::createPadScalarOp`, see https://reviews.llvm.org/D136493
+ Update file checks
The parameter "supportFPInputOnly" of function createPoolingOp() is
supposed to be "supportNonFPInput", which was added to distinguish
between "MaxPool2d" and "AvgPool2d" op in #718
This commit removes almost all of the valsem ops, since the value
semantics version of the ops now exist in PyTorch. The only op missing
is `aten.bernoulli_.float`. In addition, this commit also simplifies
the implementation of `aten.fill.Scalar` by moving it to the pattern
that converts elementwise ops.
This commit makes the following changes needed to update bump LLVM:
- Replace `linalg.init_tensor` with `tensor.empty` (see:
https://reviews.llvm.org/D135129)
- Replace `NoSideEffect` with `Pure` (see
https://reviews.llvm.org/D135505)
- Replace `body` region accessor for `ReduceOp` and `ReduceWindowOp`
with `getBody`
- Fix incorrect use of `tosa::ReduceSumOp` in `AtenNativeLayerNormOp`
conversion pattern. The result type of `tosa::ReduceSumOp` must have
the same rank as the input type. (see:
https://www.mlplatform.org/tosa/tosa_spec.html#_reduce_sum)
Co-authored-by: Ashay Rane <ashay@users.noreply.github.com>
Co-authored-by: Ashay Rane <ashay@users.noreply.github.com>
This commit removes the `weight` tensor from the inputs of one of the
`linalg.generic` ops generated by the `aten.convolution` linalg
lowering, since the indexed values are not actually used by the body
of the `linalg.generic`. Moreover, in general the `weight` tensor does
not have the same shape as the output tensor of the `linalg.generic`,
so both tensors being indexed by the same indexing maps is wrong.
This commit adds lowering of `aten.div.int` and `aten.bitwise_or.Tensor`
ops. Both these ops are required in order to support bloom_560m model.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit updates the linalg conversion of `AtenMaxDimOp` to use
`arith.maxf` instead of `arith.select` to calculate the maximum. This
allows better vectorization further downstream, since the operation
can be converted to a simple max reduction when the `indices` result
is not used. See: https://github.com/iree-org/iree/issues/10666.
Summary of changes:
- Updated references to the Arith dialect
(https://reviews.llvm.org/D134762)
- Switched to prefixed accessors for MemRef dialect
(https://reviews.llvm.org/D134995)
- Fixed warnings about signed/unsigned comparisons, ignored return
values, and unused variables
* Add aten.frobenius_norm.dim op and init its conversion pattern to linalg and MHLO,
* run symbolic-shape-optimization before hlo-legalize-to-linalg to fit more mhlo e2e tests.
This commit adds decomposition of `aten.linear` op. Due to limited
support at tosa backend in case of dynamic dimensions, this
decomposition is currently disabled for tosa backend.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
Summary of changes:
- Switch to C++17 (similar to https://reviews.llvm.org/D131348)
- Update MHLO to build with LLVM commit hash 061e0189
- Replace deprecated `hasValue()` and `getValue()` with `has_value()`
and `value()` respectively (https://reviews.llvm.org/D131349)
- Use `TypedAttr` (https://reviews.llvm.org/D130092)
- Use updated assembly format of `mhlo.compare` op (commit
d03ef01e70fbf9afd0fa1976fbb7ed31838929b3 in MHLO repo)
This commit fixes the shape calculation for:
1.) aten.mean.dim
2.) aten.var.dim
3.) aten.sum.dim_IntList op
Also, it fixes the lowering of `aten.mean.dim` and
`aten.sum.dim_IntList` for handling the cases of empty dim list.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com
- Includes a canonicalizer for `aten.add.t`needed for successfully lowering the shape function
- Only offers support for statically sized index tensors when there is more than one
- Dynamic shape support remains for single indexing tensors
- Supports cases where the view op expands and collapses dims
simulataneously. This does not handle the case where it is neither
expanding nor collapsing (e.g. [2, 3] -> [3, 2])
- Additionally fixes a previous bug with adding 1-sized dims on both
sides of a tensor with aten.view
This commit adds the support for negative dim cases for `aten.cat`,
`aten.slice.Tensor` and `aten.slice_scatter` op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
The original conversion pattern for `AtenBatchNormOp` required that
the input rank be greater than 2; however, the only
expectation in the conversion pattern and in Pytorch is that the input
rank is greater than 1, since the second dimension of the input must
match the size of the `weight`, `bias`, `runningMean`, and
`runningVar` inputs. This commit fixes the `inputRank` check.
lowering.
This commit addresses the remaining comments on lowering of
slice_scatter and select_scatter.
Signed-Off-By: Prateek Gupta <gprateek93@gmail.com>
`aten.select_scatter` op.
This commit adds:
1. Lowering of `aten.slice_scatter` op into `tensor.insert_slice`
op.
2. Decomposes the `aten.select_scatter` op into `aten.slice_scater`
op.
Signed-Off-By: Prateek Gupta <gprateek93@gmail.com>
A previous fix to the handling of size-1 dims in
`aten.view` (https://github.com/llvm/torch-mlir/pull/962) resulted in
the wrong grouping of dimensions when size-1 dims where between two
dims of size greater than 1. This commit fixes that.
This commit lowers `aten.matmul` to `linalg.BatchMatmul` under the
following conditions:
1. The result of matrix multiplication must have batch dimensions,
i.e., rank greater than 2.
2. The resultant matrix must have at most 1 dynamic batch dimension.
It also handles broadcasting of batch dimensions when batch dimensions
of the matrices are broadcastable.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit fixes the shape function for `index.Tensor`, adding
support for multiple index tensors and `None`s in the indices
list. This commit also adds support for input tensors of rank greater
than 1. The lowering for `index.Tensor` still has the the limitation
that only a single index tensor along the first dimension of the input
tensor is supported.
Prior to this patch, the torch dialect included `AtenTriuOp` for
computing the upper triangular part of the input matrix, but there was
no code for lowering the op to the linalg dialect.
This patch adds code to generate a `linalg.generic` operation that
compares indices (computed using `linalg.index`) to choose between zero
or the original value (using `arith.select`). The lowering fails if the
number of dimensions are less than two. This patch also adds a few
end-to-end tests.
The MacOS builders are having linking trouble with the extension library.
Until it's fixed, all support for op extensions is disabled. It should be
easy to restore once the issue is resolved.
The function `AffineMap::inferFromExprList` does not work if the first
vector of expressions is empty, because it uses these expressions to
obtain the context. This prevented `aten.permute` from working for
inputs of 0-rank. This commit adds support for 0-rank inputs.
PyTorch allows new operators to be registered dynamically in modules.
Torch-mlir already makes it fairly straightforward to add support for
new operators, and this commit just extends that support to allow new
PyTorch ops to come from a external module.
This does *not* allow ops to be dynamically loaded into torch-mlir.
Torch-mlir must still be compiled with support built-in.
Add a `_torch_mlir_custom_op_example` subpackage to `torch_mlir` which
registers an demonstration op. It will not be imported by default when
importing torch_mlir. It's strictly for testing and documentation.
Adds an end-to-end test for the `torch_mlir_custom_op_example::identity` op.
With all these changes, we should now be actively testing PyTorch extension
support with all future patches.
This commit adds lowering of `aten.div.Tensor_mode` op.
This commit also fixes formatting for the test file elementwise.py.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit adds the decomposition of `aten.adaptive_avg_pool2d` op into
`aten.avg_pool2d` op. The current decomposition only supports cases where
input size is equal to the output size.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
When compiling without assertions (i.e. in `NDEBUG` mode), a handful of
statements turn to NOPs, which results in warnings such as missing
return statement or unused variables and function. This patch replaces
such statements with `llvm_unreachable()`, which informs the compiler
about program termination regardless of the `NDEBUG` mode. This also
enables torch-mlir to be compiled using the flags `-Wall`, `-Wextra`,
`-Wpedantic`, and `-Werror`.
This patch adds support for the torch.linalg.vector_norm op to the torch
dialect, including the necessary shape function. It also extends the
conversion of reduction operators to support lowering of
AtenLinalgVectorNormOp, in addition to adding a handful of end-to-end
tests to validate the lowering.
There exist several opportunities to make this lowering optimal and
robust. For instance, in its current form, the translation does not
support ord = 0, +inf, or -inf. For L1 norms, we don't need to raise
each element to the power 1.0. Similarly, L2 norms could benefit from
strength reduction. Since the canonicalization pass is not able to
apply these optimizations, we should consider applying them during the
linalg lowering itself.
This commit adds support for aten.max_pool2d, aten.max_pool2d_with_indices,
and aten.avg_pool2d op for the cases where ceil_mode = true.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
The preserve memory specifies that `If any of the input tensors is in channels_last format,
operator output should be in channels_last format` and hence can be
added as is in aten_empty_like op.
This commit adds lowering of `aten.masked_fill.Scalar` op.
This commit also fixes the formatting of the file constant_alloc.py.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
The updated LLVM code includes a patch to create bfloat16 array
attributes, thus enabling a different patch to torch-mlir to flesh out
support for the bfloat16 type.
* shape: add shape transfer function for aten.neg
Prior to this patch, the list of shape transfer functions did not
include `aten.neg`, which resulted in errors like below.
```
error: unsupported by backend lowering: tensor with unknown rank or dtype
note: see current operation: %0 = "torch.aten.neg"(%arg0) :
(!torch.vtensor<[256,256],f32>) -> !torch.vtensor<*,f32>
note: this is likely due to a missing shape transfer function in shape_lib_gen.py
```
This patch fixes the problem by adding a shape transfer function to
reflect the point-wise nature of this operation.
* linalg: add translation of aten.neg operation
This patch adds a translation rule to lower `aten.neg` operations on
tensors to an `arith.negf` operation wrapped inside a `linalg.generic`
operation. This patch also adds a rudimentary test.
This commit adds the following support to the op `nll_loss_backward`:
- `input` tensor can be rank-1
- `weight` parameter
- `reduction` parameter
- `target`, `grad_output`, `total_weight` can be rank-0
- Checks that input tensors are of the expected type
This commit decomposes `aten.new_empty` op into `aten.empty.memory_format` op.
This commit also made a dtype fix to the constant tensor allocation like ops.
Earlier the dtype for the result was inferred from the result type; now, it's
being evaluated as per the original definition of the op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit adds support for the cases of view op where the rank and
the shapes of the input and result are equal.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
In order to make sure that the TorchToLinalg conversions leave the
graph in a valid state, the final result of the conversion has to be
casted to the result type of the op. This commit adds this cast to ops
that did not have it.
This commit adds the op `ValsemVariantAtenCopyOp` that represents
`AtenCopy_Op` without the underscore. This is needed to make sure
that the `ReduceOpVariants` pass turns the in-place op into an op
that takes value tensors as inputs, otherwise the
`MaximizeValueSemantics` pass will not be able to add value
semantics correctly.
This commit also adds the lowering of `ValsemVariantAtenCopyOp`.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit fixes the 2nd and 3rd return types of the `aten.native_layer_norm`.
Previously the mean and rSTD were returned with reduction dims removed.
This commit fixes this and keeps the reduction dims of the results.
Signed-Off-By: Prateek Gupta <prateek@nord-labs.com>
The term "pseudo" is very vague and was getting confusing (I felt I had
to explain it in every comment referencing it). Instead, rework the
"pseudo" ops to instead be named:
- MLIR Syntax: `torch.valsem.*`
- C++ / ODS: `ValsemVariant*Op`
This makes it clear what the concept is, and avoids confusion with other
things that might be called "pseudo", since these are very specific and
should be 100% consistently named w.r.t. the non-valsem-variant ops that
they correspond to.
This helps keep things organized and also exposes more parallelism to
the build system. It seems though that most of the compile time is
actually spent in the headers though, so the wall time doesn't decrease
as much as I had hoped (and now that the headers are being included
multiple times, the cpu time actually increases a lot, sadly -- will try
to dig into this).