The issue was in the canonicalizer for torch.aten.ge.int -- in cases
where the operands were swapped, it would miscompile. This issue is
fixed and folding support generalized to `torch.aten.size.int < 0` as
well.
Fixes#716
This commit decomposes different variants of `aten.where.*` op into
`aten.where.Self` op. It covers `aten.where.Scalar`,
`aten.where.ScalarSelf` and `aten.where.ScalarOther` ops.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit decomposes `aten.new_empty` op into `aten.empty.memory_format` op.
This commit also made a dtype fix to the constant tensor allocation like ops.
Earlier the dtype for the result was inferred from the result type; now, it's
being evaluated as per the original definition of the op.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
A recent PyTorch commit made ConstantPad2d call a helper function with a
`Union[int, float]` type annotated. This commit adds minimal support for
representing and dealing with that.
https://github.com/pytorch/pytorch/pull/73287
Changes:
- Adding support for `!torch.union<T1, T2, T3>`/`Torch::UnionType`,
along with the importer and CAPI code.
- Add support in isValidSubtype for union types.
- Adding a canonicalizer for `torch.derefine` to help simplify some code
that derefines to a UnionType (this also fixes#664).
There is still more work to do for really supporting UnionType well,
such as canonicalizing UnionType's so that they can be compared with
pointer equality.
The reified code to compute the shape of torch.aten.constant_pad_nd
uses negative indices when setting list elements. This was not
converted to a positive offset in one place in SimplifyShapeCalculations
which prevented computation of the static shape.
The logic in the rewriting phase had a bug in case of a read-only op
coming before mutation ops. The logic would use the op itself as the
"latest literal", but that is not correct, because later on we replace
the op itself with the *final* "latest literal", assuming that all uses
of the op have been rewritten -- that was working in general, except for
any read-only ops at the beginning.
Big thanks to @ljfitz for the tiny reproducer!
Fixes#704
- This commit adds decomposition of `aten.dropout` op. It also covers the
training mode of the same op.
- It also adds lowering of `aten.sub.float` op.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds the op `ValsemVariantAtenCopyOp` that represents
`AtenCopy_Op` without the underscore. This is needed to make sure
that the `ReduceOpVariants` pass turns the in-place op into an op
that takes value tensors as inputs, otherwise the
`MaximizeValueSemantics` pass will not be able to add value
semantics correctly.
This commit also adds the lowering of `ValsemVariantAtenCopyOp`.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit adds support for type refinement when
`torch.tensor_static_info_cast`s are involved, even when there are
users of the casted tensor that don't allow type refinements.
Originally the canonicalization pattern for
`torch.tensor_static_info_cast` would check if all the users of the
casted tensor allowed type refinements before making any changes. This
means that if at least one of the users did not allow type
refinements, the pattern would fail. This becomes an issue when doing
shape calculations because the calculations need the shape information
of each input tensor to be available before the calculation can be
simplified.
This commit adds the op `ValsemVariantAtenIndexPutImplOp` that represents
`Aten_IndexPutImpl_Op` without the underscore. This is needed to
make sure that the `ReduceOpVariants` pass turns the in-place op
into an op that takes value tensors as inputs, otherwise the
`MaximizeValueSemantics` pass will not be able to add value
semantics correctly.
This commit also adds the lowering of `ValsemVariantAtenIndexPutImplOp` op.
This commit also updates the `torch.bincount` op test cases.
The term "pseudo" is very vague and was getting confusing (I felt I had
to explain it in every comment referencing it). Instead, rework the
"pseudo" ops to instead be named:
- MLIR Syntax: `torch.valsem.*`
- C++ / ODS: `ValsemVariant*Op`
This makes it clear what the concept is, and avoids confusion with other
things that might be called "pseudo", since these are very specific and
should be 100% consistently named w.r.t. the non-valsem-variant ops that
they correspond to.
This leads to much more succinct types in many cases:
```
!torch.list<!torch.int>
!torch.list<int>
!torch.tuple<!torch.list<!torch.int>, !torch.list<!torch.int>>
!torch.tuple<list<int>, list<int>>
!torch.optional<!torch.list<!torch.int>>
!torch.optional<list<int>>
!torch.list<list<list<tensor>>>
!torch.list<!torch.list<!torch.list<!torch.tensor>>>
```
I would like to take this further and allow omitting the `!torch.`
prefix in all cases, but that's harder -- for example, we currently use
`FuncOp` for functions, and so I don't think we can customize the
printing there. It seems like it will be a longer road to getting that
level of customization.
See the documentation in `docs/shape_lib.md` and
`docs/adding_a_shape_function.md` for an overview of the system.
This completely overhauls how we represent shape functions. In
particular, RefineTypes does not infer shapes anymore (only dtypes).
Shape functions are now written in (TorchScript'able) Python.
Recommended review order:
1. Read `docs/shape_lib.md` and `docs/adding_a_shape_function.md`.
1. Code and tests for ReifyShapeCalculations, DropShapeCalculations.
1. Code and tests for SimplifyShapeCalculations.
1. shape_lib_gen.py
1. Code and tests for new RefineTypes pass.
1. Random folders/canonicalizers in TorchOps.cpp and associated test in
`canonicalize.mlir`.
1. New ReadOnly trait inferred from the registry.
1. Any miscellaneous remaining stuff.
Example `-print-ir-after-all` for ElementwiseUnaryModule:
[IR lowering dump](https://gist.github.com/silvasean/e4dc8cbc8d00aac7819602e3cbd8e212).
Example `-print-ir-after-all` for ElementwiseBinaryModule:
[IR lowering dump](https://gist.github.com/silvasean/daf6860ecced732af3568af6b1899113).
This commit replaces the two rewrite patterns of
maximize-value-semantics with a single pattern that captures the
behavior of both as well as other edge cases previously not
supported. The new pattern works by first performing alias analysis on
a subgraph to see if pattern is applicable, then rewriting all
non-value tensors to value tensors in a single go.
- This commit adds E2E support for `aten.rand_like` and
`aten.bernoulli_.Tensor` ops.
- The `aten.bernoulli(x)` was implemented as:
`aten.bernoulli(x) = rand_like(x) < 0.5`, assuming 0.5 as default
probability, whereas according to the pytorch documentation:
https://pytorch.org/docs/stable/generated/torch.bernoulli.html#torch.bernoulli
the input x in `aten.bernoulli(x)` is itself a tensor containing
probabilities to be used for drawing the binary random number.
- So this commit fixes the `aten.bernoulli(x)` implementation as:
`aten.bernoulli(x) = rand_like(x) < x`.
- It also fixes the case where the input to `aten.bernoulli_.float` is
an integer tensor. In this case the input must be casted to float type
before passing it as operand to `aten.rand_like` op.
`aten.bernoulli_.float(x, p) = rand_like(float(x)) < p`.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds the invariant to the op `torch.overwrite.tensor.contents` that
both of its operands have the same shape and size. In order to
maintain the invariant, special handling of this op is added to the
`RefineTypes` pass.
This commit adds handling to the `maximize-value-semantics` pass for
the case where a view-like op depends on a tensor that has been
overwritten by a value tensor. The approach for removing the
dependency is to change the input to the view-like op to be a copy of
the value tensor that is being used to overwrite.
This commit also removes `AtenFill_ScalarOp` and
`AtenBernoulli_FloatOp` from the list of view-like ops, since these
ops now have a corresponding op with value semantics into which they
get converted in the `reduce-op-variants` pass.
- This commit decomposes the `aten.batch_norm` op into the
`aten.native_batch_norm` op, instead of lowering it to the
`linalg.generic` op.
- It also adds run-time asserts in the `aten.native_batch_norm` lowering
to make sure that the shape of the weight, bias, running_mean, and
running_var must match the num of features.
- Since the `aten.native_batch_norm` op is not supported at TOSA backend,
all the modules that are dependent on the `aten.native_batch_norm` op
will fail and therefore they should be removed from the TOSA `passing`
set.
- It also moves `checkNotNone` to utility.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This commit adds the op `PseudoAtenFillScalarOp` that represents
`AtenFill_ScalarOp` without the underscore. The approach is the same
as in commit dd998fa4d4.
Adding this op allows for a simpler and more consistent version of the
`empty` and `empty_like` op e2e tests.
This commit adds the op `PseudoAtenBernoulliFloatOp` that represents
`AtenBernoulli_FloatOp` without the underscore. This is needed to make
sure that the `ReduceOpVariants` pass turns the in-place op into an op
that takes value tensors as inputs, otherwise the
`MaximizeValueSemantics` pass will not be able to add value semantics
correctly.
- This commit adds lowering of `aten.Bool.Tensor` and
`aten.Float.Tensor` op as a part of `convert-torch-to-linalg` pass.
- It also adds support for returning bool types.
- It also fixes lowering of the `aten.Int.Tensor` op for non-zero rank
input tensors.
- If a scalar number is converted to a 0-d tensor and passed on to the
`aten.Float.Tensor` op, it folds to the scalar number.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>
This PR include the following pieces:
- Add torch `Generator` type. `Generator` type is converted to i64 in
refbackend type converter.
- Add seed managment support for the default global generator.
`torch_c.getNextSeed` op is used to get the seed. On refbackend, the
`torch_c.getNextSeed` is lowered to load/store from [0] of global
variable `default_generator` memref<i64> in `InsertRngGlobals` pass.
- Add `aten.uniform_` and testing as an example op for RNG ops. Add
`torch.pseudo.aten.uniform` op. It has the same operands and return as
the `aten.uniform_` from the op registry except for value semantics.