Commit Graph

4 Commits (7b7c9c5d3df0b82000d656f7e180950706fc5487)

Author SHA1 Message Date
Sean Silva 370e3270ab Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes.  The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:

```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```

This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".

At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.

Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
  creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
  touch -- we need to sort out the situation with !basicpy.BoolType
  there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
  semantics. We currently require this, as our backend contract does not
  currently allow us to even model the non-value-semantic case. Before,
  the value-semantic assumption was randomly injected in the middle of
  the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
  RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
  `!torch.vtensor` to `tensor` and use the dialect conversion infra.
  The overall conversion pipeline is set up following the best practices
  of the "Type Conversions the Not-So-Hard Way" talk. This required
  introducing `torch-func-builtin-tensorize` and
  `torch-finalizing-builtin-tensorize` passes analogous to the upstream
  bufferization passes with the corresponding names (mostly just
  copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
  lowering to std later in the pipeline, so we are gradually lessening
  our reliance on random std constant folding before we get to that
  point.

Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
  - Frontend changes.
  - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-06-10 10:56:48 -07:00
Sean Silva c3f1f8ebf4 [cleanup] Put the root class type for exportPath first.
This is more consistent and intuitive -- usually the object being
"indexed" or used as a "context" for a later parameter goes first.
2021-04-01 18:40:03 -07:00
Sean Silva e749074bae Basic infra for annotate shapes and dtypes on arguments.
These allow users to annotate a known "type bound" on the argument,
which can seed shape/dtype inference. We don't rewrite the function
types as part of the import process (it will happen in a
yet-to-be-written pass) because:

1. We would need to interprocedurally rewrite all calls to keep the IR
   consistent. Currently, we have a place after GlobalizeObjectGraph but
   before we convert to tensors where this is convenient to do. Ideally,
   we would do this on the object graph representation.

1. We don't necessarily know that adjusting the function type is a legal
   calling convention change. The pass will have blessed knowledge (by
   the pass pipeline author) that adjusting the argument type based on
   the type bound is safe (which it frequently is).

2. Note that in principle, a type bound could be a fairly general thing
   (such as maximum sizes of dimensions, unions of multiple concrete
   types, etc.). The pass will in principle have logic to interpret the
   type bounds and to determine a suitable "best" (and legal) argument
   type.
2021-04-01 18:40:03 -07:00
Sean Silva a375ccf9da Add ability to annotate TorchScript classes.
The first use case is to annotate certain program constructs as either
exported or private. In this commit we plumb it down to
GlobalizeObjectGraph which makes use of this information.

Recommended review order:
1. class_annotator.h/.cpp + `test/module_import/annotations/*`
    - New abstractions to communicate with Python code and annotate.
2. IR changes in TorchOps.td
    - Adding "private" attribute to various things.
3. ivalue_import.cpp changes
    - Module + ClassAnnotator = annotated IR
4. GlobalizeObjectGraph.cpp + tests
    - use new "private" attributes to create "private" IR.
    - also, tweak some of the op deleting mechanics, which was triggering
      some memory errors / assertions

With this, we can run the classifier through and inline it as follows:
```
frontends/pytorch/utils/pt_util.py --import --exported-name forward ~/tmp/classifier.pt \
| npcomp-opt -torch-globalize-object-graph -inline
```
IR: https://gist.github.com/silvasean/32dcad9f6270557f412094a77cecdd69
2021-02-25 11:28:34 -08:00