Commit Graph

3 Commits (c464cb107fff332cee973180b9b783e0ac98c382)

Author SHA1 Message Date
Sean Silva 453e29ea05 Add E2E support for tests with heavy dependencies (heavydep tests).
The tests use the same (pure-Python) test framework as the
normal torchscript_e2e_test.sh, but the tests are added in
`build_tools/torchscript_e2e_heavydep_tests` instead of
`frontends/pytorch/e2e_testing/torchscript`. Any needed dependencies can
easily be configured in generate_serialized_tests.sh.

We add an initial machine translation model with a complex set of
dependencies to seed the curriculum there. I verified that this model
gets to the point of MLIR import (it fails there with a segfault due to
not being able to import the "Any" type).

This required moving a few files from the `torch_mlir` Python module
into multiple modules to isolate the code that depends on our C++
extensions (which now live in `torch_mlir` and
`torch_mlir_torchscript_e2e_test_configs`) from the pure Python code
(which now lives in `torch_mlir_torchscript`). This is an entirely
mechanical change, and lots of imports needed to be updated.

The dependency graph is:
```
       torch_mlir_torchscript_e2e_test_configs
                  /              |
                 /               |
                /                |
               V                 V
torch_mlir_torchscript       torch_mlir
```

The `torch_mlir_torchscript_e2e_test_configs` are then dependency-injected
into the `torch_mlir_torchscript` modules to successfully assemble a
working test harness (the code was already structured this way, but this
new file organization allows the isolation from C++ code to actually
happen).  This isolation is critical to allowing the serialized programs
to be transported across PyTorch versions and for the test harness to be
used seamlessly to generate the heavydep tests.

Also:
- Extend `_Tracer` class to support nested property (submodule) accesses.

Recommended review order:
- "user-level" docs in README.md
- code in `build_tools/torchscript_e2e_heavydep_tests`.
- changes in `torch_mlir_torchscript/e2e_test/framework.py`
- misc mechanical changes.
2021-08-03 14:09:56 -07:00
Sean Silva 370e3270ab Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes.  The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:

```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```

This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".

At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.

Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
  creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
  touch -- we need to sort out the situation with !basicpy.BoolType
  there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
  semantics. We currently require this, as our backend contract does not
  currently allow us to even model the non-value-semantic case. Before,
  the value-semantic assumption was randomly injected in the middle of
  the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
  RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
  `!torch.vtensor` to `tensor` and use the dialect conversion infra.
  The overall conversion pipeline is set up following the best practices
  of the "Type Conversions the Not-So-Hard Way" talk. This required
  introducing `torch-func-builtin-tensorize` and
  `torch-finalizing-builtin-tensorize` passes analogous to the upstream
  bufferization passes with the corresponding names (mostly just
  copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
  lowering to std later in the pipeline, so we are gradually lessening
  our reliance on random std constant folding before we get to that
  point.

Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
  - Frontend changes.
  - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-06-10 10:56:48 -07:00
Sean Silva 179105ca3e Add basic MLP's to the e2e curriculum.
These tests pass on the reference backend.

- Add aten.linear op + shape xfer function + ATen->Linalg lowering.
 - Note: this needs to be more automated, and needs to cover more cases.
 - Current not implemented caveats:
  - size-1 broadcasting for bias vector (either static-size-1 or ? case)
  - higher-rank aten.linear ops (not produced by torch.nn.Linear though)
  - type promotion (still don't even know the exact rules here)
- Add folder for torch.derefine op. Now the inliner can clean it up as
  it inlines. (call boundaries are a main place we need to insert
  torch.derefine) This is brittle -- the other important case is control
  flow which will need to be handled via an extension to
  RefineTypes.cpp (as will more robust call handling). River has an
  in-flight patch to update it to the new dataflow framework so I didn't
  want to do anything intrusive here.
    - Also adjust torch.derefine syntax to use the keyword `to` instead of
      `->`, as most type-only, cast-like ops do.
2021-04-27 12:18:54 -07:00