Commit Graph

219 Commits (9ac90ec7b2710dda9a30450d2c0ee308b31ee4e0)

Author SHA1 Message Date
Sean Silva 92ee0fa98f Add `!torch.tuple<T1, T2>` type.
This further eliminates the need for the `basicpy` dependency.

This required adding `torch.prim.TupleConstruct` to replace
`basicpy.build_tuple`.
2021-06-15 08:15:22 -07:00
Sean Silva db282fd1b4 Introduce native `!torch.none` type.
- Add `torch.constant.none` op to construct it (naming is chosen to be
  analogous to Torch's representation of a prim::Constant with
  NoneType, rather than using the "singleton" terminology of Basicpy).
2021-06-14 13:30:58 -07:00
Sean Silva 81bcd7fb12 Move Torch type implementation code into TorchTypes.cpp 2021-06-10 16:46:47 -07:00
Yi Zhang e0ff5248fb Add TorchList type and prim::ListConstruct #218 2021-06-10 14:31:35 -07:00
Sean Silva 370e3270ab Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes.  The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:

```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```

This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".

At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.

Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
  creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
  touch -- we need to sort out the situation with !basicpy.BoolType
  there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
  semantics. We currently require this, as our backend contract does not
  currently allow us to even model the non-value-semantic case. Before,
  the value-semantic assumption was randomly injected in the middle of
  the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
  RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
  `!torch.vtensor` to `tensor` and use the dialect conversion infra.
  The overall conversion pipeline is set up following the best practices
  of the "Type Conversions the Not-So-Hard Way" talk. This required
  introducing `torch-func-builtin-tensorize` and
  `torch-finalizing-builtin-tensorize` passes analogous to the upstream
  bufferization passes with the corresponding names (mostly just
  copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
  lowering to std later in the pipeline, so we are gradually lessening
  our reliance on random std constant folding before we get to that
  point.

Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
  - Frontend changes.
  - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-06-10 10:56:48 -07:00
Sean Silva 2efda323ff Significantly restructure torch/aten import design.
This is a really major and invasive restructuring of the way we get
torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into
MLIR. Please forgive the challenging review, but due to the sheer
invasiveness, it wasn't really practical do do it in sane smaller
pieces.

This fully replaces everything that was already working on the
TorchScript path (actually, more -- we added tanh support to
TorchToLinalg in order to delete the older code paths). Additionally,
I've kept the lights on for the acap path too, including what little e2e
stuff was working before (for expediency I made a few tiny compromises
along the way that will be easy to undo when we give that path proper
attention).

Overview of the new design:
- The torch operator `somens::someunqualname.someoverloadname` is
  imported as `torch.somens.someunqualname.someoverloadname` (skip the
  last dotted part if the overload name is empty), OR, if we don't have
  such an op registered, it is imported as
  `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`.
  - The addition of the "overload name" is a critical element here, as
    the `(ns,unqual,overload)` triple is unique, which solves a lot of
    problems we were having.
  - This involves having separate MLIR ops for the `trailing_` and
    `.out` variants and all the different overloads. This seemed
    necessary, because the set of overloads is so wild and varied and
    unstructured. The previous design was leaning into some underlying
    structure that just isn't there -- the default situation is
    the "random overload that we want to manage on the MLIR side",
    rather than that being an exception. E.g.  `aten::ne` (not-equal)
    has 21 overloads, only 4 of which are c10 dispatcher ops see
    [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1),
    and the "out" variant is really called `.Tensor_out` instead of
    `.out` as it frequently is for other ops.
  - Rationale for all being in `torch` namespace: the set of operators
    are so varied and unstructured that "dialect per namespace"
    doesn't result in anything resembling the typical MLIR dialect
    boundary expectations. We could maybe draw the boundary at
    dispatcher ops vs non-dispatcher ops, but that doesn't seem to
    really result in very much useful structure at this point in time.
  - Note: within the torch operator registry, we effectively have a
    mini-basicpy subdialect (already type-resolved), which is reasonably
    structured.
  - The existing Torch op interfaces are also removed -- now that we
    track the overload name, we can losslessly find the original
    operator.
- Instead of `ATenRecognizeKernelsPass`, we now have a
  `ReduceOpVariantsPass` that keys off certain traits (and perhaps
  eventually interfaces) to reduce variants of ops to a smaller set,
  ideally operating on immutable tensors and using surrounding ops to
  model the mutability/aliasing aspects.
  - Note: `torch.ns.unqual.overload` ops allow both immutable and
    mutable tensors (unlike the previous hard distinction in the common
    case). This is a premonition for a future change that will introduce a
    bona fide `!torch.tensor` type that will clean up a bunch of stuff.
- `TorchToLinalg` / `TorchToStd` supercede the existing
  "ATen->TCF->TCP->Linalg" path.
- The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`.
  It should look somewhat familiar, but the benefit of hindsight has
  allowed a lot of simplifications.

The overall trend seems to be to make the `torch` dialect a nice layer
independent of anything else. It feels like as a natural result of
various future changes we will be removing the reliance on basicpy+numpy
dialects and have a nice self-contained type system too that properly
models the TorchScript type system (including proper subtyping,
mutable/immutable tensors, optional dtype, etc.).

Recommended review order:
- Start at some of the new import IR, e.g. in
  `frontends/pytorch/test/node_import/prim.py`,
  `frontends/pytorch/test/acap_export/test_export_add3.py`, and other
  tests.
- `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py`
  and associated generated files:
  - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td`
  - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td`
- Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new
  traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h`
- Various code changes in the import path in
  `frontends/pytorch/csrc/builder`. Probably most interesting is the new
  code in `torch_to_mlir_utils.cpp` that has the logic to create the
  `torch.operator` ops or `torch.ns.unqual.overload` ops.

This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe),
just to be able to look at a substantial sample of IR in the new style.
2021-05-19 13:37:39 -07:00
Sean Silva 9257457d8a Add AllowsTypeRefinement trait and use it to improve RefineTypes
This trait lets us model the semantics of various aten/torch/numpy ops
that are insensitive to type refinements. This replaces
hardcoded/inconsistent checks for this property.

To show usage of this new trait, we fix up some old uses, and improve
RefineTypes to be smarter about rewriting with this trait.
2021-04-30 10:57:02 -07:00
Sean Silva 179105ca3e Add basic MLP's to the e2e curriculum.
These tests pass on the reference backend.

- Add aten.linear op + shape xfer function + ATen->Linalg lowering.
 - Note: this needs to be more automated, and needs to cover more cases.
 - Current not implemented caveats:
  - size-1 broadcasting for bias vector (either static-size-1 or ? case)
  - higher-rank aten.linear ops (not produced by torch.nn.Linear though)
  - type promotion (still don't even know the exact rules here)
- Add folder for torch.derefine op. Now the inliner can clean it up as
  it inlines. (call boundaries are a main place we need to insert
  torch.derefine) This is brittle -- the other important case is control
  flow which will need to be handled via an extension to
  RefineTypes.cpp (as will more robust call handling). River has an
  in-flight patch to update it to the new dataflow framework so I didn't
  want to do anything intrusive here.
    - Also adjust torch.derefine syntax to use the keyword `to` instead of
      `->`, as most type-only, cast-like ops do.
2021-04-27 12:18:54 -07:00
Sean Silva e749074bae Basic infra for annotate shapes and dtypes on arguments.
These allow users to annotate a known "type bound" on the argument,
which can seed shape/dtype inference. We don't rewrite the function
types as part of the import process (it will happen in a
yet-to-be-written pass) because:

1. We would need to interprocedurally rewrite all calls to keep the IR
   consistent. Currently, we have a place after GlobalizeObjectGraph but
   before we convert to tensors where this is convenient to do. Ideally,
   we would do this on the object graph representation.

1. We don't necessarily know that adjusting the function type is a legal
   calling convention change. The pass will have blessed knowledge (by
   the pass pipeline author) that adjusting the argument type based on
   the type bound is safe (which it frequently is).

2. Note that in principle, a type bound could be a fairly general thing
   (such as maximum sizes of dimensions, unions of multiple concrete
   types, etc.). The pass will in principle have logic to interpret the
   type bounds and to determine a suitable "best" (and legal) argument
   type.
2021-04-01 18:40:03 -07:00
Sean Silva 99178a167d Bump llvm-project to 0524a09cc7e1a0797982feacf505825231efbee7
- renames of OwningRewritePatternList -> RewritePatternSet
  - also `insert` to `add`
- RewritePatternSet holds a context now
- memref dialect split from std
2021-03-23 14:29:05 -07:00
Sean Silva 43dba03afd Properly model "derefinement".
In terms of IR structure, TorchScript allows types to vary in many
circumstances where MLIR requires pointer-identical types. In particular,
it is valid to pass any subtype in place of a type. For example, if an
`Optional[int]` is required somewhere in the IR, it is legal to pass a
value of just `int` (but not the other way around; see
`torch.prim.unchecked_cast`). In effect, every *use* can have a different
type.

We introduce a new op `torch.derefine` that models that impedance
mismatch. This op allows casting a value from one type to a type that it
is a subtype of to model this behavior.

Recommended review order:
- TorchOps.td for new torch.derefine (and updated docs for
  `torch.prim.unchecked_cast`)
- new test code in if.py, loop.py, function-derefine.py
- new code in node_importer.cpp for handling derefinement insertion
- function_importer.cpp and utils changes in torch_to_mlir_utils.cpp

Properly handling derefinement on function boundaries required
relayering the code so that graph_importer.cpp/.h is now
function_importer.cpp/.h because only the `torch::jit::Function`
(actually the `c10::FunctionSchema` it holds) knows the derefined types that are
actually needed at the boundary (see `function-derefine.py` for a test).

Annoyingly, this churns all the functions which are now prefixed with
`__torch__.` but that is more correct anyway (that is their linkage name
in the `torch::jit::CompilationUnit`; the previous `mb.import_function`
was actually buggy in the case of functions calling each other as it
would reference their unqualified name).

With this change, we can import `resnet18` from `torchvision` :)
IR: https://gist.github.com/silvasean/6426a5272d8a6c7caae533fce05ab704
2021-03-03 15:09:44 -08:00
Sean Silva 939d36906f Add support for prim::Loop op.
This is a funny one. It combines a `for` and `while` loop in one op. We
will need to write some conversions to `scf`.
2021-03-02 16:01:34 -08:00
Sean Silva 8486968925 Add trivial inliner interfaces.
With this + manually setting private visibility on everything, a simple
classifier can be reduced to this IR, which is looking pretty lean and
mean:
https://gist.github.com/silvasean/19e7e2e21a61ff197aeac0dd864d188f

Also, include a utility script for importing `.pt` models.

```
pt_util.py --import classifier.pt | npcomp-opt -torch-globalize-object-graph
```
2021-02-22 10:40:38 -08:00
Sean Silva 158c5c484d Implement GlobalizeObjectGraph transformation.
This required restructuring of how we model TorchScript on import. The
main difference is that now we split out a `torch.class_type` that holds
methods and declarations of the types of each slot. This is more
consistent with TorchScript (our previous representation was
"denormalized").

Recommended reading order:
1. check out the description of `torch.class_type` in `TorchOps.td` and
   look at `test/Dialect/Torch/ops.mlir` and
   `frontends/pytorch/test/module_import/` to familiarize with the new
   representation.
   - Just look at the new IR. The diff between the old names and new
     names is confusing.
2. check out `test/Dialect/Torch/globalize-object-graph*.mlir`
   and read along with the pass description in
   `include/npcomp/Dialect/Torch/Transforms/Passes.td`
3. Read the code in `GlobalizeObjectGraph.cpp` and miscellaneous changes
   in `ivalue_importer.cpp`, `TorchOps.cpp`, etc.
2021-02-18 18:18:47 -08:00
Sean Silva 689b40c7a6 Add initial TorchScript module importer
It turns out that this was easiest to structure as a general IValue
importer, since torch module are just one of the possible IValue's.

We import the IValue object graph in a braindead fashion into basicpy
ops and a new `torch.nn_module` op that is used to model the
attributes/methods of a torch::jit::Module IValue. See `Torch/ops.mlir`
for an example, and also check out the .py import tests in
`frontends/pytorch/test/module_import`.

As part of this change, a few housekeeping tasks:
- extract some helpers from graph_importer.cpp
- more helpers around the C API
- misc touchups
2021-01-28 11:55:17 -08:00
Stella Laurenzo 510f226df2 Expose signature metadata to ops and implement ATenRecognizeKernelsPass pass.
* Two op interfaces, one for querying instance metadata and one for getting static data needed to construct an op from a generic form.
* For torch.generic_kernel ops, metadata is splatted in during capture from Torch (it comes from the op registry, which will work for either device capture or graph import).
* Moved the 'add' out of the generated set so I can experiment on it. It implements the TorchBuildableKernelOpInterface interface which provides its metadata.
* The ATenRecognizeKernelsPass pass generically lowers from a torch.generic_kernel to recognized ops that implement the TorchBuildableKernelOpInterface, handling the various types of transformations that we allow at this stage.
2020-10-26 20:31:45 -07:00
Stella Laurenzo af4edb63ae Start reworking towards a shared library build.
* Need to have a dag of shared library deps in order to interop across python extensions (as presented in ODM).
* Introduced add_npcomp_library and friends to mirror the MLIR setup.
* Adds a libNPCOMP.so shared library.
* Redirects tools and extensions to link against libNPCOMP.so (instead of static libs).
* Moves all libraries to lib/, all binaries to bin/ and all python extensions to python/. The invariant is that the rpaths are setup to have a one level directory structure.
* Reworks the _torch_mlir extension to build like the others (still need to come up with a consolidated rule to do this instead of open coded).
* Includes an upstream version bump to pick up needed changes.

Sizes with dynamic linking (stripped, release, asserts enabled):
  libNPCOMP.so: 43M (includes much of the underlying LLVM codegen deps)
  libMLIR.so: 31M
  _npcomp.so: 1.6M (python extension)
  _torch_mlir.so: 670K (python extension)
  npcomp-capi-ir-test: 6.3K
  npcomp-opt: 351K
  npcomp-run-mlir: 461K
  mnist-playground: 530K

Still more can be done to normalize and optimize but this gets us structurally to the starting point.
2020-10-09 16:02:58 -07:00
Stella Laurenzo 3d74337be0 Add a torch.kernel_call op and associated predicates. 2020-09-29 15:10:38 -07:00
Stella Laurenzo 2c9ca79c89 Add boilerplate for Torch dialect. 2020-09-28 15:26:17 -07:00