Commit Graph

10 Commits (99178a167da8a04c01d3e35673a58386e7de6239)

Author SHA1 Message Date
Sean Silva 58c7030104 Support multiple instances of a class in GlobalizeObjectGraph.
This happens in practice with e.g. ResNet from torchvision (multiple
instances of the same BatchNorm class).

The key observation is that for this program, and the expected set of
programs, we can convert the program to the same globalized form with a
bit more static analysis and effort to suitably monomorphize the
program. Though what we are doing here is fairly annoying to implement,
it saves any nontrivial later pass from having to do similar analyses
(or worse). E.g. shape inference would need to be object-graph aware,
mutation/lifetime analyses would have to be aware, etc. Additionally, it
would make us front-load what it means to have a !torch.nn.Module type
on an ABI boundary, which we are just not ready to handle.

I'm really, really hoping that in practice we can get away with
this, otherwise it's going to be really rough designing a representation
(and implementing everything to back it) that is convenient to transform
and gracefully scales from full object graph (in the most dynamic case)
down to a fixed set of global slots like we have here (in the most
static case, which we presume a lot of practical programs fall into).

This also involved introducing a
`torch-prepare-for-globalize-object-graph` pass that does a minimal set of
lowerings to simplify the IR into a more orthogonal and analyzable form,
and a `torch-globalize-pipeline` helper.

Recommended review order:
- updated documentation in Passes.td
- new tests in `globalize-object-graph-multiple-instances*.mlir`
- implementation of GlobalizeObjectGraph.cpp
- PrepareForGlobalizeObjectGraph.cpp + prepare-for-globalize-object-graph.mlir
- misc stuff like torch-globalize-pipeline pipeline definition.

With this, we can import, globalize, and inline resnet18 from
torchvision:
https://gist.github.com/silvasean/821586afc19b67d9fb72030b2e0adeb8
2021-03-11 19:21:07 -08:00
Sean Silva c837dbb077 Properly import the entire torch::jit::CompilationUnit
This primarily unlocks proper handling of free functions (that is,
functions that are not methods of any torch.nn.Module).

Recommended review order:
- `ivalue_importer.cpp` + `ivalue_import/functions*.py`
- `GlobalizeObjectGraph.cpp` + test case
- misc other stuff

The `torch::jit::CompilationUnit` is basically a backing store or
"context" holding all the possible functions in the program. The
previous code was not explicitly accessing this data structure, since it
just imported the `torch::jit::Function`'s that it saw attached to
methods.

Subtly, any time a TorchScript module called into a free function, the
free function gets incorporated into the torch::jit::CompilationUnit,
but doesn't show up anywhere when dumping the module, except in the
curious pattern:

```
%5 : Function = prim::Constant[name="adaptive_avg_pool2d"]()
%6 : Tensor = prim::CallFunction(%5, %input.1, %4)
```

That is, calls are indirect calls, and are accessed via `prim::Constant`
materializing a function object. Even stranger, the `name` attribute here
doesn't really even tell the full story -- it doesn't correspond to
anything. It turns out that the c10::FunctionType itself actually holds
a pointer to the `torch::jit::Function` in the compilation unit
directly (so there is actually no indirection in prim::CallMethod,
because any two values of the same FunctionType call the same
function!). E.g. when converting the IR to bytecode, the "name" is
ignored [code link](1d6bd15790/torch/csrc/jit/runtime/interpreter.cpp (L937)).
We do import `prim::CallFunction` as a `std.call_indirect` though
because it's more braindead to do it that way (it gets canonicalized to
a direct call easily).
2021-03-01 12:08:01 -08:00
Sean Silva 79a3f639bf Give torch.global_slot an initializer region.
This is a much simpler representation than the ad-hoc initializer
function we had before. It is also less general, but given the rationale
in Passes.td it seems like the right tradeoff right now.

We can probably carry this representation for quite a while, and when we
can't, it likely means that TorchScript has fixed their object identity
bug and we probably need to just upgrade to a more general object graph
modeling (more general than GlobalizeObjectGraph).

In particular, we don't want to deal with defining and carrying around
this initializer function concept until we need it. For example, if we
want to constant-fold the global slots into uses, this is a much better
representation, and it plays better with symbol-dce (the initializer
function counts as a "use" of the symbol).

(the alternative would have been to write a pass that converts the
initializer function to this form when possible, but I realized that
lots of information had been lost which made that fairly annoying -- it
was all self-inflicted anyway, so best to just go to the source
(GlobalizeObjectGraph) before the information is lost)

Now symbol-dce works nicely (no more "training" bools)
```
pt_util ~/tmp/classifier.pt --import --exported-name forward \
| npcomp-opt -torch-globalize-object-graph -inline -symbol-dce
```
IR: https://gist.github.com/silvasean/8abe63d70d24e29d6db9170ccc8d512b
2021-02-26 16:24:19 -08:00
Sean Silva a375ccf9da Add ability to annotate TorchScript classes.
The first use case is to annotate certain program constructs as either
exported or private. In this commit we plumb it down to
GlobalizeObjectGraph which makes use of this information.

Recommended review order:
1. class_annotator.h/.cpp + `test/module_import/annotations/*`
    - New abstractions to communicate with Python code and annotate.
2. IR changes in TorchOps.td
    - Adding "private" attribute to various things.
3. ivalue_import.cpp changes
    - Module + ClassAnnotator = annotated IR
4. GlobalizeObjectGraph.cpp + tests
    - use new "private" attributes to create "private" IR.
    - also, tweak some of the op deleting mechanics, which was triggering
      some memory errors / assertions

With this, we can run the classifier through and inline it as follows:
```
frontends/pytorch/utils/pt_util.py --import --exported-name forward ~/tmp/classifier.pt \
| npcomp-opt -torch-globalize-object-graph -inline
```
IR: https://gist.github.com/silvasean/32dcad9f6270557f412094a77cecdd69
2021-02-25 11:28:34 -08:00
Sean Silva 1b769f7841 Extend GlobalizeObjectGraph to handle torch.prim.GetAttr returning NnModuleType
This happens in practice. With this, we can globalize slots for the
non-trivial classifier layer obtained from
https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb

This also adds support for tuple return types, which were needed by that
model.
2021-02-19 10:23:25 -08:00
Sean Silva 158c5c484d Implement GlobalizeObjectGraph transformation.
This required restructuring of how we model TorchScript on import. The
main difference is that now we split out a `torch.class_type` that holds
methods and declarations of the types of each slot. This is more
consistent with TorchScript (our previous representation was
"denormalized").

Recommended reading order:
1. check out the description of `torch.class_type` in `TorchOps.td` and
   look at `test/Dialect/Torch/ops.mlir` and
   `frontends/pytorch/test/module_import/` to familiarize with the new
   representation.
   - Just look at the new IR. The diff between the old names and new
     names is confusing.
2. check out `test/Dialect/Torch/globalize-object-graph*.mlir`
   and read along with the pass description in
   `include/npcomp/Dialect/Torch/Transforms/Passes.td`
3. Read the code in `GlobalizeObjectGraph.cpp` and miscellaneous changes
   in `ivalue_importer.cpp`, `TorchOps.cpp`, etc.
2021-02-18 18:18:47 -08:00
Sean Silva 689b40c7a6 Add initial TorchScript module importer
It turns out that this was easiest to structure as a general IValue
importer, since torch module are just one of the possible IValue's.

We import the IValue object graph in a braindead fashion into basicpy
ops and a new `torch.nn_module` op that is used to model the
attributes/methods of a torch::jit::Module IValue. See `Torch/ops.mlir`
for an example, and also check out the .py import tests in
`frontends/pytorch/test/module_import`.

As part of this change, a few housekeeping tasks:
- extract some helpers from graph_importer.cpp
- more helpers around the C API
- misc touchups
2021-01-28 11:55:17 -08:00
Stella Laurenzo 510f226df2 Expose signature metadata to ops and implement ATenRecognizeKernelsPass pass.
* Two op interfaces, one for querying instance metadata and one for getting static data needed to construct an op from a generic form.
* For torch.generic_kernel ops, metadata is splatted in during capture from Torch (it comes from the op registry, which will work for either device capture or graph import).
* Moved the 'add' out of the generated set so I can experiment on it. It implements the TorchBuildableKernelOpInterface interface which provides its metadata.
* The ATenRecognizeKernelsPass pass generically lowers from a torch.generic_kernel to recognized ops that implement the TorchBuildableKernelOpInterface, handling the various types of transformations that we allow at this stage.
2020-10-26 20:31:45 -07:00
Stella Laurenzo 3d74337be0 Add a torch.kernel_call op and associated predicates. 2020-09-29 15:10:38 -07:00
Stella Laurenzo 2c9ca79c89 Add boilerplate for Torch dialect. 2020-09-28 15:26:17 -07:00