torch-mlir/include/npcomp/Dialect/Torch/IR/TorchBase.td

//===-------------------------------------------------------*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef TORCH_BASE
#define TORCH_BASE

include "mlir/IR/OpBase.td"
include "npcomp/Dialect/Basicpy/IR/BasicpyDialect.td"
include "npcomp/Dialect/Numpy/IR/NumpyDialect.td"

def Torch_Dialect : Dialect {
  let name = "torch";
  let cppNamespace = "::mlir::NPCOMP::Torch";
  let description = [{
    Top-level dialect for interfacing PyTorch and MLIR.

    This dialect contains types and structural ops that model enough of
    PyTorch's behavior to allow for easy import/call-out. While not aiming to
    be completely isomorphic, it is laid out to make conversion in/out
    systematic for the supported features (some of which are aspirational):
      - Transitions between mutable and immutable tensors.
      - Gradient associations and management.
      - Custom ops.
      - Types specific to PyTorch such as torch.nn.Module structures
      - Module level constructs like quantization parameters, globals, etc.

    Where possible, this dialect composes with types and ops from the `Numpy`
    and `Basicpy` dialects, and those dialects should be considered "upstream"
    for basic Python and ND-Array based programming constructs.

    As a key point, this dialect does not contain any custom operations,
    including those that people would typically associate as core (see
    the `ATen` dialect for mathematical ops like add, conv, etc), instead
    modeling the open op-system that PyTorch reasons about natively.
  }];

  let hasRegionArgAttrVerify = 1;
  let hasConstantMaterializer = 1;
}

class TorchOpTrait<string name> : OpTrait, NativeTrait<"", ""> {
  let trait = name;
  let cppNamespace = "::mlir::NPCOMP::Torch::OpTrait";
}

def HasValueSemantics : TorchOpTrait<"HasValueSemantics">;
def IsTrailingUnderscoreInplaceVariant
  : TorchOpTrait<"IsTrailingUnderscoreInplaceVariant">;

#endif // TORCH_BASE
Add boilerplate for Torch dialect. 2020-09-29 03:02:35 +08:00			`//===-------------------------------------------------------- tablegen --===//`
			`//`
			`// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.`
			`// See https://llvm.org/LICENSE.txt for license information.`
			`// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception`
			`//`
			`//===----------------------------------------------------------------------===//`

			`#ifndef TORCH_BASE`
			`#define TORCH_BASE`

			`include "mlir/IR/OpBase.td"`
Add a torch.kernel_call op and associated predicates. 2020-09-30 05:17:34 +08:00			`include "npcomp/Dialect/Basicpy/IR/BasicpyDialect.td"`
			`include "npcomp/Dialect/Numpy/IR/NumpyDialect.td"`
Add boilerplate for Torch dialect. 2020-09-29 03:02:35 +08:00
			`def Torch_Dialect : Dialect {`
			`let name = "torch";`
			`let cppNamespace = "::mlir::NPCOMP::Torch";`
			`let description = [{`
			`Top-level dialect for interfacing PyTorch and MLIR.`

			`This dialect contains types and structural ops that model enough of`
			`PyTorch's behavior to allow for easy import/call-out. While not aiming to`
			`be completely isomorphic, it is laid out to make conversion in/out`
			`systematic for the supported features (some of which are aspirational):`
			`- Transitions between mutable and immutable tensors.`
			`- Gradient associations and management.`
			`- Custom ops.`
Add initial TorchScript module importer It turns out that this was easiest to structure as a general IValue importer, since torch module are just one of the possible IValue's. We import the IValue object graph in a braindead fashion into basicpy ops and a new `torch.nn_module` op that is used to model the attributes/methods of a torch::jit::Module IValue. See `Torch/ops.mlir` for an example, and also check out the .py import tests in `frontends/pytorch/test/module_import`. As part of this change, a few housekeeping tasks: - extract some helpers from graph_importer.cpp - more helpers around the C API - misc touchups 2021-01-28 08:35:44 +08:00			`- Types specific to PyTorch such as torch.nn.Module structures`
Add boilerplate for Torch dialect. 2020-09-29 03:02:35 +08:00			`- Module level constructs like quantization parameters, globals, etc.`

			Where possible, this dialect composes with types and ops from the `Numpy`
			and `Basicpy` dialects, and those dialects should be considered "upstream"
			`for basic Python and ND-Array based programming constructs.`

			`As a key point, this dialect does not contain any custom operations,`
			`including those that people would typically associate as core (see`
			the `ATen` dialect for mathematical ops like add, conv, etc), instead
			`modeling the open op-system that PyTorch reasons about natively.`
			`}];`
Basic infra for annotate shapes and dtypes on arguments. These allow users to annotate a known "type bound" on the argument, which can seed shape/dtype inference. We don't rewrite the function types as part of the import process (it will happen in a yet-to-be-written pass) because: 1. We would need to interprocedurally rewrite all calls to keep the IR consistent. Currently, we have a place after GlobalizeObjectGraph but before we convert to tensors where this is convenient to do. Ideally, we would do this on the object graph representation. 1. We don't necessarily know that adjusting the function type is a legal calling convention change. The pass will have blessed knowledge (by the pass pipeline author) that adjusting the argument type based on the type bound is safe (which it frequently is). 2. Note that in principle, a type bound could be a fairly general thing (such as maximum sizes of dimensions, unions of multiple concrete types, etc.). The pass will in principle have logic to interpret the type bounds and to determine a suitable "best" (and legal) argument type. 2021-03-31 07:11:41 +08:00
			`let hasRegionArgAttrVerify = 1;`
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style. 2021-05-05 05:42:50 +08:00			`let hasConstantMaterializer = 1;`
Add boilerplate for Torch dialect. 2020-09-29 03:02:35 +08:00			`}`

Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style. 2021-05-05 05:42:50 +08:00			`class TorchOpTrait<string name> : OpTrait, NativeTrait<"", ""> {`
			`let trait = name;`
			`let cppNamespace = "::mlir::NPCOMP::Torch::OpTrait";`
			`}`

			`def HasValueSemantics : TorchOpTrait<"HasValueSemantics">;`
			`def IsTrailingUnderscoreInplaceVariant`
			`: TorchOpTrait<"IsTrailingUnderscoreInplaceVariant">;`

Add boilerplate for Torch dialect. 2020-09-29 03:02:35 +08:00			`#endif // TORCH_BASE`