torch-mlir/frontends/pytorch/test/ivalue_import/functions.py

# -*- Python -*-
# This file is licensed under a pytorch-style license
# See frontends/pytorch/LICENSE for license information.

import typing

import torch
import torch_mlir

# RUN: %PYTHON %s | npcomp-opt | FileCheck %s

mb = torch_mlir.ModuleBuilder()

# CHECK-LABEL:     func private @__torch__.TestModule.forward
# CHECK-SAME:        (%[[ARG0:.*]]: !torch.nn.Module<"__torch__.TestModule">, %[[ARG1:.*]]: !torch.tensor) -> !torch.tensor {
# CHECK:             %[[VAL_2:.*]] = constant @__torch__.identity : (!torch.tensor) -> !torch.tensor
# CHECK:             %[[VAL_3:.*]] = call_indirect %[[VAL_2]](%[[ARG1]]) : (!torch.tensor) -> !torch.tensor
# CHECK:             return %[[VAL_3]] : !torch.tensor
# CHECK:           }
# CHECK-LABEL:     func private @__torch__.identity
# CHECK-SAME:        (%[[ARG:.*]]: !torch.tensor) -> !torch.tensor {
# CHECK:             return %[[ARG]] : !torch.tensor
# CHECK:           }

# CHECK-LABEL:   torch.class_type @__torch__.TestModule  {
# CHECK:           torch.method "forward", @__torch__.TestModule.forward
# CHECK:         }

def identity(x):
    return x

class TestModule(torch.nn.Module):
    def __init__(self):
        super().__init__()
    def forward(self, x):
        return identity(x)

test_module = TestModule()
recursivescriptmodule = torch.jit.script(test_module)
# TODO: Automatically handle unpacking Python class RecursiveScriptModule into the underlying ScriptModule.
mb.import_module(recursivescriptmodule._c)
mb.module.operation.print()
Properly import the entire torch::jit::CompilationUnit This primarily unlocks proper handling of free functions (that is, functions that are not methods of any torch.nn.Module). Recommended review order: - `ivalue_importer.cpp` + `ivalue_import/functions*.py` - `GlobalizeObjectGraph.cpp` + test case - misc other stuff The `torch::jit::CompilationUnit` is basically a backing store or "context" holding all the possible functions in the program. The previous code was not explicitly accessing this data structure, since it just imported the `torch::jit::Function`'s that it saw attached to methods. Subtly, any time a TorchScript module called into a free function, the free function gets incorporated into the torch::jit::CompilationUnit, but doesn't show up anywhere when dumping the module, except in the curious pattern: ``` %5 : Function = prim::Constant[name="adaptive_avg_pool2d"]() %6 : Tensor = prim::CallFunction(%5, %input.1, %4) ``` That is, calls are indirect calls, and are accessed via `prim::Constant` materializing a function object. Even stranger, the `name` attribute here doesn't really even tell the full story -- it doesn't correspond to anything. It turns out that the c10::FunctionType itself actually holds a pointer to the `torch::jit::Function` in the compilation unit directly (so there is actually no indirection in prim::CallMethod, because any two values of the same FunctionType call the same function!). E.g. when converting the IR to bytecode, the "name" is ignored [code link](https://github.com/pytorch/pytorch/blob/1d6bd157902d4b1347a5d03122d02b407658e263/torch/csrc/jit/runtime/interpreter.cpp#L937). We do import `prim::CallFunction` as a `std.call_indirect` though because it's more braindead to do it that way (it gets canonicalized to a direct call easily). 2021-02-27 08:20:35 +08:00			`# -- Python --`
			`# This file is licensed under a pytorch-style license`
			`# See frontends/pytorch/LICENSE for license information.`

			`import typing`

			`import torch`
			`import torch_mlir`

			`# RUN: %PYTHON %s \| npcomp-opt \| FileCheck %s`

			`mb = torch_mlir.ModuleBuilder()`

			`# CHECK-LABEL: func private @__torch__.TestModule.forward`
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/` 2021-05-21 08:07:18 +08:00			`# CHECK-SAME: (%[[ARG0:.]]: !torch.nn.Module<"__torch__.TestModule">, %[[ARG1:.]]: !torch.tensor) -> !torch.tensor {`
			`# CHECK: %[[VAL_2:.*]] = constant @__torch__.identity : (!torch.tensor) -> !torch.tensor`
			`# CHECK: %[[VAL_3:.*]] = call_indirect %[[VAL_2]](%[[ARG1]]) : (!torch.tensor) -> !torch.tensor`
			`# CHECK: return %[[VAL_3]] : !torch.tensor`
Properly import the entire torch::jit::CompilationUnit This primarily unlocks proper handling of free functions (that is, functions that are not methods of any torch.nn.Module). Recommended review order: - `ivalue_importer.cpp` + `ivalue_import/functions*.py` - `GlobalizeObjectGraph.cpp` + test case - misc other stuff The `torch::jit::CompilationUnit` is basically a backing store or "context" holding all the possible functions in the program. The previous code was not explicitly accessing this data structure, since it just imported the `torch::jit::Function`'s that it saw attached to methods. Subtly, any time a TorchScript module called into a free function, the free function gets incorporated into the torch::jit::CompilationUnit, but doesn't show up anywhere when dumping the module, except in the curious pattern: ``` %5 : Function = prim::Constant[name="adaptive_avg_pool2d"]() %6 : Tensor = prim::CallFunction(%5, %input.1, %4) ``` That is, calls are indirect calls, and are accessed via `prim::Constant` materializing a function object. Even stranger, the `name` attribute here doesn't really even tell the full story -- it doesn't correspond to anything. It turns out that the c10::FunctionType itself actually holds a pointer to the `torch::jit::Function` in the compilation unit directly (so there is actually no indirection in prim::CallMethod, because any two values of the same FunctionType call the same function!). E.g. when converting the IR to bytecode, the "name" is ignored [code link](https://github.com/pytorch/pytorch/blob/1d6bd157902d4b1347a5d03122d02b407658e263/torch/csrc/jit/runtime/interpreter.cpp#L937). We do import `prim::CallFunction` as a `std.call_indirect` though because it's more braindead to do it that way (it gets canonicalized to a direct call easily). 2021-02-27 08:20:35 +08:00			`# CHECK: }`
			`# CHECK-LABEL: func private @__torch__.identity`
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/` 2021-05-21 08:07:18 +08:00			`# CHECK-SAME: (%[[ARG:.*]]: !torch.tensor) -> !torch.tensor {`
			`# CHECK: return %[[ARG]] : !torch.tensor`
Properly import the entire torch::jit::CompilationUnit This primarily unlocks proper handling of free functions (that is, functions that are not methods of any torch.nn.Module). Recommended review order: - `ivalue_importer.cpp` + `ivalue_import/functions*.py` - `GlobalizeObjectGraph.cpp` + test case - misc other stuff The `torch::jit::CompilationUnit` is basically a backing store or "context" holding all the possible functions in the program. The previous code was not explicitly accessing this data structure, since it just imported the `torch::jit::Function`'s that it saw attached to methods. Subtly, any time a TorchScript module called into a free function, the free function gets incorporated into the torch::jit::CompilationUnit, but doesn't show up anywhere when dumping the module, except in the curious pattern: ``` %5 : Function = prim::Constant[name="adaptive_avg_pool2d"]() %6 : Tensor = prim::CallFunction(%5, %input.1, %4) ``` That is, calls are indirect calls, and are accessed via `prim::Constant` materializing a function object. Even stranger, the `name` attribute here doesn't really even tell the full story -- it doesn't correspond to anything. It turns out that the c10::FunctionType itself actually holds a pointer to the `torch::jit::Function` in the compilation unit directly (so there is actually no indirection in prim::CallMethod, because any two values of the same FunctionType call the same function!). E.g. when converting the IR to bytecode, the "name" is ignored [code link](https://github.com/pytorch/pytorch/blob/1d6bd157902d4b1347a5d03122d02b407658e263/torch/csrc/jit/runtime/interpreter.cpp#L937). We do import `prim::CallFunction` as a `std.call_indirect` though because it's more braindead to do it that way (it gets canonicalized to a direct call easily). 2021-02-27 08:20:35 +08:00			`# CHECK: }`

			`# CHECK-LABEL: torch.class_type @__torch__.TestModule {`
			`# CHECK: torch.method "forward", @__torch__.TestModule.forward`
			`# CHECK: }`

			`def identity(x):`
			`return x`

			`class TestModule(torch.nn.Module):`
			`def __init__(self):`
			`super().__init__()`
			`def forward(self, x):`
			`return identity(x)`

			`test_module = TestModule()`
			`recursivescriptmodule = torch.jit.script(test_module)`
			`# TODO: Automatically handle unpacking Python class RecursiveScriptModule into the underlying ScriptModule.`
			`mb.import_module(recursivescriptmodule._c)`
			`mb.module.operation.print()`