2021-02-18 03:28:51 +08:00
|
|
|
//===-- Passes.td - Pass definition file -------------------*- tablegen -*-===//
|
|
|
|
//
|
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#ifndef NPCOMP_TORCH_PASSES
|
|
|
|
#define NPCOMP_TORCH_PASSES
|
|
|
|
|
|
|
|
include "mlir/Pass/PassBase.td"
|
|
|
|
|
|
|
|
def GlobalizeObjectGraph : Pass<"torch-globalize-object-graph", "ModuleOp"> {
|
|
|
|
let summary = "Converts TorchScript object graphs to a globalized form";
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createGlobalizeObjectGraphPass()";
|
|
|
|
let description = [{
|
|
|
|
This pass converts a subset of possible TorchScript modules into a
|
|
|
|
more restrictive lower-level form that strips away the need to be
|
|
|
|
concerned with instances of !torch.nn.Module<...> type. Specifically,
|
|
|
|
the object graph is flattened into a set of discrete globals
|
|
|
|
(`torch.global_slot`) that hold the program state.
|
|
|
|
|
|
|
|
The overarching goal is for a strict correspondence between the original
|
|
|
|
`torch.nn.Module` (call it `root`) that the user `torch.jit.script`'ed, and
|
|
|
|
the public interface of the resulting MLIR module. Specifically:
|
|
|
|
- The call `root.encoder.forward(...)` in Python corresponds to invoking
|
|
|
|
the `func @encoder.forward` on the resulting MLIR module.
|
|
|
|
- The data member access `root.decoder.ids_to_strings_table` in Python
|
|
|
|
corresponds to accessing the
|
|
|
|
`torch.global_slot @decoder.ids_to_strings_table` on the resulting
|
|
|
|
MLIR module.
|
|
|
|
In effect, the entire MLIR module corresponds to an instance of the `root`
|
|
|
|
object. This matches with the intuitive behavior desired for deployment:
|
|
|
|
When the MLIR module (or, more likely, a compiled artifact derived from it)
|
|
|
|
is loaded in a deployed environment, it is equivalent to recreating the
|
|
|
|
original `root` object.
|
|
|
|
|
|
|
|
This pass performs a complete change of the externally visible calling
|
|
|
|
convention of the MLIR module for a graph of objects and methods to a
|
Support multiple instances of a class in GlobalizeObjectGraph.
This happens in practice with e.g. ResNet from torchvision (multiple
instances of the same BatchNorm class).
The key observation is that for this program, and the expected set of
programs, we can convert the program to the same globalized form with a
bit more static analysis and effort to suitably monomorphize the
program. Though what we are doing here is fairly annoying to implement,
it saves any nontrivial later pass from having to do similar analyses
(or worse). E.g. shape inference would need to be object-graph aware,
mutation/lifetime analyses would have to be aware, etc. Additionally, it
would make us front-load what it means to have a !torch.nn.Module type
on an ABI boundary, which we are just not ready to handle.
I'm really, really hoping that in practice we can get away with
this, otherwise it's going to be really rough designing a representation
(and implementing everything to back it) that is convenient to transform
and gracefully scales from full object graph (in the most dynamic case)
down to a fixed set of global slots like we have here (in the most
static case, which we presume a lot of practical programs fall into).
This also involved introducing a
`torch-prepare-for-globalize-object-graph` pass that does a minimal set of
lowerings to simplify the IR into a more orthogonal and analyzable form,
and a `torch-globalize-pipeline` helper.
Recommended review order:
- updated documentation in Passes.td
- new tests in `globalize-object-graph-multiple-instances*.mlir`
- implementation of GlobalizeObjectGraph.cpp
- PrepareForGlobalizeObjectGraph.cpp + prepare-for-globalize-object-graph.mlir
- misc stuff like torch-globalize-pipeline pipeline definition.
With this, we can import, globalize, and inline resnet18 from
torchvision:
https://gist.github.com/silvasean/821586afc19b67d9fb72030b2e0adeb8
2021-03-10 12:33:21 +08:00
|
|
|
fixed set of globals and functions. Additionally, method signatures are
|
|
|
|
changed such that all types of !torch.nn.Module are deleted from public
|
|
|
|
interfaces since they are guaranteed to correspond to a unique instance and
|
|
|
|
are thus redundant.
|
2021-02-18 03:28:51 +08:00
|
|
|
|
|
|
|
Of course, only a subset of programs can be transformed, and this pass fails
|
|
|
|
with an error if the conditions are violated.
|
|
|
|
|
|
|
|
Specifically, the restrictions are:
|
|
|
|
- There must be a unique torch.nn_module that is not the value of a slot
|
|
|
|
of any other torch.nn_module
|
|
|
|
- Rationale: Allows us to have a notion of a unique "root" op, which is
|
|
|
|
used to define linkage. This also matches how TorchScript imports in
|
|
|
|
practice (`torch.jit.script` imports a single root object).
|
Support multiple instances of a class in GlobalizeObjectGraph.
This happens in practice with e.g. ResNet from torchvision (multiple
instances of the same BatchNorm class).
The key observation is that for this program, and the expected set of
programs, we can convert the program to the same globalized form with a
bit more static analysis and effort to suitably monomorphize the
program. Though what we are doing here is fairly annoying to implement,
it saves any nontrivial later pass from having to do similar analyses
(or worse). E.g. shape inference would need to be object-graph aware,
mutation/lifetime analyses would have to be aware, etc. Additionally, it
would make us front-load what it means to have a !torch.nn.Module type
on an ABI boundary, which we are just not ready to handle.
I'm really, really hoping that in practice we can get away with
this, otherwise it's going to be really rough designing a representation
(and implementing everything to back it) that is convenient to transform
and gracefully scales from full object graph (in the most dynamic case)
down to a fixed set of global slots like we have here (in the most
static case, which we presume a lot of practical programs fall into).
This also involved introducing a
`torch-prepare-for-globalize-object-graph` pass that does a minimal set of
lowerings to simplify the IR into a more orthogonal and analyzable form,
and a `torch-globalize-pipeline` helper.
Recommended review order:
- updated documentation in Passes.td
- new tests in `globalize-object-graph-multiple-instances*.mlir`
- implementation of GlobalizeObjectGraph.cpp
- PrepareForGlobalizeObjectGraph.cpp + prepare-for-globalize-object-graph.mlir
- misc stuff like torch-globalize-pipeline pipeline definition.
With this, we can import, globalize, and inline resnet18 from
torchvision:
https://gist.github.com/silvasean/821586afc19b67d9fb72030b2e0adeb8
2021-03-10 12:33:21 +08:00
|
|
|
- Multiple instances of the same class type are allowed, as long as it is
|
|
|
|
possible to monomorphize ("template instantiate") functions so that each
|
|
|
|
argument of !torch.nn.Module type corresponds to a unique instance.
|
|
|
|
In pratice, this limitation is either 1) (fundamental) due to truly
|
|
|
|
dynamic use of modules, such as `m1 if cond() else m2` in Python code,
|
|
|
|
or 2) (incidental) imprecision of the static analysis used in this pass
|
|
|
|
which is used to calculate when a single intance is relevant. In general,
|
|
|
|
this analysis is equivalent to the halting problem, but we can aim to
|
|
|
|
improve this pass such that practical patterns are all handled.
|
|
|
|
- Rationale: The fundamental limitation "1)" guarantees that the
|
|
|
|
program can be lowered to a fixed set of globals without indirection
|
|
|
|
across globals. In the absence of this property, most compiler
|
|
|
|
analyses/transformations are significantly curtailed (or require very
|
|
|
|
sophisticated implementations). For the moment, this restriction
|
|
|
|
is deemed to be sufficiently reasonable to be a pragmatic choice to
|
|
|
|
avoid front-loading the complexity of working with a representation that
|
|
|
|
really does a good job of representing that kind of program.
|
|
|
|
Additionally, it avoids front-loading the handling of programs which
|
|
|
|
have !torch.nn.Module types at external calling convention boundaries.
|
2021-02-18 03:28:51 +08:00
|
|
|
- All torch.nn_module's must be reachable by a unique path from the root
|
|
|
|
- Rationale: Eliminates possibility of potentially exponential number of
|
|
|
|
paths. Or worse, infinite number of paths when considering cyclic
|
|
|
|
object graphs. Also as of Feb 2021, TorchScript won't import into
|
|
|
|
this form (it has a bug related to the identity of submodules).
|
2021-02-26 07:54:51 +08:00
|
|
|
- Two slots cannot have initial values that alias each other.
|
|
|
|
- Rationale: This makes the representation of initial values simpler. Also
|
|
|
|
as of Feb 2021, TorchScript won't import into this form except
|
|
|
|
potentially for Tensors (it has a bug related to the identity of
|
|
|
|
objects). And for tensors, the npcomp IValue importer only supports a
|
|
|
|
very restricted form of aliasing anyway for other reasons. We are
|
|
|
|
waiting for signals that more general handling of object aliasing is
|
|
|
|
important to devote the effort to it.
|
2021-02-18 03:28:51 +08:00
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
Support multiple instances of a class in GlobalizeObjectGraph.
This happens in practice with e.g. ResNet from torchvision (multiple
instances of the same BatchNorm class).
The key observation is that for this program, and the expected set of
programs, we can convert the program to the same globalized form with a
bit more static analysis and effort to suitably monomorphize the
program. Though what we are doing here is fairly annoying to implement,
it saves any nontrivial later pass from having to do similar analyses
(or worse). E.g. shape inference would need to be object-graph aware,
mutation/lifetime analyses would have to be aware, etc. Additionally, it
would make us front-load what it means to have a !torch.nn.Module type
on an ABI boundary, which we are just not ready to handle.
I'm really, really hoping that in practice we can get away with
this, otherwise it's going to be really rough designing a representation
(and implementing everything to back it) that is convenient to transform
and gracefully scales from full object graph (in the most dynamic case)
down to a fixed set of global slots like we have here (in the most
static case, which we presume a lot of practical programs fall into).
This also involved introducing a
`torch-prepare-for-globalize-object-graph` pass that does a minimal set of
lowerings to simplify the IR into a more orthogonal and analyzable form,
and a `torch-globalize-pipeline` helper.
Recommended review order:
- updated documentation in Passes.td
- new tests in `globalize-object-graph-multiple-instances*.mlir`
- implementation of GlobalizeObjectGraph.cpp
- PrepareForGlobalizeObjectGraph.cpp + prepare-for-globalize-object-graph.mlir
- misc stuff like torch-globalize-pipeline pipeline definition.
With this, we can import, globalize, and inline resnet18 from
torchvision:
https://gist.github.com/silvasean/821586afc19b67d9fb72030b2e0adeb8
2021-03-10 12:33:21 +08:00
|
|
|
def PrepareForGlobalizeObjectGraph
|
|
|
|
: Pass<"torch-prepare-for-globalize-object-graph", "ModuleOp"> {
|
|
|
|
let summary = "Lowering in preparation for globalizing";
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createPrepareForGlobalizeObjectGraphPass()";
|
|
|
|
let description = [{
|
|
|
|
Establishes and the invariants needed by the
|
|
|
|
torch-globalize-object-graph transformation. Fails if that cannot be
|
|
|
|
accomplished.
|
|
|
|
|
|
|
|
Currently, this just involves ensuring a small set of patterns have been
|
|
|
|
applied.
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
2021-04-02 08:36:18 +08:00
|
|
|
def AdjustCallingConventions
|
|
|
|
: Pass<"torch-adjust-calling-conventions", "ModuleOp"> {
|
|
|
|
let summary = "Adjust the calling conventions of functions";
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createAdjustCallingConventionsPass()";
|
|
|
|
let description = [{
|
|
|
|
Adjusts the calling conventions of functions in the module, with the aim of
|
|
|
|
preparing them for backends and further lowering passes. As this changes
|
|
|
|
the module calling convention, it should be considered a legalization
|
|
|
|
step towards reaching IR that is suitable for an appropriate backend.
|
|
|
|
All transformations are context-free and suitable for documenting
|
|
|
|
at the user level if needed to clarify the eventual calling convention
|
|
|
|
of compiled artifacts.
|
|
|
|
This is not an optimization.
|
|
|
|
|
|
|
|
The transformations performed are:
|
|
|
|
- `torch.type_bound` annotations are incorporated into the type of the
|
|
|
|
function arguments, which should be `!numpy.ndarray<...>`'s.
|
|
|
|
- Python-isms are rewritten to MLIR-isms
|
|
|
|
- NoneType return is rewritten to the absence of a return value.
|
|
|
|
- (Not implemented yet) Tuple return is rewritten to multiple return
|
|
|
|
values
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
2021-04-06 08:43:23 +08:00
|
|
|
def RefineTypes : Pass<"torch-refine-types", "FuncOp"> {
|
|
|
|
let summary = "Refine types";
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createRefineTypesPass()";
|
|
|
|
let description = [{
|
|
|
|
Refines types of the program. Currently, this means shapes and dtypes of
|
|
|
|
tensors/arrays.
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
2021-04-24 04:35:44 +08:00
|
|
|
def InlineGlobalSlots : Pass<"torch-inline-global-slots", "ModuleOp"> {
|
|
|
|
let summary = "Inlines torch.global_slot ops.";
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createInlineGlobalSlotsPass()";
|
|
|
|
let description = [{
|
|
|
|
Inlines torch.global_slot ops when it is safe to do so.
|
|
|
|
|
|
|
|
Note: This pass inlines everything that is safe to inline. That is, it
|
|
|
|
doesn't have a cost model. This is likely to pessimize programs with
|
|
|
|
significant amounts of computation inside torch.global_slot initializer
|
|
|
|
regions (but this currently doesn't happen due to how TorchScript modules
|
|
|
|
are imported -- the contents are just constants).
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
Significantly restructure torch/aten import design.
This is a really major and invasive restructuring of the way we get
torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into
MLIR. Please forgive the challenging review, but due to the sheer
invasiveness, it wasn't really practical do do it in sane smaller
pieces.
This fully replaces everything that was already working on the
TorchScript path (actually, more -- we added tanh support to
TorchToLinalg in order to delete the older code paths). Additionally,
I've kept the lights on for the acap path too, including what little e2e
stuff was working before (for expediency I made a few tiny compromises
along the way that will be easy to undo when we give that path proper
attention).
Overview of the new design:
- The torch operator `somens::someunqualname.someoverloadname` is
imported as `torch.somens.someunqualname.someoverloadname` (skip the
last dotted part if the overload name is empty), OR, if we don't have
such an op registered, it is imported as
`torch.operator "somens.someunqualname.someoverloadname" (...) : ...`.
- The addition of the "overload name" is a critical element here, as
the `(ns,unqual,overload)` triple is unique, which solves a lot of
problems we were having.
- This involves having separate MLIR ops for the `trailing_` and
`.out` variants and all the different overloads. This seemed
necessary, because the set of overloads is so wild and varied and
unstructured. The previous design was leaning into some underlying
structure that just isn't there -- the default situation is
the "random overload that we want to manage on the MLIR side",
rather than that being an exception. E.g. `aten::ne` (not-equal)
has 21 overloads, only 4 of which are c10 dispatcher ops see
[gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1),
and the "out" variant is really called `.Tensor_out` instead of
`.out` as it frequently is for other ops.
- Rationale for all being in `torch` namespace: the set of operators
are so varied and unstructured that "dialect per namespace"
doesn't result in anything resembling the typical MLIR dialect
boundary expectations. We could maybe draw the boundary at
dispatcher ops vs non-dispatcher ops, but that doesn't seem to
really result in very much useful structure at this point in time.
- Note: within the torch operator registry, we effectively have a
mini-basicpy subdialect (already type-resolved), which is reasonably
structured.
- The existing Torch op interfaces are also removed -- now that we
track the overload name, we can losslessly find the original
operator.
- Instead of `ATenRecognizeKernelsPass`, we now have a
`ReduceOpVariantsPass` that keys off certain traits (and perhaps
eventually interfaces) to reduce variants of ops to a smaller set,
ideally operating on immutable tensors and using surrounding ops to
model the mutability/aliasing aspects.
- Note: `torch.ns.unqual.overload` ops allow both immutable and
mutable tensors (unlike the previous hard distinction in the common
case). This is a premonition for a future change that will introduce a
bona fide `!torch.tensor` type that will clean up a bunch of stuff.
- `TorchToLinalg` / `TorchToStd` supercede the existing
"ATen->TCF->TCP->Linalg" path.
- The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`.
It should look somewhat familiar, but the benefit of hindsight has
allowed a lot of simplifications.
The overall trend seems to be to make the `torch` dialect a nice layer
independent of anything else. It feels like as a natural result of
various future changes we will be removing the reliance on basicpy+numpy
dialects and have a nice self-contained type system too that properly
models the TorchScript type system (including proper subtyping,
mutable/immutable tensors, optional dtype, etc.).
Recommended review order:
- Start at some of the new import IR, e.g. in
`frontends/pytorch/test/node_import/prim.py`,
`frontends/pytorch/test/acap_export/test_export_add3.py`, and other
tests.
- `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py`
and associated generated files:
- `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td`
- `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td`
- Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new
traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h`
- Various code changes in the import path in
`frontends/pytorch/csrc/builder`. Probably most interesting is the new
code in `torch_to_mlir_utils.cpp` that has the logic to create the
`torch.operator` ops or `torch.ns.unqual.overload` ops.
This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe),
just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
|
|
|
def ReduceOpVariants : Pass<"torch-reduce-op-variants", "FuncOp"> {
|
|
|
|
let summary = "Reduces variants of ops to a smaller set of ops.";
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createReduceOpVariantsPass()";
|
|
|
|
let description = [{
|
|
|
|
Replaces ops with other ops to reduce the number of variants that
|
|
|
|
need to be handled elsewhere in the code.
|
|
|
|
|
|
|
|
Examples of the transformations done in this pass are:
|
|
|
|
- Convert operations with value semantics to operate on immutable tensors
|
|
|
|
- Convert operations with in-place semantics (e.g. `add_`) or inherently
|
|
|
|
mutable semantics (e.g. `add.out`) to their value-semantic equivalent.
|
|
|
|
- Convert operations that involve a scalar promotion to the tensor
|
|
|
|
variant plus a scalar promotion op.
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes. The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:
```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```
This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".
At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.
Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
touch -- we need to sort out the situation with !basicpy.BoolType
there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
semantics. We currently require this, as our backend contract does not
currently allow us to even model the non-value-semantic case. Before,
the value-semantic assumption was randomly injected in the middle of
the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
`!torch.vtensor` to `tensor` and use the dialect conversion infra.
The overall conversion pipeline is set up following the best practices
of the "Type Conversions the Not-So-Hard Way" talk. This required
introducing `torch-func-builtin-tensorize` and
`torch-finalizing-builtin-tensorize` passes analogous to the upstream
bufferization passes with the corresponding names (mostly just
copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
lowering to std later in the pipeline, so we are gradually lessening
our reliance on random std constant folding before we get to that
point.
Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
- Frontend changes.
- Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
|
|
|
def MaximizeValueSemantics : Pass<"torch-maximize-value-semantics", "FuncOp"> {
|
|
|
|
let summary = "Use value-semantic tensors where possible.";
|
|
|
|
let description = [{
|
|
|
|
Use value-semantic tensors where possible to make the program more
|
|
|
|
analyzable by later passes (also, backends prefer value semantics as well).
|
|
|
|
|
|
|
|
This pass is analogous to an SSA-formation pass in a
|
|
|
|
traditional compiler, with the added complication that arrays can alias
|
|
|
|
each other in interesting ways.
|
|
|
|
|
|
|
|
The current code doesn't implement any fancy algorithm, and is intended
|
|
|
|
to be just sufficient for a first e2e spike. An algorithm inspired by the
|
|
|
|
SSA formation literature will need to be implemented.
|
|
|
|
|
|
|
|
Also, this pass doesn't currently handle interprocedural rewriting
|
|
|
|
(of private functions), which is even more complex.
|
|
|
|
}];
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createMaximizeValueSemanticsPass()";
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
def RefinePublicReturn : Pass<"torch-refine-public-return", "ModuleOp"> {
|
|
|
|
let summary = "Refine public return";
|
|
|
|
let constructor = "mlir::NPCOMP::Torch::createRefinePublicReturnPass()";
|
|
|
|
let description = [{
|
|
|
|
Refines types of values returned from public functions based on
|
|
|
|
intraprocedural information.
|
|
|
|
|
|
|
|
This pass effectively encodes an assumption by the pass pipeline author that
|
|
|
|
the public calling convention of the module can have its types refined,
|
|
|
|
without causing ABI mismatches. This is frequently true -- for example, in
|
|
|
|
many systems, `!torch.vtensor<[?,?],f32>`, `!torch.vtensor<[3,3],f32>` and
|
|
|
|
`!torch.vtensor` are all the same data structure on calling
|
|
|
|
convention boundaries.
|
|
|
|
|
|
|
|
This pass is expected to run after shape refinement has occurred to
|
|
|
|
otherwise resolve shapes, and is currently mainly useful to convert
|
|
|
|
rank/dtype-erased function boundaries to ranked, dtyped code for
|
|
|
|
compiler backends.
|
|
|
|
|
|
|
|
This pass also changes the return to be a value tensor. This is incorrect
|
|
|
|
in general because users may rely on the aliasing properties of non-value
|
|
|
|
tensors, but for now it is deemed expedient to include this in this pass.
|
|
|
|
TODO: Avoid hardcoding the value tensor assumption. In general, much
|
|
|
|
as the type bound of an argument can be marked as having value semantics
|
|
|
|
at the frontend level based on user concerns, so too should the returns
|
|
|
|
from the function be annotated as having value semantics.
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
2021-06-26 08:25:09 +08:00
|
|
|
def VerifyInvariantsBeforeBackendLowering
|
|
|
|
: Pass<"torch-verify-invariants-before-backend-lowering", "ModuleOp"> {
|
|
|
|
let summary = "Verify invariants required by backend lowering";
|
|
|
|
let constructor =
|
|
|
|
"mlir::NPCOMP::Torch::createVerifyInvariantsBeforeBackendLoweringPass()";
|
|
|
|
let description = [{
|
|
|
|
This pass checks any invariants needed by the process of lowering the
|
|
|
|
`torch` dialect to the npcomp backend contract.
|
|
|
|
|
|
|
|
The most important invariant is that all tensors should be ranked and have
|
|
|
|
a known dtype. It is useful to catch this early because it usually
|
|
|
|
represents a simple bug in RefineTypes, but can manifest as many different
|
|
|
|
kinds of obscure symptoms during lowering.
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
2021-06-16 07:47:53 +08:00
|
|
|
def FuncBackendTypeConversion : Pass<"torch-func-backend-type-conversion", "ModuleOp"> {
|
Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes. The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:
```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```
This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".
At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.
Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
touch -- we need to sort out the situation with !basicpy.BoolType
there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
semantics. We currently require this, as our backend contract does not
currently allow us to even model the non-value-semantic case. Before,
the value-semantic assumption was randomly injected in the middle of
the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
`!torch.vtensor` to `tensor` and use the dialect conversion infra.
The overall conversion pipeline is set up following the best practices
of the "Type Conversions the Not-So-Hard Way" talk. This required
introducing `torch-func-builtin-tensorize` and
`torch-finalizing-builtin-tensorize` passes analogous to the upstream
bufferization passes with the corresponding names (mostly just
copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
lowering to std later in the pipeline, so we are gradually lessening
our reliance on random std constant folding before we get to that
point.
Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
- Frontend changes.
- Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
|
|
|
let summary = "Convert functions to operate on builtin tensors";
|
2021-06-16 07:47:53 +08:00
|
|
|
let constructor = "mlir::NPCOMP::Torch::createFuncBackendTypeConversionPass()";
|
Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes. The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:
```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```
This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".
At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.
Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
touch -- we need to sort out the situation with !basicpy.BoolType
there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
semantics. We currently require this, as our backend contract does not
currently allow us to even model the non-value-semantic case. Before,
the value-semantic assumption was randomly injected in the middle of
the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
`!torch.vtensor` to `tensor` and use the dialect conversion infra.
The overall conversion pipeline is set up following the best practices
of the "Type Conversions the Not-So-Hard Way" talk. This required
introducing `torch-func-builtin-tensorize` and
`torch-finalizing-builtin-tensorize` passes analogous to the upstream
bufferization passes with the corresponding names (mostly just
copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
lowering to std later in the pipeline, so we are gradually lessening
our reliance on random std constant folding before we get to that
point.
Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
- Frontend changes.
- Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
|
|
|
let description = [{
|
|
|
|
Partial type conversion pass analogous in scope to the upstream
|
|
|
|
`func-bufferize` pass. See details there.
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
2021-06-16 07:47:53 +08:00
|
|
|
def FinalizingBackendTypeConversion
|
|
|
|
: Pass<"torch-finalizing-backend-type-conversion", "FuncOp"> {
|
Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes. The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:
```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```
This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".
At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.
Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
touch -- we need to sort out the situation with !basicpy.BoolType
there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
semantics. We currently require this, as our backend contract does not
currently allow us to even model the non-value-semantic case. Before,
the value-semantic assumption was randomly injected in the middle of
the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
`!torch.vtensor` to `tensor` and use the dialect conversion infra.
The overall conversion pipeline is set up following the best practices
of the "Type Conversions the Not-So-Hard Way" talk. This required
introducing `torch-func-builtin-tensorize` and
`torch-finalizing-builtin-tensorize` passes analogous to the upstream
bufferization passes with the corresponding names (mostly just
copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
lowering to std later in the pipeline, so we are gradually lessening
our reliance on random std constant folding before we get to that
point.
Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
- Frontend changes.
- Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
|
|
|
let summary = "Finalizes a partial conversion to builtin tensors";
|
|
|
|
let constructor =
|
2021-06-16 07:47:53 +08:00
|
|
|
"mlir::NPCOMP::Torch::createFinalizingBackendTypeConversionPass()";
|
Introduce `!torch.tensor` / `!torch.vtensor` types.
This removes our reliance on the numpy dialect and avoids our off-label
use of the builtin tnesor type for modeling unknown dtypes. The
`!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor.
The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic
tensor. The new types look as follows syntactically:
```
// Least-static-information, non-value-semantic tensor.
!torch.tensor
// Explicit form of least-static-information variant.
!torch.tensor<*,unk>
// Least-static-information, value-semantic tensor.
!torch.vtensor
// Explicit form of least-static-information variant.
!torch.vtensor<*,unk>
// Fixed-set of allowable element types, with first-class support for
// Torch's frontend signedness semantics.
!torch.tensor<*,si32>
// First-class support for unknown dtypes.
!torch.tensor<[?,?,?],unk>
// Standard MLIR representation of `?` for unknown dimensions.
!torch.tensor<[?,2,?,4],unk>
// Statically shaped / dtyped example.
!torch.vtensor<[1,2,3,4],f32>
```
This required fairly significant changes throughout the compiler, but
overall it is a big cleanup. We now have a much clearer layering of "the
Torch frontend lowering" vs "lowering to std + linalg + etc.".
At the C++ level, there is `ValueTensorType`, `NonValueTensorType`.
We also have a helper `BaseTensorType` (kind of like ShapedType) which
interoperates with those two.
Included changes:
- New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for
creating torch tensor literals in the frontend.
- Consistently use signedness for the types (except i1 which I didn't
touch -- we need to sort out the situation with !basicpy.BoolType
there anyway so will be attending to that soon)
- Frontend can annotate whether an argument to the function has value
semantics. We currently require this, as our backend contract does not
currently allow us to even model the non-value-semantic case. Before,
the value-semantic assumption was randomly injected in the middle of
the pass pipeline.
- Move ArrayToTensor (now called MaximizeValueSemantics) and
RefinePublicReturn passes to torch dialect.
- The TorchToStd and TorchToLinalg passes are now type conversions from
`!torch.vtensor` to `tensor` and use the dialect conversion infra.
The overall conversion pipeline is set up following the best practices
of the "Type Conversions the Not-So-Hard Way" talk. This required
introducing `torch-func-builtin-tensorize` and
`torch-finalizing-builtin-tensorize` passes analogous to the upstream
bufferization passes with the corresponding names (mostly just
copypasta from there).
- Misc Torch-level canonicalizations -- we now cleanly layer the
lowering to std later in the pipeline, so we are gradually lessening
our reliance on random std constant folding before we get to that
point.
Recommended review order:
- New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp
- New ops in TorchOps.td / TorchOps.cpp
- Less important / more mechanical stuff
- Frontend changes.
- Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
|
|
|
let description = [{
|
|
|
|
Analogous in scope to the upstream `finalizing-bufferize` pass.
|
|
|
|
See details there.
|
|
|
|
}];
|
|
|
|
}
|
|
|
|
|
2021-02-18 03:28:51 +08:00
|
|
|
#endif // NPCOMP_TORCH_PASSES
|