mirror of https://github.com/llvm/torch-mlir
444 lines
23 KiB
Markdown
444 lines
23 KiB
Markdown
|
# Torch-MLIR Architecture
|
||
|
|
||
|
## Introduction
|
||
|
|
||
|
The Torch-MLIR project provides core infrastructure for bridging the PyTorch
|
||
|
ecosystem and the MLIR ecosystem. For example, Torch-MLIR enables PyTorch models
|
||
|
to be lowered to a few different MLIR dialects. Torch-MLIR does not attempt to
|
||
|
provide a production end-to-end flow for PyTorch programs by itself, but is a
|
||
|
useful component for constructing one.
|
||
|
|
||
|
## Overview
|
||
|
|
||
|
Torch-MLIR has two parts, which we call the "frontend" and "backend". These two
|
||
|
halves interface at an abstraction layer that we call the "backend contract",
|
||
|
which is a subset of the `torch` dialect with certain properties appealing for
|
||
|
backends to lower from.
|
||
|
|
||
|
![Torch-MLIR Architecture](Torch-MLIR_Architecture.png)
|
||
|
|
||
|
The frontend of Torch-MLIR is concerned with interfacing to PyTorch itself, and
|
||
|
then normalizing the program to the "backend contract". This part involves build
|
||
|
system complexity and exposure to PyTorch APIs to get the program into the MLIR
|
||
|
`torch` dialect. When we interface with TorchScript, we additionally have a
|
||
|
large amount of lowering and simplification to do within MLIR on the `torch`
|
||
|
dialect.
|
||
|
|
||
|
The "backend" of Torch-MLIR takes IR in the "backend contract" form and lowers
|
||
|
it to various target dialects of interest to the MLIR ecosystem (various
|
||
|
"backends"). In particular, right now we support lowering to:
|
||
|
|
||
|
- Linalg-on-Tensors (+ `arith`, `tensor`, etc.)
|
||
|
- [TOSA](https://mlir.llvm.org/docs/Dialects/TOSA/)
|
||
|
- [MHLO](https://github.com/tensorflow/mlir-hlo)
|
||
|
|
||
|
The terms "frontend" and "backend" are highly overloaded in any compiler
|
||
|
project, but frequently in Torch-MLIR this is the meaning that they have.
|
||
|
Sometimes "frontend" can mean something even further up the stack, such as
|
||
|
something in PyTorch itself. When there is ambiguity we will refer to this as
|
||
|
"at the PyTorch level". Similarly, "backend" can sometimes refer to something
|
||
|
sitting below Linalg-on-Tensors, TOSA, or MHLO.
|
||
|
|
||
|
## The `torch` dialect
|
||
|
|
||
|
See [include/torch-mlir/Dialect/Torch/IR](https://github.com/llvm/torch-mlir/tree/main/include/torch-mlir/Dialect/Torch/IR)
|
||
|
|
||
|
The central MLIR abstraction in the Torch-MLIR project is the `torch` dialect.
|
||
|
This dialect supports progressive lowering from the raw imported PyTorch
|
||
|
programs that various PyTorch integration points provide, all the way down to
|
||
|
the backend contract.
|
||
|
|
||
|
The `torch` dialect must be versatile enough to support being imported by any
|
||
|
program capture mechanism in PyTorch -- this could be TorchDynamo, `torch.fx`,
|
||
|
LazyTensorCore, TorchScript, `torch.jit.trace`, etc. Thankfully, PyTorch is
|
||
|
factored such that we can handle this with one core import path, which is
|
||
|
through the PyTorch
|
||
|
"[JIT IR](https://github.com/pytorch/pytorch/blob/78c8a0d75220bdd4955415b5f81509e005af4232/torch/csrc/jit/OVERVIEW.md)",
|
||
|
and lives in
|
||
|
[torch-mlir/python/torch_mlir/dialects/torch/importer/jit_ir](https://github.com/llvm/torch-mlir/tree/e322f6a8784009b37aa354abfa9a40a80f30877d/python/torch_mlir/dialects/torch/importer/jit_ir).
|
||
|
The JIT IR is a highly principled IR that faithfully models a Python subset (+
|
||
|
tensors, the PyTorch op registry, and a few other things). All the other PyTorch
|
||
|
program representations can eventually bottom-out on the JIT IR via some path
|
||
|
provided by PyTorch. The `torch` dialect is almost entirely in 1:1
|
||
|
correspondence with the JIT IR -- this allows the importer to be extremely small
|
||
|
(the core is
|
||
|
[under 500 lines of code](https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/python/torch_mlir/dialects/torch/importer/jit_ir/csrc/node_importer.cpp#L1)).
|
||
|
|
||
|
### Ops
|
||
|
|
||
|
See [TorchOps.td](https://github.com/llvm/torch-mlir/blob/114f48e96c578ee76a6f83b3aa4aa229a8d5b76e/include/torch-mlir/Dialect/Torch/IR/TorchOps.td#L1)
|
||
|
|
||
|
The ops in the `torch` dialect are almost entirely generated based on the
|
||
|
PyTorch JIT IR operator registry via the script
|
||
|
[torch_ods_gen.py](https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/python/torch_mlir/dialects/torch/importer/jit_ir/build_tools/torch_ods_gen.py#L1) (invoked via [update_torch_ods.sh](https://github.com/llvm/torch-mlir/blob/main/build_tools/update_torch_ods.sh)).
|
||
|
This script queries the registry and generates MLIR
|
||
|
[ODS](https://mlir.llvm.org/docs/OpDefinitions/) in
|
||
|
[GeneratedTorchOps.td](https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/include/torch-mlir/Dialect/Torch/IR/GeneratedTorchOps.td#L1). We have a guide for [adding a new op end-to-end](https://github.com/llvm/torch-mlir/wiki/Torch-ops-E2E-implementation).
|
||
|
|
||
|
There are also some manually implemented ops in the following categories (see
|
||
|
[TorchOps.td](https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/include/torch-mlir/Dialect/Torch/IR/TorchOps.td#L1)):
|
||
|
|
||
|
- Ops used for modeling PyTorch IValue object graphs (e.g. `torch.nn_module`,
|
||
|
`torch.class_type`).
|
||
|
- `torch.global_slot` and related ops which are used to model an incremental
|
||
|
lowering of the IValue object graphs.
|
||
|
- Ops that are supported in the JIT interpreter directly, and so don't have a
|
||
|
corresponding op in the registry (e.g. `torch.prim.If`,
|
||
|
`torch.prim.ListConstruct`, `torch.constant.*`)
|
||
|
- `torch.operator` which is used to represent ops from the registry which
|
||
|
haven't been generated by `torch_ods_gen.py`.
|
||
|
|
||
|
### Types
|
||
|
|
||
|
See [TorchTypes.td](https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/include/torch-mlir/Dialect/Torch/IR/TorchTypes.td#L1)
|
||
|
|
||
|
The `torch` dialect has a complete set of types modeling the PyTorch type
|
||
|
system, which itself is a strongly typed subset of the Python type system (+
|
||
|
tensors). These types are almost all 1:1 with the corresponding
|
||
|
[PyTorch types](https://github.com/pytorch/pytorch/blob/c54d18dbc7bb2f9fdd83c5de529702e5a02295c3/aten/src/ATen/core/jit_type.h#L1).
|
||
|
|
||
|
The one exception where a significant amount of design work has been done in
|
||
|
Torch-MLIR is the handling of tensors. Torch-MLIR's tensor types allow
|
||
|
progressive lowering from raw imported IR which maybe be missing shapes, dtypes,
|
||
|
and value semantics, into the backend contract which provides those. Torch-MLIR
|
||
|
has two tensor types `ValueTensorType` (`!torch.vtensor`) and
|
||
|
`NonValueTensorType` (`!torch.tensor`) sharing most of their definition in
|
||
|
[TorchTypes.td](https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/include/torch-mlir/Dialect/Torch/IR/TorchTypes.td#L58).
|
||
|
The `NonValueTensorType` models a `torch.Tensor` including mutation, aliasing,
|
||
|
etc. while the `ValueTensorType` has value semantics. That is, `ValueTensorType`
|
||
|
is immutable and non-aliased. These types have a common C++ base class
|
||
|
[`BaseTensorType`](https://github.com/llvm/torch-mlir/blob/e322f6a8784009b37aa354abfa9a40a80f30877d/include/torch-mlir/Dialect/Torch/IR/TorchTypes.h#L40)
|
||
|
which permits abstracting across them. Both `ValueTensorType` and
|
||
|
`NonValueTensorType` have an optional list of optional sizes and an optional
|
||
|
dtype.
|
||
|
|
||
|
## The "backend contract"
|
||
|
|
||
|
See [satisfiesBackendContract](https://github.com/llvm/torch-mlir/blob/114f48e96c578ee76a6f83b3aa4aa229a8d5b76e/lib/Dialect/Torch/Transforms/LowerToBackendContract.cpp#L151)
|
||
|
|
||
|
The backend contract is a normalized form of the `torch` dialect with a set of
|
||
|
properties that make it easy to lower into various forms such as
|
||
|
Linalg-on-Tensors, TOSA, MHLO, or other forms that we don't provide out of the
|
||
|
box. The primary guarantees that we provide Torch-MLIR's backends are:
|
||
|
|
||
|
- All tensors have been converted to value semantics.
|
||
|
- All tensors have at least a known number of dimensions (i.e. rank), and
|
||
|
ideally also have a precise size for each dimension.
|
||
|
- All tensors have a known dtype.
|
||
|
- Certain ops have been decomposed to make them easier to handle (this is
|
||
|
configurable).
|
||
|
|
||
|
See the extensive comments in the function `satisfiesBackendContract` (and its
|
||
|
callees) in the `LowerToBackendContract` pass for an extended rationale for
|
||
|
these decisions, and a precise definition of the backend contract.
|
||
|
|
||
|
## The Frontends
|
||
|
|
||
|
Torch-MLIR provides 2 main frontends:
|
||
|
|
||
|
- LazyTensorCore - a frontend that is based around intercepting PyTorch
|
||
|
dispatcher calls and creating a graph that is lazily evaluated on demand.
|
||
|
- TorchScript - a frontend based around importing TorchScript functions or
|
||
|
modules. Such modules or functions can be obtained via `torch.jit.script`,
|
||
|
`torch.jit.trace`, or a few other methods in the PyTorch ecosystem.
|
||
|
|
||
|
Internally these share a lot of the core import code.
|
||
|
|
||
|
### LazyTensorCore
|
||
|
|
||
|
Docs: https://github.com/llvm/torch-mlir/blob/main/docs/ltc_backend.md
|
||
|
|
||
|
LazyTensorCore (LTC) is a program capture method provided by PyTorch that does
|
||
|
device-level tracing. This low-level interception point sits below gradient
|
||
|
calculations, and is thus a good choice for training flows. The downside of LTC
|
||
|
is that it depends on having the whole PyTorch runtime available, so cannot be
|
||
|
used for ahead-of-time compilation or capturing standalone program artifacts.
|
||
|
|
||
|
From an implementation perspective, the JIT IR that is produced by
|
||
|
LazyTensorCore has already had a number of transformations performed on it, in
|
||
|
particular, after importing from JIT IR to MLIR, the backend contract is
|
||
|
trivially satisfied. So the Torch-MLIR implementation complexity for
|
||
|
LazyTensorCore is restricted to build system and PyTorch integration, rather
|
||
|
than actual MLIR compiler passes.
|
||
|
|
||
|
### TorchScript (`torch.jit.script`)
|
||
|
|
||
|
[TorchScript](https://pytorch.org/docs/stable/jit.html) is a strict Python
|
||
|
subset which is modeled faithfully in the JIT IR. Additionally, TorchScript can
|
||
|
represent a full `torch.nn.Module` object graph (hierarchy). This results in a
|
||
|
significant amount of work needing to be done by the frontend to lower it to the
|
||
|
backend contract:
|
||
|
|
||
|
- The `torch.nn.Module` hierarchy must be lowered to the backend contract, which
|
||
|
does not allow any program state.
|
||
|
- The program must be converted to value semantics (functionalized).
|
||
|
- Shapes and dtypes must be inferred.
|
||
|
- Many "Python-isms" must be simplified away, such as list appends, string
|
||
|
operations, etc.
|
||
|
|
||
|
Because TorchScript does not naturally give shapes or dtypes, we usually require
|
||
|
the user to annotate a set of expected shapes and dtypes of any arguments. We then propagate those throughout the program.
|
||
|
|
||
|
`torch.jit.trace` produces JIT IR with shapes and dtypes already, but no value
|
||
|
semantics. And often users want to erase the shapes in the trace to allow
|
||
|
dynamic shapes for the trace. Additionally, the Python-level data structures and
|
||
|
APIs are very parallel between `torch.jit.script` and `torch.jit.trace`, so we
|
||
|
consider both of those as the same from the perspective of the responsibilities
|
||
|
of the compiler. Both are accessed via the `torch_mlir.compile` Python API.
|
||
|
|
||
|
### Modeling the `torch.nn.Module` object (`IValue`) hierarchy for TorchScript
|
||
|
|
||
|
PyTorch consistently models a subset of Python objects with its concept of
|
||
|
[`IValue`](https://github.com/pytorch/pytorch/blob/1ee9eb52b612f5fb4b63bbda832e44c8902edb64/aten/src/ATen/core/ivalue.h#L171)
|
||
|
(interpreter value). These are used throughout PyTorch to represent Python
|
||
|
values. When one `torch.jit.script`'s a `torch.nn.Module`, the result is
|
||
|
actually an `IValue` that represents the module, with a hierarchy of children
|
||
|
`IValue`'s. Strictly speaking, JIT IR `torch::jit::Graph`'s are only used to
|
||
|
represent the bodies of methods on the modules. So in addition to importing the
|
||
|
JIT IR, we also need to import the `IValue`'s. This happens inside [ivalue_importer.cpp](https://github.com/llvm/torch-mlir/blob/fde390c7669e29362b18388448ef2b188713383f/python/torch_mlir/dialects/torch/importer/jit_ir/csrc/ivalue_importer.cpp#L1).
|
||
|
|
||
|
Most of the IValue modeling can reuse `torch` dialect ops that already exist
|
||
|
otherwise, such as `torch.constant.int` to represent an int in the object graph.
|
||
|
However, special IR constructs are needed for modeling the `torch.nn.Module`'s
|
||
|
themselves.
|
||
|
|
||
|
An example is:
|
||
|
|
||
|
```mlir
|
||
|
torch.class_type @c {
|
||
|
torch.attr "b" : !torch.bool
|
||
|
torch.attr "i" : !torch.int
|
||
|
torch.attr "f" : !torch.float
|
||
|
torch.attr "t" : !torch.tensor
|
||
|
torch.method "get_tensor", @get_tensor
|
||
|
}
|
||
|
func.func private @get_tensor(%arg0: !torch.nn.Module<"c">) -> !torch.tensor {
|
||
|
%2 = torch.prim.GetAttr %arg0["t"] : !torch.nn.Module<"c"> -> !torch.tensor
|
||
|
return %2 : !torch.tensor
|
||
|
}
|
||
|
|
||
|
%true = torch.constant.bool true
|
||
|
%int3 = torch.constant.int 3
|
||
|
%float4.250000e01 = torch.constant.float 4.250000e+01
|
||
|
%0 = torch.tensor.literal(dense<1.000000e+00> : tensor<1xf32>) : !torch.tensor
|
||
|
%1 = torch.nn_module {
|
||
|
torch.slot "b", %true : !torch.bool
|
||
|
torch.slot "i", %int3 : !torch.int
|
||
|
torch.slot "f", %float4.250000e01 : !torch.float
|
||
|
torch.slot "t", %0 : !torch.tensor
|
||
|
} : !torch.nn.Module<"c">
|
||
|
```
|
||
|
|
||
|
See the documentation for the ops for more information on the semantics of this
|
||
|
form.
|
||
|
|
||
|
|
||
|
### Lowering TorchScript to the backend contract
|
||
|
|
||
|
The `torchscript-module-to-torch-backend-pipeline` contains the set of simplifications used convert TorchScript to the backend contract. At a high level, it consists of the following transformations:
|
||
|
|
||
|
1. GlobalizeObjectGraph: This takes the `IValue` object graph and converts it
|
||
|
into a flat list of globals (see `torch.global_slot` and related ops).
|
||
|
1. LowerToBackendContract: This pass iteratively applies a simplification
|
||
|
pipeline until the backend contract is reached. The simplification pipeline consists of:
|
||
|
- Standard canonicalization.
|
||
|
- Shape refinement. See [shape_lib.md](https://github.com/llvm/torch-mlir/blob/main/docs/shape_lib.md) for detail
|
||
|
- DType refinement. See `RefineTypes`.
|
||
|
- Decomposing ops into more primitive ops. See `DecomposeComplexOps`.
|
||
|
|
||
|
### Layering of the PyTorch Dependency
|
||
|
|
||
|
One of the core principles of our Torch-MLIR <-> PyTorch interop is that
|
||
|
anything that links against PyTorch must interact with MLIR through
|
||
|
[the Torch-MLIR C API](https://github.com/llvm/torch-mlir/tree/main/include/torch-mlir-c).
|
||
|
This bypasses a number of very complex dependency and shared library issues.
|
||
|
|
||
|
Additionally, we maintain the invariant that the core MLIR compiler code (in
|
||
|
`lib/` and `include/`) never has a build dependency on PyTorch itself. This
|
||
|
strict isolation avoids a number of complex dependency issues and ensures that
|
||
|
`torch-mlir-opt` and similar debugging tools always provide the excellent
|
||
|
development and debugging experience that MLIR developers expect. Sometimes,
|
||
|
certain highly stable enums and related logic must be shared with upstream
|
||
|
PyTorch, and for those we copy code from PyTorch into
|
||
|
[TorchUpstream.h](https://github.com/llvm/torch-mlir/blob/fde390c7669e29362b18388448ef2b188713383f/include/torch-mlir/Dialect/Torch/Utils/TorchUpstream.h#L13).
|
||
|
|
||
|
## The Backends
|
||
|
|
||
|
Torch-MLIR provides 3 built-in backends, which take the backend contract IR and
|
||
|
lower it to the requirements of each backend. The 3 backends are:
|
||
|
|
||
|
- [`linalg`](https://mlir.llvm.org/docs/Dialects/Linalg/) on tensors (+ `arith`,
|
||
|
`tensor`, etc.)
|
||
|
- [TOSA](https://mlir.llvm.org/docs/Dialects/TOSA/)
|
||
|
- [MHLO](https://github.com/tensorflow/mlir-hlo)
|
||
|
|
||
|
### The Linalg Backend (Linalg-on-Tensors)
|
||
|
|
||
|
Code: https://github.com/llvm/torch-mlir/tree/main/lib/Conversion/TorchToLinalg
|
||
|
|
||
|
The Linalg-on-Tensors backend was the first backend that we added, and it is
|
||
|
still the most complete. It fully supports dynamic shapes (known number of
|
||
|
dimensions but arbitrary dynamic dimension sizes). Since linalg was originally
|
||
|
designed as a dialect for transformations, it can be too low-level for certain
|
||
|
consumers.
|
||
|
|
||
|
### The TOSA Backend
|
||
|
|
||
|
Code: https://github.com/llvm/torch-mlir/tree/main/lib/Conversion/TorchToTosa
|
||
|
|
||
|
The TOSA backend was the second backend that we added. It remains preferred by
|
||
|
many users (especially "hardware" or "hardware-adjacent" folks). Some of its characteristics are:
|
||
|
- It is tied to a [spec](https://www.mlplatform.org/tosa/tosa_spec.html) with a
|
||
|
really clear "ISA-like" expository style that resonates with a lot of folks
|
||
|
- The coarse-grained named-op approach is a good match for the many compilers
|
||
|
that are designed that way.
|
||
|
- It has really good support for quantization / integer data types.
|
||
|
- It has clear versioning/stability guarantees on the op semantics.
|
||
|
- It is extremely solid with static shapes (and many of its users only care
|
||
|
about static shapes, so that's fine).
|
||
|
|
||
|
### The MHLO Backend
|
||
|
|
||
|
Code: https://github.com/llvm/torch-mlir/tree/main/lib/Conversion/TorchToMhlo
|
||
|
|
||
|
The MHLO backend was the third backend that we added, and it offers a reasonable
|
||
|
blend of the benefits of the other two.
|
||
|
- It is a coarse-grained named-op approach.
|
||
|
- It has a pretty clear spec for most of the ops (with a bit of mental
|
||
|
translation and hoping that MHLO is the same as HLO):
|
||
|
https://www.tensorflow.org/xla/operation_semantics
|
||
|
- It functionally supports dynamic shapes (though not as coherent and consistent
|
||
|
as Linalg-on-Tensors, and the dynamic shape support falls outside the
|
||
|
wonderful HLO docs above).
|
||
|
- It appears to be pretty tied to HLO (which is highly mature) so most of the op
|
||
|
surface area doesn't change too much.
|
||
|
- It has a different set of principles than TOSA which tend to make it more
|
||
|
expressive at the cost of having a larger abstraction gap from hardware. For
|
||
|
example, TOSA limits (for highly considered reasons) the number of dimensions
|
||
|
that certain operators can handle to 1D-4D, when from a purely algebraic
|
||
|
perspective there isn't a good reason to not be more general. Similarly, more
|
||
|
general forms of reduction and scatter also fall into MHLO nicely while
|
||
|
TOSA's principles tend to bias it away from that.
|
||
|
|
||
|
### Backend Implementation
|
||
|
|
||
|
All the backends are implemented using the MLIR [Dialect Conversion
|
||
|
infrastructure](https://mlir.llvm.org/docs/DialectConversion/). This involves
|
||
|
converting the `torch` dialect types to other types, so we closely follow the
|
||
|
principes from the "Type Conversions the Not-So-Hard Way" talk
|
||
|
([slides](https://drive.google.com/file/d/1FVbzCXxZzS9LBLuvpPNLWJD-XDkt54ky/view?usp=sharing),
|
||
|
[recording](https://drive.google.com/file/d/1VfVajitgf8ZPnd-HRkJvaJiFLhBsluXN/view?usp=sharing)).
|
||
|
We follow the standard `{include,lib}/Conversion/TorchTo*` convention used in
|
||
|
MLIR for conversion passes.
|
||
|
|
||
|
For type conversion, we provide
|
||
|
[BackendTypeConversion.cpp](https://github.com/llvm/torch-mlir/blob/57681f794764a34c34e2be7f07f7dfbcafa683c1/lib/Dialect/TorchConversion/Transforms/BackendTypeConversion.cpp#L1)
|
||
|
and
|
||
|
[BackendTypeConversionPasses.cpp](https://github.com/llvm/torch-mlir/blob/57681f794764a34c34e2be7f07f7dfbcafa683c1/lib/Dialect/TorchConversion/Transforms/BackendTypeConversionPasses.cpp#L1)
|
||
|
which provide a default conversion from `torch` dialect types to the builtin
|
||
|
`tensor` type and scalar integer/float types. These are not the right choice for
|
||
|
all backends, but can be copied and adapted by backends. These files closely
|
||
|
follow the "Type Conversions the Not-So-Hard Way" talk.
|
||
|
|
||
|
|
||
|
## Testing
|
||
|
|
||
|
See
|
||
|
[development.md](https://github.com/llvm/torch-mlir/blob/9c8b96272057f4f8210de5842b6952228434cfa2/development.md#testing)
|
||
|
for more details on running tests.
|
||
|
|
||
|
Torch-MLIR has two types of tests:
|
||
|
|
||
|
1. End-to-end execution tests. These compile and run a program and check the
|
||
|
result against the expected output from execution on native Torch. These use
|
||
|
a homegrown testing framework (see
|
||
|
[framework.py](https://github.com/llvm/torch-mlir/blob/7d4a0d0e2b65c7ce8de19993f3b10ad5344fe32b/python/torch_mlir_e2e_test/torchscript/framework.py#L6))
|
||
|
and the test suite lives at `python/torch_mlir_e2e_test/test_suite`.
|
||
|
|
||
|
2. Compiler and Python API unit tests. These use LLVM's `lit` testing framework.
|
||
|
For example, these might involve using `torch-mlir-opt` to run a pass and
|
||
|
check the output with `FileCheck`. `lit` is flexible enough to unit test
|
||
|
various Python pieces, importers, and LTC this way as well.
|
||
|
|
||
|
### Why so much end-to-end testing?
|
||
|
|
||
|
Torch-MLIR places a heavy emphasis on end-to-end testing for the following reasons:
|
||
|
|
||
|
Reason 1: Even if a compiler pass produces the output IR that the author
|
||
|
expected, that output IR may not correctly implement the semantics of the op.
|
||
|
This is especially true for complex, often-poorly-specified deep learning
|
||
|
operators that Torch-MLIR is mainly concerned with. It is critical to run these
|
||
|
against the source of truth to ensure correct implementation.
|
||
|
|
||
|
Reason 2: There are many patterns in Torch-MLIR's backends that really just
|
||
|
expand one op into other ops without any real logic. When we started Torch-MLIR,
|
||
|
we were very religious about always having `.mlir` unit tests even for these
|
||
|
"macro expansion" patterns, but we found that these tests 1) Never caught a bug
|
||
|
2) Interfered with refactoring / caused spurious extra work (changing op syntax,
|
||
|
etc.). There is not much point to having a bunch of tests like this, which are
|
||
|
basically just rewriting the builder calls in a different syntax:
|
||
|
|
||
|
```
|
||
|
// MyPass.cpp
|
||
|
b.create<FooOp>(...)
|
||
|
b.create<BarOp>(...)
|
||
|
|
||
|
// test.mlir
|
||
|
// CHECK: foo
|
||
|
// CHECK: bar
|
||
|
```
|
||
|
|
||
|
Such a test is simply checking that the implementation of an op is the way it
|
||
|
is. There is no way to change the implementation while having the test pass. So
|
||
|
the test is fully redundant with the implementation.
|
||
|
|
||
|
Because of this, many Torch-MLIR patches adding support for new ops have no
|
||
|
`.mlir` unit tests, and only include end-to-end test(s). We generally make sure
|
||
|
that our end-to-end tests are as targeted as possible. As a result, when
|
||
|
debugging end-to-end test failures, the resulting reproducers (which our test
|
||
|
framework automaticaly produces for failures) are usually already fully reduced
|
||
|
test cases.
|
||
|
|
||
|
### Do's and Don'ts for unit vs end-to-end testing.
|
||
|
|
||
|
DO use an end-to-end test if you are implementing a new op or extending the
|
||
|
support for an existing op.
|
||
|
|
||
|
DO use a unit test if your lowering for an op has multiple cases / logic. This
|
||
|
also helps future maintainers of the lowering to see in one place all the
|
||
|
different edge cases of the op that you had to handle. (these can be easily
|
||
|
reduced out of all the end-to-end tests you added).
|
||
|
|
||
|
DON'T use a unit test if your lowering pattern could be described as a trivial
|
||
|
"macro expansion" of one op into another op or set of ops. That is, if you feel
|
||
|
like your unit test is just rewriting `b.create<...>(...)` into `CHECK: ...`
|
||
|
then it is probably not a useful unit test.
|
||
|
|
||
|
DON'T add a unit test for trivial changes to RefineTypes.
|
||
|
|
||
|
With the exceptions above, all changes should include appropriate unit tests, as
|
||
|
is standard in the LLVM and MLIR community. This includes full coverage of all
|
||
|
canonicalizations, pretty printing, passes, errors, and diagnostics.
|
||
|
|
||
|
### The RefBackend (Reference Backend)
|
||
|
|
||
|
In order to run end-to-end tests, Torch-MLIR needs an end-to-end flow.
|
||
|
Thankfully, upstream MLIR has just enough pieces to precariously put one
|
||
|
together that is enough for testing.
|
||
|
|
||
|
The RefBackend consists of a few minor
|
||
|
[C++ passes](https://github.com/llvm/torch-mlir/blob/114f48e96c578ee76a6f83b3aa4aa229a8d5b76e/include/torch-mlir/RefBackend/Passes.td#L1)
|
||
|
filling in some corners missing upstream and
|
||
|
[Python glue logic](https://github.com/llvm/torch-mlir/blob/114f48e96c578ee76a6f83b3aa4aa229a8d5b76e/python/torch_mlir_e2e_test/linalg_on_tensors_bakends/refbackend.py#L1)
|
||
|
to pull together upstream functionality into a working system.
|
||
|
|
||
|
The RefBackend accepts Linalg-on-Tensors as input. It mainly just bufferizes the
|
||
|
ops and lowers them to loops. Note that TOSA and MHLO support lowering to
|
||
|
Linalg-on-Tensors, so all our end-to-end testing bottoms out on RefBackend.
|
||
|
|
||
|
The RefBackend is absolutely not suitable for any production use case. It leaks
|
||
|
memory, doesn't support any error handling, performs no optimizations, and
|
||
|
probably a bunch of other horrible things. We are patiently awaiting for the
|
||
|
upstream MLIR community to produce a viable end-to-end flow with better
|
||
|
characteristics.
|