torch-mlir/lib/Conversion/TorchToLinalg/TorchToLinalg.cpp

461 lines
20 KiB
C++
Raw Normal View History

//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
#include "npcomp/Conversion/TorchToLinalg/TorchToLinalg.h"
#include "../PassDetail.h"
#include "mlir/Dialect/Linalg/IR/LinalgOps.h"
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
#include "mlir/Dialect/Math/IR/Math.h"
#include "mlir/Dialect/MemRef/IR/MemRef.h" // TODO: For `memref.dim`.
#include "mlir/Dialect/Traits.h"
#include "mlir/Transforms/DialectConversion.h"
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
#include "npcomp/Dialect/Torch/IR/TorchOps.h"
#include "npcomp/Dialect/Torch/Transforms/BackendTypeConversion.h"
using namespace mlir;
using namespace mlir::NPCOMP;
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
using namespace mlir::NPCOMP::Torch;
// -----------------------------------------------------------------------------
// Patterns (as this grows, it should be organized into multiple files)
// -----------------------------------------------------------------------------
// This is going to eventually be O(#aten ops), which is in the 100s.
//
// Most of these patterns consist of:
// 1. Checking that the operand/result types and other static properties are
// good-enough to create a valid linalg op (such as operands being of
// ranks/dtypes acceptable to the linalg op).
// 2. Creating dynamic error guards, usually checking a predicate on the
// compatibility of operand shapes.
// 3. Creating init tensors for the computation op. Usually this involves
// reifying IR for a shape transfer function based on the operand shapes.
// 4. Creating a named linalg op to replace the original op.
//
// TODO: Use linalg OpDSL to autogenerate at least 1)/2)/3) such
// that these patterns become mostly mechanical associations of
// "aten.foo -> linalg.foo".
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
static LogicalResult verifyLinalgCompatibleTypes(Operation *op,
PatternRewriter &rewriter) {
// Check the value tensor is ranked as expected by Linalg.
// TODO: Remove this check but use a separate verification pass to verify the
// invariants expected by later passes.
auto isValidLinalgType = [](Type type) {
auto tensor = type.dyn_cast<ValueTensorType>();
return !tensor ||
tensor.toBuiltinTensor().dyn_cast_or_null<RankedTensorType>();
};
bool valid = llvm::all_of(op->getOperandTypes(), isValidLinalgType) &&
llvm::all_of(op->getResultTypes(), isValidLinalgType);
if (!valid)
return rewriter.notifyMatchFailure(op, "type cannot be lowered to linalg");
return success();
}
namespace {
class ConvertAtenBatchNormOp : public OpConversionPattern<AtenBatchNormOp> {
public:
using OpConversionPattern::OpConversionPattern;
LogicalResult
matchAndRewrite(AtenBatchNormOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {
AtenBatchNormOp::Adaptor adaptor(operands);
MLIRContext *context = op->getContext();
Location loc = op->getLoc();
Value input = adaptor.input();
Value weight = adaptor.weight();
Value bias = adaptor.bias();
Value runningMean = adaptor.running_mean();
Value runningVar = adaptor.running_var();
Value training = adaptor.training();
Value eps = adaptor.eps();
// TODO: Handle the None cases for the optional parameters:
// weight, bias, running_mean, running_var.
if (failed(verifyLinalgCompatibleTypes(op, rewriter)))
return failure();
auto inputType = input.getType().cast<RankedTensorType>();
auto weightType = weight.getType().cast<RankedTensorType>();
auto biasType = bias.getType().cast<RankedTensorType>();
auto runningMeanType = runningMean.getType().cast<RankedTensorType>();
auto runningVarType = runningVar.getType().cast<RankedTensorType>();
auto inputRank = inputType.getRank();
if (inputRank <= 2)
return rewriter.notifyMatchFailure(
op, "input should have rank larger than 2");
if (weightType.getRank() != 1 || biasType.getRank() != 1 ||
runningMeanType.getRank() != 1 || runningVarType.getRank() != 1) {
return rewriter.notifyMatchFailure(
op, "expect weight, bias, running_mean and running_var to be rank 1");
}
// TODO: Add support for training.
auto constFalse = rewriter.create<ConstantOp>(
loc, IntegerAttr::get(IntegerType::get(context, 1), 0));
auto trainingFalse =
rewriter.create<CmpIOp>(loc, CmpIPredicate::eq, training, constFalse);
rewriter.create<AssertOp>(
loc, trainingFalse,
rewriter.getStringAttr("training is not supported for now"));
// num_features C from an expected input of size (N,C,D,H,W ...)
Value numFeatures = rewriter.create<memref::DimOp>(loc, input, 1);
auto contractingDim0EqualsNumFeatures = [&](Value v) {
auto dim0 = rewriter.create<memref::DimOp>(loc, v, 0);
auto dim0Equal =
rewriter.create<CmpIOp>(loc, CmpIPredicate::eq, numFeatures, dim0);
rewriter.create<AssertOp>(
loc, dim0Equal,
rewriter.getStringAttr(
"expect the size of dim 0 equal to the number of features"));
};
contractingDim0EqualsNumFeatures(weight);
contractingDim0EqualsNumFeatures(bias);
contractingDim0EqualsNumFeatures(runningMean);
contractingDim0EqualsNumFeatures(runningVar);
auto indexingMap = AffineMap::get(
/*dimCount=*/inputRank,
/*symbolCount=*/0, rewriter.getAffineDimExpr(1), context);
SmallVector<AffineMap> indexingMaps = {
rewriter.getMultiDimIdentityMap(inputRank), // input
indexingMap, // weight
indexingMap, // bias
indexingMap, // runningMean
indexingMap, // runningVar
rewriter.getMultiDimIdentityMap(inputRank), // output
};
SmallVector<StringRef> iteratorTypes(inputRank, "parallel");
Value batchNorm =
rewriter
.create<linalg::GenericOp>(
loc, input.getType(),
ValueRange{input, weight, bias, runningMean, runningVar}, input,
/*indexingMaps=*/indexingMaps,
/*iteratorTypes=*/iteratorTypes,
[&](OpBuilder &b, Location loc, ValueRange args) {
Value input = args[0], weight = args[1], bias = args[2],
mean = args[3], var = args[4];
// ((input - mean) / sqrt(var + eps)) * weight + bias
Value inputSubMean = b.create<SubFOp>(loc, input, mean);
// The eps is always f64.
Value truncatedEps =
b.create<FPTruncOp>(loc, var.getType(), eps);
Value varPlusEps = b.create<AddFOp>(loc, var, truncatedEps);
Value rSTD = b.create<math::RsqrtOp>(loc, varPlusEps);
Value temp = b.create<MulFOp>(loc, inputSubMean, rSTD);
Value timesWeight = b.create<MulFOp>(loc, temp, weight);
Value plusBias = b.create<AddFOp>(loc, timesWeight, bias);
b.create<linalg::YieldOp>(loc, plusBias);
})
.getResult(0);
Type newResultType = getTypeConverter()->convertType(op.getType());
rewriter.replaceOpWithNewOp<tensor::CastOp>(op, newResultType, batchNorm);
return success();
}
};
} // namespace
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
namespace {
class ConvertAtenMmOp : public OpConversionPattern<AtenMmOp> {
public:
using OpConversionPattern::OpConversionPattern;
LogicalResult
matchAndRewrite(AtenMmOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {
Location loc = op->getLoc();
Value lhs = operands[0];
Value rhs = operands[1];
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
// A user can write an errorneous program where `aten.mm` is in fact called
// with operands of invalid rank or dtype. We cannot convert to linalg in
// this case or we will get a verifier error, which corresponds to breaking
// of *internal* compiler invariants, and for a user manifests as a compiler
// crash in the worst case (such as we try to canonicalize/fold/print the
// invalid op before the verifier gets to see it -- also release builds of a
// mature copmiler usually have the verifier turned off for compile time
// reasons).
//
// The compiler cannot crash even if the user wrote an erroneous program!
if (failed(verifyLinalgCompatibleTypes(op, rewriter)))
return failure();
if (lhs.getType().cast<RankedTensorType>().getRank() != 2 ||
rhs.getType().cast<RankedTensorType>().getRank() != 2) {
return rewriter.notifyMatchFailure(
op, "expected both operands to aten.mm to be rank 2");
}
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
Value lhsDim0 = rewriter.create<memref::DimOp>(loc, lhs, 0);
Value lhsDim1 = rewriter.create<memref::DimOp>(loc, lhs, 1);
Value rhsDim0 = rewriter.create<memref::DimOp>(loc, rhs, 0);
Value rhsDim1 = rewriter.create<memref::DimOp>(loc, rhs, 1);
Value contractingDimEqual =
rewriter.create<CmpIOp>(loc, CmpIPredicate::eq, lhsDim1, rhsDim0);
rewriter.create<AssertOp>(
loc, contractingDimEqual,
rewriter.getStringAttr(
"mismatching contracting dimension for torch.aten.mm"));
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
Type newResultType = getTypeConverter()->convertType(op.getType());
Type elementType = newResultType.cast<TensorType>().getElementType();
Value initTensor = rewriter.create<linalg::InitTensorOp>(
loc, ValueRange{lhsDim0, rhsDim1}, elementType);
Value c0 =
rewriter.create<ConstantOp>(loc, FloatAttr::get(elementType, 0.0));
Value zeroFill =
rewriter.create<linalg::FillOp>(loc, c0, initTensor).getResult(0);
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
Value matmul = rewriter
.create<linalg::MatmulOp>(loc, zeroFill.getType(),
ValueRange{lhs, rhs}, zeroFill)
.getResult(0);
// When constructed with just dynamic sizes, InitTensorOp will have a result
// type which has all `?`'s for dimensions, which might not be the result
// type of `op`. The constraints on later linalg ops means that the result
// of the MatmulOp will have this type too. So cast it to the desired type
// so that in the end we have the original result type.
rewriter.replaceOpWithNewOp<tensor::CastOp>(op, newResultType, matmul);
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
return success();
}
};
} // namespace
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
namespace {
// See comments at in convertMmOp and the heading for this section for general
// considerations. This function needs to be auto-generated.
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
class ConvertAtenLinearOp : public OpConversionPattern<AtenLinearOp> {
public:
using OpConversionPattern::OpConversionPattern;
LogicalResult
matchAndRewrite(AtenLinearOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {
AtenLinearOp::Adaptor adaptor(operands);
MLIRContext *context = op->getContext();
Location loc = op->getLoc();
Value input = adaptor.input();
Value weight = adaptor.weight();
Value bias = adaptor.bias();
// TODO: Handle the case of bias being None (bias is optional).
if (failed(verifyLinalgCompatibleTypes(op, rewriter)))
return failure();
auto inputType = input.getType().cast<RankedTensorType>();
auto weightType = weight.getType().cast<RankedTensorType>();
auto biasType = bias.getType().cast<RankedTensorType>();
// Only handle the case of rank 2 `input` for now.
// TODO: Insert the appropriate reshape to collapse any leading dimensions.
if (inputType.getRank() != 2 || weightType.getRank() != 2 ||
biasType.getRank() != 1) {
return rewriter.notifyMatchFailure(
op,
"expected both input and weight to be rank 2 and bias to be rank 1");
}
// TODO: Handle type promotion. What are ATen's promotion rules?
if (inputType.getElementType() != weightType.getElementType() ||
inputType.getElementType() != biasType.getElementType()) {
return rewriter.notifyMatchFailure(op, "unimplemented: type promotion");
}
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
// TODO: We can handle a static size 1 here at some complexity cost, but the
// dynamic case is not representable in linalg. We don't handle either for
// now. Biases are generally statically shaped for most models (since for
// inference they are constants, and for training they don't change shape
// typically), so this is not too constraining.
auto biasSize = bias.getType().cast<RankedTensorType>().getShape()[0];
if (biasSize == 1 || biasSize == ShapedType::kDynamicSize)
return rewriter.notifyMatchFailure(
op, "unimplemented: size-1 broadcasting for aten::LinearOp");
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
auto getDimOp = [&](Value v, int dimension) {
return rewriter.create<memref::DimOp>(loc, v, dimension);
};
Value inputDim0 = getDimOp(input, 0);
Value inputDim1 = getDimOp(input, 1);
Value weightDim0 = getDimOp(weight, 0);
Value weightDim1 = getDimOp(weight, 1);
Value biasDim0 = getDimOp(bias, 0);
Value contractingDimEqual =
rewriter.create<CmpIOp>(loc, CmpIPredicate::eq, inputDim1, weightDim1);
rewriter.create<AssertOp>(
loc, contractingDimEqual,
rewriter.getStringAttr(
"mismatching contracting dimension for aten.linear"));
// Here we take advantage of ruling out the size-1 case above.
// In the static-size-1 case, we will not emit this check at all.
Value biasSizeCorrect =
rewriter.create<CmpIOp>(loc, CmpIPredicate::eq, weightDim0, biasDim0);
rewriter.create<AssertOp>(
loc, biasSizeCorrect,
rewriter.getStringAttr("mismatching bias size for aten.linear"));
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
Value initTensor = rewriter.create<linalg::InitTensorOp>(
loc, ValueRange{inputDim0, weightDim0}, inputType.getElementType());
SmallVector<AffineMap> broadcastIndexingMaps = {
AffineMap::get(
/*dimCount=*/2, /*symbolCount=*/0, rewriter.getAffineDimExpr(1)),
rewriter.getMultiDimIdentityMap(2)};
SmallVector<StringRef> iteratorTypes(2, "parallel");
Value broadcasted =
rewriter
.create<linalg::GenericOp>(
loc, initTensor.getType(), bias, initTensor,
/*indexingMaps=*/broadcastIndexingMaps,
/*iteratorTypes=*/iteratorTypes,
[](OpBuilder &b, Location loc, ValueRange args) {
b.create<linalg::YieldOp>(loc, args[0]);
})
.getResult(0);
// We need a matmul with dimension ordering (N, K) * (M, K), so transpose
// the weights to fit into linalg::MatmulOp which is (N, K) * (K, M).
// TODO: This whole aten.linear lowering should eventually be generated from
// a single linalg ODS generator statement. Both the bias and matmul part.
SmallVector<AffineMap> transposeIndexingMaps = {
AffineMap::get(
/*dimCount=*/2, /*symbolCount=*/0,
{rewriter.getAffineDimExpr(1), rewriter.getAffineDimExpr(0)},
context),
rewriter.getMultiDimIdentityMap(2)};
Value transposedWeightInitTensor = rewriter.create<linalg::InitTensorOp>(
loc, ValueRange{weightDim1, weightDim0}, weightType.getElementType());
Value transposedWeights =
rewriter
.create<linalg::GenericOp>(
loc, transposedWeightInitTensor.getType(), weight,
transposedWeightInitTensor,
/*indexingMaps=*/transposeIndexingMaps,
/*iteratorTypes=*/iteratorTypes,
[](OpBuilder &b, Location loc, ValueRange args) {
b.create<linalg::YieldOp>(loc, args[0]);
})
.getResult(0);
Value matmul = rewriter
.create<linalg::MatmulOp>(
loc, broadcasted.getType(),
ValueRange{input, transposedWeights}, broadcasted)
.getResult(0);
Type newResultType = getTypeConverter()->convertType(op.getType());
rewriter.replaceOpWithNewOp<tensor::CastOp>(op, newResultType, matmul);
return success();
}
};
} // namespace
2021-06-16 00:00:37 +08:00
static Value createScalarRelu(OpBuilder &b, Location loc, ValueRange args) {
Type elementType = args[0].getType();
// TODO: Add support for integer types.
assert(elementType.isa<::mlir::FloatType>() &&
"Only support float case for relu");
Value constZero = b.create<ConstantOp>(loc, FloatAttr::get(elementType, 0.0));
Value pred = b.create<CmpFOp>(loc, CmpFPredicate::UGT, args[0], constZero);
return b.create<SelectOp>(loc, pred, args[0], constZero);
}
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
namespace {
2021-06-16 00:00:37 +08:00
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
// Converts a unary op. There is no implicit broadcasting behavior, so these can
// be trivially lowered to linalg.
// TODO: For binary ops, we will need a "linalg.generic-like" op that models
// N-ary broadcasting and allows us to do multiversioning techniques for
// lowering to linalg. We can trivially handle this as through that
// abstraction instead.
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
struct ConvertUnaryOp : ConversionPattern {
ConvertUnaryOp(TypeConverter &typeConverter, MLIRContext *context)
: ConversionPattern(typeConverter, MatchAnyOpTypeTag(), /*benefit=*/1,
context) {}
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
LogicalResult
matchAndRewrite(Operation *op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override {
2021-06-16 00:00:37 +08:00
if (!isa<AtenTanhOp>(op) && !isa<AtenReluOp>(op))
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
return rewriter.notifyMatchFailure(op, "not a unary op");
if (failed(verifyLinalgCompatibleTypes(op, rewriter)))
return failure();
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
Value operand = operands[0];
auto type = getTypeConverter()
->convertType(op->getResult(0).getType())
.cast<RankedTensorType>();
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
auto rank = type.getRank();
SmallVector<StringRef> iteratorTypes(rank, "parallel");
SmallVector<AffineMap> indexingMaps = {
rewriter.getMultiDimIdentityMap(rank),
rewriter.getMultiDimIdentityMap(rank)};
rewriter.replaceOpWithNewOp<linalg::GenericOp>(
op, type, operand, operand,
/*indexingMaps=*/indexingMaps,
/*iteratorTypes=*/iteratorTypes,
[&](OpBuilder &b, Location loc, ValueRange args) {
Value result;
2021-06-16 00:00:37 +08:00
if (isa<AtenTanhOp>(op))
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
result = b.create<math::TanhOp>(loc, args[0]);
2021-06-16 00:00:37 +08:00
else if (isa<AtenReluOp>(op))
result = createScalarRelu(b, loc, args);
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
b.create<linalg::YieldOp>(loc, result);
});
return success();
}
};
} // namespace
// -----------------------------------------------------------------------------
// The pass
// -----------------------------------------------------------------------------
namespace {
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
class ConvertTorchToLinalg
: public ConvertTorchToLinalgBase<ConvertTorchToLinalg> {
public:
void getDependentDialects(DialectRegistry &registry) const override {
registry.insert<linalg::LinalgDialect>();
registry.insert<memref::MemRefDialect>();
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
registry.insert<math::MathDialect>();
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
registry.insert<StandardOpsDialect>();
registry.insert<tensor::TensorDialect>();
}
void runOnOperation() override {
MLIRContext *context = &getContext();
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
ConversionTarget target(*context);
target.addLegalDialect<linalg::LinalgDialect, StandardOpsDialect,
memref::MemRefDialect, math::MathDialect,
tensor::TensorDialect>();
TypeConverter typeConverter;
typeConverter.addConversion([](Type type) { return type; });
setupBackendTypeConversion(target, typeConverter);
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
RewritePatternSet patterns(context);
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
target.addIllegalOp<AtenMmOp>();
patterns.add<ConvertAtenMmOp>(typeConverter, context);
target.addIllegalOp<AtenLinearOp>();
patterns.add<ConvertAtenLinearOp>(typeConverter, context);
target.addIllegalOp<AtenBatchNormOp>();
patterns.add<ConvertAtenBatchNormOp>(typeConverter, context);
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
target.addIllegalOp<AtenTanhOp>();
patterns.add<ConvertUnaryOp>(typeConverter, context);
if (failed(applyPartialConversion(getOperation(), target,
std::move(patterns))))
return signalPassFailure();
}
};
} // namespace
std::unique_ptr<OperationPass<FuncOp>>
Significantly restructure torch/aten import design. This is a really major and invasive restructuring of the way we get torch operators (`torch::jit::Operator` / `c10::OperatorHandle`) into MLIR. Please forgive the challenging review, but due to the sheer invasiveness, it wasn't really practical do do it in sane smaller pieces. This fully replaces everything that was already working on the TorchScript path (actually, more -- we added tanh support to TorchToLinalg in order to delete the older code paths). Additionally, I've kept the lights on for the acap path too, including what little e2e stuff was working before (for expediency I made a few tiny compromises along the way that will be easy to undo when we give that path proper attention). Overview of the new design: - The torch operator `somens::someunqualname.someoverloadname` is imported as `torch.somens.someunqualname.someoverloadname` (skip the last dotted part if the overload name is empty), OR, if we don't have such an op registered, it is imported as `torch.operator "somens.someunqualname.someoverloadname" (...) : ...`. - The addition of the "overload name" is a critical element here, as the `(ns,unqual,overload)` triple is unique, which solves a lot of problems we were having. - This involves having separate MLIR ops for the `trailing_` and `.out` variants and all the different overloads. This seemed necessary, because the set of overloads is so wild and varied and unstructured. The previous design was leaning into some underlying structure that just isn't there -- the default situation is the "random overload that we want to manage on the MLIR side", rather than that being an exception. E.g. `aten::ne` (not-equal) has 21 overloads, only 4 of which are c10 dispatcher ops see [gist](https://gist.github.com/silvasean/190ba918c550c956260e21254e1b8aa1), and the "out" variant is really called `.Tensor_out` instead of `.out` as it frequently is for other ops. - Rationale for all being in `torch` namespace: the set of operators are so varied and unstructured that "dialect per namespace" doesn't result in anything resembling the typical MLIR dialect boundary expectations. We could maybe draw the boundary at dispatcher ops vs non-dispatcher ops, but that doesn't seem to really result in very much useful structure at this point in time. - Note: within the torch operator registry, we effectively have a mini-basicpy subdialect (already type-resolved), which is reasonably structured. - The existing Torch op interfaces are also removed -- now that we track the overload name, we can losslessly find the original operator. - Instead of `ATenRecognizeKernelsPass`, we now have a `ReduceOpVariantsPass` that keys off certain traits (and perhaps eventually interfaces) to reduce variants of ops to a smaller set, ideally operating on immutable tensors and using surrounding ops to model the mutability/aliasing aspects. - Note: `torch.ns.unqual.overload` ops allow both immutable and mutable tensors (unlike the previous hard distinction in the common case). This is a premonition for a future change that will introduce a bona fide `!torch.tensor` type that will clean up a bunch of stuff. - `TorchToLinalg` / `TorchToStd` supercede the existing "ATen->TCF->TCP->Linalg" path. - The new `torch_ods_gen.py` supercedes `torch_signature_ods_gen.py`. It should look somewhat familiar, but the benefit of hindsight has allowed a lot of simplifications. The overall trend seems to be to make the `torch` dialect a nice layer independent of anything else. It feels like as a natural result of various future changes we will be removing the reliance on basicpy+numpy dialects and have a nice self-contained type system too that properly models the TorchScript type system (including proper subtyping, mutable/immutable tensors, optional dtype, etc.). Recommended review order: - Start at some of the new import IR, e.g. in `frontends/pytorch/test/node_import/prim.py`, `frontends/pytorch/test/acap_export/test_export_add3.py`, and other tests. - `frontends/pytorch/python/torch_mlir_utils/codegen/torch_ods_gen.py` and associated generated files: - `include/npcomp/Dialect/Torch/IR/GeneratedAtenOps.td` - `include/npcomp/Dialect/Torch/IR/GeneratedPrimOps.td` - Inspect `ReduceOpVariants.cpp` / `reduce-op-variants.mlir` and the new traits in `include/npcomp/Dialect/Torch/IR/TorchTraits.h` - Various code changes in the import path in `frontends/pytorch/csrc/builder`. Probably most interesting is the new code in `torch_to_mlir_utils.cpp` that has the logic to create the `torch.operator` ops or `torch.ns.unqual.overload` ops. This is the [new ResNet IR](https://gist.github.com/silvasean/5407aafb710d07612b7b5b92eabecebe), just to be able to look at a substantial sample of IR in the new style.
2021-05-05 05:42:50 +08:00
mlir::NPCOMP::createConvertTorchToLinalgPass() {
return std::make_unique<ConvertTorchToLinalg>();
}