torch-mlir/lib/Dialect/Torch/Transforms/MaximizeValueSemantics.cpp

180 lines
7.3 KiB
C++
Raw Normal View History

Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
//===- MaximizeValueSemantics.cpp --------------------------------*- C++-*-===//
//
// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
// Also available under a BSD-style license. See LICENSE.
//
//===----------------------------------------------------------------------===//
#include "PassDetail.h"
#include "mlir/IR/Builders.h"
#include "mlir/IR/BuiltinOps.h"
#include "mlir/IR/PatternMatch.h"
#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
[torch-mlir earthmoving (1/N)] C/C++ code movement. This creates the `external/torch-mlir` directory as an LLVM_EXTERNAL_PROJECTS-compatible project (analogous to `iree-dialects`) and completes movement/rename of all pure MLIR C/C++ compiler code into there. The next step will be to move all the Python code / code that links/includes PyTorch C++ code (which currently lives in `frontends/pytorch`) into a subdirectory here. I call this "earthmoving" because it is mostly mechanical changes and renames. As a quick summary (we can change this down the road easily) - C++ `mlir::NPCOMP::Torch -> mlir::torch::Torch` - CAPI `npcompTorchListTypeGet -> torchMlirTorchListTypeGet` - preprocessor `#ifndef NPCOMP_ -> #ifndef TORCHMLIR_` - CMake `NPCOMPFoo -> TorchMLIRFoo` The goal of this is to create a standalone project creating a center of mass for entry into the MLIR ecosystem from PyTorch, suitable in scope for eventual inclusion/ownership in PyTorch. The idea is that `external/torch-mlir` will some day be pulled out into its own repository, and then npcomp will simply pull it in as a submodule. Layering-wise, what lives in `torch-mlir` lowers code from PyTorch (currently TorchScript, but TorchFX or pytorch/xla-style tracing are possible extensions) down to what we have been calling the "Torch backend contract" which is cleaned up IR (inlining, simplifcation, conversion to value tensors, ...) entirely in the `torch` dialect. This is the branching off point for further lowering, of which npcomp takes one opinion (outside `torch-mlir` of course!), namely the `TorchConversion` dialect/transforms which lower to IR suitable for IREE and other linalg-on-tensors based lower-level compilers. Summary of changes: - move `{include,lib,test}/Dialect/Torch` into `torch-mlir` - move relevant parts of CAPI into `torch-mlir`. - leave a few things related to the `torch-mlir` Python build commented out, which should be resolved in a subsequent change.
2021-09-10 03:24:10 +08:00
#include "torch-mlir/Dialect/Torch/IR/TorchOps.h"
#include "torch-mlir/Dialect/Torch/Transforms/Passes.h"
using namespace mlir;
[torch-mlir earthmoving (1/N)] C/C++ code movement. This creates the `external/torch-mlir` directory as an LLVM_EXTERNAL_PROJECTS-compatible project (analogous to `iree-dialects`) and completes movement/rename of all pure MLIR C/C++ compiler code into there. The next step will be to move all the Python code / code that links/includes PyTorch C++ code (which currently lives in `frontends/pytorch`) into a subdirectory here. I call this "earthmoving" because it is mostly mechanical changes and renames. As a quick summary (we can change this down the road easily) - C++ `mlir::NPCOMP::Torch -> mlir::torch::Torch` - CAPI `npcompTorchListTypeGet -> torchMlirTorchListTypeGet` - preprocessor `#ifndef NPCOMP_ -> #ifndef TORCHMLIR_` - CMake `NPCOMPFoo -> TorchMLIRFoo` The goal of this is to create a standalone project creating a center of mass for entry into the MLIR ecosystem from PyTorch, suitable in scope for eventual inclusion/ownership in PyTorch. The idea is that `external/torch-mlir` will some day be pulled out into its own repository, and then npcomp will simply pull it in as a submodule. Layering-wise, what lives in `torch-mlir` lowers code from PyTorch (currently TorchScript, but TorchFX or pytorch/xla-style tracing are possible extensions) down to what we have been calling the "Torch backend contract" which is cleaned up IR (inlining, simplifcation, conversion to value tensors, ...) entirely in the `torch` dialect. This is the branching off point for further lowering, of which npcomp takes one opinion (outside `torch-mlir` of course!), namely the `TorchConversion` dialect/transforms which lower to IR suitable for IREE and other linalg-on-tensors based lower-level compilers. Summary of changes: - move `{include,lib,test}/Dialect/Torch` into `torch-mlir` - move relevant parts of CAPI into `torch-mlir`. - leave a few things related to the `torch-mlir` Python build commented out, which should be resolved in a subsequent change.
2021-09-10 03:24:10 +08:00
using namespace mlir::torch;
using namespace mlir::torch::Torch;
static bool isViewLikeOp(Operation *op) {
// AtenContiguousOp might return a view, so this is conservatively
// correct. We could potentially be more precise and identify the cases
// that it does not return a view and treat those as having value
// semantics.
return isa<AtenBroadcastToOp, AtenContiguousOp, AtenExpandOp,
AtenFlattenUsingIntsOp, AtenPermuteOp, AtenReshapeOp,
AtenSelectIntOp, AtenSliceTensorOp, AtenSqueezeDimOp,
AtenSqueezeOp, AtenTOp, AtenToDtypeOp, AtenTransposeIntOp,
AtenUnsqueezeOp, AtenViewOp, TensorStaticInfoCastOp>(op);
}
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
namespace {
class AbstractlyInterpretCopyToNonValueTensorOpUsersWithinABlock
: public OpRewritePattern<CopyToNonValueTensorOp> {
public:
using OpRewritePattern::OpRewritePattern;
LogicalResult matchAndRewrite(CopyToNonValueTensorOp copy,
PatternRewriter &rewriter) const override {
SmallVector<Operation *> users;
bool foundNonViewLikeOpUser = false;
// See if our limited form of analysis is even applicatble.
for (Operation *user : copy.getResult().getUsers()) {
// We can only analyze within a single basic block.
if (user->getBlock() != copy->getBlock())
return failure();
// We can only analyze these ops or view-like ops.
if (isa<CopyToValueTensorOp, OverwriteTensorContentsOp>(user))
foundNonViewLikeOpUser = true;
else if (!isViewLikeOp(user))
return failure();
users.push_back(user);
}
// If all users found are view-like ops, then there is nothing to do
// here. The `RewriteViewLikeSubgraph` will take care of turning
// these ops into ops with value semantics.
if (!foundNonViewLikeOpUser)
return failure();
// Sort by order in the block, so we can abstractly interpret the ops.
llvm::sort(users, [](Operation *lhs, Operation *rhs) {
return lhs->isBeforeInBlock(rhs);
});
// Do an abstract interpretation within the block.
// We track the current value tensor that holds the same contents as the
// non-value tensor at each program point as we walk forward.
Value currentlyHeldValueTensor = copy.getOperand();
for (Operation *user : users) {
if (auto copyToValueTensor = dyn_cast<CopyToValueTensorOp>(user)) {
rewriter.replaceOp(copyToValueTensor, {currentlyHeldValueTensor});
} else if (auto overwriteTensorContents =
dyn_cast<OverwriteTensorContentsOp>(user)) {
currentlyHeldValueTensor = overwriteTensorContents.value();
rewriter.eraseOp(overwriteTensorContents);
} else if (isViewLikeOp(user)) {
// This case currently only handles view-like ops that have one tensor
// input and one tensor output.
//
// The goal here is to transform view-like ops that depend on an
// overwritten tensor into ops that don't, so that the `overwrite` op
// can be removed.
Location loc = user->getLoc();
Type currentlyHeldValueType = currentlyHeldValueTensor.getType()
.dyn_cast<ValueTensorType>()
.getWithoutValueSemantics();
{
PatternRewriter::InsertionGuard guard(rewriter);
rewriter.setInsertionPoint(user);
Value newInput = rewriter.create<CopyToNonValueTensorOp>(
loc, currentlyHeldValueType, currentlyHeldValueTensor);
user->setOperands(/*start*/0, /*length*/1, {newInput});
}
} else {
llvm_unreachable("only those ops supported!");
}
}
rewriter.eraseOp(copy);
return success();
}
};
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
} // namespace
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
namespace {
// Calculate a forward slice starting from a CopyToNonValueTensorOp
// and ending at CopyToValueTensorOp's. If all intervening ops
// are just view-like operations (i.e. no mutation), then we can trivially
// convert them all to value semantics.
class RewriteViewLikeSubgraph
: public OpRewritePattern<CopyToNonValueTensorOp> {
public:
using OpRewritePattern::OpRewritePattern;
LogicalResult matchAndRewrite(CopyToNonValueTensorOp copy,
PatternRewriter &rewriter) const override {
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// Find a subgraph starting with this CopyToNonValueTensorOp, and
// terminating at CopyToValueTensorOp's, possibly with intervening view-like
// ops.
// This also catches the special case of a CopyToNonValueTensorOp that
// trivially feeds into CopyToValueTensorOp's.
SmallVector<Operation *> viewLikeOps;
SmallVector<CopyToValueTensorOp> copyToValueTensorOps;
auto workList = llvm::to_vector<6>(copy.getResult().getUsers());
// We currently only support view-like ops with one tensor input and one
// tensor output, meaning that the tensor use-def chains form a tree.
// This will not be the case for an op like `torch.aten.view_as`, so
// we will need to add a set to prune duplicate visitation.
while (!workList.empty()) {
Operation *op = workList.pop_back_val();
if (auto copyToValueTensor = dyn_cast<CopyToValueTensorOp>(op)) {
copyToValueTensorOps.push_back(copyToValueTensor);
} else if (isViewLikeOp(op)) {
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
viewLikeOps.push_back(op);
llvm::append_range(workList, op->getResult(0).getUsers());
} else {
return rewriter.notifyMatchFailure(
copy, "can only handle these transitive user ops");
}
}
copy.replaceAllUsesWith(copy.getOperand());
for (CopyToValueTensorOp op : copyToValueTensorOps)
rewriter.replaceOp(op, op.getOperand());
for (Operation *op : viewLikeOps) {
rewriter.updateRootInPlace(op, [&]() {
if (auto nonValueTensorType =
op->getResult(0).getType().dyn_cast<NonValueTensorType>()) {
op->getResult(0).setType(nonValueTensorType.getWithValueSemantics());
}
});
}
return success();
}
};
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
} // namespace
namespace {
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
class MaximizeValueSemanticsPass
: public MaximizeValueSemanticsBase<MaximizeValueSemanticsPass> {
void runOnOperation() override {
MLIRContext *context = &getContext();
auto func = getOperation();
RewritePatternSet patterns(context);
patterns.insert<AbstractlyInterpretCopyToNonValueTensorOpUsersWithinABlock,
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
RewriteViewLikeSubgraph>(context);
(void)applyPatternsAndFoldGreedily(func, std::move(patterns));
}
};
} // namespace
std::unique_ptr<OperationPass<FuncOp>>
[torch-mlir earthmoving (1/N)] C/C++ code movement. This creates the `external/torch-mlir` directory as an LLVM_EXTERNAL_PROJECTS-compatible project (analogous to `iree-dialects`) and completes movement/rename of all pure MLIR C/C++ compiler code into there. The next step will be to move all the Python code / code that links/includes PyTorch C++ code (which currently lives in `frontends/pytorch`) into a subdirectory here. I call this "earthmoving" because it is mostly mechanical changes and renames. As a quick summary (we can change this down the road easily) - C++ `mlir::NPCOMP::Torch -> mlir::torch::Torch` - CAPI `npcompTorchListTypeGet -> torchMlirTorchListTypeGet` - preprocessor `#ifndef NPCOMP_ -> #ifndef TORCHMLIR_` - CMake `NPCOMPFoo -> TorchMLIRFoo` The goal of this is to create a standalone project creating a center of mass for entry into the MLIR ecosystem from PyTorch, suitable in scope for eventual inclusion/ownership in PyTorch. The idea is that `external/torch-mlir` will some day be pulled out into its own repository, and then npcomp will simply pull it in as a submodule. Layering-wise, what lives in `torch-mlir` lowers code from PyTorch (currently TorchScript, but TorchFX or pytorch/xla-style tracing are possible extensions) down to what we have been calling the "Torch backend contract" which is cleaned up IR (inlining, simplifcation, conversion to value tensors, ...) entirely in the `torch` dialect. This is the branching off point for further lowering, of which npcomp takes one opinion (outside `torch-mlir` of course!), namely the `TorchConversion` dialect/transforms which lower to IR suitable for IREE and other linalg-on-tensors based lower-level compilers. Summary of changes: - move `{include,lib,test}/Dialect/Torch` into `torch-mlir` - move relevant parts of CAPI into `torch-mlir`. - leave a few things related to the `torch-mlir` Python build commented out, which should be resolved in a subsequent change.
2021-09-10 03:24:10 +08:00
mlir::torch::Torch::createMaximizeValueSemanticsPass() {
Introduce `!torch.tensor` / `!torch.vtensor` types. This removes our reliance on the numpy dialect and avoids our off-label use of the builtin tnesor type for modeling unknown dtypes. The `!torch.vtensor` (`ValueTensorType`) type is a value-semantic tensor. The `!torch.tensor` (`NonValueTensorType`) type is a non-value-semantic tensor. The new types look as follows syntactically: ``` // Least-static-information, non-value-semantic tensor. !torch.tensor // Explicit form of least-static-information variant. !torch.tensor<*,unk> // Least-static-information, value-semantic tensor. !torch.vtensor // Explicit form of least-static-information variant. !torch.vtensor<*,unk> // Fixed-set of allowable element types, with first-class support for // Torch's frontend signedness semantics. !torch.tensor<*,si32> // First-class support for unknown dtypes. !torch.tensor<[?,?,?],unk> // Standard MLIR representation of `?` for unknown dimensions. !torch.tensor<[?,2,?,4],unk> // Statically shaped / dtyped example. !torch.vtensor<[1,2,3,4],f32> ``` This required fairly significant changes throughout the compiler, but overall it is a big cleanup. We now have a much clearer layering of "the Torch frontend lowering" vs "lowering to std + linalg + etc.". At the C++ level, there is `ValueTensorType`, `NonValueTensorType`. We also have a helper `BaseTensorType` (kind of like ShapedType) which interoperates with those two. Included changes: - New `torch.tensor(dense<0.0> : tensor<5xf32>) : !torch.tensor` op for creating torch tensor literals in the frontend. - Consistently use signedness for the types (except i1 which I didn't touch -- we need to sort out the situation with !basicpy.BoolType there anyway so will be attending to that soon) - Frontend can annotate whether an argument to the function has value semantics. We currently require this, as our backend contract does not currently allow us to even model the non-value-semantic case. Before, the value-semantic assumption was randomly injected in the middle of the pass pipeline. - Move ArrayToTensor (now called MaximizeValueSemantics) and RefinePublicReturn passes to torch dialect. - The TorchToStd and TorchToLinalg passes are now type conversions from `!torch.vtensor` to `tensor` and use the dialect conversion infra. The overall conversion pipeline is set up following the best practices of the "Type Conversions the Not-So-Hard Way" talk. This required introducing `torch-func-builtin-tensorize` and `torch-finalizing-builtin-tensorize` passes analogous to the upstream bufferization passes with the corresponding names (mostly just copypasta from there). - Misc Torch-level canonicalizations -- we now cleanly layer the lowering to std later in the pipeline, so we are gradually lessening our reliance on random std constant folding before we get to that point. Recommended review order: - New types in TorchTypes.td/TorchTypes.h/TorchDialect.cpp - New ops in TorchOps.td / TorchOps.cpp - Less important / more mechanical stuff - Frontend changes. - Pass changes/additions in `Torch/Transforms` and `Conversion/`
2021-05-21 08:07:18 +08:00
return std::make_unique<MaximizeValueSemanticsPass>();
}