torch-mlir/test/Conversion/TorchToLinalg/elementwise.mlir

103 lines
6.4 KiB
MLIR
Raw Normal View History

// RUN: torch-mlir-opt <%s -convert-torch-to-linalg -split-input-file -mlir-print-local-scope -verify-diagnostics | FileCheck %s
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK-LABEL: func.func @elementwise$unary(
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK-SAME: %[[ARG:.*]]: !torch.vtensor<[],f32>) -> !torch.vtensor<[],f32> {
// CHECK-DAG: %[[BUILTIN_TENSOR:.*]] = torch_c.to_builtin_tensor %[[ARG]] : !torch.vtensor<[],f32> -> tensor<f32>
[TorchToLinalg] perform rank0 elementwise computations outside linalg generic ops (#3762) This is motivated by the fact that shapes are stored as tensors in ONNX, and IREE tries to perform tensor arithmetic on the device. This causes unnecessary dispatches, and makes it harder for the compiler to reason about shapes. Here is a small snippet of torch-IR that is typical seen coming from ONNX models: ```mlir module { func.func @main_graph(%arg0: !torch.vtensor<[?,?,768],f32>, %arg1: !torch.vtensor<[?,?,768],f32>) -> !torch.vtensor<[],si64> { %int0 = torch.constant.int 0 %0 = torch.vtensor.literal(dense<0> : tensor<1xsi64>) : !torch.vtensor<[1],si64> %1 = torch.aten._shape_as_tensor %arg1 : !torch.vtensor<[?,?,768],f32> -> !torch.vtensor<[3],si64> %2 = torch.aten.index_select %1, %int0, %0 : !torch.vtensor<[3],si64>, !torch.int, !torch.vtensor<[1],si64> -> !torch.vtensor<[1],si64> %3 = torch.aten.squeeze.dim %2, %int0 : !torch.vtensor<[1],si64>, !torch.int -> !torch.vtensor<[],si64> %4 = torch.aten.item %3 : !torch.vtensor<[],si64> -> !torch.int %5 = torch.aten.eq.int %4, %int0 : !torch.int, !torch.int -> !torch.bool %6 = torch.aten.Int.bool %5 : !torch.bool -> !torch.int %7 = torch.aten.size.int %arg0, %int0 : !torch.vtensor<[?,?,768],f32>, !torch.int -> !torch.int %8 = torch.prim.NumToTensor.Scalar %6 : !torch.int -> !torch.vtensor<[],i1> %9 = torch.prim.NumToTensor.Scalar %7 : !torch.int -> !torch.vtensor<[],si64> %10 = torch.prim.NumToTensor.Scalar %4 : !torch.int -> !torch.vtensor<[],si64> %11 = torch.aten.where.self %8, %9, %10 : !torch.vtensor<[],i1>, !torch.vtensor<[],si64>, !torch.vtensor<[],si64> -> !torch.vtensor<[],si64> return %11 : !torch.vtensor<[],si64> } } ``` Without the change in this PR, the result would be: ```mlir #map = affine_map<() -> ()> module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> { %c0_i64 = arith.constant 0 : i64 %c0 = arith.constant 0 : index %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32> %0 = arith.index_cast %dim : index to i64 %1 = tensor.empty() : tensor<1xi64> %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64> %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64> %extracted = tensor.extract %2[] : tensor<i64> %3 = arith.cmpi eq, %extracted, %c0_i64 : i64 %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32> %4 = arith.index_cast %dim_0 : index to i64 %5 = tensor.empty() : tensor<i1> %6 = linalg.fill ins(%3 : i1) outs(%5 : tensor<i1>) -> tensor<i1> %7 = tensor.empty() : tensor<i64> %8 = linalg.fill ins(%4 : i64) outs(%7 : tensor<i64>) -> tensor<i64> %9 = linalg.fill ins(%extracted : i64) outs(%7 : tensor<i64>) -> tensor<i64> %10 = linalg.generic {indexing_maps = [#map, #map, #map, #map], iterator_types = []} ins(%6, %8, %9 : tensor<i1>, tensor<i64>, tensor<i64>) outs(%7 : tensor<i64>) { ^bb0(%in: i1, %in_1: i64, %in_2: i64, %out: i64): %11 = arith.select %in, %in_1, %in_2 : i64 linalg.yield %11 : i64 } -> tensor<i64> return %10 : tensor<i64> } } ``` With the change in this PR, we would instead get: ```mlir module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @main_graph(%arg0: tensor<?x?x768xf32>, %arg1: tensor<?x?x768xf32>) -> tensor<i64> { %c0_i64 = arith.constant 0 : i64 %c0 = arith.constant 0 : index %dim = tensor.dim %arg1, %c0 : tensor<?x?x768xf32> %0 = arith.index_cast %dim : index to i64 %1 = tensor.empty() : tensor<1xi64> %collapsed = tensor.collapse_shape %1 [] : tensor<1xi64> into tensor<i64> %2 = linalg.fill ins(%0 : i64) outs(%collapsed : tensor<i64>) -> tensor<i64> %extracted = tensor.extract %2[] : tensor<i64> %3 = arith.cmpi eq, %extracted, %c0_i64 : i64 %dim_0 = tensor.dim %arg0, %c0 : tensor<?x?x768xf32> %4 = arith.index_cast %dim_0 : index to i64 %5 = arith.select %3, %4, %extracted : i64 %6 = tensor.empty() : tensor<i64> %7 = linalg.fill ins(%5 : i64) outs(%6 : tensor<i64>) -> tensor<i64> return %7 : tensor<i64> } } ``` Some related issues for context: 1. <https://github.com/iree-org/iree/issues/18677> 2. <https://github.com/iree-org/iree/issues/18631>
2024-10-05 00:27:00 +08:00
// CHECK: %[[EXTRACT:.*]] = tensor.extract %[[BUILTIN_TENSOR]][] : tensor<f32>
// CHECK: %[[TANH:.*]] = math.tanh %[[EXTRACT]] : f32
// CHECK: %[[EMPTY:.*]] = tensor.empty() : tensor<f32>
// CHECK: %[[FILL:.*]] = linalg.fill ins(%[[TANH]] : f32) outs(%[[EMPTY]] : tensor<f32>) -> tensor<f32>
// CHECK: %[[CASTED:.*]] = tensor.cast %[[FILL:.*]] : tensor<f32> to tensor<f32>
Add TorchToIREE and factor out TorchConversion dialect. This converts a basic list op (torch.prim.ListConstruct) to the IREE dialect. ``` def forward(self, x: float): return [x, x] ``` turns into: ``` builtin.func @forward(%arg0: !torch.float) -> !torch.list<!torch.float> { %0 = torch.prim.ListConstruct %arg0, %arg0 : (!torch.float, !torch.float) -> !torch.list<!torch.float> return %0 : !torch.list<!torch.float> } ``` which turns into: ``` builtin.func @forward(%arg0: f64) -> !iree.list<f64> { %c1 = constant 1 : index %c0 = constant 0 : index %c2 = constant 2 : index %0 = iree.list.create %c2 : !iree.list<f64> iree.list.set %0[%c0], %arg0 : !iree.list<f64>, f64 iree.list.set %0[%c1], %arg0 : !iree.list<f64>, f64 return %0 : !iree.list<f64> } ``` As part of doing this, I realized that it was time to formalize the IR form that we reach right before running TorchTo{Linalg,Std,...}. We now call it the "Torch backend contract". We then lower the "Torch backend contract" to the "npcomp backend contract", which involves the new TorchConversion (`torch_c`) dialect, which holds ops that need to operate on both the npcomp backend types (e.g. builtin tensors, i1, IREE list, etc.) and the `!torch` types. This made more sense, as I realized that if I didn't factor out `torch_c` then the Torch dialect would have a dependency on IREE dialect (we previously didn't notice this was an issue because we only depended on `builtin` types), which seemed wrong to me. Recommended review order: - TorchToIREE.cpp / `TorchToIREE/basic.mlir` - Look at the new structure of createTorchScriptToNpcompBackendPipeline. It now lives in TorchConversion/Transforms/Passes.cpp and cleanly calls into `Torch::createTorchScriptToTorchBackendPipeline` for the frontend lowering to the Torch backend contract. - Mechanical change extracting `torch_c.{to,from}_{i1,i64,f64,builtin_tensor,iree_list}` into a new TorchConversion dialect, and a few passes specific to the lowering from the Torch backend contract to the npcomp backend contract. - Minor fixes to TorchToLinalg.cpp to use unconverted operands (now that we convert lists as part of operand materialization, we need to use the original operands). Also added test for AtenMaxPool2dOp and fixed m_TorchConstantIntList. - TmpDeleteDeadIREELists pass. Temporary pass for deleting dead IREE lists that are created as part of operand materialization for conv/max pool/avg pool ops in TorchToLinalg.
2021-08-12 05:40:08 +08:00
// CHECK: %[[RESULT:.*]] = torch_c.from_builtin_tensor %[[CASTED]] : tensor<f32> -> !torch.vtensor<[],f32>
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: return %[[RESULT]] : !torch.vtensor<[],f32>
// CHECK: }
func.func @elementwise$unary(%arg0: !torch.vtensor<[],f32>) -> !torch.vtensor<[],f32> {
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
%0 = torch.aten.tanh %arg0 : !torch.vtensor<[],f32> -> !torch.vtensor<[],f32>
return %0 : !torch.vtensor<[],f32>
}
// -----
// CHECK-LABEL: func.func @elementwise$binary(
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK-SAME: %[[ARG0:.*]]: !torch.vtensor<[?,?],f32>,
// CHECK-SAME: %[[ARG1:.*]]: !torch.vtensor<[?],f32>) -> !torch.vtensor<[?,?],f32> {
// CHECK-DAG: %[[BUILTIN_ARG0:.*]] = torch_c.to_builtin_tensor %[[ARG0]] : !torch.vtensor<[?,?],f32> -> tensor<?x?xf32>
// CHECK-DAG: %[[BUILTIN_ARG1:.*]] = torch_c.to_builtin_tensor %[[ARG1]] : !torch.vtensor<[?],f32> -> tensor<?xf32>
// CHECK: %[[C0:.*]] = arith.constant 0 : index
// CHECK: %[[ARG0_DIM0:.*]] = tensor.dim %[[BUILTIN_ARG0]], %[[C0]] : tensor<?x?xf32>
// CHECK: %[[C1:.*]] = arith.constant 1 : index
// CHECK: %[[ARG0_DIM1:.*]] = tensor.dim %[[BUILTIN_ARG0]], %[[C1]] : tensor<?x?xf32>
// CHECK: %[[C0_2:.*]] = arith.constant 0 : index
// CHECK: %[[ARG1_DIM0:.*]] = tensor.dim %[[BUILTIN_ARG1]], %[[C0_2]] : tensor<?xf32>
// CHECK: %[[LEGAL_SIZES:.*]] = arith.cmpi eq, %[[ARG0_DIM1]], %[[ARG1_DIM0]] : index
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: assert %[[LEGAL_SIZES]], "mismatched size for broadcast"
// CHECK: %[[INIT_TENSOR:.*]] = tensor.empty(%[[ARG0_DIM0]], %[[ARG0_DIM1]]) : tensor<?x?xf32>
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: %[[GENERIC:.*]] = linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d1)>, affine_map<(d0, d1) -> (d0, d1)>], iterator_types = ["parallel", "parallel"]} ins(%[[BUILTIN_ARG0]], %[[BUILTIN_ARG1]] : tensor<?x?xf32>, tensor<?xf32>) outs(%[[INIT_TENSOR]] : tensor<?x?xf32>) {
// CHECK: ^bb0(%[[LHS:.*]]: f32, %[[RHS:.*]]: f32, %{{.*}}: f32):
// CHECK: %[[MUL:.*]] = arith.mulf %[[LHS]], %[[RHS]] : f32
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: linalg.yield %[[MUL]] : f32
// CHECK: } -> tensor<?x?xf32>
// CHECK: %[[CASTED:.*]] = tensor.cast %[[GENERIC:.*]] : tensor<?x?xf32> to tensor<?x?xf32>
Add TorchToIREE and factor out TorchConversion dialect. This converts a basic list op (torch.prim.ListConstruct) to the IREE dialect. ``` def forward(self, x: float): return [x, x] ``` turns into: ``` builtin.func @forward(%arg0: !torch.float) -> !torch.list<!torch.float> { %0 = torch.prim.ListConstruct %arg0, %arg0 : (!torch.float, !torch.float) -> !torch.list<!torch.float> return %0 : !torch.list<!torch.float> } ``` which turns into: ``` builtin.func @forward(%arg0: f64) -> !iree.list<f64> { %c1 = constant 1 : index %c0 = constant 0 : index %c2 = constant 2 : index %0 = iree.list.create %c2 : !iree.list<f64> iree.list.set %0[%c0], %arg0 : !iree.list<f64>, f64 iree.list.set %0[%c1], %arg0 : !iree.list<f64>, f64 return %0 : !iree.list<f64> } ``` As part of doing this, I realized that it was time to formalize the IR form that we reach right before running TorchTo{Linalg,Std,...}. We now call it the "Torch backend contract". We then lower the "Torch backend contract" to the "npcomp backend contract", which involves the new TorchConversion (`torch_c`) dialect, which holds ops that need to operate on both the npcomp backend types (e.g. builtin tensors, i1, IREE list, etc.) and the `!torch` types. This made more sense, as I realized that if I didn't factor out `torch_c` then the Torch dialect would have a dependency on IREE dialect (we previously didn't notice this was an issue because we only depended on `builtin` types), which seemed wrong to me. Recommended review order: - TorchToIREE.cpp / `TorchToIREE/basic.mlir` - Look at the new structure of createTorchScriptToNpcompBackendPipeline. It now lives in TorchConversion/Transforms/Passes.cpp and cleanly calls into `Torch::createTorchScriptToTorchBackendPipeline` for the frontend lowering to the Torch backend contract. - Mechanical change extracting `torch_c.{to,from}_{i1,i64,f64,builtin_tensor,iree_list}` into a new TorchConversion dialect, and a few passes specific to the lowering from the Torch backend contract to the npcomp backend contract. - Minor fixes to TorchToLinalg.cpp to use unconverted operands (now that we convert lists as part of operand materialization, we need to use the original operands). Also added test for AtenMaxPool2dOp and fixed m_TorchConstantIntList. - TmpDeleteDeadIREELists pass. Temporary pass for deleting dead IREE lists that are created as part of operand materialization for conv/max pool/avg pool ops in TorchToLinalg.
2021-08-12 05:40:08 +08:00
// CHECK: %[[RESULT:.*]] = torch_c.from_builtin_tensor %[[CASTED]] : tensor<?x?xf32> -> !torch.vtensor<[?,?],f32>
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: return %[[RESULT]] : !torch.vtensor<[?,?],f32>
func.func @elementwise$binary(%arg0: !torch.vtensor<[?,?],f32>, %arg1: !torch.vtensor<[?],f32>) -> !torch.vtensor<[?,?],f32> {
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
%0 = torch.aten.mul.Tensor %arg0, %arg1 : !torch.vtensor<[?,?],f32>, !torch.vtensor<[?],f32> -> !torch.vtensor<[?,?],f32>
return %0 : !torch.vtensor<[?,?],f32>
}
// -----
// CHECK-LABEL: func.func @elementwise$ternary(
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: linalg.generic {indexing_maps = [
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1, d2)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d1, d2)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d2)>,
// CHECK-SAME: affine_map<(d0, d1, d2) -> (d0, d1, d2)>]
func.func @elementwise$ternary(%arg0: !torch.vtensor<[?,?,?],f32>, %arg1: !torch.vtensor<[?,?],f32>, %arg2: !torch.vtensor<[?],f32>) -> !torch.vtensor<[?,?,?],f32> {
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
%0 = torch.aten.lerp.Tensor %arg0, %arg1, %arg2 : !torch.vtensor<[?,?,?],f32>, !torch.vtensor<[?,?],f32>, !torch.vtensor<[?],f32> -> !torch.vtensor<[?,?,?],f32>
return %0 : !torch.vtensor<[?,?,?],f32>
}
// -----
// CHECK-LABEL: func.func @elementwise$with_scalar_capture(
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK-SAME: %[[VAL_0:.*]]: !torch.vtensor<[?],f32>,
// CHECK-SAME: %[[VAL_1:.*]]: !torch.vtensor<[],f32>) -> !torch.vtensor<[?],f32> {
// CHECK: %[[C1:.*]] = torch.constant.int 1
TorchToLinalg: Try folding shape computations to keep static shapes when possible (#3475) Before this PR, a statically shaped aten.convolution would generate dynamically shaped linalg IR, and even `-canonicalize` would not be able to fold it back into static shapes. This PR ensure that shape calculations are folded on construction to directly generate statically shaped linalg IR. We achieve that by ensuring that `arith` ops involved in computing shapes are created via `createOrFold`, so that later uses of `getAsOpFoldResult` see constants instead of those ops. For example ``` module { func.func @forward(%arg0: !torch.vtensor<[32,336,112,112],f32>, %arg1: !torch.vtensor<[336,168,3,3],f32>, %arg2: !torch.vtensor<[336],f32>) -> !torch.vtensor<[32,336,56,56],f32> { %false = torch.constant.bool false %int2 = torch.constant.int 2 %int1 = torch.constant.int 1 %0 = torch.prim.ListConstruct %int1, %int1 : (!torch.int, !torch.int) -> !torch.list<int> %1 = torch.prim.ListConstruct %int2, %int2 : (!torch.int, !torch.int) -> !torch.list<int> %2 = torch.prim.ListConstruct : () -> !torch.list<int> %3 = torch.aten.convolution %arg0, %arg1, %arg2, %1, %0, %0, %false, %2, %int2 : !torch.vtensor<[32,336,112,112],f32>, !torch.vtensor<[336,168,3,3],f32>, !torch.vtensor<[336],f32>, !torch.list<int>, !torch.list<int>, !torch.list<int>, !torch.bool, !torch.list<int>, !torch.int -> !torch.vtensor<[32,336,56,56],f32> return %3 : !torch.vtensor<[32,336,56,56],f32> } } ``` would result in ``` [...] %padded = tensor.pad %2 low[%14, %15, %16, %17] high[%14, %15, %16, %17] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<32x336x112x112xf32> to tensor<?x?x?x?xf32> [...] %45 = linalg.conv_2d_ngchw_gfchw {dilations = dense<1> : vector<2xi64>, strides = dense<2> : vector<2xi64>} ins(%expanded, %expanded_37 : tensor<?x2x?x?x?xf32>, tensor<2x168x168x3x3xf32>) outs(%expanded_44 : tensor<32x2x168x?x?xf32>) -> tensor<32x2x168x?x?xf32> [...] ``` and with this PR all shapes are static.
2024-06-27 14:43:10 +08:00
// CHECK: %[[BUILTIN_C1:.*]] = arith.constant 1 : i64
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: linalg.generic {indexing_maps = [affine_map<(d0) -> (d0)>, affine_map<(d0) -> ()>, affine_map<(d0) -> (d0)>]
// CHECK: ^bb0(%[[LHS:.*]]: f32, %[[RHS:.*]]: f32, %{{.*}}: f32):
// CHECK: %[[ALPHA:.*]] = arith.sitofp %[[BUILTIN_C1]] : i64 to f32
// CHECK: %[[SCALED:.*]] = arith.mulf %[[RHS]], %[[ALPHA]] : f32
// CHECK: %[[RES:.*]] = arith.addf %[[LHS]], %[[SCALED]] : f32
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: linalg.yield %[[RES]] : f32
// CHECK: } -> tensor<?xf32>
func.func @elementwise$with_scalar_capture(%arg0: !torch.vtensor<[?],f32>, %arg1: !torch.vtensor<[],f32>) -> !torch.vtensor<[?],f32> {
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
%int1 = torch.constant.int 1
%0 = torch.aten.add.Tensor %arg0, %arg1, %int1 : !torch.vtensor<[?],f32>, !torch.vtensor<[],f32>, !torch.int -> !torch.vtensor<[?],f32>
return %0 : !torch.vtensor<[?],f32>
}
// -----
// CHECK-LABEL: func.func @elementwise$static_1(
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
// CHECK: linalg.generic {indexing_maps = [
// CHECK-SAME: affine_map<(d0) -> (d0)>,
// CHECK-SAME: affine_map<(d0) -> (0)>,
// CHECK-SAME: affine_map<(d0) -> (d0)>]
func.func @elementwise$static_1(%arg0: !torch.vtensor<[?],f32>, %arg1: !torch.vtensor<[1],f32>) -> !torch.vtensor<[?],f32> {
Generalize support for elementwise ops. We plumb through e2e a fair number of interesting cases: - unary, binary, ternary elementwise ops - ops like `torch.aten.add.Tensor` that also take a scalar parameter - static size-1 broadcasting We allow the static size-1 broadcasting case, but emit a runtime error in the case of dynamic size-1 broadcasting. This seems like a sweet spot subset of things that can be lowered directly to linalg, while not being overly constraining to users. This is consistent with what IREE is doing for CHLO->Linalg lowering as well ([code](https://github.com/google/iree/blob/50bf7a87e465d2048c527bc27724edde40519b7e/iree/compiler/InputConversion/MHLO/BroadcastingToLinalgPatterns.cpp#L1)). To test the static size-1 case, we added support for the `torch.aten.unsqueeze` op and lowering for it through `linalg.tensor_expand_shape`. This involved a generalization of `MaximizeValueSemantics` able to handle it (the solution there also works for `torch.aten.flatten.using_ints` which we need for ResNet anyway) Also, a few minor additional changes: - Add `VerifyInvariantsBeforeBackendLowering` pass, which catches a large class of errors before we get to backend lowering (now that we are doing dialect conversion, the errors are way nicer if we just emit them up front rather than in the guts of a random pattern). - Minor change to RefBackend to allow `linalg.tensor_expand_shape`. Recommended review order: - e2e tests in elementwise.py - `ConvertElementwiseOp` in TorchToLinalg.cpp + elementwise.mlir test - `ConvertAtenUnsqueezeOp` in TorchToLinalg.cpp + unsqueeze.mlir test - RefineTypes.cpp + tests - MaximizeValueSemantics changes + test - VerifyInvariantsBeforeBackendLowering pass + test
2021-06-26 08:25:09 +08:00
%1 = torch.aten.mul.Tensor %arg0, %arg1 : !torch.vtensor<[?],f32>, !torch.vtensor<[1],f32> -> !torch.vtensor<[?],f32>
return %1 : !torch.vtensor<[?],f32>
}
// -----
// CHECK-LABEL: func.func @elementwise_sinh
// CHECK: linalg.generic
// CHECK: math.sinh
func.func @elementwise_sinh(%arg0: !torch.vtensor<[3],f32>) -> !torch.vtensor<[3],f32> {
%0 = torch.aten.sinh %arg0 : !torch.vtensor<[3],f32> -> !torch.vtensor<[3],f32>
return %0 : !torch.vtensor<[3],f32>
}