torch-mlir

Commit Graph

Author	SHA1	Message	Date
Rob Suderman	34f6948533	[torch] Support `!countIncludePad` when unpadded for average pool (#2836 ) We do not support average pool when `countIncludePad is set to false. However if the input is unpadded then the setting of the boolean is unneeded. Extended use by checking if padding is zero before rejecting the lowering.	2024-01-31 15:09:36 -08:00
zjgarvey	c531f5495b	AtenAdaptiveMaxPool2d Conversion to Linalg (#2779 ) The logic here is very similar to the conversion for AdaptiveAvgPool1d #2661 with a few modifications: 1. buffVal = -inf instead of 0 2. the main linalg generic op accumulates a max, instead of a sum, to the first output tensor 3. avg pooling requires dividing the sum pool by the kernel width, which we stored as an auxilliary tensor (kSizeTensor). Here, the auxiliary tensor will be recording the indices. Strangely enough, the only signature available for this function is to return indices, and it appears that they must be computed whether the user desires them or not. See [pytorch/torch/nn/functional.py](https://github.com/pytorch/pytorch/blob/main/torch/nn/functional.py#L1174). Before writing other adaptive pooling conversions, the logic of this decomposition should be rolled into a helper function that will work for both max and avg pooling ops. Even the auxiliary tensor should likely be automated. This code was written in a slightly more tedious way than strictly necessary (often using loops to fill SmallVectors up to rank-2, which is only two in this case), in order to more easily facilitate the transition to a helper function.	2024-01-24 09:09:56 -08:00
James Newling	50ac3b1912	g++ build fix (#2778 ) Introduced in `704cfdaf08` of @wu-s-john g++ compiler error: Pooling.cpp:177:13: error: explicit specialization in non-namespace scope ‘class Design looks good, g++ is just freaking out for no good reason. Un-nesting the template classes fixes the error. We don't have g++ CI. This hopefully happens infrequently enough that we can just fix manually. My service to those folks who really like building with g++... :)	2024-01-19 19:12:29 -08:00
John Wu	704cfdaf08	Add aten.pool_max3d support to torch-to-linalg (#2735 ) Added verification logic to the abstract_interpreter_lib_gen.py Also made some unit tests Initially, I thought we can use `linalg::pooling_ndhwc_max` to help implement this problem. However, on a 5-dimensional matrix it does the pooling on dimensions (2, 3, 4) which is not what we want. We want pooling on dimensions (3, 4, 5). To achieve this, we would need to lower our code using the `linalg` dialect. Turns out the pooling code in `linalg` looks like this. ``` func @max_pooling_ncdhw(%I: memref<?x?x?x?x?xf32>, %K: memref<3xindex>, %O: memref<?x?x?x?x?xf32>, %strides: memref<3xindex>, %dilations: memref<3xindex>) { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index %N = memref.dim %I, %c0 : memref<?x?x?x?x?xf32> %C = memref.dim %I, %c1 : memref<?x?x?x?x?xf32> %D = memref.dim %I, 2 : memref<?x?x?x?x?xf32> %H = memref.dim %I, 3 : memref<?x?x?x?x?xf32> %W = memref.dim %I, 4 : memref<?x?x?x?x?xf32> %kernel_d = memref.load %K[%c0] : memref<3xindex> %kernel_h = memref.load %K[%c1] : memref<3xindex> %kernel_w = memref.load %K[2] : memref<3xindex> %stride_d = memref.load %strides[%c0] : memref<3xindex> %stride_h = memref.load %strides[%c1] : memref<3xindex> %stride_w = memref.load %strides[2] : memref<3xindex> %dilation_d = memref.load %dilations[%c0] : memref<3xindex> %dilation_h = memref.load %dilations[%c1] : memref<3xindex> %dilation_w = memref.load %dilations[2] : memref<3xindex> linalg.generic { indexing_maps = [ affine_map<(n, c, d, h, w, kd, kh, kw) -> (n, c, d * %stride_d + kd * %dilation_d, h * %stride_h + kh * %dilation_h, w * %stride_w + kw * %dilation_w)>, // Map for input tensor affine_map<(n, c, d, h, w, kd, kh, kw) -> (kd, kh, kw)>, // Map for kernel tensor affine_map<(n, c, d, h, w, kd, kh, kw) -> (n, c, d, h, w)> // Map for output tensor ], iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"], doc = "3D Max Pooling NCDHW with Strides, Dilations, and Kernel Size" } ins(%I, %K : memref<?x?x?x?x?xf32>, memref<3xindex>) outs(%O : memref<?x?x?x?x?xf32>) { ^bb0(%input_elem: f32, %kernel_elem: index, %output_elem: f32): %max_val = arith.maxf %input_elem, %output_elem : f32 linalg.yield %max_val : f32 } return } ``` This was implemented based on it's source code with the adjustments mentioned above: `4ca1b5e094/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml (L5647)` Issues related to this can be found here https://github.com/nod-ai/SHARK-Turbine/issues/324	2024-01-19 21:09:46 +05:30
zjgarvey	07d0645f64	[RFC] general support for Adaptive Pooling Ops (#2661 ) Adaptive pooling ops can only be decomposed into their non-adaptive counterparts in trivial cases. For example, the current decomposition for AtenAdaptiveAvgPool1dOp in DecomposeComplexOps.cpp supports outSize = inSize (i.e., do literally nothing), and outSize = 1 (i.e., do a batched average). The reason adaptive pooling ops are difficult to lower to linalg is that they are not constantly strided. They are computed by taking an input tensor of shape (N, C, Hin), and an output size Hout, and computing the output tensor at position (n,c, h) in the following way: 1. compute st(h) = (hHin)//Hout 2. compute en(h) = 1 + ((h+1)Hin -1)//Hout 3. apply a computation (max or avg) to the slice: INPUT[n, c, st(h):en(h)] The provided sample implementation (for ConvertAtenAdaptiveAvgPool1dOp) uses tensor.extract to access the input tensor inside the payload of a linalg generic op. This is likely an unattractive use of linalg generic ops, which is why I am asking for some more targeted feedback on the validity of this approach before attempting to support the many other adaptive pooling ops. Specifically: - Is the performance of this implementation bad enough to warrant targeting different dialects entirely? e.g. TMtensor/linalg ext/ etc. - If the provided implementation is of acceptable performance to the community, then is it permissable to remove the Adaptive pooling decompositions from DecomposeComplexOps.cpp? Based on the current structure of the -torch-decompose-complex-ops pass, it does not seem possible to only decompose the adaptive ops in special cases (it seems to get stuck in an infinite loop on a match failure). I would be happy to instead incorporate the case logic into the conversion directly, and remove the decompositions once they are rendered completely obsolete. As long as this approach is acceptable, I can clean up the implementation with some helper functions, and quickly add support for each of the remaining Adaptive pooling ops.	2024-01-09 11:14:10 -08:00
Quinn Dawkins	400752ca8d	[TorchToLinalg] NFC: Move Utils.h to an externally accessible location (#2603 )	2023-12-01 19:38:21 -05:00
Ramiro Leal-Cavazos	41bafe13cc	[build] Update llvm tag to a3f2751f (#2397 ) This commit updates the `llvm-project` and `mlir-hlo` submodules to commits: llvm-project: a3f2751f782f3cdc6ba4790488ec20163a40ac37 mlir-hlo: 97c7e4b4506c3a2441c923e592833f45da439009 Changes made: - Rename `getSuccessorEntryOperands` with `getEntrySuccessorOperands` and remove `operands` from `getSuccessorRegions` (https://reviews.llvm.org/D157506) - Make `TypeConverter` a `const` (https://reviews.llvm.org/D157601)	2023-08-15 09:53:28 -07:00
JianzheXiao	31ef08b63d	[Stablehlo]Add support for AvgPool1dOp (#2268 ) * Add support for AvgPool1d * Update AbstractInterpLibrary * support avgpool1d in linalg * refactored code * fix nit problem	2023-07-25 14:09:53 +08:00
Yuanqiang Liu	ef6dae6ae2	[Linalg] fix lowering reduce max with -inf (#2097 )	2023-05-08 09:17:49 -07:00
Ramiro Leal-Cavazos	c8e062fb4e	Fix default value of `stride` in 2d pooling ops in linalg and tosa (#2065 ) When the user does not specify the `stride` value in 2d pooling ops, `stride` is given the value of an empty list. However, the current lowerings for pooling ops assumed that the `stride` operand would always be a list of two ints, leading to crashes when that was not the case. This commit fixes the crashes by setting the value of `stride` to `kernel_size` when `stride` is the empty list, since this is the default `stride` value specified in PyTorch docs. See: https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d	2023-04-27 08:31:36 -07:00
Eric Kunze	6a833e1922	Update to LLVM 3157f03a349cfc852cdd994675eaa9652caa2e3a (#2060 ) New requirement to explicitly cast for interfaces https://reviews.llvm.org/D148493	2023-04-25 08:52:46 -07:00
Ramiro Leal-Cavazos	dd35488da5	build: update llvm tag to 798fa4b4 (#1684 ) - Support for non-prefixed accessors has been removed. See: https://reviews.llvm.org/D136727 - Rename `operands` to `methodOperands` in `prim.CallMethod` since the name `operands` overlaps with a builtin method name. See: https://reviews.llvm.org/D136727 - Add passes in refbackend to lower memref.subview. See: https://reviews.llvm.org/D136377 - Replace `CopyToValueTensorOps` first in `RewriteViewLikeSubgraph` in maximize-value-semantics. The current implementation of the `RewriteViewLikeSubgraph` pass in maximize-value-semantics creates temporarily invalid IR. In particular, given a forward slice starting from a `CopyToNonValueTensorOp` and ending in `CopyToValueTensorOp`s, the pass first replaces all uses of the `CopyToNonValueTensorOp` with its operand, which results in all the `CopyToValueTensorOp` users having their operand have type `!torch.vtensor`, which is invalid. The correct way to do things is to first replace all the `CopyToValueTensorOp`s with their operand, and then replace all uses of the `CopyToNonValueTensorOp` with its operand. This only started failing now because the generated accessor `getOperand` for the `CopyToValueTensorOp` now returns a `TypedValue<NonValueTensorType>`, which has an assert checking that the value returned is of the expected type.	2022-12-07 12:20:41 -08:00
Gaurav Shukla	0d209998d1	llvm: update tag to e864ac6945 (#1600 ) Summary of changes: 1. Replace `string` iterator types by `IteratorType` enum. (`e6598b053d`) 2. Update `includes` wrt new directory layout of MLIR HLO codebase. (`9fd8d251a8`) 3. Update tags llvm: e864ac694540342d5e59f59c525c5082f2594fb8 MHLO: eab364ba2a66bd0613efb94f8a738c1c97aaee92 Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com> Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-11-16 14:40:36 -08:00
Ramiro Leal-Cavazos	09ca07bca0	`m_TorchConstant{Int/Bool}List` -> `m_TorchListOfConstant{Int/Bool}s` (#1601 ) This commit renames the patterns used to match on lists of constant values to `m_TorchListOfConstant{valueType}s`. This is needed to avoid ambiguity for when `valueType` has `Optional` in it. In particular, it makes it clear whether the values in the list are optional or the list itself is optional.	2022-11-16 20:33:12 +00:00
xndcn	759057cbdd	[MLIR][TORCH] Fix wrong parameter name "supportFPInputOnly" The parameter "supportFPInputOnly" of function createPoolingOp() is supposed to be "supportNonFPInput", which was added to distinguish between "MaxPool2d" and "AvgPool2d" op in #718	2022-10-30 23:18:08 +08:00
Ramiro Leal-Cavazos	82a3860e25	build: update llvm tag to 4546397e (#1502 ) This commit makes the following changes needed to update bump LLVM: - Replace `linalg.init_tensor` with `tensor.empty` (see: https://reviews.llvm.org/D135129) - Replace `NoSideEffect` with `Pure` (see https://reviews.llvm.org/D135505) - Replace `body` region accessor for `ReduceOp` and `ReduceWindowOp` with `getBody` - Fix incorrect use of `tosa::ReduceSumOp` in `AtenNativeLayerNormOp` conversion pattern. The result type of `tosa::ReduceSumOp` must have the same rank as the input type. (see: https://www.mlplatform.org/tosa/tosa_spec.html#_reduce_sum) Co-authored-by: Ashay Rane <ashay@users.noreply.github.com> Co-authored-by: Ashay Rane <ashay@users.noreply.github.com>	2022-10-18 04:22:53 +00:00
Ashay Rane	faa9a78e38	build: update llvm tag to 6f46ff37 (#1448 ) Summary of changes: - Updated references to the Arith dialect (https://reviews.llvm.org/D134762) - Switched to prefixed accessors for MemRef dialect (https://reviews.llvm.org/D134995) - Fixed warnings about signed/unsigned comparisons, ignored return values, and unused variables	2022-10-05 08:28:06 -05:00
Vivek Khandelwal	6f548fc3ad	[MLIR][TORCH] Add decomposition of aten.adaptive_avg_pool2d op This commit adds the decomposition of `aten.adaptive_avg_pool2d` op into `aten.avg_pool2d` op. The current decomposition only supports cases where input size is equal to the output size. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2022-05-27 07:56:37 +05:30
Vivek Khandelwal	f15d257aac	[MLIR][TORCH] Add support for ceil_mode = true for pooling ops This commit adds support for aten.max_pool2d, aten.max_pool2d_with_indices, and aten.avg_pool2d op for the cases where ceil_mode = true. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2022-05-11 12:52:47 +05:30
Vivek Khandelwal	4b11284440	[MLIR][TORCH] Add E2E support for aten.avg_pool2d op This commit adds lowering of `aten.avg_pool2d` op. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2022-05-02 12:31:44 +05:30
Prashant Kumar	5cdef0213d	[LINALG] Bug fix i64 vs i32 type comparison. Comparing index type instead of integer types solves the problem.	2022-04-22 08:09:58 +05:30
Vivek Khandelwal	769f3a8870	[MLIR][TORCH] Add E2E support for max_pool2d_with_indices op This commit adds lowering of `max_pool2d_with_indices` op. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2022-04-18 21:05:19 +05:30
Sean Silva	5d9222383c	Split up TorchToLinalg.cpp This helps keep things organized and also exposes more parallelism to the build system. It seems though that most of the compile time is actually spent in the headers though, so the wall time doesn't decrease as much as I had hoped (and now that the headers are being included multiple times, the cpu time actually increases a lot, sadly -- will try to dig into this).	2022-03-14 10:19:41 -07:00

23 Commits (b3a56c0711fcd49698ebaa73173fc7fcd986cf34)