Commit Graph

1360 Commits (234b2f2bd4f657da300163dbd01757fcbc5d04d7)
 

Author SHA1 Message Date
武家伟 7bd173a1c4
[MHLO] Eliminate explicit dynamic output shape generating in converting AtenSliceTensorOp (#1245)
[MHLO] Eliminate explicit dynamic output shape generating in converting AtenSliceTensorOp
2022-08-19 10:14:57 +08:00
powderluv 0d1aa43764
Drop Python 3.7x from the nightly binary builds (#1246) 2022-08-18 16:34:12 -07:00
Sean Silva f601435fdf Add white background to diagram.
This makes it easier to read in dark mode browsers.
2022-08-18 15:53:43 -07:00
Ramiro Leal-Cavazos 9bc606c384
Add support for returning more than one copy of the same tensor (#1228)
One of the simplifications made by the pass `RefinePublicReturn`
currently only happens if the tensor in question only has one
user. However, the current method of checking this does not correctly
handle the case of a user having multiple uses of the same
tensor. This commit makes sure only unique users are considered.
2022-08-18 22:41:45 +00:00
Sean Silva 1a7fc3915c [docs] Add architecture doc.
This attempts to get out of my head most of the critical layering and
project structure decisions for Torch-MLIR.
2022-08-18 13:29:49 -07:00
Sean Silva 283e0f141a Add a concept of "backend legal ops".
This is a first step towards formalizing the set of ops in our backend
contract. The goal is to eventually formalize `torch` dialect ops into 3
categories:
1. Legal in backend contract
2. Illegal in backend contract
3. Conditionally legal in backend contract

The "conditionally legal" set are the ops that we can optionally
decompose for backends.

This patch adds relevant pass options for this throughout the compiler,
in preparation for a new set of traits which will formalize this
classification.
2022-08-18 11:46:50 -07:00
Ramiro Leal-Cavazos f07f7d20f9
Clean up shape functions that use `sum_mean_dim` (#1217)
I recently fixed the handling of the `dim` argument in
`sum_mean_dim` (59fccab857). Therefore,
the checks that the `dim` input is `None` or `[]` are no longer needed.
2022-08-18 08:23:43 -07:00
Sambhav Jain 7d4a0d0e2b
[Bazel] Add LowerToBackendContract.cpp to TorchMLIRTorchPasses bazel target (#1243)
Pass is introduced in [this commit](57681f7947). Including it to the bazel targets to get a green build.
2022-08-17 18:15:23 -07:00
Sambhav Jain 114f48e96c
[Bazel] Check cache directory exists before changing owners (#1241)
This fixes a seeding issue with the [previous PR](https://github.com/llvm/torch-mlir/pull/1240) where bazel build's GHA cache is not present to begin with and one of the commands (chown) fails on it. Should get the Bazel build back to green.
2022-08-17 17:04:50 -07:00
Sean Silva 57681f7947 Iteratively run the main simplification pipeline.
This introduces a new pass LowerToBackendContract (better name very
welcome) which performs the bulk of the simplifications that we do,
such as
- shape refinement
- dtype refinement
- maximizing value semantics
- inlining global slots
- decomposing complex ops

The key difference from before is that it iterates the set of
transformations, which can help to break a number of "catch-22" issues
where one simplification depends on another, the latest example being
here:
https://github.com/llvm/torch-mlir/issues/1131

This also exposed that RefineTypes was sometimes crashing/asserting for
certain inputs. This commit hardens it a bit.
2022-08-17 14:54:33 -07:00
Sambhav Jain 9c8b962720
Dockerize and Cache Bazel {Local, CI} Builds (#1240)
This PR adds:

- A minimal docker wrapper to the bazel GHA workflow to make it reproducible locally
- Bazel cache to speed up GHA workflows (down to ~5 minutes from ~40+minutes)

This is a no-op for non-bazel workflows and an incremental improvement.
2022-08-17 12:46:17 -07:00
Yan Xu 9be8997536
Revert "add native_dropout and related ops pattern (#1211)" (#1230)
This reverts commit c935795086.
2022-08-17 13:48:10 +08:00
Quinn Dawkins 85f383ce0b
Bump the shape lib to match the upstream functions currently in PyTorch (#1236)
Bumps the shape library:
 - Updates the function signature for aten.arange.start_step
 - upstream_shape_functions.mean_dim -> upstream_shape_functions.sum_mean_dim
2022-08-17 00:11:04 -04:00
武家伟 11a5b5ac52
[MHLO] Add AtenRSubScalarOp conversion pattern to MHLO (#1233)
* [MHLO] Add AtenRSubScalarOp conversion pattern
Co-authored-by: Bairen Yi <yibairen.byron@bytedance.com>
Co-authored-by: Jiawei Wu <xremold@gmail.com>
Co-authored-by: Tianyou Guo <tianyou.gty@alibaba-inc.com>
Co-authored-by: Xu Yan <yancey.yx@alibaba-inc.com>
Co-authored-by: Ziheng Jiang <ziheng.jiang@bytedance.com>
2022-08-17 09:07:36 +08:00
nithinsubbiah fde390c766 Re-enable custom op support 2022-08-16 22:49:08 +05:30
Jae Hoon (Antonio) Kim 0af55781ae
Propagate device data names (#1157)
* Propagate device data names

* Address PR comment

* Add example usage

* Add test for device data names

* Make TorchMlirComputation fields protected

* Add lazy backend device data name unit tests

* Disable lazy backend tests if LTC is disabled

* Add comments
2022-08-16 09:30:22 -04:00
Ashay Rane 84d345c650
build: update llvm tag to 2dde4ba6 (#1229)
Summary of changes:
 - Tensor dialect now sets `emitAccessorPrefix` to prefixed, thus
   requring updates to methods that retrieve arguments
   [https://reviews.llvm.org/D131361]
 - Update MHLO to build with LLVM commit hash 2dde4ba6
 - Replace `AbsOp` with `AbsFOp` [https://reviews.llvm.org/D131325]
 - Replace deprecated `getValue()` with `value()`
   [https://reviews.llvm.org/D131349]
 - Remove `AnalysisState::defaultInitialize()`
   [https://reviews.llvm.org/D131746]
 - Update MHLO MLIR tests to use the updated assembly format
 - Disabled two failing TOSA tests (Github Issue link:
   https://github.com/llvm/torch-mlir/issues/1231)
2022-08-15 23:54:45 -07:00
武家伟 3b3cb99ef8
Generalize canonicalization pattern for more aten.sub/div/mul/add op (#1209)
Generalize canonicalization pattern for more sub/div/mul/add op, but for AtenDivTensorModeOp in 'trunc' rounding mode, we try to fold it.
2022-08-16 13:24:08 +08:00
Ramiro Leal-Cavazos 9d6ee48661
Fix unused-variables warnings about EmbeddingBag ops (#1220)
According to the documentation for
`torch.embedding_bag` (https://pytorch.org/docs/stable/generated/torch.nn.functional.embedding_bag.html),
the default value for `scale_grad_by_freq` is False.
2022-08-15 09:43:55 -07:00
Yan Xu c935795086
add native_dropout and related ops pattern (#1211) 2022-08-15 09:28:47 +08:00
Sambhav Jain 41aa562fb4
s/external/externals/g (#1222)
Fix remaining instances of `external/llvm-project`.
2022-08-13 07:13:56 -07:00
Ashay Rane 606f4d2c0e
build: streamline options for enabling LTC and MHLO (#1221) 2022-08-12 23:49:28 -07:00
Sambhav Jain 34478ab1c7
[Build] Add concurrency groups to address long queue times (#1219)
We're seeing large CI queue times ([example](https://discord.com/channels/636084430946959380/742573221882364009/1007631811184164944)) especially with MacOS VMs on GHA. Part of the problem is follow-on commits to the same branch which trigger new runs while the previous runs are still in-progress, hogging on the scarce VMs.

This PR adds concurrency groups to the GHA workflow which ensures that only a single job or workflow using the same concurrency group will run at a time. This would cancel any in-progress jobs in the same github workflow and github ref (e.g. `refs/heads/main` or `refs/pull/<pr_number>/merge`).

As discussed on discord [thread](https://discord.com/channels/636084430946959380/1007787336848912386/1007787338895740928), once this lands we may have to closely monitor the workflows to see this didn't introduce unintended consequences. If so, we could either revert, or decide to selectively cancel particular runs (e.g. macos only which is the main bottleneck right now) instead of entire workflow.

This will also require some expectation management. As in, if you see an  on the main branch, it may not necessarily mean things broke, it could mean the run was killed by a more recent run. Making it a bit harder to traceback a failure to a commit in a sequence of commits (requiring to run those builds again).

Thanks @powderluv for the proposal and pointer to this! It should help with the scarce VMs on GHA and save on queue time. 

References:
* https://docs.github.com/en/actions/using-jobs/using-concurrency#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow
* https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow
2022-08-12 17:38:48 -07:00
Ashay Rane 1581d6a84c
build: fix typo in path (#1218)
When we renamed the directory containing submodules from `external` to
`externals`, we accidentally left the original name in the Github
workflow.  This patch fixes the problem.
2022-08-12 15:38:25 -07:00
Sambhav Jain aed0ec3a2c
Merge matrix runs to fail fast globally (#1216)
My earlier[ PR](https://github.com/llvm/torch-mlir/pull/1213) had (among other things) decoupled ubuntu and macos builds into separate matrix runs. This is not working well due to limited number of MacOS GHA VMs causing long queue times and backlog. There are two reasons causing this backlog: 

1. macos arm64 builds with pytorch source are getting erratically cancelled due to resource / network constraints. This is addressed with this: https://github.com/llvm/torch-mlir/pull/1215

> "macos-arm64 (in-tree, OFF) The hosted runner: GitHub Actions 3 lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error."

2. macos runs don't fail-fast when ubuntu runs fail due to being in separate matrix setups. This PR couples them again.
2022-08-12 11:30:09 -07:00
Sambhav Jain b8bd0a46cc
use pytorch binary for macos-arm64 builds (#1215) 2022-08-12 06:33:57 -07:00
Sambhav Jain f00ca91db0
Simplify matrix configuration for CI workflows (#1213)
Addresses https://github.com/llvm/torch-mlir/issues/1207. 

#### Provisioned jobs:
```
# ubuntu - x86_64 - llvm in-tree     - pytorch binary - build+test    # most used dev flow and fastest signal
# ubuntu - x86_64 - llvm out-of-tree - pytorch source - build+test    # most elaborate build
# macos  - arm64  - llvm in-tree     - pytorch source - build only    # cross compile, can't test arm64
```

#### Main changes
- [x] Spawn macos builds from a separate matrix (in the same workflow). It made sense to do this as they are fairly different from ubuntu (cross compile, use a different cmake configuration). This simplifies the matrix configuration and exclusions quite a bit, and makes the workflow a bit more tractable and maintenance friendly.
- [x] Remove the submodule md5sum step for ccache config. This was [broken](https://github.com/llvm/torch-mlir/runs/7779288734?check_suite_focus=true#step:3:145) for a while now.
- [x] Removes unused matrix options - `os`, `targetarch`, `python-version`, `llvmtype`.
- [x] Address ZSTD [comment](https://github.com/llvm/torch-mlir/pull/1204#discussion_r942349282) on @powderluv's cross compile [PR](https://github.com/llvm/torch-mlir/pull/1204). 

#### Further improvements (to be addressed in follow-on):
* ubuntu-x86_64 out-of-tree integration tests fail ([error](https://github.com/sjain-stanford/torch-mlir/runs/7781264029?check_suite_focus=true)); only run unit tests for now (tests are excluded in current CI too)

#### Passing workflow:
https://github.com/sjain-stanford/torch-mlir/actions/runs/2840676309
![image](https://user-images.githubusercontent.com/19234106/184194535-f3807991-401a-4cb9-b030-0ee8c334eba3.png)
2022-08-11 16:35:15 -07:00
Renato Golin 51bfe25c89
Add PyYaml to requirements.txt (#1174)
Building on a fresh environment + virtualenv + in-tree build errors out
becayse PyYaml isn't installed. Adding to requirements.txt fixes that.

Fixes #1173
2022-08-11 17:59:39 +01:00
Prashant Kumar b1a506624c Add decomposition of `aten.masked.tensor` op.
`aten.masked.tensor` op has been decomposed to `aten.masked.scalar` op.
2022-08-11 07:48:04 +05:30
Yan Xu d96ec64be1
remove torch dialect from legal list (#1192) 2022-08-11 09:22:41 +08:00
Vidush Singhal dd2da5a038
E2E support for AtenRemainderScalarOp (#1200) 2022-08-10 20:02:06 -04:00
gpetters94 79b9cf9468
Add lowering for aten.to.device (#1107) 2022-08-10 19:24:02 -04:00
Ramana Radhakrishnan b8d51a74d9
Update TorchToStd to TorchtoArith in bazel files too. (#1210)
The CI didn't catch the missing rename of TorchToArith until the
merge had happened. This is following up from #1163
2022-08-10 14:51:13 -07:00
Renato Golin 71c240a6fa
Add note about MLIR compiled outputs in dev docs (#1195)
Adding an example on how to extract MLIR output from the compilation
process in various different formats to the development documentation.

This should help developers trying to either debug torch_mlir or use it
for the purpose of extracting MLIR outputs for other testing.

Fixes #1175
2022-08-10 20:14:41 +01:00
Ramana Radhakrishnan 738f4fe96a
Rename TorchToStd pass as TorchToArith (#1163)
All the converters in this pass appear to create ops from the
arith dialect. Hence the full rename.

Fix GH Issue #409.
2022-08-10 20:12:51 +01:00
powderluv 2342456356
mac m1 cross compile (#1204)
* mac m1 cross compile

Add support for M1 cross compile

* Remove redundant ExecutionEngine

It is registered as part of RegisterEverything

* nuke non-universal zstd

disable LTC
2022-08-10 08:48:39 -07:00
武家伟 87562773f8
[MHLO] Add AtenCatOp conversion pattern to MHLO (#1208)
Co-authored-by: Bairen Yi <yibairen.byron@bytedance.com>
Co-authored-by: Jiawei Wu <xremold@gmail.com>
Co-authored-by: Tianyou Guo <tianyou.gty@alibaba-inc.com>
Co-authored-by: Xu Yan <yancey.yx@alibaba-inc.com>
Co-authored-by: Ziheng Jiang <ziheng.jiang@bytedance.com>
Co-authored-by: Vremold <xremold@gamil.com>
2022-08-09 22:12:34 -07:00
powderluv 9cf0b6e8ff
Disable out-of-tree and PyTorch binary (#1206) 2022-08-09 18:18:12 -07:00
Sambhav Jain 072c2b5aaf
[Bazel] Add EraseModuleInitializer to TorchMLIRTorchPasses library (#1202)
The torch-mlir bazel build is [failing](https://github.com/llvm/torch-mlir/runs/7737425906?check_suite_focus=true) since [this commit](504de5e701) due to a linker failure (undefined symbol: `mlir::torch::Torch::createEraseModuleInitializerPass()`).

```
ERROR: /home/runner/.cache/bazel/_bazel_runner/db599744cd37f8c161e5034d9b9cd520/external/torch-mlir/BUILD.bazel:845:10: Linking external/torch-mlir/torch-mlir-opt failed: (Exit 1): clang failed: error executing command /usr/lib/llvm-11/bin/clang @bazel-out/k8-fastbuild/bin/external/torch-mlir/torch-mlir-opt-2.params

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
ld.lld: error: undefined symbol: mlir::torch::Torch::createEraseModuleInitializerPass()
>>> referenced by Passes.cpp
>>>               bazel-out/k8-fastbuild/bin/external/torch-mlir/_objs/TorchMLIRTorchPasses/Passes.pic.o:(mlir::torch::Torch::createTorchFunctionToTorchBackendPipeline(mlir::OpPassManager&, mlir::torch::Torch::TorchLoweringPipelineOptions const&))
>>> referenced by Passes.cpp
>>>               bazel-out/k8-fastbuild/bin/external/torch-mlir/_objs/TorchMLIRTorchPasses/Passes.pic.o:((anonymous namespace)::registerEraseModuleInitializerPass()::'lambda'()::operator()() const)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```

This PR adds `lib/Dialect/Torch/Transforms/EraseModuleInitializer.cpp` to `TorchMLIRTorchPasses` library.
2022-08-09 13:34:59 -07:00
Sambhav Jain d41c7becf5
[Bazel] Allow workflow_dispatch manual trigger on bazel workflow (#1203)
At the moment we don't gate torch-mlir PRs with bazel builds. This means bazel builds don't get run on open PRs, and so there's no good way to validate a fix PR which is meant to fix a broken bazel build. This option allows a bazel build to be manually triggered as needed on open PRs.
2022-08-09 13:28:21 -07:00
Sambhav Jain b696362b7d
Enable OOT builds in CI (#1188) 2022-08-09 12:13:16 -07:00
Jacques Pienaar e75c7c5292
Flip to C++17 (#1198)
LLVM now uses C++17.
2022-08-09 08:38:30 -07:00
Marius Brehler 747356186f
Don't set MLIR_TABLEGEN_EXE (#1197)
With llvm/llvm-project@112499f landed, `MLIR_TABLEGEN_EXE` is given as a
cache variable in the MLIR core project. Other external projects, such
as TORCH-MLIR, should not set the variable as this breaks
cross-compilation.
2022-08-09 16:06:12 +02:00
Marius Brehler 202076c6e3
Add CMake dep to Func dialect (#1196)
The Torch dialect has an include to `mlir/Dialect/Func/IR/FuncOps.h` and
should therefore have a CMake dependency to the MLIRFuncDialect.
Otherwise, the build can fail since it may occur that
`mlir/Dialect/Func/IR/FuncOps.h.inc` isn't generated yet.
2022-08-09 06:54:30 -07:00
Yan Xu f83a905856
[MHLO]fix lowering failed on reduction op with i32 shape (#1185)
fixed lowering failed on torch::max.dim while shape type is i32
2022-08-09 17:02:50 +08:00
powderluv e55fc4deb5
Revert "E2E support for AtenRemainderScalarOp (#1119)" (#1190)
This reverts commit 34e207eeb5.
2022-08-08 22:59:57 -07:00
Ashay Rane bb47c166a0
llvm: update tag to 061e0189 (#1180)
Summary of changes:
 - Switch to C++17 (similar to https://reviews.llvm.org/D131348)
 - Update MHLO to build with LLVM commit hash 061e0189
 - Replace deprecated `hasValue()` and `getValue()` with `has_value()`
   and `value()` respectively (https://reviews.llvm.org/D131349)
 - Use `TypedAttr` (https://reviews.llvm.org/D130092)
 - Use updated assembly format of `mhlo.compare` op (commit
   d03ef01e70fbf9afd0fa1976fbb7ed31838929b3 in MHLO repo)
2022-08-08 20:17:35 -07:00
Henry Tu 3e97a33c80
Revert "Reenable LTC in out-of-tree build (#1177)" (#1183)
This reverts commit f85ae9c685.
2022-08-08 18:58:35 -07:00
武家伟 351f15424e
[MHLO] Add transposed convolution conversion pattern (#1171)
Co-authored-by: Bairen Yi <yibairen.byron@bytedance.com>
Co-authored-by: Jiawei Wu <xremold@gmail.com>
Co-authored-by: Tianyou Guo <tianyou.gty@alibaba-inc.com>
Co-authored-by: Xu Yan <yancey.yx@alibaba-inc.com>
Co-authored-by: Ziheng Jiang <ziheng.jiang@bytedance.com>
2022-08-09 09:50:07 +08:00
Sean Silva 504de5e701 Rework how global slot initializers work.
Rather than a per-global-slot initializer region, we now have one for
the whole module. For example, it might look like this:

```
torch.global_slot "private" @tensor : !torch.tensor
torch.global_slot "private" @list : !torch.list<tensor>
torch.global_slot.module_initializer {
  %0 = torch.tensor.literal(dense<0.0> : tensor<f32>) : !torch.tensor
  %1 = torch.prim.ListConstruct %0 : (!torch.tensor) -> !torch.list<tensor>
  torch.initialize.global_slots [
    @tensor(%0 : !torch.tensor)
    @list(%1 : !torch.list<tensor>)
  ]
}
```

This new structure allows GlobalizeObjectGraph to create the initializer in a
much simpler way, avoiding the need to reason about whether different slots
alias each other. Reasoning about whether slots alias each other now is the
responsibility of InlineGlobalSlots, which has to do a much more complicated
analysis, implemented using MLIR's dataflow analysis framework.

Recommended review order:
- Check out the new IR constructs in the .mlir files of various passes
- Op definitions (*.td)
- Changes to GlobalizeObjectGraph pass.
- InlineGlobalSlots pass (~total rewrite)
- Misc changes:
  - Moving torchMlirAdjustStaticInformation for sharing with C++ code.
  - EraseModuleInitializer pass

To make this a bit nicer, it would be good to have a `torch.module` op
with an initializer region attached. That would be more invasive though.

This change has highlighted certain aspects of our project layering
which are worth calling out. None of our backends can handle global
slots, so we enforce that there are no global slots before backend
lowering. At an earlier stage in the project, we had aspirations of
transparently handling mutable global state and such, but for reasons
described below, that is no longer a goal. So really global slots should
be seen as a progressive lowering step as part of inlining all the
IValue's in the original program (GlobalizeObjectGraph is also one such
step).

Over time, with insights from work like IREE-JAX, it has become clear
that there isn't a reliable programming model we can compile for users
where we just transparently handle mutable global state (and some other
things, like lists and dictionaries). There is a need for an "outer
program" that orchestrates more restricted subroutines of the kind we
can handle in our compile flow here. The benefit of that is that it
decouples considerations like shapes, dtypes, etc. from the program
constructs used in the outer program. As long as the outer program can
efficiently invoke (pipelining/async/etc.) high-performance
data-parallel numerical subroutines of the kind we compile in our flow
here, then there is a complete programming model. This is also
consistent with the direction of upstream PyTorch which is becoming more
tracing-based (which inherently loses a lot of program structure, which
then has to be applied back with an "outer program" orchestrating the
traced subroutines).
2022-08-08 18:12:06 -07:00