Commit Graph

3004 Commits (b38585e0773c78e05567e96afc6315733466016e)
 

Author SHA1 Message Date
Vivek Khandelwal a1d3afdba9 [MLIR][TORCH] Add E2E support for aten.randint.low op
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
2022-11-16 09:54:18 +05:30
AmosLewis 22a5067242 [TOSA] Add more tosa::cast type support 2022-11-16 09:53:10 +05:30
Sambhav Jain 4aa1e90b34
Fix cache bug with Bazel builds in CI (#1593)
Some time ago, bazel builds in CI were being sped up fine with caching. However, over time the cache got stale because `actions/cache@v3` apparently doesn't update caches when it "hits" unless it is configured to do so specifically. This requires using a uniqued per-commit `key` (to force it to update cache after each successful run) and a relaxed `restore-keys` which is not unique per-commit so newer commits can restore from the nearest hit.

Test GHA run 1 (no cache hit): [1h 1m 52s](https://github.com/sjain-stanford/torch-mlir/actions/runs/3474770334/usage)
Test GHA run 2 (cache hit, same commit): [5m 14s](https://github.com/sjain-stanford/torch-mlir/actions/runs/3475132135/usage)
Test GHA run 3 (cache hit, different commit): [6m 6s](https://github.com/sjain-stanford/torch-mlir/actions/runs/3475161009/usage)
2022-11-15 18:48:31 -08:00
Sambhav Jain fc4c8d4ed9
Enable torch-mlir LIT tests in Bazel (#1585)
Adds support to run `.mlir` LIT tests in bazel. 

```
bazel test @torch-mlir//test/...
```

Follow-on PR will contain these updates:
- Add tests to GHA CI workflow
- Add `.py` LIT tests to bazel
2022-11-15 14:02:19 -08:00
Sambhav Jain 4032eeca64
Add Bazel buildifier to torch-mlir (#1586)
Formats bazel BUILD and .bzl files with a standard convention. 

Invoke using
```
bazel run @torch-mlir//:buildifier
```
2022-11-15 12:34:27 -08:00
Sambhav Jain 99ec6039f6
Fix bazel CI (#1591)
I accidentally broke bazel CI by forgetting to update the GHA workflow in my [previous PR](https://github.com/llvm/torch-mlir/pull/1587). This should get it back to green, my apologies.

Qualifying CI run: https://github.com/sjain-stanford/torch-mlir/actions/runs/3472523982
2022-11-15 09:51:52 -08:00
Sambhav Jain b320f7fb77
Simplify Bazel build workflow (#1587)
Remove `run_bazel_build.sh`, simplify docker's entrypoint to start container at `utils/bazel` directory, update docs.
2022-11-15 08:34:43 -08:00
George Petterson 92f385bd9f [MLIR][TORCH] Add E2E support aten.convolution_backward op
This commit adds the decomposition for the `aten.convolution_backward`
and `aten.convolution_backward_overrideable` op.
2022-11-15 07:38:26 +05:30
Roll PyTorch Action f40cbd6a71 update PyTorch version to 1.14.0.dev20221114 2022-11-15 01:44:30 +00:00
Ashay Rane f1ef5681cc
build: pin torchvision to latest nightly (#1584)
We currently pin the `torch` package to the latest nightly version, but
since `torchvision` depends on the `torch` package, the pip resolver
then has to run through an extensive list of `torchvision` packages that
can be installed with the pinned `torch` package.  This search fails in
the RollPyTorch action, causing pip to settle on an old version of
`torchvision` that does not work with our tests.  In reality, we are
only interested in a specific version of the `torchvision` package.

To make the dependency explicit and to prevent test failures because of
incorrect package installations, this patch makes two key changes:

1. `torchvision` is now pinned to the latest nightly release in
   pytorch-requirements.txt along with the version of `torch` that is
   necessary to install the requested `torchvision` package

2. The RollPyTorch action now looks for the latest `torchvision` package
   instead of the latest `torch` package before writing the version
   numbers for pinning in pytorch-requirements.txt
2022-11-14 15:56:02 -06:00
Chi_Liu dfe7513a45
[MLIR][TORCH] Fix aten.unsqueeze op (#1578)
The range of the unsqueeze dim is: [-input.dim() - 1, input.dim() + 1), the bug forgets to add 1.
2022-11-14 09:09:15 -08:00
Gleb Kazantaev 6909eaf7fc
Update TorchMlirBackendImpl Methods (#1580)
* Fix LTC build

* Remove passing test from xfail set
2022-11-14 00:37:49 -05:00
Ashay Rane eec9a7e022
ci: make pip skip cached packages while installing dependencies (#1570)
We want each build to be reproducible regardless of prior builds and
prior package installations, but pip, by default, uses cached packages
from previous invocations of `pip install`.  As a result, the incorrect
dependencies downloaded in the RollPyTorch workflow in the main
repository cannot be reproduced in private forks of the repository.  To
resolve this problem, this patch adds a `--no-cache-dir` flag to pip, so
that it fetches and inspects each requested package independent or prior
installations.
2022-11-11 20:31:38 -06:00
Vivek Khandelwal a558034c1a [MLIR][TORCH] Fix aten.upsample_nearest2d_backward op
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
2022-11-12 00:05:36 +05:30
Vivek Khandelwal d571d050fd [torch_mlir.compile] Fixes issue with the https://github.com/llvm/torch-mlir/issues/1557
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
2022-11-11 18:05:15 +05:30
Sambhav Jain dcff5a7150
[Bazel] Update to Ubuntu-22.04 and clang-16 for Bazel build docker (#1523)
* Update Ubuntu and clang in the docker container
* Specifically build just `@torch-mlir//:torch-mlir-opt`


Triggered GHA run:
https://github.com/sjain-stanford/torch-mlir/actions/runs/3317006870/jobs/5479411204
2022-11-10 13:11:06 -08:00
Ashay Rane 6c31b06922
build: revert PyTorch update (#1571)
The PyTorch update broke the build.  I'm about to add more tests so that
it doesn't happen in the future.
2022-11-10 12:37:25 -06:00
Roll PyTorch Action 9df748d7ef update PyTorch version to 1.14.0.dev20221110 2022-11-10 17:52:06 +00:00
Sean Silva cc468d2d16 [cleanup] Be consistent about apostrophe 2022-11-10 07:42:15 -08:00
Daniel Ellis a7ac0def45
Move single-tensor-tuple-return test to mlir unit test.
Also, add multiple return test.
2022-11-10 09:23:53 -05:00
Xiafei Qiu 4f173c6e0f
update llvm tag to a2620e00. (#1567)
- also update MHLO to 57ba12a2(branch greencommit/2022-11-07-a2620e00)
- change -pass-pipeline format to make tests pass.
2022-11-10 18:39:28 +08:00
Sean Silva 64914603fa [torch_mlir.compile] Add support for multiple exported methods
For AoT deployments models often have multiple exported methods.
This patch enables something like this:

```
class TwoMethodsModule(torch.nn.Module):
    def sin(self, x):
        return torch.ops.aten.sin(x)

    def cos(self, x):
        return torch.ops.aten.cos(x)

example_args = torch_mlir.ExampleArgs()
example_args.add_method("sin", torch.ones(2, 3))
example_args.add_method("cos", torch.ones(2, 4))
print(torch_mlir.compile(TwoMethodsModule(), example_args))
```

In the
[long-term](https://github.com/llvm/torch-mlir/blob/main/docs/long_term_roadmap.md#tools-for-advanced-aot-deployments)
we will need to reconcile this with our story for stateful models and the
backend contract being purely functional. For now, this provides some basic
infra that seems harmless. Arguably, we could tighten up the backend contract
even more to only allow a single compiled function which would prohibit this or
require building out a layer above.

Fixes #1557
2022-11-10 02:10:22 -08:00
Yuanqiang Liu 2793a2bd41
fix TorchToMhlo Conversion cmake dependency (#1549) 2022-11-09 18:34:53 -06:00
Sean Silva ec4e01c321
Add Suraj to TorchToTOSA owners (#1566) 2022-11-09 14:55:13 -08:00
Jae Hoon (Antonio) Kim 2ec4b06bbb
Remove MakeView from IR Builder (#1552)
* Remove MakeView from IR Builder

* Update PyTorch requirements
2022-11-09 13:46:34 -05:00
Ashay Rane 9a73b9e6c7
build: un-pin the ninja pip package version (#1562)
Now that the ninja pip package issue has been resolved, this patch
removes the pinned version from requirements.txt so that we can go back
to using the most recent version of ninja.
2022-11-06 14:12:28 -06:00
Ashay Rane e16ccce373
ci: re-add powershell script for windows release builds (#1561)
This file was removed as part of the PR that added build caching for
Windows.
2022-11-06 12:48:38 -06:00
Roll PyTorch Action e78e9cd782 update PyTorch version to 1.14.0.dev20221105 2022-11-06 14:04:59 +00:00
Ashay Rane 27d8d47022
build: pin ninja pip version temporarily to resolve build failure (#1558)
Going from ninja v1.10.2 to v1.11.1, there is a change that breaks the
CI builds with the following error:

```
CMake Error at CMakeLists.txt:47 (project):
  Running
   '/main_checkout/torch-mlir/docker_venv/bin/ninja' '--version'
  failed with:
CMake Error: CMAKE_ASM_COMPILER not set, after EnableLanguage
```

Ostensibly, the reason for the error about the ASM compiler is because
llvm-project/llvm/CMakeLists.txt includes ASM among the list of
languages used in the LLVM project. Adding `-DCMAKE_ASM_COMPILER=clang`
does not resolve the error.

Until we figure out why the new version of ninja causes the build
failures, this patch pins the ninja to the one that worked.
2022-11-05 12:20:56 -05:00
Roll PyTorch Action 5ee20e70a1 update PyTorch version to 1.14.0.dev20221104 2022-11-04 22:01:57 +00:00
Ashay Rane d99b2ddb1b
importer: fix usage after PyTorch update (#1555)
Unless requested otherwise, PyTorch no longer installs most of the
header files under the caffe2 directory (see
https://github.com/pytorch/pytorch/pull/87986).  This breaks our
importer code since we need to use the `MakeGuard()` function to execute
statements in the event of exceptions.

To fix this issue, this patch implements a rudimentary version of
PyTorch's ScopeGuard, where once the class variable goes out of scope,
it executes a predefined method.
2022-11-04 15:02:23 -05:00
Vivek Khandelwal fedf8c0640 [MLIR][TORCH] Add E2E support for aten.upsample_nearest2d_backward.vec op
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
2022-11-04 22:10:07 +05:30
Ashay Rane db5a496eb4
build: enable update scripts to work with out-of-tree builds (#1553)
Before this patch, the update_shape_lib.sh and update_torch_ods.sh
scripts only worked on in-tree builds, which implied that the
RollPyTorch action was forced to run the longer-running in-tree build.
As a result of this patch, we should be able to run through the basic
checks in the RollPyTorch action faster, while running the full suite of
tests off the critical path.

The key change in this patch is that the update scripts now look for the
directory that is most recently modified between in-tree or out-of-tree
build directories.  The change also correctly handles the case when only
one of the two directories exists.
2022-11-04 08:13:02 -05:00
Sean Silva de4bcbfe9b [docs] Centralize all images in docs/images/ 2022-11-04 03:12:17 -07:00
Ashay Rane 2846776897
ci: enable ccache on Windows (#1548)
This patch makes a few small, but key, changes to enable ccache on
Windows.  First, it replaces the hendrikmuhs/ccache-action action with
command line invocations to the ccache binary, since the action has two
bugs, one of which causes CI to refer to different ccache artifacts
before versus after the build on Windows whereas the other bug can
sometimes cause the action to incorrectly infer that the cache is empty.

Second, this patch slightly alters the cache key, so that our old cache
artifacts, which have grown too big, are eventually discarded in favor
of the new, smaller cache artifacts.  Along the way, this patch also
keeps the RollPyTorch's cache artifact separate from the regular build's
cache artifact so as to keep these artifacts small, and also because the
RollPyTorch action is off the critical path for most contributors.

Finally, this patch makes small changes to the CMake file so that on
Windows, the ccache binary is added as a prefix, as recommended on the
[ccache Wiki](https://github.com/ccache/ccache/wiki/MS-Visual-Studio).
2022-11-03 12:17:22 -05:00
Ashay Rane f847642495
CI script improvements (#1547)
* ci: update versions of external actions

Node.js 12 actions are deprecated and will eventually go away, so this
patch bumps the old actions to their latest versions that use Node.js
16.

* ci: replace deprecated action with bash commands

The llvm/actions/install-ninja action uses Node.js 12, which is
deprecated.  Since that action is not updated to work with Node.js 16,
this patch replaces that action with equivalent bash commands to install
Ninja.

* ci: use smaller ccache artifacts to reduce evictions

Over time, our ccache sizes have grown quite large (some as large as
1.3 GB), which results in us routinely exceeding GitHub's limits, thus
triggering frequent cache evictions.  As a result, cache downloads and
uploads take unnecessary long, in addition to fewer cache entries being
available.

Based on experiments on a clean cache state, it appears that we need
less than 300 MB of (compressed) ccache artifacts for each build type.
Anything larger than that will accrue changes from the past that aren't
needed.

To alleviate the cache burden, this patch sets the maximum ccache size
to be 300 MB.  This change should not affect the success or failure of
our builds.  I will monitor the build times to check whether this change
causes any performance degradation.

* ci: use consistent platform identifiers

Prior to this patch, some of our builds ran on `ubuntu-latest`, while
some others ran on `ubuntu-20.04` and others ran on `ubuntu-22.04`, with
similar situations for macOS and windows.  This patch instead sets all
Linux builds to run on `ubuntu-latest`, all macOS builds to run on
`macos-latest`, and all Windows builds to run on `windows-latest`, to
make debugging future CI failures a little easier.
2022-11-02 21:37:01 -05:00
Sean Silva 2162253401 [docs] Add long-term roadmap
Add a roadmap covering expected project evolution over the next 1-2
years.
2022-11-02 03:25:52 -07:00
Ashay Rane 031d127940
ci: introduce read-only and read-write PyTorch build caches (#1546)
Until recently, we had to either risk feature branches creating PyTorch
build caches (which were unusable by the main branch or other parallel
feature branches because of GitHub's rules around sharing caches among
branches) or we had to limit the PyTorch build caches to only the main
branch, causing CI runs on feature branches to be terribly slow because
they had to rebuild PyTorch each time.

This patch enables the best of both worlds, by using a fork
(github.com/ashay/cache) of the GitHub's cache action, where the fork
adds an option (called `save`) which, when set, uploads a new cache
entry.  We thus set this `save` flag only when we're building PyTorch
from source in Torch-MLIR's main branch, whereas all other builds set
this `save` flag to `false`.

The ability to conditionally update the cache has been an oft-requested
feature on the original (github.com/actions/cache) repository and
multiple unmerged PRs exist to allow conditional cache updates, so it is
likely that using the fork is only a temporary solution.
2022-11-01 23:26:17 -07:00
Ashay Rane 79871040c9
Revert "ci: build PyTorch before building Torch-MLIR (#1542)" (#1545)
This reverts commit 805d728194.
2022-11-01 20:40:09 -05:00
Ashay Rane 805d728194
ci: build PyTorch before building Torch-MLIR (#1542)
This patch updates the build_linux_packages.sh script so that when
PyTorch needs to be built from source, it is built _before_ building
LLVM and before building Torch-MLIR.  The rationale behind this change
is that previously, when the PyTorch build was triggered through the
Torch-MLIR build, the PyTorch compilation added more entries to the
ccache artifacts.  However, since we cache the PyTorch _binary_ (i.e.
the WHL file), there is no need to add the PyTorch compilation to the
ccache artifacts.  By removing the PyTorch compilation files, we keep
the ccache artifact size small, thus reducing the number of evictions
when we exceed GitHub's allowed limit.
2022-11-01 17:03:58 -05:00
Ashay Rane 0409595ccc
mlir: add missing dependency on TableGen targets (#1537)
lib/Dialect/Torch/Utils/Utils.cpp includes TorchOps.h, which, by way of
included header files, refers to both TorchOps.h.inc as well as
TorchTypes.h.inc.  However, the build rules do not specify the
dependency of the `TorchMLIRTorchUtils` target on the TableGen generated
header files, causing spurious build errors.

This patch fixes the problem by adding `MLIRTorchOpsIncGen` and
`MLIRTorchTypesIncGen` to the list of dependencies of
`TorchMLIRTorchUtils`.
2022-11-01 14:59:11 -05:00
powderluv 1a33577860
remove spurious ref in publish pages (#1536)
We don't need to pass in optional tag information.
2022-11-01 09:42:21 -07:00
Tanyo Kwok 17bc7c89cc
build: update llvm tag to 74fb770d (#1539)
* build: update llvm tag to 74fb770d

This commit makes the following changes needed to update bump LLVM:

+ replace usages of `tensor::createPadScalarOp`, see https://reviews.llvm.org/D136493
+ Update file checks
2022-11-01 15:27:09 +08:00
Ashay Rane a8970101dc
pytorch: rename pytorch-version.txt to pytorch-hash.txt (#1541)
This patch is part of a larger set of improvements to the CI/build
system.  In the code, we refer to the version as the string that
contains the release identifier such as 1.14.0.dev20221028, so calling
the file that contains the commit hash as pytorch-version.txt creates
confusion.  For the sake of simplicity, this patch renames that file to
be pytorch-hash.txt.
2022-10-31 22:03:05 -05:00
Ashay Rane 2cf1092d4d
ci: restrict PyTorch cache to just the main branch (#1540)
If PyTorch build caches are created on a branch other than the main
branch, then GitHub does not share those caches with the main branch,
making every CI run that runs for each PR slow.  This patch resolves the
problem by letting only the main branch create and use PyTorch build
caches.
2022-10-31 15:14:53 -05:00
Jae Hoon (Antonio) Kim 0701464c47
Remove view ops from IR builder (#1534)
* Remove view ops from IR builder

* Update PyTorch requirements
2022-10-30 21:42:44 -04:00
xndcn 759057cbdd [MLIR][TORCH] Fix wrong parameter name "supportFPInputOnly"
The parameter "supportFPInputOnly" of function createPoolingOp() is
supposed to be "supportNonFPInput", which was added to distinguish
between "MaxPool2d" and "AvgPool2d" op in #718
2022-10-30 23:18:08 +08:00
Vivek Khandelwal c86177730d [MLIR][TORCH] Add E2E support for aten.fill.Tensor op
This commit adds the decomposition for `aten.fill.Tensor` op.

Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
2022-10-30 18:40:47 +05:30
powderluv 87ab714ed6
Update buildRelease.yml (#1535) 2022-10-30 00:14:54 -07:00
Ramiro Leal-Cavazos b723186983
Remove all but one of valsem ops + move fill.Scalar to elementwise (#1531)
This commit removes almost all of the valsem ops, since the value
semantics version of the ops now exist in PyTorch. The only op missing
is `aten.bernoulli_.float`. In addition, this commit also simplifies
the implementation of `aten.fill.Scalar` by moving it to the pattern
that converts elementwise ops.
2022-10-28 15:06:11 +00:00