Commit Graph

108 Commits (revert-2869-AtenSlice-folder)

Author SHA1 Message Date
Stella Laurenzo 77c14ab22b
[ci] Upgrade to new runners and disable unsupported jobs. (#2818)
Per the RFC and numerous conversations on Discord, this rebuilds the
torch-mlir CI and discontinues the infra and coupling to the binary
releases
(https://discourse.llvm.org/t/rfc-discontinuing-pytorch-1-binary-releases/76371).

I iterated on this to get latency back to about what it was with the old
(much larger and non-ephemeral) runners: About 4m - 4.5m for an
incremental change.

Behind the scenes changes:

* Uses a new runner pool operated by AMD. It is currently set to manual
scaling and has two runners (32-core, 64GiB RAM) while we get some
traction. We can either fiddle with some auto-scaling or use a schedule
to give it an increase during certain high traffic hours.
* Builds are now completely isolated and cannot have run-to-run
interference like we were getting before (i.e. lock file/permissions
stuff).
* The GHA runner is installed directly into a manylinux 2.28 container
with upgraded dev tools. This eliminates the need to do sub-invocations
of docker on Linux in order to run on the same OS that is used to build
wheels.
* While not using it now, this setup was cloned from another project
that posts the built artifacts to the job and fans out testing. Might be
useful here later.
* Uses a special git cache that lets us have ephemeral runners and still
check out the repo and deps (incl. llvm) in ~13s.
* Running in an Azure VM Scale Set.

In-repo changes:

* Disables (but does not yet delete):
  * Old buildAndTest.yml jobs
  * releaseSnapshotPackage.yml
* Adds a new `ci.yml` pipeline and scripts the steps in `build_tools/ci`
(by decomposing the existing `build_linux_packages.sh` for in-tree
builds and modularizing it a bit better).
* Test framework changes:
* Adds a `TORCH_MLIR_TEST_CONCURRENCY` env var that can be used to bound
the multiprocess concurrency. Ended up not using this in the final
version but is useful to have as a knob.
* Changes the default concurrency to `nproc * 0.8 + 1` vs `nproc * 1.1`.
We're running on systems with significantly less virtual memory and I
did a bit of fiddling to find a good tradeoff.
* Changed multiprocess mode to spawn instead of fork. Otherwise, I was
getting instability (as discussed on discord).
* Added MLIR configuration to disable multithreaded contexts globally
for the project. Constantly spawning `nproc * nproc` threads (more than
that actually) was OOM'ing.
* Added a test timeout of 5 minutes. If a multiprocess worker crashes,
the framework can get wedged indefinitely (and then will just be reaped
after multiple hours). We should fix this, but this at least keeps the
CI pool from wedging with stuck jobs.

Functional changes needing followup:

* No matter what I did, I couldn't get the LTC tests to work, and I'm
not 100% sure they were being run in the old setup as the scripts were a
bit twisty. I disabled them and left a comment.
* Dropped out-of-tree build variants. These were not providing much
signal and increase CI needs by 50%.
* Dropped MacOS and Windows builds. Now that we are "just a library" and
not building releases, there is less pressure to test these commit by
commit. Further, since we bump torch-mlir to known good commits on these
platforms, it has been a long time since either of these jobs have
provided much signal (and they take ~an hour+ to run). We can add them
back later post-submit if ever needed.
2024-01-27 18:35:45 -08:00
Stella Laurenzo 4a4d80a6ad
[ci] Add lint job and enable yaml linting of GH files. (#2819) 2024-01-27 15:48:06 -08:00
Stella Laurenzo 53fc995639 Run CI on all main/postsubmit commits.
Prior to this, the concurrency rules for presubmits (which cancel eagerly) were being applied to main. The result was that landing a second patch would cancel the CI on the one prior.
2023-11-22 18:05:18 -08:00
Stella Laurenzo 5eae0adff1
Breakup python pytorch deps (#2582)
This lifts the core of the jit_ir_importer and ltc out of the pt1
project, making them peers to it. As a side-effect of this layering, now
the "MLIR bits" (dialects, etc) are not commingled with the various
parts of the pt1 project, allowing pt1 and ltc to overlay cleanly onto a
more fundamental "just MLIR" Python core. Prior to this, the Python
namespace was polluted to the point that this could not happen.

That "just MLIR" Python core will be introduced in a followup, which
will create the space to upstream the FX and ONNX pure Python importers.

This primary non-NFC change to the API is:

* `torch_mlir.dialects.torch.importer.jit_ir` ->
`torch_mlir.jit_ir_importer`.

The rest is source code layering so that we can make the pt1 project
optional without losing the other features.

Progress on #2546.
2023-11-19 12:10:19 -08:00
Stella Laurenzo 6961f0a247
Re-organize project structure to separate PyTorch dependencies from core project. (#2542)
This is a first step towards the structure we discussed here:
https://gist.github.com/stellaraccident/931b068aaf7fa56f34069426740ebf20

There are two primary goals:

1. Separate the core project (C++ dialects and conversions) from the
hard PyTorch dependencies. We move all such things into projects/pt1 as
a starting point since they are presently entangled with PT1-era APIs.
Additional work can be done to disentangle components from that
(specifically LTC is identified as likely ultimately living in a
`projects/ltc`).
2. Create space for native PyTorch2 Dynamo-based infra to be upstreamed
without needing to co-exist with the original TorchScript path.

Very little changes in this path with respect to build layering or
options. These can be updated in a followup without commingling
directory structure changes.

This also takes steps toward a couple of other layering enhancements:

* Removes the llvm-external-projects/torch-mlir-dialects sub-project,
collapsing it into the main tree.
* Audits and fixes up the core C++ build to account for issues found
while moving things. This is just an opportunistic pass through but
roughly ~halves the number of build actions for the project from the
high 4000's to the low 2000's.

It deviates from the discussed plan by having a `projects/` tree instead
of `compat/`. As I was thinking about it, this will better accommodate
the follow-on code movement.

Once things are roughly in place and the CI passing, followups will
focus on more in-situ fixes and cleanups.
2023-11-02 19:45:55 -07:00
Ashay Rane 8f28d933e1
CI: disable LTC e2e tests in stable PyTorch builds (#2414)
This way, we can keep CI green without being forced to ignore _all_
errors that arise in stable PyTorch builds
2023-08-23 11:11:17 -05:00
Ramiro Leal-Cavazos c0d8248ec7
Prevent failed stable CI job from cancelling nightly jobs (#2373)
The CI jobs that use stable PyTorch are currently not required to pass
in order for a patch to get merged in `main`. This commit makes sure
that if a CI job for stable PyTorch fails, it does not cancel the
other required jobs.
2023-08-03 14:58:57 -07:00
Ashay Rane 755d0c46da
CI: Spot fixes related to nightly and stable PyTorch builds (#2190)
* CI: Skip (redundant) libtorch build when using stable PyTorch version

When we use PyTorch stable builds, there is no need to build libtorch
from source, making the stable-pytorch-with-torch-binary-OFF
configuration redundant with stable-pytorch-with-torch-binary-ON.  This
patch drops the redundant configuration from CI.

* CI: Simplify guard conditions for creating and using libtorch cache

Whether libtorch is enabled or not is predicated on a host of conditions
such as the platform, in-tree versus out-of-tree build, and stable
versus nightly PyTorch builds.  Instead of repeating these conditions to
guard whether to create or use the libtorch cache artifacts (and getting
them almost incorrect), this patch predicates the relevant pipeline
steps to whether libtorch is enabled, thus making the conditions far
simpler.
2023-06-01 22:58:25 -07:00
maxbartel db3f2e3fde
Add Stable PyTorch CI Pipeline (#2038)
* feat: split pytorch requirements into stable and nightly

* fix: add true to tests to see full output

* refactor: add comments to explain true statement

* feat: move some tests to experimental mode

* refactor: refactor pipeline into more fine grained difference

* feat: add version differentiation for some tests

* feat: activate more configs

* refactor: change implementation to use less requirement files

* refactor: remove contraints used for testing

* fix: revert some requirement file names

* refactor: remove unnecessary ninja install

* fix: fix version parsing

* refactor: remove dependency on torchvision in main requirements file

* refactor: remove index url

* style: remove unnecesary line switch

* fix: readd index url
2023-05-30 12:16:24 -07:00
Ashay Rane 19a08d51f3
CI: [nfc] Use actions/cache instead of modified fork (#2124)
We previously used a fork of the action/cache repository for the PyTorch
cache since the actions/cache repo did not support read-only caches.
Now that actions/cache supports separate read and write steps, this
patch switches back to the actions/cache repo.
2023-05-12 23:25:17 -05:00
Ashay Rane 28bb866260
CI: prepare CI for ccache updates for MSVC/Windows (#2120)
This patch, by itself, doesn't fix caching on Windows, but once a new
release of ccache is available, caching for Windows builds should start
working again (validated by building ccache from source and using it
with LLVM builds).

Ccache rejects caching when either the `/Zi` or `/ZI` flags are used
during compilation on Windows, since these flags tell the compiler to
embed debug information in a PDB file (separate from the object file
produced by the compiler).  In particular, our CI builds add the `/Zi`
flag, making ccache mark these compiler invocations as uncacheable.

But what caused our CI to add debug flags, especially when we specified
`-DCMAKE_BUILD_TYPE=Release`?  On Windows, unless we specify the
`--config Release` flag during the CMake build step, CMake assumes a
debug build.  So all this while, we had been producing debug builds of
torch-mlir for every PR!  No doubt it took so long to build the Windows
binaries.

The reason for having to specify the configuration during the _build_
step (as opposed to the _configure_ step) of CMake on Windows is that
CMake's Visual Studio generators will produce _both_ Release and Debug
profiles during the CMake configure step (thus requiring a build-time
value that tells CMake whether to build in Release or Debug mode).
Luckily, on Linux and macOS, the `--config` flag seems to be simply
ignored, instead of causing build errors.

Strangely, based on cursory tests, it seems like on Windows we need to
specify the Relase configuration as both `-DCMAKE_BUILD_TYPE=Release` as
well as `--config Release`.  Dropping either made my build switch to a
Debug configuration.

Additionally, there is a bug in ccache v4.8 (although this is addressed
in trunk) that causes ccache to reject caching if the compiler
invocation includes any flag that starts with `/Z`, including /`Zc`,
which is added by LLVM's HandleLLVMOptions.cmake and which isn't related
to debug info or PDB files.  The next release of ccache should include
the fix, which is to reject caching only for `/Zi` and `/ZI` flags and
not all flags that start with `/Z`.

As a side note, debugging this problem was possible because of ccache's
log file, which is enabled by: `ccache --set-config="log_file=log.txt"`.
2023-05-12 12:45:01 -05:00
powderluv 0a3ab07c8f
Set fetch-depth 0 for CI builds too (#2034) 2023-04-14 11:36:41 -07:00
powderluv 0497f0b08d
Revert "CI: drop deletion of workspace and limit submodule fetch concurrency (#1921)" (#2007)
This reverts commit 07f5f042c7.
2023-04-06 10:36:30 -07:00
Ashay Rane 07f5f042c7
CI: drop deletion of workspace and limit submodule fetch concurrency (#1921)
Despite using sudo to delete the workspace directory, we still
occasionally run into checkout errors.  This patch thus drops the
deletion of the workspace prior to checkout.  It also restricts the
number of parallel jobs in the submodule fetch step to just one, to try
and resolve the checkout issue ("index.lock: File exists.").
2023-04-04 12:58:52 -05:00
Ashay Rane 987d5ab335
CI: use `sudo` to remove Docker-created files (#1905) 2023-02-27 17:44:50 -06:00
Ashay Rane ea00371d85
CI: clear workspace directory before checkout (#1900)
We have recently started seeing errors like:

```
  Synchronizing submodule url for 'externals/llvm-project'
  Synchronizing submodule url for 'externals/mlir-hlo'
  /usr/bin/git -c protocol.version=2 submodule update --init --force --depth=1
  Error: fatal: Unable to create '/home/anush/actions-runner/_work/torch-mlir/torch-mlir/.git/modules/externals/llvm-project/index.lock': File exists.
```

As a workaround, this patch removes the workspace directory before the
checkout step.
2023-02-24 14:44:35 -06:00
powderluv 5710871f4f
Update buildAndTest.yml (#1881)
* Update buildAndTest.yml

* Update oneshotSnapshotPackage.yml

* Update buildRelease.yml

* Update RollPyTorch.yml

* Update oneshotSnapshotPackage.yml

* Update buildAndTest.yml
2023-02-15 09:17:12 -08:00
Ashay Rane 711646d095
mhlo: migrate conversion to stablehlo (#1840)
This patch replaces all MHLO operations with their StableHLO
counterparts and adds a validation pass to ensure that no MHLO operations
remain before translating all Stablehlo operations to the MHLO dialect
for further lowering to the Linalg dialect.

This patch also updates all lit tests so that they refer to the
`convert-torch-to-stablehlo` pass and so that they check for StableHLO
operations.
2023-02-02 07:29:47 -06:00
powderluv cd90c0aaf5
Update buildAndTest.yml (#1723) 2022-12-15 05:42:01 -08:00
Ashay Rane 64f9a0e978
ci: print ccache statistics and configuration at end of CI run (#1719)
There appear to be two problems with the caching layer in our CI runs:
(a) the sizes of some of the caches have grown to multiples of the
300 MB limit and (b) caching on Windows seems to be provide little to no
benefit.

To help understand the reasons for these problems, this patch adds a
line item to the list of steps run in CI to dump the ccache
configuration and statistics just prior to uploading the cache artifact.
2022-12-14 09:50:43 -06:00
Ashay Rane 2846776897
ci: enable ccache on Windows (#1548)
This patch makes a few small, but key, changes to enable ccache on
Windows.  First, it replaces the hendrikmuhs/ccache-action action with
command line invocations to the ccache binary, since the action has two
bugs, one of which causes CI to refer to different ccache artifacts
before versus after the build on Windows whereas the other bug can
sometimes cause the action to incorrectly infer that the cache is empty.

Second, this patch slightly alters the cache key, so that our old cache
artifacts, which have grown too big, are eventually discarded in favor
of the new, smaller cache artifacts.  Along the way, this patch also
keeps the RollPyTorch's cache artifact separate from the regular build's
cache artifact so as to keep these artifacts small, and also because the
RollPyTorch action is off the critical path for most contributors.

Finally, this patch makes small changes to the CMake file so that on
Windows, the ccache binary is added as a prefix, as recommended on the
[ccache Wiki](https://github.com/ccache/ccache/wiki/MS-Visual-Studio).
2022-11-03 12:17:22 -05:00
Ashay Rane f847642495
CI script improvements (#1547)
* ci: update versions of external actions

Node.js 12 actions are deprecated and will eventually go away, so this
patch bumps the old actions to their latest versions that use Node.js
16.

* ci: replace deprecated action with bash commands

The llvm/actions/install-ninja action uses Node.js 12, which is
deprecated.  Since that action is not updated to work with Node.js 16,
this patch replaces that action with equivalent bash commands to install
Ninja.

* ci: use smaller ccache artifacts to reduce evictions

Over time, our ccache sizes have grown quite large (some as large as
1.3 GB), which results in us routinely exceeding GitHub's limits, thus
triggering frequent cache evictions.  As a result, cache downloads and
uploads take unnecessary long, in addition to fewer cache entries being
available.

Based on experiments on a clean cache state, it appears that we need
less than 300 MB of (compressed) ccache artifacts for each build type.
Anything larger than that will accrue changes from the past that aren't
needed.

To alleviate the cache burden, this patch sets the maximum ccache size
to be 300 MB.  This change should not affect the success or failure of
our builds.  I will monitor the build times to check whether this change
causes any performance degradation.

* ci: use consistent platform identifiers

Prior to this patch, some of our builds ran on `ubuntu-latest`, while
some others ran on `ubuntu-20.04` and others ran on `ubuntu-22.04`, with
similar situations for macOS and windows.  This patch instead sets all
Linux builds to run on `ubuntu-latest`, all macOS builds to run on
`macos-latest`, and all Windows builds to run on `windows-latest`, to
make debugging future CI failures a little easier.
2022-11-02 21:37:01 -05:00
Ashay Rane 031d127940
ci: introduce read-only and read-write PyTorch build caches (#1546)
Until recently, we had to either risk feature branches creating PyTorch
build caches (which were unusable by the main branch or other parallel
feature branches because of GitHub's rules around sharing caches among
branches) or we had to limit the PyTorch build caches to only the main
branch, causing CI runs on feature branches to be terribly slow because
they had to rebuild PyTorch each time.

This patch enables the best of both worlds, by using a fork
(github.com/ashay/cache) of the GitHub's cache action, where the fork
adds an option (called `save`) which, when set, uploads a new cache
entry.  We thus set this `save` flag only when we're building PyTorch
from source in Torch-MLIR's main branch, whereas all other builds set
this `save` flag to `false`.

The ability to conditionally update the cache has been an oft-requested
feature on the original (github.com/actions/cache) repository and
multiple unmerged PRs exist to allow conditional cache updates, so it is
likely that using the fork is only a temporary solution.
2022-11-01 23:26:17 -07:00
Ashay Rane a8970101dc
pytorch: rename pytorch-version.txt to pytorch-hash.txt (#1541)
This patch is part of a larger set of improvements to the CI/build
system.  In the code, we refer to the version as the string that
contains the release identifier such as 1.14.0.dev20221028, so calling
the file that contains the commit hash as pytorch-version.txt creates
confusion.  For the sake of simplicity, this patch renames that file to
be pytorch-hash.txt.
2022-10-31 22:03:05 -05:00
Ashay Rane 2cf1092d4d
ci: restrict PyTorch cache to just the main branch (#1540)
If PyTorch build caches are created on a branch other than the main
branch, then GitHub does not share those caches with the main branch,
making every CI run that runs for each PR slow.  This patch resolves the
problem by letting only the main branch create and use PyTorch build
caches.
2022-10-31 15:14:53 -05:00
powderluv bbde4e163f
Add Windows Builder (#1521)
Add a powershell script to build windows .whl packages
Disable LTC as it doesn't build on Windows.
Add GHA hooks
Use Python 3.10.8
2022-10-25 16:13:31 -07:00
Ashay Rane a9942f343a
Cache PyTorch source builds to reduce CI time (#1500)
* ci: cache PyTorch source builds

This patch reduces the time spent in regular CI builds by caching
PyTorch source builds.  Specifically, this patch:

1. Makes CI lookup the cache entry for the PyTorch commit hash in
   pytorch-version.txt
2. If lookup was successful, CI fetches the previously-generated WHL
   file into the build_tools/python/wheelhouse directory
3. CI sets the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable to `true`
4. The build_libtorch.sh script then uses the downloaded WHL file
   instead of rebuilding PyTorch

* ci: warm up PyTorch source cache during daily RollPyTorch action

This patch makes the RollPyTorch action write the updated WHL file to
the cache, so that it can be later retrieved by CI that runs for each
PR.  We deliberately add the caching step to the end of the action since
the RollPyTorch action never needs to read from the cache, although
executing this step earlier in the process should not cause problems
either.
2022-10-18 00:42:42 -05:00
Ashay Rane 8a8e779529
Disable auto-update of PyTorch version until CI script stabilizes (#1456)
Instead of letting the auto-update script either fail because of script
errors or letting it commit bad versions, this patch makes the update
process manual, for now.  Once the script stabilizes, I will its
re-enable periodic execution.
2022-10-04 03:02:44 -05:00
powderluv e6528f701a
Move CIs to use docker builds (#1316)
* Move CIs to use docker builds

Now that #1234 has landed and anyone can run CI / Release builds locally move GHA to use the same flow.

* update names

* Update comments
2022-09-02 18:35:40 -07:00
Sean Silva e16b43e20b Remove "torchscript" association from the e2e framework.
We use it for more than TorchScript testing now. This is a purely
mechanical change to adjust some file paths to remove "torchscript".

The most perceptible change here is that now e2e tests are run with

```
./tools/e2e_test.sh
instead of:
./tools/torchscript_e2e_test.sh
```
2022-08-29 14:10:03 -07:00
Sean Silva bcccf41d96 Add CI for generated files.
This ensures that they are always up to date.

This also updates the shape lib to make the new CI actually pass :)
2022-08-29 12:07:16 -07:00
powderluv c0630da678
Disable LTC by default until upstream revert relands (#1303)
* Disable LTC by default until upstream revert relands

Tracked with the WIP https://github.com/llvm/torch-mlir/pull/1292

* Disable LTC e2e tests temporarily

* Update setup.py

Disable LTC in setup.py temporarily until upstream is fixed.
2022-08-28 19:11:40 -07:00
Tanyo Kwok 2374098d71
[MHLO] Init end to end unit tests (#1223) 2022-08-23 16:47:21 +08:00
Henry Tu ba17a4d6c0
Reenable LTC in out-of-tree build (for real this time) (#1205)
* Fix OOT LTC CI build failure

* Disable LTC during macOS package gen

* Add more details about static TorchMLIRJITIRImporter library
2022-08-19 15:25:00 -04:00
Ashay Rane 606f4d2c0e
build: streamline options for enabling LTC and MHLO (#1221) 2022-08-12 23:49:28 -07:00
Sambhav Jain 34478ab1c7
[Build] Add concurrency groups to address long queue times (#1219)
We're seeing large CI queue times ([example](https://discord.com/channels/636084430946959380/742573221882364009/1007631811184164944)) especially with MacOS VMs on GHA. Part of the problem is follow-on commits to the same branch which trigger new runs while the previous runs are still in-progress, hogging on the scarce VMs.

This PR adds concurrency groups to the GHA workflow which ensures that only a single job or workflow using the same concurrency group will run at a time. This would cancel any in-progress jobs in the same github workflow and github ref (e.g. `refs/heads/main` or `refs/pull/<pr_number>/merge`).

As discussed on discord [thread](https://discord.com/channels/636084430946959380/1007787336848912386/1007787338895740928), once this lands we may have to closely monitor the workflows to see this didn't introduce unintended consequences. If so, we could either revert, or decide to selectively cancel particular runs (e.g. macos only which is the main bottleneck right now) instead of entire workflow.

This will also require some expectation management. As in, if you see an  on the main branch, it may not necessarily mean things broke, it could mean the run was killed by a more recent run. Making it a bit harder to traceback a failure to a commit in a sequence of commits (requiring to run those builds again).

Thanks @powderluv for the proposal and pointer to this! It should help with the scarce VMs on GHA and save on queue time. 

References:
* https://docs.github.com/en/actions/using-jobs/using-concurrency#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow
* https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow
2022-08-12 17:38:48 -07:00
Ashay Rane 1581d6a84c
build: fix typo in path (#1218)
When we renamed the directory containing submodules from `external` to
`externals`, we accidentally left the original name in the Github
workflow.  This patch fixes the problem.
2022-08-12 15:38:25 -07:00
Sambhav Jain aed0ec3a2c
Merge matrix runs to fail fast globally (#1216)
My earlier[ PR](https://github.com/llvm/torch-mlir/pull/1213) had (among other things) decoupled ubuntu and macos builds into separate matrix runs. This is not working well due to limited number of MacOS GHA VMs causing long queue times and backlog. There are two reasons causing this backlog: 

1. macos arm64 builds with pytorch source are getting erratically cancelled due to resource / network constraints. This is addressed with this: https://github.com/llvm/torch-mlir/pull/1215

> "macos-arm64 (in-tree, OFF) The hosted runner: GitHub Actions 3 lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error."

2. macos runs don't fail-fast when ubuntu runs fail due to being in separate matrix setups. This PR couples them again.
2022-08-12 11:30:09 -07:00
Sambhav Jain b8bd0a46cc
use pytorch binary for macos-arm64 builds (#1215) 2022-08-12 06:33:57 -07:00
Sambhav Jain f00ca91db0
Simplify matrix configuration for CI workflows (#1213)
Addresses https://github.com/llvm/torch-mlir/issues/1207. 

#### Provisioned jobs:
```
# ubuntu - x86_64 - llvm in-tree     - pytorch binary - build+test    # most used dev flow and fastest signal
# ubuntu - x86_64 - llvm out-of-tree - pytorch source - build+test    # most elaborate build
# macos  - arm64  - llvm in-tree     - pytorch source - build only    # cross compile, can't test arm64
```

#### Main changes
- [x] Spawn macos builds from a separate matrix (in the same workflow). It made sense to do this as they are fairly different from ubuntu (cross compile, use a different cmake configuration). This simplifies the matrix configuration and exclusions quite a bit, and makes the workflow a bit more tractable and maintenance friendly.
- [x] Remove the submodule md5sum step for ccache config. This was [broken](https://github.com/llvm/torch-mlir/runs/7779288734?check_suite_focus=true#step:3:145) for a while now.
- [x] Removes unused matrix options - `os`, `targetarch`, `python-version`, `llvmtype`.
- [x] Address ZSTD [comment](https://github.com/llvm/torch-mlir/pull/1204#discussion_r942349282) on @powderluv's cross compile [PR](https://github.com/llvm/torch-mlir/pull/1204). 

#### Further improvements (to be addressed in follow-on):
* ubuntu-x86_64 out-of-tree integration tests fail ([error](https://github.com/sjain-stanford/torch-mlir/runs/7781264029?check_suite_focus=true)); only run unit tests for now (tests are excluded in current CI too)

#### Passing workflow:
https://github.com/sjain-stanford/torch-mlir/actions/runs/2840676309
![image](https://user-images.githubusercontent.com/19234106/184194535-f3807991-401a-4cb9-b030-0ee8c334eba3.png)
2022-08-11 16:35:15 -07:00
powderluv 2342456356
mac m1 cross compile (#1204)
* mac m1 cross compile

Add support for M1 cross compile

* Remove redundant ExecutionEngine

It is registered as part of RegisterEverything

* nuke non-universal zstd

disable LTC
2022-08-10 08:48:39 -07:00
powderluv 9cf0b6e8ff
Disable out-of-tree and PyTorch binary (#1206) 2022-08-09 18:18:12 -07:00
Sambhav Jain b696362b7d
Enable OOT builds in CI (#1188) 2022-08-09 12:13:16 -07:00
Henry Tu 3e97a33c80
Revert "Reenable LTC in out-of-tree build (#1177)" (#1183)
This reverts commit f85ae9c685.
2022-08-08 18:58:35 -07:00
Henry Tu f85ae9c685
Reenable LTC in out-of-tree build (#1177) 2022-08-08 17:35:22 -04:00
Henry Tu e322f6a878
Update LTC CMake hack documentation (#1155)
* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update buildAndTest.yml

* Update setup.py

* Address review comments
2022-08-05 14:12:20 -04:00
powderluv 37a229cffc
Update buildAndTest.yml (#1145) 2022-08-03 12:50:54 -07:00
powderluv 0d25b6f10e
Fix cache-suffix name bug (#1138)
This should enabling better caching of builds.
2022-08-03 07:53:01 -07:00
Henry Tu 2c3b3606d0 Resolve remaining LTC CI failures (#1110)
* Replace CHECK_EQ with TORCH_CHECK_EQ

* Check value of TORCH_MLIR_USE_INSTALLED_PYTORCH during LTC build

* Update LTC XFAIL with NewZerosModule ops

* Explicitly blacklist _like ops

* Automatically blacklist new_/_like ops

* Prune away unused Python dependencies from LTC

* Add flag to disable LTC

* Autogen dummy _REFERENCE_LAZY_BACKEND library when LTC is disabled

* Implement compute_shape_var

* Removed Var tests from XFAIL Set

* XFAIL tests using _local_scalar_dense or index.Tensor

* Add StdDim tests to XFAIL set

* Autogen aten::cat
2022-07-30 09:40:02 -04:00
Antonio Kim de6c135dc3 Fix LTC autogen for CI with nightly PyTorch
- Update llvm-project pin to match main
2022-07-30 09:40:02 -04:00