torch-mlir

Commit Graph

Author	SHA1	Message	Date
Ashay Rane	a897c49803	CI: miscellaneous fixes for Release builds (#1781 ) - Use v3 of actions/checkout, since the version we use (v2) uses Node.js 12, which is deprecated by GitHub. - Source the PowerShell venv sctipt (instead of the bash sript) since the calling script is a PowerShell script. Without this, the build doesn't use venv at all. - Make the build dependencies in whl-requirements.txt (used by setup.py) match those in requirements.txt. To that end, this patch creates a build-requirements.txt that is referenced by requirements.txt and whl-requirements.txt.	2023-01-06 20:41:43 -06:00
Ashay Rane	f6b6069a34	ci: post comment on RollPyTorch tracker issue upon build failure (#1730 ) Now that the RollPyTorch tracker issue exists, we can automate the job of notifying folks of failures instead of having to do it manually. This patch adds a step to the workflow to post such a message.	2022-12-18 13:45:30 -06:00
powderluv	cd90c0aaf5	Update buildAndTest.yml (#1723 )	2022-12-15 05:42:01 -08:00
Ashay Rane	64f9a0e978	ci: print ccache statistics and configuration at end of CI run (#1719 ) There appear to be two problems with the caching layer in our CI runs: (a) the sizes of some of the caches have grown to multiples of the 300 MB limit and (b) caching on Windows seems to be provide little to no benefit. To help understand the reasons for these problems, this patch adds a line item to the list of steps run in CI to dump the ccache configuration and statistics just prior to uploading the cache artifact.	2022-12-14 09:50:43 -06:00
Ashay Rane	731c313231	ci: run `git pull` before committing pytorch version updates (#1716 ) The RollPyTorch action often takes more than 1.5 hours to finish. During this time, if another PR is merged, then the RollPyTorch action needs to first pull the merged changes before committing the updates to the PyTorch commit hash and version files. This patch adds the required `git pull` statement, without which, the subsequent `git push` statement fails, causing the RollPyTorch action to fail as well.	2022-12-13 13:41:41 -06:00
Daniel Ellis	07a65961dd	Disable pypi publishing. See https://github.com/llvm/torch-mlir/issues/1709	2022-12-13 11:45:41 -05:00
Ramiro Leal-Cavazos	a710237437	[custom op] Generalize shape library logic to work with dtypes (#1594 ) * [custom op] Generalize shape library logic to work with dtypes This commit generalizes the shape library logic, so that dtype rules for ops can also be expressed using the same mechanism. In other words, each op can now have a shape function and a dtype function specified in Python that is imported during lowering to calculate the shapes and dtypes throught a program. For more information about how to specify a dtype function, see the updated `docs/adding_a_shape_and_dtype_function.md`. For those not familiar with how the shape library works, the file `docs/calculations_lib.md` provides an overview.	2022-12-13 08:25:41 -08:00
Sambhav Jain	109c91ae9b	[CI] Verify bazel buildifier is run and changes committed (#1700 ) Ensures the buildifier (linter for bazel build files) is run and changes are pushed.	2022-12-08 15:56:57 -08:00
Daniel Ellis	98d80a642a	Publish releases to PyPI after build	2022-12-07 10:01:55 -05:00
Ashay Rane	b43965d8d3	build: fetch PyTorch version using downloaded WHL file (#1632 ) Until recently, the metadata file in the torchvision package included the nightly version of the torch package, but since that is no longer the case, our RollPyTorch workflow is broken. As a workaround, this patch uses the `pip download` command's ability to fetch the dependent torch package for the specified version of torchvision, before peeking into the WHL file for the torch package to determine the release version and the commit hash.	2022-11-23 13:54:54 -06:00
Ashay Rane	4eead74232	ci: delay RollPyTorch action by 1 hour to use latest torchvision package (#1603 ) The upload timestamp of the nightly torchvision package has drifted beyond the scheduled time of the RollPyTorch action because of the time change due to daylight saving. As a result, the RollPyTorch action now picks the torchvision package from a day earlier instead of the most recent package. This patch schedules the RollPyTorch action to start one hour later than before so that it continues to pick the most recent nightly package.	2022-11-23 11:31:02 -06:00
Sambhav Jain	ba5b90ee27	Enable bazel LIT tests in CI (#1596 ) Bazel LIT test support was added in https://github.com/llvm/torch-mlir/pull/1585. This PR enables the tests in CI. ``` INFO: Build completed successfully, 254 total actions @torch-mlir//test/Conversion:TorchToArith/basic.mlir.test PASSED in 0.3s @torch-mlir//test/Conversion:TorchToLinalg/basic.mlir.test PASSED in 0.5s @torch-mlir//test/Conversion:TorchToLinalg/elementwise.mlir.test PASSED in 0.3s @torch-mlir//test/Conversion:TorchToLinalg/flatten.mlir.test PASSED in 0.3s @torch-mlir//test/Conversion:TorchToLinalg/pooling.mlir.test PASSED in 0.3s @torch-mlir//test/Conversion:TorchToLinalg/unsqueeze.mlir.test PASSED in 0.2s @torch-mlir//test/Conversion:TorchToLinalg/view.mlir.test PASSED in 0.3s @torch-mlir//test/Conversion:TorchToMhlo/basic.mlir.test PASSED in 0.5s @torch-mlir//test/Conversion:TorchToMhlo/elementwise.mlir.test PASSED in 0.9s @torch-mlir//test/Conversion:TorchToMhlo/gather.mlir.test PASSED in 0.3s @torch-mlir//test/Conversion:TorchToMhlo/linear.mlir.test PASSED in 0.6s @torch-mlir//test/Conversion:TorchToMhlo/pooling.mlir.test PASSED in 0.3s @torch-mlir//test/Conversion:TorchToMhlo/reduction.mlir.test PASSED in 0.4s @torch-mlir//test/Conversion:TorchToMhlo/view_like.mlir.test PASSED in 0.6s @torch-mlir//test/Conversion:TorchToSCF/basic.mlir.test PASSED in 0.2s @torch-mlir//test/Conversion:TorchToTosa/basic.mlir.test PASSED in 1.1s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/basic.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/error.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/free-functions.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/initializers.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/methods.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/module-uses-error.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/module-uses.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/multiple-instances-error.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/multiple-instances-multiple-module-args.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/multiple-instances.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/submodules.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:Torch/GlobalizeObjectGraph/visibility.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/adjust-calling-conventions.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/canonicalize.mlir.test PASSED in 0.4s @torch-mlir//test/Dialect:Torch/decompose-complex-ops-legal.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/decompose-complex-ops.mlir.test PASSED in 0.9s @torch-mlir//test/Dialect:Torch/drop-shape-calculations.mlir.test PASSED in 0.4s @torch-mlir//test/Dialect:Torch/erase-module-initializer.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/inline-global-slots-analysis.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:Torch/inline-global-slots-transform.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/invalid.mlir.test PASSED in 0.4s @torch-mlir//test/Dialect:Torch/lower-to-backend-contract-error.mlir.test PASSED in 17.3s @torch-mlir//test/Dialect:Torch/maximize-value-semantics.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:Torch/ops.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:Torch/prepare-for-globalize-object-graph.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/promote-types.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:Torch/reduce-op-variants-error.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/reduce-op-variants.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/refine-public-return.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:Torch/refine-types-branch.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:Torch/refine-types-ops.mlir.test PASSED in 0.6s @torch-mlir//test/Dialect:Torch/refine-types.mlir.test PASSED in 0.4s @torch-mlir//test/Dialect:Torch/reify-shape-calculations.mlir.test PASSED in 2.9s @torch-mlir//test/Dialect:Torch/simplify-shape-calculations.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:Torch/torch-function-to-torch-backend-pipeline.mlir.test PASSED in 0.6s @torch-mlir//test/Dialect:TorchConversion/canonicalize.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:TorchConversion/finalizing-backend-type-conversion.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:TorchConversion/func-backend-type-conversion.mlir.test PASSED in 0.2s @torch-mlir//test/Dialect:TorchConversion/ops.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:TorchConversion/verify-linalg-on-tensors-backend-contract.mlir.test PASSED in 0.3s @torch-mlir//test/Dialect:TorchConversion/verify-tosa-backend-contract.mlir.test PASSED in 0.2s @torch-mlir//test/RefBackend:insert-rng-globals.mlir.test PASSED in 0.2s INFO: Build completed successfully, 2[54](https://github.com/sjain-stanford/torch-mlir/actions/runs/3476816449/jobs/5812368489#step:7:55) total actions @torch-mlir//test/RefBackend:munge-calling-conventions.mlir.test PASSED in 0.2s Executed [59](https://github.com/sjain-stanford/torch-mlir/actions/runs/3476816449/jobs/5812368489#step:7:60) out of 59 tests: 59 tests pass. ``` GHA workflow: https://github.com/sjain-stanford/torch-mlir/actions/runs/3476816449/jobs/5812368489	2022-11-16 11:59:33 -08:00
Sambhav Jain	4aa1e90b34	Fix cache bug with Bazel builds in CI (#1593 ) Some time ago, bazel builds in CI were being sped up fine with caching. However, over time the cache got stale because `actions/cache@v3` apparently doesn't update caches when it "hits" unless it is configured to do so specifically. This requires using a uniqued per-commit `key` (to force it to update cache after each successful run) and a relaxed `restore-keys` which is not unique per-commit so newer commits can restore from the nearest hit. Test GHA run 1 (no cache hit): [1h 1m 52s](https://github.com/sjain-stanford/torch-mlir/actions/runs/3474770334/usage) Test GHA run 2 (cache hit, same commit): [5m 14s](https://github.com/sjain-stanford/torch-mlir/actions/runs/3475132135/usage) Test GHA run 3 (cache hit, different commit): [6m 6s](https://github.com/sjain-stanford/torch-mlir/actions/runs/3475161009/usage)	2022-11-15 18:48:31 -08:00
Sambhav Jain	99ec6039f6	Fix bazel CI (#1591 ) I accidentally broke bazel CI by forgetting to update the GHA workflow in my [previous PR](https://github.com/llvm/torch-mlir/pull/1587). This should get it back to green, my apologies. Qualifying CI run: https://github.com/sjain-stanford/torch-mlir/actions/runs/3472523982	2022-11-15 09:51:52 -08:00
Ashay Rane	f1ef5681cc	build: pin torchvision to latest nightly (#1584 ) We currently pin the `torch` package to the latest nightly version, but since `torchvision` depends on the `torch` package, the pip resolver then has to run through an extensive list of `torchvision` packages that can be installed with the pinned `torch` package. This search fails in the RollPyTorch action, causing pip to settle on an old version of `torchvision` that does not work with our tests. In reality, we are only interested in a specific version of the `torchvision` package. To make the dependency explicit and to prevent test failures because of incorrect package installations, this patch makes two key changes: 1. `torchvision` is now pinned to the latest nightly release in pytorch-requirements.txt along with the version of `torch` that is necessary to install the requested `torchvision` package 2. The RollPyTorch action now looks for the latest `torchvision` package instead of the latest `torch` package before writing the version numbers for pinning in pytorch-requirements.txt	2022-11-14 15:56:02 -06:00
Ashay Rane	2846776897	ci: enable ccache on Windows (#1548 ) This patch makes a few small, but key, changes to enable ccache on Windows. First, it replaces the hendrikmuhs/ccache-action action with command line invocations to the ccache binary, since the action has two bugs, one of which causes CI to refer to different ccache artifacts before versus after the build on Windows whereas the other bug can sometimes cause the action to incorrectly infer that the cache is empty. Second, this patch slightly alters the cache key, so that our old cache artifacts, which have grown too big, are eventually discarded in favor of the new, smaller cache artifacts. Along the way, this patch also keeps the RollPyTorch's cache artifact separate from the regular build's cache artifact so as to keep these artifacts small, and also because the RollPyTorch action is off the critical path for most contributors. Finally, this patch makes small changes to the CMake file so that on Windows, the ccache binary is added as a prefix, as recommended on the [ccache Wiki](https://github.com/ccache/ccache/wiki/MS-Visual-Studio).	2022-11-03 12:17:22 -05:00
Ashay Rane	f847642495	CI script improvements (#1547 ) * ci: update versions of external actions Node.js 12 actions are deprecated and will eventually go away, so this patch bumps the old actions to their latest versions that use Node.js 16. * ci: replace deprecated action with bash commands The llvm/actions/install-ninja action uses Node.js 12, which is deprecated. Since that action is not updated to work with Node.js 16, this patch replaces that action with equivalent bash commands to install Ninja. * ci: use smaller ccache artifacts to reduce evictions Over time, our ccache sizes have grown quite large (some as large as 1.3 GB), which results in us routinely exceeding GitHub's limits, thus triggering frequent cache evictions. As a result, cache downloads and uploads take unnecessary long, in addition to fewer cache entries being available. Based on experiments on a clean cache state, it appears that we need less than 300 MB of (compressed) ccache artifacts for each build type. Anything larger than that will accrue changes from the past that aren't needed. To alleviate the cache burden, this patch sets the maximum ccache size to be 300 MB. This change should not affect the success or failure of our builds. I will monitor the build times to check whether this change causes any performance degradation. * ci: use consistent platform identifiers Prior to this patch, some of our builds ran on `ubuntu-latest`, while some others ran on `ubuntu-20.04` and others ran on `ubuntu-22.04`, with similar situations for macOS and windows. This patch instead sets all Linux builds to run on `ubuntu-latest`, all macOS builds to run on `macos-latest`, and all Windows builds to run on `windows-latest`, to make debugging future CI failures a little easier.	2022-11-02 21:37:01 -05:00
Ashay Rane	031d127940	ci: introduce read-only and read-write PyTorch build caches (#1546 ) Until recently, we had to either risk feature branches creating PyTorch build caches (which were unusable by the main branch or other parallel feature branches because of GitHub's rules around sharing caches among branches) or we had to limit the PyTorch build caches to only the main branch, causing CI runs on feature branches to be terribly slow because they had to rebuild PyTorch each time. This patch enables the best of both worlds, by using a fork (github.com/ashay/cache) of the GitHub's cache action, where the fork adds an option (called `save`) which, when set, uploads a new cache entry. We thus set this `save` flag only when we're building PyTorch from source in Torch-MLIR's main branch, whereas all other builds set this `save` flag to `false`. The ability to conditionally update the cache has been an oft-requested feature on the original (github.com/actions/cache) repository and multiple unmerged PRs exist to allow conditional cache updates, so it is likely that using the fork is only a temporary solution.	2022-11-01 23:26:17 -07:00
powderluv	1a33577860	remove spurious ref in publish pages (#1536 ) We don't need to pass in optional tag information.	2022-11-01 09:42:21 -07:00
Ashay Rane	a8970101dc	pytorch: rename pytorch-version.txt to pytorch-hash.txt (#1541 ) This patch is part of a larger set of improvements to the CI/build system. In the code, we refer to the version as the string that contains the release identifier such as 1.14.0.dev20221028, so calling the file that contains the commit hash as pytorch-version.txt creates confusion. For the sake of simplicity, this patch renames that file to be pytorch-hash.txt.	2022-10-31 22:03:05 -05:00
Ashay Rane	2cf1092d4d	ci: restrict PyTorch cache to just the main branch (#1540 ) If PyTorch build caches are created on a branch other than the main branch, then GitHub does not share those caches with the main branch, making every CI run that runs for each PR slow. This patch resolves the problem by letting only the main branch create and use PyTorch build caches.	2022-10-31 15:14:53 -05:00
powderluv	87ab714ed6	Update buildRelease.yml (#1535 )	2022-10-30 00:14:54 -07:00
powderluv	bbde4e163f	Add Windows Builder (#1521 ) Add a powershell script to build windows .whl packages Disable LTC as it doesn't build on Windows. Add GHA hooks Use Python 3.10.8	2022-10-25 16:13:31 -07:00
Ashay Rane	801452b2f4	ci: make RollPyTorch run only on the Torch-MLIR repo (#1516 )	2022-10-25 17:56:59 -05:00
Ashay Rane	a9942f343a	Cache PyTorch source builds to reduce CI time (#1500 ) * ci: cache PyTorch source builds This patch reduces the time spent in regular CI builds by caching PyTorch source builds. Specifically, this patch: 1. Makes CI lookup the cache entry for the PyTorch commit hash in pytorch-version.txt 2. If lookup was successful, CI fetches the previously-generated WHL file into the build_tools/python/wheelhouse directory 3. CI sets the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable to `true` 4. The build_libtorch.sh script then uses the downloaded WHL file instead of rebuilding PyTorch * ci: warm up PyTorch source cache during daily RollPyTorch action This patch makes the RollPyTorch action write the updated WHL file to the cache, so that it can be later retrieved by CI that runs for each PR. We deliberately add the caching step to the end of the action since the RollPyTorch action never needs to read from the cache, although executing this step earlier in the process should not cause problems either.	2022-10-18 00:42:42 -05:00
Ashay Rane	0374a6da4e	ci: re-enable auto execution of the RollPyTorch action (#1488 ) Now that the RollPyTorch action seems to have become stable, this patch enables that action to be run at around 4am Pacific Time every day.	2022-10-12 19:18:54 -05:00
Daniel Ellis	67a0fb14ef	Fix build release workflow.	2022-10-11 15:19:53 -04:00
Daniel Ellis	2e0d806bf7	Publish releases page after both mac and linux builds. Mac was finishing first, causing linux releases to be lagged a day behind.	2022-10-10 10:37:02 -04:00
Ashay Rane	8a8e779529	Disable auto-update of PyTorch version until CI script stabilizes (#1456 ) Instead of letting the auto-update script either fail because of script errors or letting it commit bad versions, this patch makes the update process manual, for now. Once the script stabilizes, I will its re-enable periodic execution.	2022-10-04 03:02:44 -05:00
Ashay Rane	da02390188	build: update ODS and shape library when updating PyTorch (#1450 ) Updating the PyTorch version may break the Torch-MLIR build, as it did recently, since the PyTorch update caused the shape library to change, but the shape library was not updated in the commit for updating PyTorch. This patch introduces a new default-off environment variable to the build_linux_packages.sh script called `TM_UPDATE_ODS_AND_SHAPE_LIB` which instructs the script to run the update_torch_ods.sh and update_shape_lib.sh scripts. However, running these scripts requires an in-tree build and the tests that run for an in-tree build of Torch-MLIR are more comprehensive than those that run for an out-of-tree build, so this patch also swaps out the out-of-tree build for an in-tree build.	2022-10-02 18:02:34 -05:00
Ashay Rane	005d40f4d7	build: check exit code without causing script to fail (#1447 ) A bug in the CI script caused the entire script to fail if the exit code of the command for comparing with the existing hash returned a non-zero exit status. The non-zero exit status for this comparison does not imply failed execution, since it only indicates that the hash has changed.	2022-10-02 16:04:26 -05:00
Ashay Rane	b3345e69e2	build: miscellaneous performance improvements (#1443 )	2022-09-30 12:47:43 -05:00
Ashay Rane	cf41a2582e	Final changes necessary to auto-update PyTorch version (#1438 ) * build: push directly from CI to main branch This avoids the need to create, approve, and merge a separate PR, in addition to avoiding unnecessary CI runs for the PyTorch version update. * build: schedule cronjob to run RollPyTorch action This patch schedules the RollPyTorch action to be run at noon UTC, which roughly corresponds to 4am Pacific Time. We pick this time since the commit for PyTorch nightly releases are picked just after midnight Pacific Time and the nightly release artifacts are produced in about 2 to 3 hours after the commit is picked.	2022-09-29 17:15:32 -05:00
powderluv	da584fbb73	Update releases pages after release builds (#1432 ) * Update buildRelease.yml Update Releases right after a Release build. * Move gh-page update after release builds This removes the periodic update and updates after a release build.	2022-09-29 12:49:41 -07:00
Ashay Rane	8f608c048d	build: use Github Actions for creating PR (#1433 )	2022-09-29 07:09:16 -05:00
Ashay Rane	53e76b8ab6	build: create RollPyTorch to update PyTorch version in Torch-MLIR (#1419 ) This patch fetches the most recent nightly (binary) build of PyTorch, before pinning it in pytorch-requirements.txt, which is referenced in the top-level requirements.txt file. This way, end users will continue to be able to run `pip -r requirements.txt` without worrying whether doing so will break their Torch-MLIR build. This patch also fetches the git commit hash that corresponds to the nightly release, and this hash is passed to the out-of-tree build so that it can build PyTorch from source. If we were to sort the torch versions as numbers (in the usual descending order), then 1.9 appears before 1.13. To fix this problem, we use the `--version-sort` flag (along with `--reverse` for specifying a descending order). We also filter out lines that don't contain version numbers by only considering lines that start with a digit. As a matter of slight clarity, this patch renames the variable `torch_from_src` to `torch_from_bin`, since that variable is initialized to `TM_USE_PYTORCH_BINARY`. Co-authored-by: powderluv <powderluv@users.noreply.github.com>	2022-09-28 15:38:30 -05:00
Daniel Ellis	1dfe5efe9e	Create github action for creating pip-compatible releases index	2022-09-23 15:26:19 -04:00
powderluv	e6528f701a	Move CIs to use docker builds (#1316 ) * Move CIs to use docker builds Now that #1234 has landed and anyone can run CI / Release builds locally move GHA to use the same flow. * update names * Update comments	2022-09-02 18:35:40 -07:00
powderluv	7769eb88f8	Set ccache logging to verbose temporarily (#1326 ) This is to debug what is causing the exactly ccache look up failures etc.	2022-08-31 16:09:46 -07:00
Sean Silva	e16b43e20b	Remove "torchscript" association from the e2e framework. We use it for more than TorchScript testing now. This is a purely mechanical change to adjust some file paths to remove "torchscript". The most perceptible change here is that now e2e tests are run with ``` ./tools/e2e_test.sh instead of: ./tools/torchscript_e2e_test.sh ```	2022-08-29 14:10:03 -07:00
Sean Silva	bcccf41d96	Add CI for generated files. This ensures that they are always up to date. This also updates the shape lib to make the new CI actually pass :)	2022-08-29 12:07:16 -07:00
powderluv	c0630da678	Disable LTC by default until upstream revert relands (#1303 ) * Disable LTC by default until upstream revert relands Tracked with the WIP https://github.com/llvm/torch-mlir/pull/1292 * Disable LTC e2e tests temporarily * Update setup.py Disable LTC in setup.py temporarily until upstream is fixed.	2022-08-28 19:11:40 -07:00
Tanyo Kwok	2374098d71	[MHLO] Init end to end unit tests (#1223 )	2022-08-23 16:47:21 +08:00
Henry Tu	ba17a4d6c0	Reenable LTC in out-of-tree build (for real this time) (#1205 ) * Fix OOT LTC CI build failure * Disable LTC during macOS package gen * Add more details about static TorchMLIRJITIRImporter library	2022-08-19 15:25:00 -04:00
Sambhav Jain	114f48e96c	[Bazel] Check cache directory exists before changing owners (#1241 ) This fixes a seeding issue with the [previous PR](https://github.com/llvm/torch-mlir/pull/1240) where bazel build's GHA cache is not present to begin with and one of the commands (chown) fails on it. Should get the Bazel build back to green.	2022-08-17 17:04:50 -07:00
Sambhav Jain	9c8b962720	Dockerize and Cache Bazel {Local, CI} Builds (#1240 ) This PR adds: - A minimal docker wrapper to the bazel GHA workflow to make it reproducible locally - Bazel cache to speed up GHA workflows (down to ~5 minutes from ~40+minutes) This is a no-op for non-bazel workflows and an incremental improvement.	2022-08-17 12:46:17 -07:00
Ashay Rane	606f4d2c0e	build: streamline options for enabling LTC and MHLO (#1221 )	2022-08-12 23:49:28 -07:00
Sambhav Jain	34478ab1c7	[Build] Add concurrency groups to address long queue times (#1219 ) We're seeing large CI queue times ([example](https://discord.com/channels/636084430946959380/742573221882364009/1007631811184164944)) especially with MacOS VMs on GHA. Part of the problem is follow-on commits to the same branch which trigger new runs while the previous runs are still in-progress, hogging on the scarce VMs. This PR adds concurrency groups to the GHA workflow which ensures that only a single job or workflow using the same concurrency group will run at a time. This would cancel any in-progress jobs in the same github workflow and github ref (e.g. `refs/heads/main` or `refs/pull/<pr_number>/merge`). As discussed on discord [thread](https://discord.com/channels/636084430946959380/1007787336848912386/1007787338895740928), once this lands we may have to closely monitor the workflows to see this didn't introduce unintended consequences. If so, we could either revert, or decide to selectively cancel particular runs (e.g. macos only which is the main bottleneck right now) instead of entire workflow. This will also require some expectation management. As in, if you see an ❌ on the main branch, it may not necessarily mean things broke, it could mean the run was killed by a more recent run. Making it a bit harder to traceback a failure to a commit in a sequence of commits (requiring to run those builds again). Thanks @powderluv for the proposal and pointer to this! It should help with the scarce VMs on GHA and save on queue time. References: * https://docs.github.com/en/actions/using-jobs/using-concurrency#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow * https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-only-cancel-in-progress-jobs-or-runs-for-the-current-workflow	2022-08-12 17:38:48 -07:00
Ashay Rane	1581d6a84c	build: fix typo in path (#1218 ) When we renamed the directory containing submodules from `external` to `externals`, we accidentally left the original name in the Github workflow. This patch fixes the problem.	2022-08-12 15:38:25 -07:00
Sambhav Jain	aed0ec3a2c	Merge matrix runs to fail fast globally (#1216 ) My earlier[ PR](https://github.com/llvm/torch-mlir/pull/1213) had (among other things) decoupled ubuntu and macos builds into separate matrix runs. This is not working well due to limited number of MacOS GHA VMs causing long queue times and backlog. There are two reasons causing this backlog: 1. macos arm64 builds with pytorch source are getting erratically cancelled due to resource / network constraints. This is addressed with this: https://github.com/llvm/torch-mlir/pull/1215 > "macos-arm64 (in-tree, OFF) The hosted runner: GitHub Actions 3 lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error." 2. macos runs don't fail-fast when ubuntu runs fail due to being in separate matrix setups. This PR couples them again.	2022-08-12 11:30:09 -07:00

1 2 3

139 Commits (6c07704837eb5ed760ae8b5eb9ef9f82d21df3bb)