torch-mlir/build_tools
Ashay Rane 4a776be156
build: make PyTorch caching more robust (#1510)
Whether or not the PyTorch build is cached should not affect the success
of the Torch-MLIR build, but based on the existing code, a build may
fail if the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable was set but
the build cache doesn't exist.

Although that variable is set by CI upon a cache hit, nuances of
Github's caching behavior can create situations where the coupling
between `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` and the cache lookup fails.

Specifically, a branch other than our default branch (`main`) may create
the cache entry, but because Github doesn't share this cache entry with
builds running on the `main` branch, the `main` branch build tries to
create it's own cache entry.  However, since cache identifiers are
unique and because caches are immutable, the caching step running in the
`main` branch appears to create an invalid cache entry (of 233 bytes,
instead of the expected ~60 MB).

Consequently, subsequent builds observe a cache "hit", since caches
created by the `main` branch are shared with all other branches, but
because this cache entry is invalid (since it doesn't actually contain
the ~60 MB PyTorch WHL file), the builds fail.

One workaround would be to let only the `main` branch create caches, but
in doing so, we would also prevent other branches from _reading_ the
cache, making the builds in those branches terribly slow.

So this patch uses a different workaround, which is to check whether the
PyTorch WHL file exists, even if the build observed a cache hit.  If the
file doesn't exist, even if it was a purported cache hit, the code
builds PyTorch from source, which is probably intuitive.

A longer term fix will follow, after a discussion with the wider team.
2022-10-20 08:50:18 -05:00
..
docker Dockerize CI + Release builds (#1234) 2022-08-30 11:07:25 -07:00
python_deploy ci: use the LLVM linker instead of GNU ld (#1501) 2022-10-18 00:43:04 -05:00
autogen_ltc_backend.py Fix symint ops and blacklist `lift_fresh_copy` (#1373) 2022-09-20 10:16:04 -04:00
autogen_ltc_backend.yaml New ops support & enhancements (#1494) 2022-10-14 10:28:21 -04:00
build_libtorch.sh build: make PyTorch caching more robust (#1510) 2022-10-20 08:50:18 -05:00
build_python_wheels.sh build: improve robustness of cmake and shell scripts (#1018) 2022-07-06 14:39:30 -07:00
build_standalone.sh build: improve robustness of cmake and shell scripts (#1018) 2022-07-06 14:39:30 -07:00
scrape_releases.py Create github action for creating pip-compatible releases index 2022-09-23 15:26:19 -04:00
update_shape_lib.sh Slightly tweak generated file checks 2022-08-31 20:03:25 -07:00
update_torch_ods.sh Slightly tweak generated file checks 2022-08-31 20:03:25 -07:00
write_env_file.sh build: improve robustness of cmake and shell scripts (#1018) 2022-07-06 14:39:30 -07:00