mirror of https://github.com/llvm/torch-mlir
4a776be156
Whether or not the PyTorch build is cached should not affect the success of the Torch-MLIR build, but based on the existing code, a build may fail if the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable was set but the build cache doesn't exist. Although that variable is set by CI upon a cache hit, nuances of Github's caching behavior can create situations where the coupling between `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` and the cache lookup fails. Specifically, a branch other than our default branch (`main`) may create the cache entry, but because Github doesn't share this cache entry with builds running on the `main` branch, the `main` branch build tries to create it's own cache entry. However, since cache identifiers are unique and because caches are immutable, the caching step running in the `main` branch appears to create an invalid cache entry (of 233 bytes, instead of the expected ~60 MB). Consequently, subsequent builds observe a cache "hit", since caches created by the `main` branch are shared with all other branches, but because this cache entry is invalid (since it doesn't actually contain the ~60 MB PyTorch WHL file), the builds fail. One workaround would be to let only the `main` branch create caches, but in doing so, we would also prevent other branches from _reading_ the cache, making the builds in those branches terribly slow. So this patch uses a different workaround, which is to check whether the PyTorch WHL file exists, even if the build observed a cache hit. If the file doesn't exist, even if it was a purported cache hit, the code builds PyTorch from source, which is probably intuitive. A longer term fix will follow, after a discussion with the wider team. |
||
---|---|---|
.. | ||
docker | ||
python_deploy | ||
autogen_ltc_backend.py | ||
autogen_ltc_backend.yaml | ||
build_libtorch.sh | ||
build_python_wheels.sh | ||
build_standalone.sh | ||
scrape_releases.py | ||
update_shape_lib.sh | ||
update_torch_ods.sh | ||
write_env_file.sh |