From 4a776be1560ec3127aae681e36325b547c270314 Mon Sep 17 00:00:00 2001 From: Ashay Rane Date: Thu, 20 Oct 2022 08:50:18 -0500 Subject: [PATCH] build: make PyTorch caching more robust (#1510) Whether or not the PyTorch build is cached should not affect the success of the Torch-MLIR build, but based on the existing code, a build may fail if the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable was set but the build cache doesn't exist. Although that variable is set by CI upon a cache hit, nuances of Github's caching behavior can create situations where the coupling between `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` and the cache lookup fails. Specifically, a branch other than our default branch (`main`) may create the cache entry, but because Github doesn't share this cache entry with builds running on the `main` branch, the `main` branch build tries to create it's own cache entry. However, since cache identifiers are unique and because caches are immutable, the caching step running in the `main` branch appears to create an invalid cache entry (of 233 bytes, instead of the expected ~60 MB). Consequently, subsequent builds observe a cache "hit", since caches created by the `main` branch are shared with all other branches, but because this cache entry is invalid (since it doesn't actually contain the ~60 MB PyTorch WHL file), the builds fail. One workaround would be to let only the `main` branch create caches, but in doing so, we would also prevent other branches from _reading_ the cache, making the builds in those branches terribly slow. So this patch uses a different workaround, which is to check whether the PyTorch WHL file exists, even if the build observed a cache hit. If the file doesn't exist, even if it was a purported cache hit, the code builds PyTorch from source, which is probably intuitive. A longer term fix will follow, after a discussion with the wider team. --- build_tools/build_libtorch.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/build_tools/build_libtorch.sh b/build_tools/build_libtorch.sh index d8114cbe8..ea126d5b0 100755 --- a/build_tools/build_libtorch.sh +++ b/build_tools/build_libtorch.sh @@ -152,7 +152,9 @@ unpack_pytorch() { #main echo "Building libtorch from source" -if [[ $TM_PYTORCH_INSTALL_WITHOUT_REBUILD != "true" ]]; then +wheel_exists=true +compgen -G "$WHEELHOUSE/*.whl" > /dev/null || wheel_exists=false +if [[ $TM_PYTORCH_INSTALL_WITHOUT_REBUILD != "true" || ${wheel_exists} == "false" ]]; then checkout_pytorch install_requirements build_pytorch