* ci: cache PyTorch source builds
This patch reduces the time spent in regular CI builds by caching
PyTorch source builds. Specifically, this patch:
1. Makes CI lookup the cache entry for the PyTorch commit hash in
pytorch-version.txt
2. If lookup was successful, CI fetches the previously-generated WHL
file into the build_tools/python/wheelhouse directory
3. CI sets the `TM_PYTORCH_INSTALL_WITHOUT_REBUILD` variable to `true`
4. The build_libtorch.sh script then uses the downloaded WHL file
instead of rebuilding PyTorch
* ci: warm up PyTorch source cache during daily RollPyTorch action
This patch makes the RollPyTorch action write the updated WHL file to
the cache, so that it can be later retrieved by CI that runs for each
PR. We deliberately add the caching step to the end of the action since
the RollPyTorch action never needs to read from the cache, although
executing this step earlier in the process should not cause problems
either.
Instead of letting the auto-update script either fail because of script
errors or letting it commit bad versions, this patch makes the update
process manual, for now. Once the script stabilizes, I will its
re-enable periodic execution.
Updating the PyTorch version may break the Torch-MLIR build, as it did
recently, since the PyTorch update caused the shape library to change,
but the shape library was not updated in the commit for updating
PyTorch.
This patch introduces a new default-off environment variable to the
build_linux_packages.sh script called `TM_UPDATE_ODS_AND_SHAPE_LIB`
which instructs the script to run the update_torch_ods.sh and
update_shape_lib.sh scripts.
However, running these scripts requires an in-tree build and the tests
that run for an in-tree build of Torch-MLIR are more comprehensive than
those that run for an out-of-tree build, so this patch also swaps out
the out-of-tree build for an in-tree build.
A bug in the CI script caused the entire script to fail if the exit code
of the command for comparing with the existing hash returned a non-zero
exit status. The non-zero exit status for this comparison does not
imply failed execution, since it only indicates that the hash has
changed.
* build: push directly from CI to main branch
This avoids the need to create, approve, and merge a separate PR, in
addition to avoiding unnecessary CI runs for the PyTorch version update.
* build: schedule cronjob to run RollPyTorch action
This patch schedules the RollPyTorch action to be run at noon UTC, which
roughly corresponds to 4am Pacific Time. We pick this time since the
commit for PyTorch nightly releases are picked just after midnight
Pacific Time and the nightly release artifacts are produced in about 2
to 3 hours after the commit is picked.
This patch fetches the most recent nightly (binary) build of PyTorch,
before pinning it in pytorch-requirements.txt, which is referenced in
the top-level requirements.txt file. This way, end users will continue
to be able to run `pip -r requirements.txt` without worrying whether
doing so will break their Torch-MLIR build.
This patch also fetches the git commit hash that corresponds to the
nightly release, and this hash is passed to the out-of-tree build so
that it can build PyTorch from source.
If we were to sort the torch versions as numbers (in the usual
descending order), then 1.9 appears before 1.13. To fix this problem,
we use the `--version-sort` flag (along with `--reverse` for specifying
a descending order). We also filter out lines that don't contain
version numbers by only considering lines that start with a digit.
As a matter of slight clarity, this patch renames the variable
`torch_from_src` to `torch_from_bin`, since that variable is initialized
to `TM_USE_PYTORCH_BINARY`.
Co-authored-by: powderluv <powderluv@users.noreply.github.com>