We currently pin the `torch` package to the latest nightly version, but
since `torchvision` depends on the `torch` package, the pip resolver
then has to run through an extensive list of `torchvision` packages that
can be installed with the pinned `torch` package. This search fails in
the RollPyTorch action, causing pip to settle on an old version of
`torchvision` that does not work with our tests. In reality, we are
only interested in a specific version of the `torchvision` package.
To make the dependency explicit and to prevent test failures because of
incorrect package installations, this patch makes two key changes:
1. `torchvision` is now pinned to the latest nightly release in
pytorch-requirements.txt along with the version of `torch` that is
necessary to install the requested `torchvision` package
2. The RollPyTorch action now looks for the latest `torchvision` package
instead of the latest `torch` package before writing the version
numbers for pinning in pytorch-requirements.txt
Now that the ninja pip package issue has been resolved, this patch
removes the pinned version from requirements.txt so that we can go back
to using the most recent version of ninja.
Going from ninja v1.10.2 to v1.11.1, there is a change that breaks the
CI builds with the following error:
```
CMake Error at CMakeLists.txt:47 (project):
Running
'/main_checkout/torch-mlir/docker_venv/bin/ninja' '--version'
failed with:
CMake Error: CMAKE_ASM_COMPILER not set, after EnableLanguage
```
Ostensibly, the reason for the error about the ASM compiler is because
llvm-project/llvm/CMakeLists.txt includes ASM among the list of
languages used in the LLVM project. Adding `-DCMAKE_ASM_COMPILER=clang`
does not resolve the error.
Until we figure out why the new version of ninja causes the build
failures, this patch pins the ninja to the one that worked.
* test: allow spaces in path to Python executable
On Windows, the path to the Python binary may contain spaces, so this
patch adds quotes around the path to the python executable.
Thanks to @sstamenova for suggesting the fix!
* python: remove header file that causes Windows build failures
Similar to https://reviews.llvm.org/D125284, we can safely remove this
header file without affecting the build on either Linux. It is
necessary to remove this header file on Windows builds since otherwise
it causes build errors.
* python: drop `TORCH_API` from function defined in Torch-MLIR
`TORCH_API` should apply to functions that are either exported by
libtorch.so or ones that are imported from libtorch.so by its downstream
consumers (like Torch-MLIR). Neither case applies to the
`importJitFunctionAsFuncOp()` function, since it is defined in
Torch-MLIR (and thus outside libtorch.so). This patch fixes the problem
by dropping `TORCH_API` from that function's declaration.
* python: make output of class anotations deterministic
The `class-annotator-repr.py` test checks for class annotations in a
specific order, but prior to this patch, the order was
non-deterministic, since the code iterated on an _unordered_ map.
This patch makes the iteration order deterministic through two changes:
1. using a sorted map
2. using the class qualified name instead of the address of the class in
memory
* test: use Python3_EXECUTABLE as interpreter path for consistency
This ensures that tests use the Python3 version that was detected using
CMake, instead of whichever python version that happens to be in the
PATH variable when invoking the test.
* test: fix RUN string
The parenthesis syntax does not run on Windows (the shell interprets the
`(` character as part of the path). Moreover, the ODR violation in the
comment no longer seems to apply.
* python: port parallel test framework to Windows
Since Windows does not support `fork` natively, Python's
`multiprocessing` module needs to use `spawn` on Windows. However, to
use `spawn`, the multiprocessing module serializes (or pickles) the
worker function and its arguments. Sadly, the multiprocessing module
(both the default one in Python and the one that is extended in PyTorch)
is unable to serialize lambda functions (see
https://stackoverflow.com/a/19985580) for detals.
Unfortunately, given how our tests are structured, we require that the
function under test is passed as an argument to another function, so we
cannot sidestep our use of lambda functions.
To resolve this problem, this patch makes use of the `multiprocess` and
`dill` Python modules, which together offers a multiprocessing mechanism
that can serialize lambda functions. The multiprocess module also
offers a process pool, which simplifies the code for our parallel
testing framework.
This patch fetches the most recent nightly (binary) build of PyTorch,
before pinning it in pytorch-requirements.txt, which is referenced in
the top-level requirements.txt file. This way, end users will continue
to be able to run `pip -r requirements.txt` without worrying whether
doing so will break their Torch-MLIR build.
This patch also fetches the git commit hash that corresponds to the
nightly release, and this hash is passed to the out-of-tree build so
that it can build PyTorch from source.
If we were to sort the torch versions as numbers (in the usual
descending order), then 1.9 appears before 1.13. To fix this problem,
we use the `--version-sort` flag (along with `--reverse` for specifying
a descending order). We also filter out lines that don't contain
version numbers by only considering lines that start with a digit.
As a matter of slight clarity, this patch renames the variable
`torch_from_src` to `torch_from_bin`, since that variable is initialized
to `TM_USE_PYTORCH_BINARY`.
Co-authored-by: powderluv <powderluv@users.noreply.github.com>
Building on a fresh environment + virtualenv + in-tree build errors out
becayse PyYaml isn't installed. Adding to requirements.txt fixes that.
Fixes#1173
* Add oneshot release snapshot for test/ondemand
Add some build scripts to test new release flow based on IREE.
Wont affect current builds, once this works well we can plumb it
in.
Build with manylinux docker
* Fixes a few issues found when debugging powderluv's setup.
* It is optional to link against Python3_LIBRARIES. Check that and don't do it if they don't exist for this config.
* Clean and auditwheel need to operate on sanitized package names. So "torch_mlir" vs "torch-mlir".
* Adds a pyproject.toml file that pins the build dependencies needed to detect both Torch and Python (the MLIR Python build was failing to detect because Numpy wasn't in the pip venv).
* Commented out auditwheel: These wheels are not PyPi compliant since they weak link to libtorch at runtime. However, they should be fine to deploy to users.
* Adds the --extra-index-url to the pip wheel command, allowing PyTorch to be found.
* Hack setup.py to remove the _mlir_libs dir before building. This keeps back-to-back versions from accumulating in the wheels for subsequent versions. IREE has a more principled way of doing this, but what I have here should work.
Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com>
A recent PyTorch commit made ConstantPad2d call a helper function with a
`Union[int, float]` type annotated. This commit adds minimal support for
representing and dealing with that.
https://github.com/pytorch/pytorch/pull/73287
Changes:
- Adding support for `!torch.union<T1, T2, T3>`/`Torch::UnionType`,
along with the importer and CAPI code.
- Add support in isValidSubtype for union types.
- Adding a canonicalizer for `torch.derefine` to help simplify some code
that derefines to a UnionType (this also fixes#664).
There is still more work to do for really supporting UnionType well,
such as canonicalizing UnionType's so that they can be compared with
pointer equality.
I am investigating the breakage.
Also, fix "externals" rename in setup.py and some cases where we weren't
using `requirements.txt` consistently.
Also, fix a case where the packaging script would get confused due to
".." in the path name.
* Also adds a requirements.txt and updates docs to reference it versus stringy pip install.
* Adds doc with instructions on creating a wheel.
Fixes#370