The updated LLVM code includes a patch to create bfloat16 array
attributes, thus enabling a different patch to torch-mlir to flesh out
support for the bfloat16 type.
NB: `shouldnt_normalize2` and `shouldnt_normalize3` currently XPASS i.e., args *will* successfully normalize despite being incorrect due to an [upstream bug](https://github.com/pytorch/pytorch/issues/75342).
Effectively, this mode works by compiling op by op as the NN is eagerly executed by PyTorch. Entailed in that compilation is building a representation of the op that can be `torch.jit.script`ed, importing using `ModuleBuilder`, and then executing (e.g., with `RefBackendLinalgOnTensorsBackend`). This mode includes a fallback to conventional PyTorch if anything in the torch-mlir compilation process fails (e.g., unsupported op).
Currently, all e2e tests pass execpt for two that involve an upstream PyTorch bug (https://github.com/pytorch/pytorch/issues/74400).
High priority next steps:
1. A compile cache in order to speed up reruns of the same NN.
2. Integration with IREE (though not in this repo).
3. Integration with `torch.distributed`.