- renames of OwningRewritePatternList -> RewritePatternSet
- also `insert` to `add`
- RewritePatternSet holds a context now
- memref dialect split from std
* Adds f32 scalar argument support across the ABI boundary.
* Adds support for passing input type / shape information
across the ABI boundary
* Adds support for parsing / creating input FloatAttr's in
`npcomp-run-mlir`
Two changes:
- no more "verifyPasses" constructor arg for PassManager
- OpPassManager defaults to requiring explicit "nest" calls when created
via the C++ API. The behavior upstream for mlir-opt still obeys the
"implicit" mode, so I just slapped that onto all our pass managers.
I pinged https://reviews.llvm.org/D90671 to get a signal for whether we
are expected to migrate to explicit mode. If so, I'll do that too later.
* Need to have a dag of shared library deps in order to interop across python extensions (as presented in ODM).
* Introduced add_npcomp_library and friends to mirror the MLIR setup.
* Adds a libNPCOMP.so shared library.
* Redirects tools and extensions to link against libNPCOMP.so (instead of static libs).
* Moves all libraries to lib/, all binaries to bin/ and all python extensions to python/. The invariant is that the rpaths are setup to have a one level directory structure.
* Reworks the _torch_mlir extension to build like the others (still need to come up with a consolidated rule to do this instead of open coded).
* Includes an upstream version bump to pick up needed changes.
Sizes with dynamic linking (stripped, release, asserts enabled):
libNPCOMP.so: 43M (includes much of the underlying LLVM codegen deps)
libMLIR.so: 31M
_npcomp.so: 1.6M (python extension)
_torch_mlir.so: 670K (python extension)
npcomp-capi-ir-test: 6.3K
npcomp-opt: 351K
npcomp-run-mlir: 461K
mnist-playground: 530K
Still more can be done to normalize and optimize but this gets us structurally to the starting point.
Other than the dialect definitions (which will live in standard Dialect/
subdirectory), the goal here is to keep RefBackend-related code nested
in {include/npcomp,lib,test}/RefBackend.
* llvm-project: b5924a8e27536d19dd5c4d302db29fb6163d5faa
* mhlo: 848ca244d20f045b7921da55a98a04d95ef94f0e
* Multiple breakages that need to be fixed.
Fixes:
* Refactor dialect registration
* Remove all kindof methods (Casting functionality has been added upstream and is implicitly
available, see https://llvm.discourse.group/t/removing-kinds-from-attributes-and-types/1547.)
* Update dialect registration to comply with https://reviews.llvm.org/D85495.
* Remove type kinds and update some changed dialect signatures.
* Upgrade ATen dialect to match upstream needs.
* Move dialect registration to tablegen.
* Register the ListType in tablegen.
* Change dialect initialization signature.
* Use TypeSwitch in MlirIr location printer.
* Remove global registry depends from npcomp-opt.
* Change LowerToLLVM to pass an MLIRContext vs an LLVMDialect for type creation.
* Remove dep on MLIREDSCInterface that is removed upstream.
* Thread through the DialectRegistry for opt and python-like tools.
* Modernize pass registration (This was forced because the GEN_PASS_REGISTRATION code now generates inline functions vs literal pass registration statements)
Co-authored-by: Marius Brehler <marius.brehler@iml.fraunhofer.de>
* Since the manylinux images do not hard-link against python libs (resolving them at runtime), the module must be built without resolving undefined references.
* For some reason, builds on this platform are stricter, exposing dependency ordering issues.
* The CMake bits to build the extension are still somewhat of a mess. I have better versions both upstream and in IREE and will be reconciling. For now, don't look too closely.
My main interest in this is that tweaking the default of this flag is a
quick way to check for miscompiling canonicalizations / op definitions
not annotated properly (e.g. marked NoSideEffect when in fact it is not
safe to do so).
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
The code isn't super clean, but is a useful incremental step
establishing most of the boilerplate for future enhancements.
We can't print or return tensors yet so correctness TBD, but I've
stepped into the running code in the debugger so I know it definitely is
running.
This is the first step to building out an npcomp mini-runtime. The
mini-runtime doesn't have to be fancy or complex, but it should at least
be layered nicely (which this code and the current compiler interaction
with the "runtime" code is not). Now that we have boilerplate for e2e
execution in some form, we can build that out.