Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
//
|
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
//
|
|
|
|
// This is the public-facing interface for interacting with the npcomp
|
|
|
|
// runtime.
|
|
|
|
//
|
|
|
|
// This functionality is totally firewalled from the compiler codebase, so
|
|
|
|
// even if things superficially look similar, remember that there are no
|
|
|
|
// LLVM utilities here, memory allocation should be kept to a minimum, etc.
|
|
|
|
//
|
2020-10-08 07:11:41 +08:00
|
|
|
// npcomp/RefBackend/Runtime/Support.h provides some minimal LLVM-like support
|
|
|
|
// code to keep the API familiar.
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#ifndef NPCOMP_RUNTIME_USERAPI_H
|
|
|
|
#define NPCOMP_RUNTIME_USERAPI_H
|
|
|
|
|
2020-10-08 07:11:41 +08:00
|
|
|
#include "npcomp/RefBackend/Runtime/Support.h"
|
2021-03-11 01:53:03 +08:00
|
|
|
#include <array>
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
#include <atomic>
|
|
|
|
#include <cstdlib>
|
|
|
|
|
2020-10-08 08:12:52 +08:00
|
|
|
namespace refbackrt {
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
|
2021-03-11 07:39:26 +08:00
|
|
|
struct RtValue;
|
|
|
|
|
|
|
|
// Base class for any RefCounted object type
|
|
|
|
class RefTarget {
|
|
|
|
protected:
|
|
|
|
template <typename T> friend class Ref;
|
|
|
|
mutable std::atomic<size_t> refCount;
|
|
|
|
|
|
|
|
constexpr RefTarget() noexcept : refCount(0) {}
|
|
|
|
};
|
|
|
|
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
// Reference-counted handle to a type with a `refCount` member.
|
|
|
|
template <typename T> class Ref {
|
|
|
|
public:
|
|
|
|
Ref() { ptr = nullptr; }
|
|
|
|
// Creates a Ref and increments the refcount by 1.
|
|
|
|
// rawPtr must be allocated with std::malloc.
|
|
|
|
Ref(T *rawPtr) {
|
2020-11-24 08:30:49 +08:00
|
|
|
assert(rawPtr->refCount >= 0 && "expected non-negative refcount to start!");
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
ptr = rawPtr;
|
2021-03-11 07:39:26 +08:00
|
|
|
incref(ptr);
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
}
|
|
|
|
Ref(const Ref &other) {
|
|
|
|
ptr = other.ptr;
|
|
|
|
incref(ptr);
|
|
|
|
}
|
|
|
|
Ref(Ref &&other) { ptr = other.takePtr(); }
|
|
|
|
Ref &operator=(const Ref &other) {
|
|
|
|
if (&other == this)
|
|
|
|
return *this;
|
|
|
|
decref(ptr);
|
|
|
|
ptr = other.ptr;
|
|
|
|
incref(ptr);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
Ref &operator=(Ref &&other) {
|
|
|
|
if (&other == this)
|
|
|
|
return *this;
|
|
|
|
decref(ptr);
|
|
|
|
ptr = other.takePtr();
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
~Ref() { decref(ptr); }
|
|
|
|
|
|
|
|
T &operator*() const { return *ptr; }
|
|
|
|
T *operator->() const { return ptr; }
|
|
|
|
T *get() const { return ptr; }
|
|
|
|
|
|
|
|
T *takePtr() {
|
|
|
|
auto *ret = ptr;
|
|
|
|
ptr = nullptr;
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
[RefBackend] Fix leaks related to ABI boundaries.
Best as I can tell (e.g. from LeakSanitizer), this fixes all the leaks
except for those due to buffers created internally to the codegenned
code itself (up next I'll add the buffer deallocation pass to fix
those).
The main change is that instead of attempting to pass `refbackrt::Tensor`
to the codegenned function directly, we make all the ABI types be
UnrankedMemRef which gets passed awkwardly (but workably) as a
`{size_t rank, void *ptrToDescriptor}` on the ABI. The reason why
refbackrt::Tensor wasn't workable is that is that MLIR doesn't really
have a way to deal with the lifetime of unranked memref descriptors that
happen inside the function, which is inevitably what would happen in the
old code that would emit runtime calls to
`refbackrt.to_memref/refbackrt.from_memref` to convert back and forth to
`refbackrt::Tensor` inside the codegenned code.
So, instead of the `refbackrt.to_memref/refbackrt.from_memref` with no
real sound basis for valid lifetime management, we now have a lovely
piece of code in `refbackrt::invoke` in `Runtime.cpp` that just barely
seems to be sound. We rely on the codegenned code having these
properties, which it seems to have:
- it won't free memref descriptors or their backing buffer for arguments
of UnrankedMemRef type.
- it will allocate a separate memref descriptor for each result
UnrankedMemRef (which is ensured by having a separate memref_cast for
each)
- we can sniff the `allocatedPtr`'s (i.e. the backing buffer pointers)
to avoid double-freeing in the case of aliasing of the backing buffer
(including backing buffers for arguments feeding into results)
- to catch the case of statically allocated data (which we need to avoid
passing to `free`) , check if the `allocatedPtr` is (no joke) equal to
`0xDEADBEEF`, because there is otherwise no way to distinguish
statically allocated from malloc'ed data... (std.global_memref lowering
to LLVM by happenstance sets the allocatedPtr equal to `0xDEADBEEF`,
presumably mainly as a debugging thing)
Even with all this, we *still* need to (internally to refbackrt::invoke)
make copies of all inputs/outputs! And the details of how the LLVM-level
ABI gets laid out for e.g. function arguments/returns is still super
tricky.
This really highlights how deficient memref is as the general runtime
type for our use case. It's stewing in my mind how best to improve the
situation. My general gut feeling is that IREE's abstractions for this
are "right", but I need to think more how to distill those aspects of
IREE's design in a "reference" way for RefBackend.
Some implementation notes:
- In terms of how this is implemented, this did catch a bug in our ABI
wrapper functions in LowerToLLVM.cpp, which I had to fix (it happened to
work before through some combination of npcomprt::Tensor being passed as
a single pointer + probably me infinite-monkey-ing it until it worked)
- This actually removes 2 out of the 3 compiler runtime functions (the
only one left is "abort_if". (most of the memref descriptor code moved
from CopmilerRuntime.cpp to Runtime.cpp)
- this also means deleting `refbackrt.from_memref` and
`refbackrt.to_memref`
2020-11-25 09:18:57 +08:00
|
|
|
int debugGetRefCount() { return ptr->refCount; }
|
|
|
|
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
private:
|
2021-03-11 07:39:26 +08:00
|
|
|
friend struct RtValue;
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
static void incref(T *ptr) {
|
|
|
|
if (!ptr)
|
|
|
|
return;
|
|
|
|
ptr->refCount += 1;
|
|
|
|
}
|
2021-03-11 07:39:26 +08:00
|
|
|
|
|
|
|
friend struct RtValue;
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
static void decref(T *ptr) {
|
|
|
|
if (!ptr)
|
|
|
|
return;
|
|
|
|
if (ptr->refCount.fetch_sub(1) == 1) {
|
|
|
|
ptr->~T();
|
|
|
|
std::free(static_cast<void *>(ptr));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
T *ptr;
|
|
|
|
};
|
|
|
|
|
|
|
|
// The available data types.
|
|
|
|
enum class ElementType : std::int32_t {
|
2021-03-11 01:53:03 +08:00
|
|
|
NONE,
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
F32,
|
|
|
|
};
|
|
|
|
std::int32_t getElementTypeByteSize(ElementType type);
|
2021-03-11 01:53:03 +08:00
|
|
|
StringRef getElementTypeAsStringRef(ElementType type);
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
|
|
|
|
// Representation of a tensor.
|
2021-03-11 07:39:26 +08:00
|
|
|
class Tensor : public RefTarget {
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
public:
|
|
|
|
// Due to tail-allocated objects, this struct should never be directly
|
|
|
|
// constructed.
|
|
|
|
Tensor() = delete;
|
|
|
|
|
|
|
|
// Create a Tensor with the given extents and element type, with a buffer
|
|
|
|
// holding a copy of `data`.
|
|
|
|
static Ref<Tensor> create(ArrayRef<std::int32_t> extents,
|
|
|
|
ElementType elementType, void *data);
|
|
|
|
// Same as `create`, but returns a raw pointer.
|
|
|
|
static Tensor *createRaw(ArrayRef<std::int32_t> extents,
|
|
|
|
ElementType elementType, void *data);
|
|
|
|
|
2021-03-11 01:53:03 +08:00
|
|
|
static Ref<Tensor> create(ArrayRef<std::int64_t> extents,
|
|
|
|
ElementType elementType, void *data);
|
|
|
|
// Same as `create`, but returns a raw pointer.
|
|
|
|
static Tensor *createRaw(ArrayRef<std::int64_t> extents,
|
|
|
|
ElementType elementType, void *data);
|
|
|
|
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
ElementType getElementType() const { return elementType; }
|
|
|
|
std::int32_t getRank() const { return rank; }
|
|
|
|
void *getData() const { return data; }
|
|
|
|
template <typename T> T *getData() const { return static_cast<T *>(data); }
|
|
|
|
std::int32_t getExtent(int dimension) const {
|
|
|
|
return getExtents()[dimension];
|
|
|
|
}
|
|
|
|
ArrayRef<std::int32_t> getExtents() const {
|
|
|
|
auto extents = const_cast<Tensor *>(this)->getMutableExtents();
|
|
|
|
return ArrayRef<std::int32_t>(extents.data(), extents.size());
|
|
|
|
}
|
|
|
|
// Returns the number of bytes occupied by the data representing this tensor.
|
|
|
|
// The total allocated amount might be higher to allow e.g. for alignment
|
|
|
|
// nudging.
|
|
|
|
std::int32_t getDataByteSize() const;
|
|
|
|
~Tensor() { std::free(allocatedPtr); }
|
|
|
|
|
|
|
|
private:
|
|
|
|
MutableArrayRef<std::int32_t> getMutableExtents() {
|
|
|
|
auto *tail = reinterpret_cast<std::int32_t *>(this + 1);
|
|
|
|
return MutableArrayRef<std::int32_t>(tail, rank);
|
|
|
|
}
|
|
|
|
|
|
|
|
ElementType elementType;
|
|
|
|
// The number of dimensions of this Tensor.
|
|
|
|
// There are `rank` tail-allocated std::int32_t values representing the
|
|
|
|
// tensor extents.
|
|
|
|
std::int32_t rank;
|
|
|
|
// The buffer base.
|
|
|
|
void *data;
|
|
|
|
// The raw pointer returned by the allocator (currently assumed to be
|
|
|
|
// malloc), suitable for freeing the buffer.
|
|
|
|
void *allocatedPtr;
|
|
|
|
|
|
|
|
// Sizes are tail-allocated.
|
|
|
|
};
|
|
|
|
|
2021-03-11 07:39:26 +08:00
|
|
|
// RtValue is a generic tagged union used to hold all value types
|
|
|
|
// The tag determines the type, and the payload represents the stored
|
|
|
|
// contents of an object. If an object is not trivially destructible,
|
|
|
|
// then it must be refcounted and must have a refCount.
|
|
|
|
#define NPCOMP_FORALL_PRIM_TAGS(_) \
|
|
|
|
_(None) \
|
|
|
|
_(Bool) \
|
|
|
|
_(Int) \
|
2021-03-11 01:53:03 +08:00
|
|
|
_(Float) \
|
2021-03-11 07:39:26 +08:00
|
|
|
_(Double)
|
|
|
|
|
|
|
|
#define NPCOMP_FORALL_REF_TAGS(_) _(Tensor)
|
|
|
|
|
|
|
|
#define NPCOMP_FORALL_TAGS(_) \
|
|
|
|
NPCOMP_FORALL_PRIM_TAGS(_) \
|
|
|
|
NPCOMP_FORALL_REF_TAGS(_)
|
|
|
|
|
|
|
|
struct RtValue final {
|
|
|
|
|
|
|
|
RtValue() : payload{0}, tag(Tag::None) {}
|
|
|
|
|
|
|
|
// Bool
|
|
|
|
RtValue(bool b) : tag(Tag::Bool) { payload.asBool = b; }
|
|
|
|
bool isBool() const { return Tag::Bool == tag; }
|
|
|
|
bool toBool() const {
|
|
|
|
assert(isBool());
|
|
|
|
return payload.asBool;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Int
|
|
|
|
RtValue(std::int64_t i) : tag(Tag::Int) { payload.asInt = i; }
|
|
|
|
RtValue(std::int32_t i) : RtValue(static_cast<int64_t>(i)) {}
|
|
|
|
bool isInt() const { return Tag::Int == tag; }
|
2021-03-11 01:53:03 +08:00
|
|
|
int64_t toInt() const {
|
2021-03-11 07:39:26 +08:00
|
|
|
assert(isInt());
|
|
|
|
return payload.asInt;
|
|
|
|
}
|
|
|
|
|
2021-03-11 01:53:03 +08:00
|
|
|
// Float
|
|
|
|
RtValue(float f) : tag(Tag::Float) { payload.asFloat = f; }
|
|
|
|
bool isFloat() const { return Tag::Float == tag; }
|
|
|
|
float toFloat() const {
|
|
|
|
assert(isFloat());
|
|
|
|
return payload.asFloat;
|
|
|
|
}
|
|
|
|
|
2021-03-11 07:39:26 +08:00
|
|
|
// Double
|
|
|
|
RtValue(double d) : tag(Tag::Double) { payload.asDouble = d; }
|
|
|
|
bool isDouble() const { return Tag::Double == tag; }
|
2021-03-11 01:53:03 +08:00
|
|
|
double toDouble() const {
|
2021-03-11 07:39:26 +08:00
|
|
|
assert(isDouble());
|
|
|
|
return payload.asDouble;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Tensor
|
|
|
|
RtValue(Ref<Tensor> tensor) : tag(Tag::Tensor) {
|
|
|
|
payload.asVoidPtr = reinterpret_cast<void *>(tensor.takePtr());
|
|
|
|
}
|
|
|
|
bool isTensor() const { return Tag::Tensor == tag; }
|
|
|
|
Ref<Tensor> toTensor() const {
|
|
|
|
assert(isTensor());
|
|
|
|
return Ref<Tensor>(reinterpret_cast<Tensor *>(payload.asVoidPtr));
|
|
|
|
}
|
|
|
|
|
|
|
|
// Ref
|
|
|
|
bool isRef() const {
|
|
|
|
#define DEFINE_IS_REF(x) \
|
|
|
|
if (is##x()) { \
|
|
|
|
return true; \
|
|
|
|
}
|
|
|
|
NPCOMP_FORALL_REF_TAGS(DEFINE_IS_REF)
|
|
|
|
#undef DEFINE_IS_REF
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2021-03-11 01:53:03 +08:00
|
|
|
// Scalar
|
|
|
|
bool isScalar() const {
|
|
|
|
return isBool() || isInt() || isFloat() || isDouble();
|
|
|
|
}
|
|
|
|
|
2021-03-11 07:39:26 +08:00
|
|
|
// RtValue (downcast)
|
|
|
|
const RtValue &toRtValue() const { return *this; }
|
|
|
|
RtValue &toRtValue() { return *this; }
|
|
|
|
|
|
|
|
// Stringify tag for debugging.
|
|
|
|
StringRef tagKind() const {
|
|
|
|
switch (tag) {
|
|
|
|
#define DEFINE_CASE(x) \
|
|
|
|
case Tag::x: \
|
|
|
|
return #x;
|
|
|
|
NPCOMP_FORALL_TAGS(DEFINE_CASE)
|
|
|
|
#undef DEFINE_CASE
|
|
|
|
}
|
|
|
|
// TODO(brycearden): Print tag here
|
|
|
|
return "InvalidTag!";
|
|
|
|
}
|
|
|
|
|
|
|
|
RtValue(const RtValue &rhs) : RtValue(rhs.payload, rhs.tag) {
|
|
|
|
if (isRef()) {
|
|
|
|
#define DEFINE_INCREF(x) \
|
|
|
|
if (is##x()) { \
|
|
|
|
Ref<x>::incref(static_cast<x *>(payload.asVoidPtr)); \
|
|
|
|
return; \
|
|
|
|
}
|
|
|
|
NPCOMP_FORALL_REF_TAGS(DEFINE_INCREF)
|
|
|
|
#undef DEFINE_INCREF
|
|
|
|
assert(false && "Unsupported RtValue type");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
RtValue(RtValue &&rhs) noexcept : RtValue() { swap(rhs); }
|
|
|
|
|
|
|
|
RtValue &operator=(RtValue &&rhs) & noexcept {
|
|
|
|
RtValue(std::move(rhs)).swap(*this); // this also sets rhs to None
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
RtValue &operator=(RtValue const &rhs) & {
|
|
|
|
RtValue(rhs).swap(*this);
|
|
|
|
return *this;
|
|
|
|
}
|
|
|
|
|
|
|
|
~RtValue() {
|
|
|
|
if (isRef()) {
|
|
|
|
#define DEFINE_DECREF(x) \
|
|
|
|
if (is##x()) { \
|
|
|
|
Ref<x>::decref(static_cast<x *>(payload.asVoidPtr)); \
|
|
|
|
return; \
|
|
|
|
}
|
|
|
|
NPCOMP_FORALL_REF_TAGS(DEFINE_DECREF)
|
|
|
|
#undef DEFINE_DECREF
|
|
|
|
assert(false && "Unsupported RtValue type");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
private:
|
|
|
|
void swap(RtValue &rhs) {
|
|
|
|
std::swap(payload, rhs.payload);
|
|
|
|
std::swap(tag, rhs.tag);
|
|
|
|
}
|
|
|
|
|
|
|
|
// NOTE: Runtime tags are intentionally private.
|
|
|
|
// Please use the helper functions above to query information about the type
|
|
|
|
// of a RtValue.
|
|
|
|
enum class Tag : std::uint32_t {
|
|
|
|
#define DEFINE_TAG(x) x,
|
|
|
|
NPCOMP_FORALL_TAGS(DEFINE_TAG)
|
|
|
|
#undef DEFINE_TAG
|
|
|
|
};
|
|
|
|
|
|
|
|
union Payload {
|
|
|
|
bool asBool;
|
|
|
|
int64_t asInt;
|
2021-03-11 01:53:03 +08:00
|
|
|
float asFloat;
|
2021-03-11 07:39:26 +08:00
|
|
|
double asDouble;
|
|
|
|
void *asVoidPtr;
|
|
|
|
};
|
|
|
|
|
|
|
|
RtValue(Payload pl, Tag tag) : payload(pl), tag(tag) {}
|
|
|
|
|
|
|
|
Payload payload;
|
|
|
|
Tag tag;
|
|
|
|
};
|
|
|
|
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
// Module loading.
|
|
|
|
// This is the main entry point that users interact with.
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2021-03-11 01:53:03 +08:00
|
|
|
enum class ArgType : std::uint32_t {
|
|
|
|
kNone = 0,
|
|
|
|
kTensor,
|
|
|
|
kF32,
|
|
|
|
kF64,
|
|
|
|
};
|
|
|
|
StringRef getArgTypeAsStringRef(ArgType type);
|
|
|
|
|
|
|
|
// Maximum rank supported across the ABI boundary
|
|
|
|
constexpr static int kMaxRank = 6;
|
|
|
|
|
|
|
|
struct InputArgInfo {
|
|
|
|
// What type of argument this is
|
|
|
|
ArgType argType;
|
|
|
|
// Certain arg types also have an element type
|
|
|
|
ElementType elementType;
|
|
|
|
std::int32_t rank;
|
|
|
|
std::array<std::int32_t, kMaxRank> extents;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct OutputArgInfo {
|
|
|
|
// What type of argument this is
|
|
|
|
ArgType argType;
|
|
|
|
// Certain arg types also have an element type
|
|
|
|
ElementType elementType;
|
|
|
|
std::int32_t rank;
|
|
|
|
std::array<std::int32_t, kMaxRank> extents;
|
|
|
|
// TODO(brycearden): Add checks for whether output buffers alias to input
|
|
|
|
// buffers and populate field(s) here indicating that case
|
|
|
|
};
|
|
|
|
|
|
|
|
// Maximum input or output arity.
|
|
|
|
constexpr static int kMaxArity = 20;
|
|
|
|
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
// Metadata for a particular function.
|
|
|
|
struct FunctionMetadata {
|
|
|
|
std::int32_t numInputs;
|
|
|
|
std::int32_t numOutputs;
|
2021-03-11 01:53:03 +08:00
|
|
|
|
|
|
|
std::array<InputArgInfo, kMaxArity> inputArgInfos;
|
|
|
|
std::array<OutputArgInfo, kMaxArity> outputArgInfos;
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
// Opaque forward declaration of module descriptor type. This is the type
|
|
|
|
// created by the compiler in the module binary.
|
|
|
|
struct ModuleDescriptor;
|
|
|
|
|
2021-03-11 01:53:03 +08:00
|
|
|
// Verifies that the input RtValue arg types match what the user provides
|
|
|
|
// matches the types we expect from the descriptors emitted by the
|
|
|
|
// compiler.
|
|
|
|
//
|
|
|
|
// Returns failure if the input type(s) are not valid
|
|
|
|
LogicalResult checkRtValueArgTypes(const RtValue &value,
|
|
|
|
const InputArgInfo &info);
|
|
|
|
|
|
|
|
// Verifies that the input RtValue shapes matches what the user provides
|
|
|
|
// matches the types we expect from the descriptors emitted by the
|
|
|
|
// compiler.
|
|
|
|
//
|
|
|
|
// Returns failure if the input type(s) are not valid
|
|
|
|
LogicalResult checkRtValueShapes(const RtValue &value,
|
|
|
|
const InputArgInfo &info);
|
|
|
|
|
|
|
|
// Creates an RtValue of the right type from the output metadata
|
|
|
|
// provided by the compiled module
|
|
|
|
RtValue createRtValueFromOutputArgInfo(const OutputArgInfo &info);
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
|
|
|
|
// Low-level invocation API. The number of inputs and outputs should be correct
|
|
|
|
// and match the results of getMetadata.
|
|
|
|
void invoke(ModuleDescriptor *moduleDescriptor, StringRef functionName,
|
2021-03-11 07:39:26 +08:00
|
|
|
ArrayRef<RtValue> inputs, MutableArrayRef<RtValue> outputs);
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
|
|
|
|
// Metadata for function `functionName`.
|
|
|
|
//
|
|
|
|
// Returns failure if functionName wasn't found.
|
|
|
|
LogicalResult getMetadata(ModuleDescriptor *moduleDescriptor,
|
|
|
|
StringRef functionName,
|
|
|
|
FunctionMetadata &outMetadata);
|
|
|
|
|
2020-10-08 08:12:52 +08:00
|
|
|
} // namespace refbackrt
|
Rework e2e flow to use new "npcomprt"
This ~totally reworks the existing "runtime" stuff to be more
principled and usable, such as from Python. It's still not fully
production-quality, mainly in the department of memory management (e.g.
it currently leaks memory; we need to figure out "who frees memrefs" +
the analysis and transformation needed to do that (maybe use upstream
buffer allocation pass?)).
The user API is in include/npcomp/runtime/UserAPI.h, though
include/npcomp/JITRuntime/JITModule.h is a friendlier wrapper.
The stuff under {include,lib}/runtime is totally firewalled from the
compiler and tiny (<6kB, though no attention has gone into optimizing
that size). For example, we don't link in libSupport into the runtime,
instead having our own bare bones replacements for basics like ArrayRef
(the JITRuntime helps with bridging that gap, since it *can* depend on
all common LLVM utilities).
The overall features of npcomprt is that it exposes a module that
with multiple function entry points. Each function has arguments and
results that are tensor-valued, and npcomprt::Tensor is the runtime type
that is used to interact with that (and a npcomprt::Ref<T>
reference-counting wrapper is provided to wrap npcomprt::Tensor in the
common case).
From an implementation perspective, an npcomprt module at the
LLVM/object/binary level exposes a single module descriptor struct that
has pointers to other metadata (currently just a list of function
metadata descriptors). All interactions with the npcomp runtime are
keyed off of that module descriptor, including function lookups and
dispatching. This is done to dodge platform ABI issues and also allow
enough reflection to e.g. verify provided arguments.
Most of the compiler-side work here was in LowerToNpcomprtABI and
LowerToLLVM.
Also,
- Rename npcomp_rt/NpcompRt to npcomprt/Npcomprt; it was getting
annoying to type the underscores/caps.
- misc improvements to bash_helpers.sh
2020-07-09 08:15:40 +08:00
|
|
|
|
|
|
|
#endif // NPCOMP_RUNTIME_USERAPI_H
|