Specifically, we use unranked memrefs which get passed as a fixed-size
set of arguments/returns. One big caveat about this is that returning
results isn't going to work. See TODO in LowerTensorLoadOp.
This is far from enough runtime-wise, but it starts to demarcate a
plausible layering. Notice for example how this removes the
runtime-dependence from LowerRankedShapes.
Eventually, we want to have an `npcomp_rt` or `npcomp_hal` dialect with
its own set of runtime types that will supercede this.
See comments in LowerTensorLoadOp for more direction about where this is
going to evolve.