* This elides the very common code the compiler adds for chaining otherwise tensor-related numpy ops together.
* More aggressive canonicalizations would require more advanced analysis.
* Preserving shape across the copy ops makes more thing shaped by default.
* Inference of ndarray types will now preserve the shape when specializing the dtype.
* Correctly infers the unknown dtypes that emit as part of compilation for the simple ufunc case.
* Significant more testing needs to be done on the details now that the pass is minimally functional.
* The actual pass itself is still too hacky/not general, but the underlying analysis is further along.