For the op `index_put_`, if accumulate == false, the behavior is
undefined if the indicies aren't unique
(https://pytorch.org/docs/stable/generated/torch.Tensor.index_put_.html).
So, when converting `AtenIndexPutHackedTwinOp` to a TMTensor scatter op,
mark the indices as unique if when `accumulate == false`.
This should have no functional effect (unless users are relying on UB)
and assuming unique indices has the benefit of unlocking better
optimizations in further compiler stages.
Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
Attention often broadcasts a mask across the batch dimension as masking
is usually performed the same across attention heads. Added this
materialization to the mask dimensions optionally.
Current version does not work for a mixture of dynamic and static shaped
batch dimensions. Rework to grab the correct dynamic shapes.
---------
Co-authored-by: dan <danimal197@gmail.com>
Enabled mask and is_causal parameters for torch.aten.scaled_dot_product
attention + relevant comments + tests.
The tests added highlight the new capabilities introduced in this PR,
including:
Attention with F16 mask
Attention with Boolean mask
Causal attention with same Q K V shapes
Causal attention without Q K V shapes
Made sure that one cannot input both mask and is_causal.
Set PyTorch and TorchVision version to nightly release 2024-08-18.
This commit also updates the `scaled_dot_product_attention` op.
A new attribute `enable_gqa` has been added. As of now, only the
default value for the same is supported.
Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>
The `index_put` operation, `input[indices] = values`, allows for the
values to be any shape that is broadcastable to the slice
`input[indices]`. This commit adds broadcasting support to the Linalg
lowering of `IndexPutHackedTwinOp`.
Fixes: #3465
If there is only a single value scattered there can be an implicit batch
dimension. This includes a check for the implicit batch dimension when
reshaping the update tensor. It includes an e2e test to verify
correctness.
We collapsed and broadcasted scatter indices to a single element
version. We should instead upport `tm_tensor.scatter`s support for
multiple indices and the implicitly broadcasted behavior. This avoids
the serialization and materializing a needlessly large indices tensor.
Some operations include a backend matcher for specialized operations. We
map these back to generics so they appropriately match to the high
performance versions. This is done for the attention operation.
This commit updates the `llvm-project` and `mlir-hlo` submodules to
commits:
llvm-project: a3f2751f782f3cdc6ba4790488ec20163a40ac37
mlir-hlo: 97c7e4b4506c3a2441c923e592833f45da439009
Changes made:
- Rename `getSuccessorEntryOperands` with `getEntrySuccessorOperands`
and remove `operands` from
`getSuccessorRegions` (https://reviews.llvm.org/D157506)
- Make `TypeConverter` a `const` (https://reviews.llvm.org/D157601)
-- This commit adds e2e support for atend.sort op.
-- 1. Adds aten.sort op in torch dialect.
-- 2. Adds tm_tensor.sort op in TMTensor dialect.
-- 3. Adds lowering of aten.sort -> tm_tensor.sort.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
This commits adds the support for cases for index_put_op:
1.) where index is a 2-d tensor.
2.) where indices is a list of tensors and none, with exactly
2 non none tensors along the consecutive dimensions.
This commit also adds a utility to compute the broadcast shape
given the two input tensors.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
Summary of changes:
- LLVM now includes <optional> instead of "llvm/ADT/Optional.h" in most
(although not all) places
(https://reviews.llvm.org/rG541ef3d61e9341cd38420c0dbca9250c4d0ea04c).
This patch replaces the affected instances of `llvm::Optional` with
`std::optional`.
- In the usages of llvm::Optional that remain, llvm::Optional::value()
is deprecated, so this patch replaces them with a dereference.
Currently `getTensorRank` returns -1 if it was unable to get the rank
of the tensor. However, not every use in the codebase was checking the
return value, and in some cases, the return value was casted to
unsigned leading to some infinte loops when an unranked tensor reached
a decomposition.
This commit changes the return of `getTensorRank` to
`Optional<unsigned>` to make it clear to the user that the function
can fail.
This commit also changes a couple of for loops that iterate a vector
in reverse order that can potentially become infinite loops into
range-based for loops.
- Support for non-prefixed accessors has been removed. See:
https://reviews.llvm.org/D136727
- Rename `operands` to `methodOperands` in `prim.CallMethod` since the
name `operands` overlaps with a builtin method name. See:
https://reviews.llvm.org/D136727
- Add passes in refbackend to lower memref.subview. See:
https://reviews.llvm.org/D136377
- Replace `CopyToValueTensorOps` first in `RewriteViewLikeSubgraph` in
maximize-value-semantics.
The current implementation of the `RewriteViewLikeSubgraph` pass in
maximize-value-semantics creates temporarily invalid IR. In
particular, given a forward slice starting from a
`CopyToNonValueTensorOp` and ending in `CopyToValueTensorOp`s, the
pass first replaces all uses of the `CopyToNonValueTensorOp` with
its operand, which results in all the `CopyToValueTensorOp` users
having their operand have type `!torch.vtensor`, which is invalid.
The correct way to do things is to first replace all the
`CopyToValueTensorOp`s with their operand, and then replace all uses
of the `CopyToNonValueTensorOp` with its operand.
This only started failing now because the generated accessor
`getOperand` for the `CopyToValueTensorOp` now returns a
`TypedValue<NonValueTensorType>`, which has an assert checking that
the value returned is of the expected type.
Summary of changes:
- Replace call to `MemoryEffectOpInterface::hasNoEffect`
with `isMemoryEffectFree`.
- Make fix for the dynamic dims, since
`kDynamicSize` value changed to
`std::numeric_limits<int64_t>::min()` from `-1` in llvm
- `makeShapeLLVMCompatible` and `makeShapeTorchCompatible`
utilities convert shapes in order to remain consistent
with the Torch and MLIR semantics.
- Update tags
llvm: 147fe9de29dc13c14835127b35280c4d95c8e8ba
mhlo: 1944b5fa6062ec4c065d726c9c5d64f1487ee8c5
Signed-Off By: Vivek Khandelwal<vivek@nod-labs.com>
This commit removes almost all of the valsem ops, since the value
semantics version of the ops now exist in PyTorch. The only op missing
is `aten.bernoulli_.float`. In addition, this commit also simplifies
the implementation of `aten.fill.Scalar` by moving it to the pattern
that converts elementwise ops.
This commit makes the following changes needed to update bump LLVM:
- Replace `linalg.init_tensor` with `tensor.empty` (see:
https://reviews.llvm.org/D135129)
- Replace `NoSideEffect` with `Pure` (see
https://reviews.llvm.org/D135505)
- Replace `body` region accessor for `ReduceOp` and `ReduceWindowOp`
with `getBody`
- Fix incorrect use of `tosa::ReduceSumOp` in `AtenNativeLayerNormOp`
conversion pattern. The result type of `tosa::ReduceSumOp` must have
the same rank as the input type. (see:
https://www.mlplatform.org/tosa/tosa_spec.html#_reduce_sum)
Co-authored-by: Ashay Rane <ashay@users.noreply.github.com>
Co-authored-by: Ashay Rane <ashay@users.noreply.github.com>
Summary of changes:
- Updated references to the Arith dialect
(https://reviews.llvm.org/D134762)
- Switched to prefixed accessors for MemRef dialect
(https://reviews.llvm.org/D134995)
- Fixed warnings about signed/unsigned comparisons, ignored return
values, and unused variables
The updated LLVM code includes a patch to create bfloat16 array
attributes, thus enabling a different patch to torch-mlir to flesh out
support for the bfloat16 type.
This commit adds lowering of `aten::max_pool2d_with_indices_backward` op.
This commit also fixes formatting issues in basic.py.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
This commit adds support for multi-dimensional tensors as input to the
`_index_put_impl_` op. The support was to some degree already there,
since `ScatterOp` already supports multi-dimensional tensors. This
commit also adds a bit more error checking to `index_put` and
refactors the code for creating `ScatterOp`s to mimic the way one
would make a `Linalg::GenericOp`.
This commit adds the op `ValsemVariantAtenIndexPutImplOp` that represents
`Aten_IndexPutImpl_Op` without the underscore. This is needed to
make sure that the `ReduceOpVariants` pass turns the in-place op
into an op that takes value tensors as inputs, otherwise the
`MaximizeValueSemantics` pass will not be able to add value
semantics correctly.
This commit also adds the lowering of `ValsemVariantAtenIndexPutImplOp` op.
This commit also updates the `torch.bincount` op test cases.
This pass is added to lower ops, which can not be lowered
via the TorchToLinalg pass, such as `torch.bincount` op.
This pass also uses torch-mlir's TMTensor Dialect to lower the
complex ops.
Also add torch.bincount op lowering with the help of TMTensor dialect
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>