torch-mlir/lib
Rob Suderman 25738b8c19
[linalg] Broadcast batch for mask on sdpa lowering (#3824)
Attention often broadcasts a mask across the batch dimension as masking
is usually performed the same across attention heads. Added this
materialization to the mask dimensions optionally.
2024-10-31 17:59:24 -07:00
..
CAPI [ONNX] add int16 quantization support (#3446) 2024-06-12 10:37:22 +05:30
Conversion [linalg] Broadcast batch for mask on sdpa lowering (#3824) 2024-10-31 17:59:24 -07:00
Dialect support `aten._trilinear` and improve `einsum` decomposition (#3784) 2024-10-31 14:30:40 -05:00
RefBackend Add missing dependency to TorchMLIRRefBackend target (#3107) 2024-08-14 23:41:51 +08:00
CMakeLists.txt Link necessary op interface implementations (#3364) 2024-06-03 19:43:28 -05:00
InitAll.cpp [Stablehlo] legalize deprecated ops to stablehlo ops (#3543) 2024-07-17 00:05:11 +08:00