mirror of https://github.com/llvm/torch-mlir
Attention often broadcasts a mask across the batch dimension as masking is usually performed the same across attention heads. Added this materialization to the mask dimensions optionally. |
||
---|---|---|
.. | ||
e2e_testing | ||
examples | ||
python | ||
test | ||
tools | ||
CMakeLists.txt |