Metadata-Version: 2.1
Name: platformers
Version: 0.1.0
Summary: 
Author: fecet
Author-email: xiezej@gmail.com
Requires-Python: >=3.10,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Description-Content-Type: text/markdown

# Optimus: HuggingFace-Aligned 3D-parallel backend

- flash attention 2 support on training
- flash attention 2 support on left-padding generation with kv cache
- fmha on GQA & MQA
- multi model topology support by mpu context
- more model type for experiment (PPL,RM,...)
- GQA & MQA generation (left-padding)

TODO:
- less model control option
- generator based on non-batch flash attention and self-design cuda fused kernel
- Fixed pipeline model
- KV Cache management by pre-malloc and reuse (pre-calculate)

