improved

Updated Small, Medium, and Large Generation Models

Updated small, medium, and large models are more stable and resilient against abnormal inputs due to a FP16 quantization fix. We also fixed a bug in generation presence & frequency penalty, which will result in more effective penalties.