Request to reduce size of LLM checkpoints hosted by MLCommons

The current checkpoints hosted by MLCommons are large, and using them takes more storage space than necessary. As a rule-of-thumb, checkpoints require parameter-count * byte/parameter storage, so an 8B model with bf16 weights requires 16GB storage, 70B requires 140GB storage, and so on.