reorg repo

Michael Gschwind · malfet · commit bd7ebb148b7e · 2024-07-16T22:58:42.000-07:00
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -0,0 +1,76 @@
+# Code of Conduct
+
+## Our Pledge
+
+In the interest of fostering an open and welcoming environment, we as
+contributors and maintainers pledge to make participation in our project and
+our community a harassment-free experience for everyone, regardless of age, body
+size, disability, ethnicity, sex characteristics, gender identity and expression,
+level of experience, education, socio-economic status, nationality, personal
+appearance, race, religion, or sexual identity and orientation.
+
+## Our Standards
+
+Examples of behavior that contributes to creating a positive environment
+include:
+
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+
+Examples of unacceptable behavior by participants include:
+
+* The use of sexualized language or imagery and unwelcome sexual attention or
+advances
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic
+address, without explicit permission
+* Other conduct which could reasonably be considered inappropriate in a
+professional setting
+
+## Our Responsibilities
+
+Project maintainers are responsible for clarifying the standards of acceptable
+behavior and are expected to take appropriate and fair corrective action in
+response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit, or
+reject comments, commits, code, wiki edits, issues, and other contributions
+that are not aligned to this Code of Conduct, or to ban temporarily or
+permanently any contributor for other behaviors that they deem inappropriate,
+threatening, offensive, or harmful.
+
+## Scope
+
+This Code of Conduct applies within all project spaces, and it also applies when
+an individual is representing the project or its community in public spaces.
+Examples of representing a project or community include using an official
+project e-mail address, posting via an official social media account, or acting
+as an appointed representative at an online or offline event. Representation of
+a project may be further defined and clarified by project maintainers.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported by contacting the project team at <conduct@pytorch.org>. All
+complaints will be reviewed and investigated and will result in a response that
+is deemed necessary and appropriate to the circumstances. The project team is
+obligated to maintain confidentiality with regard to the reporter of an incident.
+Further details of specific enforcement policies may be posted separately.
+
+Project maintainers who do not follow or enforce the Code of Conduct in good
+faith may face temporary or permanent repercussions as determined by other
+members of the project's leadership.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
+available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+
+[homepage]: https://www.contributor-covenant.org
+
+For answers to common questions about this code of conduct, see
+https://www.contributor-covenant.org/faq
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,32 @@
+# Contributing to gpt-fast
+We want to make contributing to this project as easy and transparent as
+possible.
+
+
+## Pull Requests
+We actively welcome your pull requests.
+
+1. Fork the repo and create your branch from `main`.
+2. If you've added code that should be tested, add tests.
+3. If you've changed APIs, update the documentation.
+4. Ensure the test suite passes.
+5. Make sure your code lints.
+6. If you haven't already, complete the Contributor License Agreement ("CLA").
+
+## Contributor License Agreement ("CLA")
+In order to accept your pull request, we need you to submit a CLA. You only need
+to do this once to work on any of Meta's open source projects.
+
+Complete your CLA here: <https://code.facebook.com/cla>
+
+## Issues
+We use GitHub issues to track public bugs. Please ensure your description is
+clear and has sufficient instructions to be able to reproduce the issue.
+
+Meta has a [bounty program](https://www.facebook.com/whitehat/) for the safe
+disclosure of security bugs. In those cases, please go through the process
+outlined on that page and do not file a public issue.
+
+## License
+By contributing to `gpt-fast`, you agree that your contributions will be licensed
+under the LICENSE file in the root directory of this source tree.
diff --git a/README.md b/README.md
@@ -26,6 +26,38 @@ Please copy-paste and fork as you desire.
 
 # Supported Models
 The model definition (and much more!) is adopted from gpt-fast, so we support the same models.
+
+## Installation
+[Download PyTorch nightly](https://pytorch.org/get-started/locally/)
+Install sentencepiece and huggingface_hub
+```bash
+pip install sentencepiece huggingface_hub
+```
+
+To download llama models, go to https://huggingface.co/meta-llama/Llama-2-7b and go through steps to obtain access.
+Then login with `huggingface-cli login`
+
+## Downloading Weights
+Models tested/supported
+```text
+tinyllamas/stories{15,42,110}
+openlm-research/open_llama_7b
+meta-llama/Llama-2-7b-chat-hf
+meta-llama/Llama-2-13b-chat-hf
+meta-llama/Llama-2-70b-chat-hf
+codellama/CodeLlama-7b-Python-hf
+codellama/CodeLlama-34b-Python-hf
+mistralai/Mistral-7B-v0.1
+mistralai/Mistral-7B-Instruct-v0.1
+mistralai/Mistral-7B-Instruct-v0.2
+```
+
+For example, to convert Llama-2-7b-chat-hf
+```bash
+export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf
+./scripts/prepare.sh $MODEL_REPO
+```
+
 See [`gpt-fast` Supported Models](https://github.com/pytorch-labs/gpt-fast?tab=readme-ov-file#supported-models) for a full list.
 
 # Installation
diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,2 @@
+torch
+sentencepiece
diff --git a/scripts/convert_hf_checkpoint.py b/scripts/convert_hf_checkpoint.py
@@ -0,0 +1,108 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+
+# This source code is licensed under the license found in the
+# LICENSE file in the root directory of this source tree.
+import json
+import re
+import sys
+from pathlib import Path
+from typing import Optional
+
+import torch
+
+# support running without installing as a package
+wd = Path(__file__).parent.parent.resolve()
+sys.path.append(str(wd))
+
+from model import ModelArgs
+
+
+@torch.inference_mode()
+def convert_hf_checkpoint(
+    *,
+    checkpoint_dir: Path = Path("checkpoints/meta-Transformer/Transformer-2-7b-chat-hf"),
+    model_name: Optional[str] = None,
+) -> None:
+    if model_name is None:
+        model_name = checkpoint_dir.name
+
+    config = ModelArgs.from_name(model_name)
+    print(f"Model config {config.__dict__}")
+
+    # Load the json file containing weight mapping
+    model_map_json = checkpoint_dir / "pytorch_model.bin.index.json"
+
+    assert model_map_json.is_file()
+
+    with open(model_map_json) as json_map:
+        bin_index = json.load(json_map)
+
+    weight_map = {
+        "model.embed_tokens.weight": "tok_embeddings.weight",
+        "model.layers.{}.self_attn.q_proj.weight": "layers.{}.attention.wq.weight",
+        "model.layers.{}.self_attn.k_proj.weight": "layers.{}.attention.wk.weight",
+        "model.layers.{}.self_attn.v_proj.weight": "layers.{}.attention.wv.weight",
+        "model.layers.{}.self_attn.o_proj.weight": "layers.{}.attention.wo.weight",
+        'model.layers.{}.self_attn.rotary_emb.inv_freq': None,
+        'model.layers.{}.mlp.gate_proj.weight': 'layers.{}.feed_forward.w1.weight',
+        "model.layers.{}.mlp.up_proj.weight": "layers.{}.feed_forward.w3.weight",
+        "model.layers.{}.mlp.down_proj.weight": "layers.{}.feed_forward.w2.weight",
+        "model.layers.{}.input_layernorm.weight": "layers.{}.attention_norm.weight",
+        "model.layers.{}.post_attention_layernorm.weight": "layers.{}.ffn_norm.weight",
+        "model.norm.weight": "norm.weight",
+        "lm_head.weight": "output.weight",
+    }
+    bin_files = {checkpoint_dir / bin for bin in bin_index["weight_map"].values()}
+
+    def permute(w, n_head):
+        dim = config.dim
+        return (
+            w.view(n_head, 2, config.head_dim // 2, dim)
+            .transpose(1, 2)
+            .reshape(config.head_dim * n_head, dim)
+        )
+
+    merged_result = {}
+    for file in sorted(bin_files):
+        state_dict = torch.load(str(file), map_location="cpu", mmap=True, weights_only=True)
+        merged_result.update(state_dict)
+    final_result = {}
+    for key, value in merged_result.items():
+        if "layers" in key:
+            abstract_key = re.sub(r'(\d+)', '{}', key)
+            layer_num = re.search(r'\d+', key).group(0)
+            new_key = weight_map[abstract_key]
+            if new_key is None:
+                continue
+            new_key = new_key.format(layer_num)
+        else:
+            new_key = weight_map[key]
+
+        final_result[new_key] = value
+
+    for key in tuple(final_result.keys()):
+        if "wq" in key:
+            q = final_result[key]
+            k = final_result[key.replace("wq", "wk")]
+            v = final_result[key.replace("wq", "wv")]
+            q = permute(q, config.n_head)
+            k = permute(k, config.n_local_heads)
+            final_result[key.replace("wq", "wqkv")] = torch.cat([q, k, v])
+            del final_result[key]
+            del final_result[key.replace("wq", "wk")]
+            del final_result[key.replace("wq", "wv")]
+    print(f"Saving checkpoint to {checkpoint_dir / 'model.pth'}")
+    torch.save(final_result, checkpoint_dir / "model.pth")
+
+if __name__ == '__main__':
+    import argparse
+    parser = argparse.ArgumentParser(description='Convert HuggingFace checkpoint.')
+    parser.add_argument('--checkpoint_dir', type=Path, default=Path("checkpoints/meta-llama/llama-2-7b-chat-hf"))
+    parser.add_argument('--model_name', type=str, default=None)
+
+    args = parser.parse_args()
+    convert_hf_checkpoint(
+        checkpoint_dir=args.checkpoint_dir,
+        model_name=args.model_name,
+    )
diff --git a/scripts/download.py b/scripts/download.py
@@ -0,0 +1,30 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+
+# This source code is licensed under the license found in the
+# LICENSE file in the root directory of this source tree.
+import os
+from typing import Optional
+
+from requests.exceptions import HTTPError
+
+
+def hf_download(repo_id: Optional[str] = None, hf_token: Optional[str] = None) -> None:
+    from huggingface_hub import snapshot_download
+    os.makedirs(f"checkpoints/{repo_id}", exist_ok=True)
+    try:
+        snapshot_download(repo_id, local_dir=f"checkpoints/{repo_id}", local_dir_use_symlinks=False, token=hf_token)
+    except HTTPError as e:
+        if e.response.status_code == 401:
+            print("You need to pass a valid `--hf_token=...` to download private checkpoints.")
+        else:
+            raise e
+
+if __name__ == '__main__':
+    import argparse
+    parser = argparse.ArgumentParser(description='Download data from HuggingFace Hub.')
+    parser.add_argument('--repo_id', type=str, default="checkpoints/meta-llama/llama-2-7b-chat-hf", help='Repository ID to download from.')
+    parser.add_argument('--hf_token', type=str, default=None, help='HuggingFace API token.')
+
+    args = parser.parse_args()
+    hf_download(args.repo_id, args.hf_token)
diff --git a/scripts/prepare.sh b/scripts/prepare.sh
@@ -0,0 +1 @@
+python scripts/download.py --repo_id $1 && python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/$1 
diff --git a/scripts/test_flow.sh b/scripts/test_flow.sh
@@ -0,0 +1,6 @@
+export MODEL_REPO=meta-llama/Llama-2-7b-chat-hf
+rm -r checkpoints/$MODEL_REPO
+python scripts/download.py --repo_id $MODEL_REPO
+python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/$MODEL_REPO
+python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth
+python generate.py --compile --checkpoint_path checkpoints/$MODEL_REPO/model_int8.pth --max_new_tokens 100

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+python scripts/download.py --repo_id $1 && python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/$1`