lora启用pissa方法 #6748

1-001 · 2025-01-24T03:03:00Z

1-001
Jan 24, 2025

使用pissa训练没有出错，但是在模型合并和评估时都会遇到下面的问题，想请问一下这个是什么问题，感谢！
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Users\pc\anaconda3\envs\python311\Scripts\llamafactory-cli.exe_main.py", line 7, in
File "C:\LLaMA-Factory-main\src\llamafactory\cli.py", line 112, in main
run_exp()
File "C:\LLaMA-Factory-main\src\llamafactory\train\tuner.py", line 59, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "C:\LLaMA-Factory-main\src\llamafactory\train\sft\workflow.py", line 52, in run_sft
model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\LLaMA-Factory-main\src\llamafactory\model\loader.py", line 169, in load_model
model = init_adapter(config, model, model_args, finetuning_args, is_trainable)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\LLaMA-Factory-main\src\llamafactory\model\adapter.py", line 299, in init_adapter
model = _setup_lora_tuning(
^^^^^^^^^^^^^^^^^^^
File "C:\LLaMA-Factory-main\src\llamafactory\model\adapter.py", line 191, in _setup_lora_tuning
model = PeftModel.from_pretrained(model, adapter_to_resume, is_trainable=is_trainable, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\anaconda3\envs\python311\Lib\site-packages\peft\peft_model.py", line 545, in from_pretrained
model.load_adapter(
File "C:\Users\pc\anaconda3\envs\python311\Lib\site-packages\peft\peft_model.py", line 1117, in load_adapter
load_result = set_peft_model_state_dict(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\anaconda3\envs\python311\Lib\site-packages\peft\utils\save_and_load.py", line 395, in set_peft_model_state_dict
load_result = model.load_state_dict(peft_model_state_dict, strict=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\pc\anaconda3\envs\python311\Lib\site-packages\torch\nn\modules\module.py", line 2584, in load_state_dict
raise RuntimeError(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.0.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.0.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.0.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.0.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.0.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.0.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.0.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.0.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.0.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.0.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.0.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.1.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.1.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.1.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.1.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.1.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.1.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.1.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.1.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.1.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.1.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.2.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.2.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.2.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.2.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.2.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.2.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.2.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.2.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.2.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.2.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.2.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.2.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.3.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.3.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.3.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.3.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.3.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.3.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.3.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.3.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.3.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.3.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.3.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.3.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.4.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.4.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.4.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.4.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.4.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.4.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.4.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.4.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.4.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.4.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.4.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.4.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.5.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.5.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.5.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.5.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.5.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.5.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.5.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.5.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.5.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.5.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.5.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.5.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.6.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.6.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.6.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.6.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.6.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.6.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.6.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.6.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.6.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.6.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.6.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.6.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.7.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.7.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.7.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.7.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.7.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.7.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.7.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.7.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.7.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.7.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.7.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.7.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.8.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.8.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.8.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.8.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.8.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.8.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.8.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.8.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.8.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.8.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.8.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.8.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.9.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.9.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.9.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.9.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.9.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.9.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.9.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.9.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.9.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.9.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.9.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.9.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.10.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.10.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.10.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.10.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.10.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.10.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.10.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.10.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.10.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.10.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.10.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.10.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.11.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.11.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.11.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.11.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.11.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.11.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.11.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.11.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.11.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.11.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.11.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.11.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.12.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.12.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.12.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.12.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.12.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.12.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.12.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.12.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.12.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.12.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.12.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.12.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.13.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.13.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.13.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.13.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.13.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.13.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.13.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.13.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.13.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.13.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.13.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.13.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.14.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.14.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.14.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.14.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.14.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.14.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.14.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.14.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.14.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.14.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.14.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.14.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.15.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.15.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.15.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.15.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.15.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.15.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.15.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.15.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.15.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.15.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.15.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.15.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.16.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.16.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.16.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.16.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.16.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.16.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.16.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.16.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.16.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.16.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.16.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.16.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.17.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.17.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.17.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.17.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.17.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.17.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.17.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.17.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.17.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.17.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.17.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.17.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.18.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.18.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.18.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.18.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.18.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.18.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.18.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.18.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.18.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.18.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.18.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.18.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.19.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.19.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.19.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.19.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.19.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.19.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.19.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.19.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.19.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.19.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.19.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.19.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.20.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.20.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.20.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.20.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.20.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.20.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.20.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.20.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.20.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.20.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.20.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.20.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.21.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.21.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.21.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.21.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.21.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.21.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.21.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.21.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.21.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.21.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.21.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.21.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.22.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.22.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.22.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.22.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.22.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.22.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.22.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.22.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.22.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.22.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.22.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.22.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.23.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.23.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.23.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.23.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.23.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.23.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.23.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.23.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.23.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.23.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.23.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.23.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.24.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.24.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.24.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.24.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.24.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.24.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.24.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.24.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.24.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.24.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.24.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.24.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.25.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.25.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.25.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.25.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.25.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.25.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.25.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.25.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.25.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.25.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.25.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.25.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.26.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.26.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.26.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.26.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.26.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.26.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.26.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.26.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.26.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.26.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.26.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.26.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.27.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.27.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.27.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.27.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.27.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.27.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.27.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.27.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.27.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.27.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.27.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.27.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.28.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.28.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.28.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.28.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.28.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.28.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.28.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.28.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.28.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.28.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.28.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.28.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.29.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.29.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.29.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.29.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.29.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.29.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.29.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.29.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.29.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.29.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.29.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.29.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.30.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.30.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.30.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.30.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.30.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.30.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.30.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.30.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.30.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.30.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.30.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.30.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.31.self_attn.q_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.31.self_attn.k_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.31.self_attn.k_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.31.self_attn.v_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.31.self_attn.v_proj.lora_B.default.weight: copying a param with shape torch.Size([1024, 16]) from checkpoint, the shape in current model is torch.Size([1024, 32]).
size mismatch for base_model.model.model.layers.31.self_attn.o_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.31.self_attn.o_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).
size mismatch for base_model.model.model.layers.31.mlp.gate_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.31.mlp.gate_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.31.mlp.up_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 4096]) from checkpoint, the shape in current model is torch.Size([32, 4096]).
size mismatch for base_model.model.model.layers.31.mlp.up_proj.lora_B.default.weight: copying a param with shape torch.Size([14336, 16]) from checkpoint, the shape in current model is torch.Size([14336, 32]).
size mismatch for base_model.model.model.layers.31.mlp.down_proj.lora_A.default.weight: copying a param with shape torch.Size([16, 14336]) from checkpoint, the shape in current model is torch.Size([32, 14336]).
size mismatch for base_model.model.model.layers.31.mlp.down_proj.lora_B.default.weight: copying a param with shape torch.Size([4096, 16]) from checkpoint, the shape in current model is torch.Size([4096, 32]).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lora启用pissa方法 #6748

{{title}}

Replies: 0 comments

Select a reply

lora启用pissa方法 #6748

1-001 Jan 24, 2025

Replies: 0 comments

1-001
Jan 24, 2025