Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Unknown Architecture Error #485

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

twanchen
Copy link

Description of the change

Initialize model architecture to gemma_config.Architecture.GEMMA_1

Motivation

The code in the notebook when run natively throw this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-8-e00c04154560>](https://localhost:8080/#) in <cell line: 11>()
      9 torch.set_default_dtype(model_config.get_dtype())
     10 device = torch.device(MACHINE_TYPE)
---> 11 model = GemmaForCausalLM(model_config)
     12 model.load_weights(ckpt_path)
     13 model = model.to(device).eval()


1 frames
[/content/gemma_pytorch/gemma/model.py](https://localhost:8080/#) in __init__(self, config)
    479                 self.layers.append(Gemma2DecoderLayer(config, attn_type))
    480             else:
--> 481                 raise ValueError(f'Unknown architecture: {config.architecture}')
    482         self.norm = RMSNorm(config.hidden_size, eps=config.rms_norm_eps)
    483 

ValueError: Unknown architecture: Architecture.GEMMA_1

which is caused by the code in model.py not recognizing Architecture.GEMMA_1 as gemma_config.Architecture.GEMMA_1

Architecture.GEMMA_1's definition and how it is the default value of the class can be seen in config.py:

class Architecture(enum.Enum):
    GEMMA_1 = 1
    GEMMA_2 = 2


@dataclasses.dataclass
class GemmaConfig:
    # The architecture of the model.
    architecture: Architecture = Architecture.GEMMA_1

Type of change

Bug fix

Checklist

  • I have performed a self-review of my code.
  • I have added detailed comments to my code where applicable.
  • I have verified that my change does not break existing code.
  • My PR is based on the latest changes of the main branch (if unsure, please run git pull --rebase upstream main).
  • I am familiar with the Google Style Guide for the language I have coded in.
  • I have read through the Contributing Guide and signed the Contributor License Agreement.

@twanchen twanchen requested a review from a team as a code owner July 10, 2024 05:20
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added status:awaiting review PR awaiting review from a maintainer component:documentation Update docs labels Jul 10, 2024
@ErikUustalu
Copy link

I have the same problem, do i have to wait for a fix from devs or i can fix it myself?

@markmcd
Copy link
Member

markmcd commented Aug 8, 2024

@twanchen can you rebase this change so that it's not conflicting with @jethac's Gemma 2 changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:documentation Update docs status:awaiting review PR awaiting review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants