Add support for nomic-ai/nomic-embed-text-v1.5 model #1874

bhavika · 2024-05-24T06:36:19Z

What does this PR do?

Adds support for nomic-embed-text-v1.5 which is a variation of BERT.

I've tested this PR using the following script:

import numpy as np
from transformers import AutoTokenizer, AutoModel
from pathlib import Path
from optimum.exporters import TasksManager
from optimum.exporters.onnx import export
import onnx
from optimum.exporters.onnx import validate_model_outputs
from optimum.onnxruntime import ORTModelForFeatureExtraction
import torch

if __name__ == "__main__":
    MODEL_NAME = "nomic-ai/nomic-embed-text-v1.5"
    tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
    inputs = tokenizer("My name is Philipp and I live in Germany.", return_tensors="pt")

    hf_model = AutoModel.from_pretrained(
        MODEL_NAME,
        trust_remote_code=True,
        safe_serialization=True,
        revision="91d2d6bfdddf0b0da840f901b533e99bae30d757",
    )

    onnx_path = Path("tests/nomic_bert.onnx")
    onnx_config_constructor = TasksManager.get_exporter_config_constructor(
        "onnx", hf_model, task="feature-extraction", library_name="transformers"
    )
    onnx_config = onnx_config_constructor(hf_model.config)
    print(onnx_config)
    onnx_inputs, onnx_outputs = export(hf_model, onnx_config, onnx_path, onnx_config.DEFAULT_ONNX_OPSET)

    print("After export")
    print(onnx_inputs)
    print(onnx_outputs)

    onnx_model = onnx.load(onnx_path)
    onnx.checker.check_model(onnx_model)

    validate_model_outputs(onnx_config, hf_model, onnx_path, onnx_outputs, onnx_config.ATOL_FOR_VALIDATION)

    model = ORTModelForFeatureExtraction.from_pretrained(
        MODEL_NAME,
        file_name="onnx/model.onnx",
        trust_remote_code=True,
        safe_serialization=True,
        revision="91d2d6bfdddf0b0da840f901b533e99bae30d757",
    )

    optimum_model_outputs = model(**inputs)
    print(optimum_model_outputs.last_hidden_state)

    hf_model_outputs = hf_model(**inputs)
    print(hf_model_outputs.last_hidden_state)

    print("Are inputs from both models close?", np.allclose(optimum_model_outputs.last_hidden_state.cpu().detach().numpy(), hf_model_outputs.last_hidden_state.cpu().detach().numpy(), rtol=1e-3, atol=1e-3))

which yields:

❯ python tests/nomic_bert.py
/Users/bhavika/src/github.com/huggingface/optimum/.venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
<All keys matched successfully>
<optimum.exporters.onnx.model_configs.NomicBertOnnxConfig object at 0x131bc5c10>
Using framework PyTorch: 2.3.0
Overriding 1 configuration item(s)
        - use_cache -> False
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:621: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if seqlen > self._seq_len_cached:
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:573: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:507: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert ro_dim <= x.shape[-1]
After export
['input_ids', 'attention_mask', 'token_type_ids']
['last_hidden_state']
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

Validating ONNX model tests/nomic_bert.onnx...
        -[✓] ONNX model output names match reference model (last_hidden_state)
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (2, 16, 768) matches (2, 16, 768)
                -[✓] all values close (atol: 0.0001)
The argument `trust_remote_code` is to be used along with export=True. It will be ignored.
The ONNX file onnx/model.onnx is not a regular name used in optimum.onnxruntime, the ORTModel might not behave as expected.
tensor([[[ 1.1261, -0.4374, -3.6442,  ..., -0.6255, -0.1797, -0.4537],
         [ 1.0806, -0.7503, -2.3863,  ..., -0.2730, -0.1225, -0.2811],
         [ 0.8267, -0.6160, -1.9121,  ..., -0.5175, -0.0665,  0.4226],
         ...,
         [ 1.5216, -0.2706, -3.2862,  ..., -0.3274, -0.7411,  0.0825],
         [ 1.2570, -0.2871, -4.0211,  ..., -0.1678, -0.7790,  0.0134],
         [ 1.2286, -0.2899, -4.0887,  ..., -0.1727, -0.7650, -0.0908]]])
tensor([[[ 1.1261, -0.4374, -3.6442,  ..., -0.6255, -0.1797, -0.4537],
         [ 1.0806, -0.7503, -2.3863,  ..., -0.2730, -0.1225, -0.2811],
         [ 0.8267, -0.6160, -1.9121,  ..., -0.5175, -0.0665,  0.4226],
         ...,
         [ 1.5216, -0.2706, -3.2862,  ..., -0.3274, -0.7411,  0.0825],
         [ 1.2570, -0.2871, -4.0211,  ..., -0.1678, -0.7790,  0.0134],
         [ 1.2286, -0.2899, -4.0887,  ..., -0.1727, -0.7650, -0.0908]]],
       grad_fn=<NativeLayerNormBackward0>)
Are inputs from both models close? True

CLI exporter

optimum-cli export onnx -m nomic-ai/nomic-embed-text-v1.5 nomic_onnx_optimum --trust-remote-code

gives me very different results, and these vary on every error so the diff is sometimes very large:

❯ optimum-cli export onnx -m nomic-ai/nomic-embed-text-v1.5 nomic_onnx_optimum --trust-remote-code
Framework not specified. Using pt to export the model.
/Users/bhavika/src/github.com/huggingface/optimum/.venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
<All keys matched successfully>
Automatic task detection to feature-extraction (possible synonyms are: default, image-feature-extraction, mask-generation, sentence-similarity).
Using the export variant default. Available variants are:
    - default: The default ONNX variant.

***** Exporting submodel 1/1: SentenceTransformer *****
Using framework PyTorch: 2.3.0
Overriding 1 configuration item(s)
        - use_cache -> False
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:621: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if seqlen > self._seq_len_cached:
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:573: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:507: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert ro_dim <= x.shape[-1]
Post-processing the exported models...
Deduplicating shared (tied) weights...

Validating ONNX model nomic_onnx_optimum/model.onnx...
        -[✓] ONNX model output names match reference model (token_embeddings, sentence_embedding)
        - Validating ONNX Model output "token_embeddings":
                -[✓] (2, 16, 768) matches (2, 16, 768)
                -[x] values not close enough, max diff: 7.82012939453125e-05 (atol: 1e-05)
        - Validating ONNX Model output "sentence_embedding":
                -[✓] (2, 768) matches (2, 768)
                -[x] values not close enough, max diff: 4.673004150390625e-05 (atol: 1e-05)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:
- token_embeddings: max diff = 7.82012939453125e-05
- sentence_embedding: max diff = 4.673004150390625e-05.
 The exported model was saved at: nomic_onnx_optimum

@xenova any thoughts on what I should check here? I'll test for sequence lengths too but this difference has me worried.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

xenova · 2024-05-24T11:44:26Z

Thanks for this! 🤗 Can you confirm that the exported model produces the same results as the python version for 2048 < context length <= 8192? That would be very helpful!

bhavika · 2024-05-24T12:14:09Z

Hi! Yes I plan to do that. This PR is still a work in progress. Thanks for taking a look!

…

On Fri, May 24, 2024 at 7:44 AM Joshua Lochner ***@***.***> wrote: Thanks for this! 🤗 Can you confirm that the exported model produces the same results as the python version for 2048 < context length <= 8192? That would be very helpful! — Reply to this email directly, view it on GitHub <#1874 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABFZX33USECLSKIZQMJXSV3ZD4R25AVCNFSM6AAAAABIG65RD6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRZGMZDKNJZHA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

bhavika · 2024-05-29T06:44:31Z

Optimum environment information for debugging:

I was wondering if it has anything to do with CUDA being available, but I see the same thing happen on a g2-standard-4 as well.

- `optimum` version: 1.20.0.dev0
- `transformers` version: 4.40.2
- Platform: Linux-5.10.0-26-cloud-amd64-x86_64-with-glibc2.31
- Python version: 3.10.13
- Huggingface_hub version: 0.23.2
- PyTorch version (GPU?): 2.3.0+cu121 (cuda available: True)
- Tensorflow version (GPU?): not installed (cuda available: NA)

The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:
- token_embeddings: max diff = 3.6067795008420944e-05
- sentence_embedding: max diff = 2.1696090698242188e-05.
 The exported model was saved at: nomic_onnx_optimum
(optimum) (base) gcpuser@bt-f609-head-e6t0afg4-compute:~/optimum$ optimum-cli export onnx -m nomic-ai/nomic-embed-text-v1.5 nomic_onnx_optimum --trust-remote-code
Framework not specified. Using pt to export the model.
/home/gcpuser/optimum/.venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
<All keys matched successfully>
Automatic task detection to feature-extraction (possible synonyms are: default, image-feature-extraction, mask-generation, sentence-similarity).
Using the export variant default. Available variants are:
    - default: The default ONNX variant.

***** Exporting submodel 1/1: SentenceTransformer *****
Using framework PyTorch: 2.3.0+cu121
Overriding 1 configuration item(s)
        - use_cache -> False
/home/gcpuser/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:621: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if seqlen > self._seq_len_cached:
/home/gcpuser/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:574: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  seqlen > self._seq_len_cached
/home/gcpuser/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:507: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert ro_dim <= x.shape[-1]
Post-processing the exported models...
Deduplicating shared (tied) weights...

Validating ONNX model nomic_onnx_optimum/model.onnx...
        -[✓] ONNX model output names match reference model (token_embeddings, sentence_embedding)
        - Validating ONNX Model output "token_embeddings":
                -[✓] (2, 16, 768) matches (2, 16, 768)
                -[x] values not close enough, max diff: 7.593631744384766e-05 (atol: 1e-05)
        - Validating ONNX Model output "sentence_embedding":
                -[✓] (2, 768) matches (2, 768)
                -[✓] all values close (atol: 1e-05)
The ONNX export succeeded with the warning: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 1e-05:
- token_embeddings: max diff = 7.593631744384766e-05.
 The exported model was saved at: nomic_onnx_optimum

bhavika · 2024-05-29T16:22:51Z

Testing with longer texts:

import numpy as np
from transformers import AutoTokenizer, AutoModel
from pathlib import Path
from optimum.exporters import TasksManager
from optimum.exporters.onnx import export
import onnx
from optimum.exporters.onnx import validate_model_outputs
from optimum.onnxruntime import ORTModelForFeatureExtraction
import torch

# text generated by GPT-4o
long_input = """
The journey of video games is a fascinating narrative that has seen tremendous evolution since its inception in the mid-20th century. From rudimentary beginnings to today's intricate, immersive experiences, the video game industry's evolution has been marked by technological advancements, cultural shifts, and expanding audiences. The history of video games can be traced back to the 1950s, with the development of early electronic games. One of the earliest examples of a video game is "Tennis for Two," created by physicist William Higinbotham in 1958. This rudimentary tennis simulation displayed on an oscilloscope marked the beginning of interactive electronic entertainment, showcasing the potential of electronic systems to engage users in playful activities.

In the 1960s, the development of more sophisticated games on mainframe computers began to take shape. MIT's "Spacewar!" (1962) is often credited as the first influential digital game. Created by Steve Russell, Martin Graetz, and Wayne Wiitanen, "Spacewar!" featured two spaceships engaged in a dogfight, complete with gravity wells and hyperspace jumps. Although it was not commercially released, it was distributed among computer research labs, influencing future game developers and laying the groundwork for what video games could become.

The 1970s saw a significant turning point with the creation of "Pong" by Atari in 1972. This simple yet addictive tennis game transformed the gaming landscape by introducing arcade machines to the public, establishing the foundation of the commercial video game industry. These early arcade games captivated audiences and laid essential groundwork for future developments. The late 1970s and 1980s are often referred to as the Golden Age of video games, marked by rapid advancements in computing power and display technology. Color graphics and improved sound capabilities became standard, significantly enhancing the user experience. During this era, iconic arcade games like "Space Invaders" (1978), "Pac-Man" (1980), and "Donkey Kong" (1981) became cultural phenomena. These titles were not only entertaining but also demonstrated the potential for video games to be cultural touchstones.

Simultaneously, the home console market began to develop. The release of the Atari 2600 in 1977 was a pivotal moment, allowing players to bring the arcade experience home. The Atari 2600's success spurred other companies to enter the market, leading to a proliferation of consoles and game titles. However, the industry experienced rapid growth but was not without challenges. The video game crash of 1983 was a significant downturn caused by market saturation, poor-quality games, and competition from home computers. Many companies went bankrupt, and the future of video games appeared uncertain.

The industry's revival came with the release of the Nintendo Entertainment System (NES) in 1985. Nintendo's strict quality control and the introduction of iconic franchises like "Super Mario Bros." and "The Legend of Zelda" restored consumer confidence. The NES became a global success and set new standards for home gaming. Nintendo's success encouraged competition, notably from Sega, which released the Sega Genesis in 1988. The Genesis touted superior graphics and processing power, pushing both companies to innovate and leading to the creation of memorable characters such as Sonic the Hedgehog.

The late 1980s also saw the emergence of handheld gaming with the release of the Nintendo Game Boy in 1989. The Game Boy's portability and a library of popular games, including "Tetris," made it an instant hit and established handheld gaming as a significant market segment. The early 1990s ushered in the era of 16-bit consoles, with the Super Nintendo Entertainment System (SNES) and the Sega Genesis leading the charge, offering enhanced graphics, more complex gameplay, and larger game libraries. Memorable titles such as "The Legend of Zelda: A Link to the Past" and "Sonic the Hedgehog 2" became classics of the era.

The mid-1990s brought CD-ROM technology, allowing for larger game files and more intricate multimedia experiences. The Sega CD and the Sony PlayStation were among the pioneers in utilizing this technology, paving the way for more expansive and cinematic games. One of the most significant technological leaps in the history of video games was the transition from 2D to 3D graphics in the mid-1990s. Games like "Super Mario 64" (1996) and "The Legend of Zelda: Ocarina of Time" (1998) revolutionized gameplay by introducing fully three-dimensional environments. These titles not only showcased the capabilities of the Nintendo 64 but also set new standards for game design.

Sony's entry into the gaming market with the PlayStation in 1994 marked a significant shift in the industry. The PlayStation appealed to a broader audience, emphasizing mature content and third-party developer support. Games like "Final Fantasy VII" and "Metal Gear Solid" pushed the boundaries of storytelling and graphical fidelity. Microsoft entered the gaming fray with the release of the Xbox in 2001, introducing online gaming with Xbox Live and setting the stage for the future of multiplayer gaming. Iconic franchises like "Halo" emerged, cementing the Xbox as a formidable competitor.

The 2000s saw the rapid expansion of online gaming, with PC titles like "World of Warcraft" (2004) and console games like "Halo 2" (2004) demonstrating the potential of online multiplayer experiences. Online gaming fostered social communities and competitive play, transforming how players interacted with each other. Massively Multiplayer Online Role-Playing Games (MMORPGs) became a dominant genre. "World of Warcraft" became a cultural phenomenon, attracting millions of subscribers and maintaining a dedicated player base for over a decade. The success of MMORPGs highlighted the potential for games to be ever-evolving, persistent worlds.

The mid-2000s saw the rise of casual gaming, driven by platforms like Facebook and mobile devices. Games like "FarmVille" and "Angry Birds" reached broader audiences, introducing gaming to people who had never considered themselves gamers. The accessibility and simplicity of these games contributed to their widespread appeal. The release of the Nintendo Wii in 2006 brought another wave of innovation. The Wii's motion controls and intuitive interface appealed to a wide demographic, including families and elderly players, showcasing that gaming could be accessible and physically engaging.

The 2010s introduced the next generation of consoles, including the PlayStation 4, Xbox One, and Nintendo Switch. These consoles boasted impressive hardware capabilities, enabling lifelike graphics, expansive open worlds, and seamless online experiences. Esports emerged as a major industry, with games like "League of Legends," "Dota 2," and "Overwatch" drawing massive audiences and offering substantial prize pools. Esports athletes gained recognition akin to traditional sports stars, further legitimizing gaming as a competitive pursuit.

Independent game developers gained prominence, thanks to digital distribution platforms like Steam and the accessibility of game development tools. Indie titles like "Minecraft," "Undertale," and "Celeste" found success by offering unique gameplay experiences and innovative narratives. Advancements in Virtual Reality (VR) and Augmented Reality (AR) technology opened new dimensions for gaming. Devices like the Oculus Rift, PlayStation VR, and the AR game "Pokémon GO" offered immersive and interactive experiences that pushed the boundaries of traditional gaming.

The rise of streaming platforms like Twitch and YouTube Gaming transformed how games were consumed. Gamers became influencers, showcasing their skills, entertaining viewers, and building dedicated communities. This shift highlighted the social and performative aspects of gaming. From humble beginnings to a global cultural phenomenon, the evolution of video games is a testament to technological innovation, creative storytelling, and societal impact. The industry has continually adapted and grown, embracing new technologies and reaching broader audiences.

As we look to the future, the potential for video games to further evolve and shape our world remains boundless. Whether through advancements in AI, increased realism, or new forms of interactivity, the journey of video games is a dynamic and ever-changing narrative that will continue to captivate and inspire generations to come. The cultural significance of video games cannot be overstated. What began as simple amusements has now become an integral part of modern entertainment culture. Today, gaming influences not only the entertainment industry but also education, art, and social interactions. Video games offer a medium for storytelling and artistic expression, where developers can create rich, interactive narratives and worlds that engage players on multiple levels.

Traditional gameplay mechanics have become platforms for complex storytelling. Titles like "The Last of Us" and "Red Dead Redemption" serve as exemplary cases where the narrative drive is as compelling as the gameplay. The characters are well-developed, the plots intricate, and the worlds immersive, offering experiences akin to interactive movies.

Furthermore, video games have also contributed to social change and awareness. Games like "Hellblade: Senua's Sacrifice" tackle mental health issues, offering players insight into the experiences of individuals living with conditions such as psychosis. This ability to simulate different perspectives and experiences allows players to develop empathy and understanding of complex issues.

The educational potential of video games is increasingly being recognized. Gamification of learning provides an engaging way to understand complex subjects. Games like "Minecraft: Education Edition" have been used in classrooms to teach subjects ranging from history to coding. Virtual reality (VR) simulations allow medical students to practice surgeries or pilots to simulate flying before ever stepping into a real cockpit.

Augmented Reality (AR) games such as "Pokémon GO" have further blurred the lines between virtual and real worlds. These games bring digital components into our physical world, encouraging outdoor activity and social interaction. Imagine future applications where AR can assist in navigation, provide real-time data overlays, and even transform educational field trips into interactive learning experiences.

Technological advancements continue to stretch the limits of what is possible in gaming. Artificial Intelligence (AI) is improving non-player characters (NPCs) to act more human, creating more challenging and realistic interactions. Procedural generation algorithms are constructing vast, explorable worlds, offering something new with every game session. Ray tracing technology is making lighting in games more realistic than ever, and haptic feedback is adding a new layer of sensory immersion.

Mobile gaming, too, has grown exponentially. The accessibility of smartphones has brought gaming to almost everyone. Mobile games now encompass a wide variety of genres and complexities, from simple time-killers to complex strategy games and even MMORPGs.

The monetization strategies within the gaming industry have also evolved, with free-to-play models becoming increasingly common. These games generate revenue through advertisements and microtransactions. While this model has been met with some criticism, it has also allowed for the creation of expansive, continually updated games that remain free to play.

Cloud gaming appears to be the next frontier. Services like Google Stadia, NVIDIA GeForce Now, and Microsoft’s Xbox Cloud Gaming are experimenting with creating platforms where games are streamed over the internet, eliminating the need for powerful local hardware. If successful, this could revolutionize access to high-quality gaming experiences, making them accessible to a broader audience.

As the gaming industry expands, it also faces significant challenges. Issues like crunch culture—where developers are required to work excessively long hours to meet deadlines—have come under increasing scrutiny. The industry is beginning to recognize the importance of sustainable work practices and the value of mental health, though more progress is needed.

Inclusivity is another critical issue. Historically, video games have been male-dominated both in terms of the workforce and the target audience. However, this is changing as the industry strives for greater representation and inclusivity both in the games being produced and within the workforce.

Finally, the role of regulation is becoming more critical as video games become ubiquitous. Issues such as data privacy, online harassment, and gaming addiction are subjects of increasing concern. Governments and industry bodies must work together to create frameworks that protect players and address these challenges without stifling innovation.

In conclusion, the evolution of video games is a remarkable story of technological innovation, creative storytelling, and cultural impact. From simple beginnings to today's complex, immersive experiences, video games have become an integral part of modern life. They entertain, educate, and even inspire social change. As technology continues to advance, the potential for video games to further shape our world is boundless. The journey of video games is dynamic and ever-changing, promising to captivate and inspire generations to come.

Whether through advancements in AI, increased realism, or new forms of interactivity, the industry is poised to continue its growth and influence. As video games evolve, their impact on culture, art, education, and society will undoubtedly deepen, solidifying their place as one of the most important forms of modern entertainment and expression. The future of video games is a bright and exciting frontier, and we can only look forward to the innovations and transformations that lie ahead.
"""


if __name__ == "__main__":
    MODEL_NAME = "nomic-ai/nomic-embed-text-v1.5"
    tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
    inputs = tokenizer("My name is Philipp and I live in Germany.", return_tensors="pt", padding=True, truncation=True)
    long_inputs = tokenizer(long_input, return_tensors="pt", padding=True, truncation=True)

    hf_model = AutoModel.from_pretrained(
        MODEL_NAME,
        trust_remote_code=True,
        safe_serialization=True,
        revision="91d2d6bfdddf0b0da840f901b533e99bae30d757",
    )

    onnx_path = Path("tests/nomic_bert.onnx")
    onnx_config_constructor = TasksManager.get_exporter_config_constructor(
        "onnx", hf_model, task="feature-extraction", library_name="transformers"
    )
    onnx_config = onnx_config_constructor(hf_model.config)
    print(onnx_config)
    onnx_inputs, onnx_outputs = export(hf_model, onnx_config, onnx_path, onnx_config.DEFAULT_ONNX_OPSET)

    print("After export")
    print(onnx_inputs)
    print(onnx_outputs)

    onnx_model = onnx.load(onnx_path)
    onnx.checker.check_model(onnx_model)

    validate_model_outputs(onnx_config, hf_model, onnx_path, onnx_outputs, onnx_config.ATOL_FOR_VALIDATION)

    model = ORTModelForFeatureExtraction.from_pretrained(
        MODEL_NAME,
        file_name="onnx/model.onnx",
        trust_remote_code=True,
        safe_serialization=True,
        revision="91d2d6bfdddf0b0da840f901b533e99bae30d757",
    )

    optimum_model_outputs = model(**inputs)
    print(optimum_model_outputs.last_hidden_state)

    hf_model_outputs = hf_model(**inputs)
    print(hf_model_outputs.last_hidden_state)

    print("Are outputs from both models close?", np.allclose(optimum_model_outputs.last_hidden_state.cpu().detach().numpy(), hf_model_outputs.last_hidden_state.cpu().detach().numpy(), rtol=1e-3, atol=1e-3))

    print("Testing for longer sequence length for Nomic-AI model")

    optimum_model_outputs = model(**long_inputs)
    print(optimum_model_outputs.last_hidden_state)

    hf_model_outputs = hf_model(**long_inputs)
    print(hf_model_outputs.last_hidden_state)

    print("Are outputs from both models close when using longer texts?", np.allclose(optimum_model_outputs.last_hidden_state.cpu().detach().numpy(), hf_model_outputs.last_hidden_state.cpu().detach().numpy(), rtol=1e-3, atol=1e-3))

Relevant output:

Testing for longer sequence length for Nomic-AI model
tensor([[[ 8.7254e-01,  2.1854e+00, -4.2498e+00,  ..., -2.7854e-01,
          -4.8159e-02,  2.2158e+00],
         [ 1.1689e+00,  2.6456e+00, -3.5509e+00,  ..., -1.7534e-01,
          -1.0793e-01,  1.8791e+00],
         [ 5.7188e-01,  2.2278e+00, -3.5168e+00,  ...,  2.5150e-03,
           1.7937e-01,  2.1163e+00],
         ...,
         [ 1.3092e+00,  2.5185e+00, -3.9133e+00,  ..., -4.1854e-01,
           3.9970e-01,  2.5439e-01],
         [ 1.1691e+00,  1.9978e+00, -3.3293e+00,  ...,  3.6745e-02,
          -3.5299e-01, -1.9861e-01],
         [ 7.1556e-01,  1.4866e+00, -4.1898e+00,  ..., -3.4444e-01,
          -2.6155e-01,  9.4057e-01]]])
tensor([[[ 8.7254e-01,  2.1854e+00, -4.2498e+00,  ..., -2.7855e-01,
          -4.8159e-02,  2.2158e+00],
         [ 1.1689e+00,  2.6456e+00, -3.5509e+00,  ..., -1.7534e-01,
          -1.0793e-01,  1.8791e+00],
         [ 5.7188e-01,  2.2278e+00, -3.5168e+00,  ...,  2.5153e-03,
           1.7937e-01,  2.1163e+00],
         ...,
         [ 1.3092e+00,  2.5185e+00, -3.9133e+00,  ..., -4.1854e-01,
           3.9970e-01,  2.5439e-01],
         [ 1.1691e+00,  1.9978e+00, -3.3293e+00,  ...,  3.6745e-02,
          -3.5299e-01, -1.9861e-01],
         [ 7.1556e-01,  1.4866e+00, -4.1898e+00,  ..., -3.4443e-01,
          -2.6154e-01,  9.4057e-01]]], grad_fn=<NativeLayerNormBackward0>)
Are outputs from both models close when using longer texts? True

xenova · 2024-05-30T11:16:30Z

Nice! As a last check, can you test with 8192 tokens?

bhavika · 2024-05-30T13:27:41Z

Nice! As a last check, can you test with 8192 tokens?

Sure! @xenova here it is, redacting the very long text I threw in there:

Note that I am logging how many tokens we end up with in the output:

import numpy as np
from transformers import AutoTokenizer, AutoModel
from pathlib import Path
from optimum.exporters import TasksManager
from optimum.exporters.onnx import export
import onnx
from optimum.exporters.onnx import validate_model_outputs
from optimum.onnxruntime import ORTModelForFeatureExtraction
import torch

long_input = """<redacted> 
"""


if __name__ == "__main__":
    MODEL_NAME = "nomic-ai/nomic-embed-text-v1.5"
    tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
    long_inputs = tokenizer(long_input, return_tensors="pt", padding=True)
    num_tokens = long_inputs['input_ids'].shape

    print(f"Number of tokens in the input: {num_tokens}")

    hf_model = AutoModel.from_pretrained(
        MODEL_NAME,
        trust_remote_code=True,
        safe_serialization=True,
        revision="91d2d6bfdddf0b0da840f901b533e99bae30d757",
    )

    onnx_path = Path("tests/nomic_bert.onnx")
    onnx_config_constructor = TasksManager.get_exporter_config_constructor(
        "onnx", hf_model, task="feature-extraction", library_name="transformers"
    )
    onnx_config = onnx_config_constructor(hf_model.config)
    onnx_inputs, onnx_outputs = export(hf_model, onnx_config, onnx_path, onnx_config.DEFAULT_ONNX_OPSET)

    print("After export")
    print(onnx_inputs)
    print(onnx_outputs)

    onnx_model = onnx.load(onnx_path)
    onnx.checker.check_model(onnx_model)

    validate_model_outputs(onnx_config, hf_model, onnx_path, onnx_outputs, onnx_config.ATOL_FOR_VALIDATION)

    model = ORTModelForFeatureExtraction.from_pretrained(
        MODEL_NAME,
        file_name="onnx/model.onnx",
        trust_remote_code=True,
        safe_serialization=True,
        revision="91d2d6bfdddf0b0da840f901b533e99bae30d757",
    )

    print("Testing for longer sequence length for Nomic-AI model")
    optimum_model_outputs = model(**long_inputs)
    hf_model_outputs = hf_model(**long_inputs)
    print("Are outputs from both models close when using longer texts?", np.allclose(optimum_model_outputs.last_hidden_state.cpu().detach().numpy(), hf_model_outputs.last_hidden_state.cpu().detach().numpy(), rtol=1e-3, atol=1e-3))

Output:

❯ python tests/nomic_bert.py
/Users/bhavika/src/github.com/huggingface/optimum/.venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Token indices sequence length is longer than the specified maximum sequence length for this model (8195 > 512). Running this sequence through the model will result in indexing errors
**Number of tokens in the input: torch.Size([1, 8195])**
<All keys matched successfully>
<optimum.exporters.onnx.model_configs.NomicBertOnnxConfig object at 0x16fdbc730>
Using framework PyTorch: 2.3.0
Overriding 1 configuration item(s)
        - use_cache -> False
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:621: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if seqlen > self._seq_len_cached:
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:573: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (
/Users/bhavika/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:507: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert ro_dim <= x.shape[-1]
After export
['input_ids', 'attention_mask', 'token_type_ids']
['last_hidden_state']
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

Validating ONNX model tests/nomic_bert.onnx...
        -[✓] ONNX model output names match reference model (last_hidden_state)
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (2, 16, 768) matches (2, 16, 768)
                -[x] values not close enough, max diff: 0.00017261505126953125 (atol: 0.0001)
Traceback (most recent call last):
  File "/Users/bhavika/src/github.com/huggingface/optimum/tests/nomic_bert.py", line 146, in <module>
    validate_model_outputs(onnx_config, hf_model, onnx_path, onnx_outputs, onnx_config.ATOL_FOR_VALIDATION)
  File "/Users/bhavika/src/github.com/huggingface/optimum/optimum/exporters/onnx/convert.py", line 233, in validate_model_outputs
    raise error
optimum.exporters.error_utils.AtolError: The maximum absolute difference between the output of the reference model and the ONNX exported model is not within the set tolerance 0.0001:
- last_hidden_state: max diff = 0.00017261505126953125

The last check about the inputs being close together never finishes because this process seems to consume too much memory (I have 64 GB on an M3 macbook). 💀

bhavika · 2024-06-06T12:27:01Z

Update

Tried the test script here with a GPU instance - https://gist.github.com/bhavika/8827463b68a327dfe334a2a7fcc723de

❯ python tests/test_nomicbert.py
/Users/gcpuser/src/github.com/huggingface/optimum/.venv/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Token indices sequence length is longer than the specified maximum sequence length for this model (8195 > 512). Running this sequence through the model will result in indexing errors
**Number of tokens in the input: torch.Size([1, 8195])**
<All keys matched successfully>
<optimum.exporters.onnx.model_configs.NomicBertOnnxConfig object at 0x16fdbc730>
Using framework PyTorch: 2.3.0
Overriding 1 configuration item(s)
        - use_cache -> False
/Users/gcpuser/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:621: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if seqlen > self._seq_len_cached:
/Users/gcpuser/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:573: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (
/Users/gcpuser/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/7b260c5676ce4ba4a117f15bc24ac13ab4b81695/modeling_hf_nomic_bert.py:507: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert ro_dim <= x.shape[-1]
After export
['input_ids', 'attention_mask', 'token_type_ids']
['last_hidden_state']
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

Validating ONNX model tests/nomic_bert.onnx...
        -[✓] ONNX model output names match reference model (last_hidden_state)
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (2, 16, 768) matches (2, 16, 768)
                -[✓] all values close (atol: 0.0001) 
The argument `trust_remote_code` is to be used along with export=True. It will be ignored.
The ONNX file onnx/model.onnx is not a regular name used in optimum.onnxruntime, the ORTModel might not behave as expected.
Testing for longer sequence length for Nomic-AI model
Are outputs from both models close when using longer texts? True

@xenova could I get a review here?

bhavika · 2024-06-20T15:32:51Z

@fxmarty @mht-sharma Hi! 👋🏽 just wondering if there's any interest in accepting this PR? Anything else I can do to land it now?

fxmarty

Added a comment about the test that needs to be moved, but otherwise it looks good to me! Thanks a lot!

Could you also add a test in tests/exporters/onnx/test_exporters_onnx_cli.py with this model? As it uses a custom modeling, you'll need to use trust_remote_code=True. Please decorate it with @slow as well (as there is no tiny model for nomic on the Hub).

tests/test_nomicbert.py

bhavika · 2024-06-27T15:35:50Z

@fxmarty thanks for the feedback! I can update the tests for sure. Any tips on how to run the test suite/checks for this PR?

…icbert

fxmarty · 2024-06-28T09:53:03Z

Thank you! You could for example run: RUN_SLOW=1 pytest tests/exporters/onnx -k "test_custom_model" -s -vvvvv. As the test is decorated with @slow, it is not run in the normal CI (as requires a large model to run).

HuggingFaceDocBuilderDev · 2024-06-28T10:21:23Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

bhavika · 2024-07-13T19:58:04Z

@fxmarty could you trigger/approve the workflow runs for this PR please?

fxmarty · 2024-07-16T15:38:23Z

@bhavika To solve

FAILED exporters/onnx/test_onnx_export.py::OnnxExportTestCase::test_all_models_tested - AssertionError: Not testing all models. Missing models: {'nomic-bert'}

can you add a PYTORCH_REMOTE_CODE_MODELS here

optimum/tests/exporters/exporters_utils.py

Line 303 in d9bd7c3

and add it here

optimum/tests/exporters/onnx/test_onnx_export.py

Lines 321 to 326 in d9bd7c3

    
           missing_models_set = ( 
        
               TasksManager._SUPPORTED_CLI_MODEL_TYPE 
        
               - set(PYTORCH_EXPORT_MODELS_TINY.keys()) 
        
               - set(PYTORCH_TIMM_MODEL.keys()) 
        
               - set(PYTORCH_SENTENCE_TRANSFORMERS_MODEL.keys()) 
        
           )

? And adapt test_custom_model accordingly to use PYTORCH_REMOTE_CODE_MODELS.

There is also

FAILED onnxruntime/test_modeling.py::ORTModelForFeatureExtractionIntegrationTest::test_pipeline_ort_model_8_nomic_bert - KeyError: 'nomic-bert'
FAILED onnxruntime/test_modeling.py::ORTModelForFeatureExtractionIntegrationTest::test_compare_to_transformers_8_nomic_bert - KeyError: 'nomic-bert'

not sure why.

tests/exporters/onnx/test_onnx_export.py

bhavika · 2024-08-05T14:05:46Z

@fxmarty could you approve the workflow runs for this PR? I made some changes to the tests as you suggested and want to see if they work now.

bhavika · 2024-10-21T13:17:28Z

@fxmarty just bumping this PR! Could we re-run the workflows?

Add nomic-bert config

65ad74a

bhavika marked this pull request as draft May 24, 2024 18:05

bhavika added 2 commits May 28, 2024 22:39

Add config

34f8121

Change opset

d55166c

bhavika marked this pull request as ready for review May 29, 2024 05:48

Fix typos

eb0f281

bhavika and others added 2 commits May 29, 2024 15:10

Merge branch 'huggingface:main' into nomicbert

e93e359

Merge branch 'nomicbert'

7b384c8

bhavika added 2 commits June 6, 2024 08:29

Add test

2570a3c

Add to config

150739f

fxmarty approved these changes Jun 24, 2024

View reviewed changes

tests/test_nomicbert.py Outdated Show resolved Hide resolved

tests/test_nomicbert.py Outdated Show resolved Hide resolved

fxmarty mentioned this pull request Jun 24, 2024

Support for phi3-v Vision Model #1915

Open

bhavika and others added 8 commits June 25, 2024 12:01

Add nomic-bert to supported architectures

985703b

Add test for exporter

82a788f

Merge branch 'huggingface:main' into main

06cf77e

remove test file

1cadbd3

Add nomic-bert as a supported arch in test

977fdbf

Add nomic-bert to large models tests

0b7d459

Merge branch 'huggingface:main' into main

bf23abf

Merge branch 'huggingface:main' into nomicbert

9a6cd59

Stray space

e10b3e1

Merge remote-tracking branch 'refs/remotes/origin/nomicbert' into nom…

773c567

…icbert

bhavika changed the title ~~Add nomic-bert config~~ Add support for nomic-ai/nomic-embed-text-v1.5 model Jun 27, 2024

Merge remote-tracking branch 'origin/main' into nomicbert

4004ec6

bhavika added 3 commits July 13, 2024 14:20

grammar

506735b

Add nomic-bert repo

0dea6cd

minor grammar fix

f906442

bhavika added 3 commits July 15, 2024 10:38

Remove nomic-bert from tests because we dont have a small version

5ca630e

Add nomic-bert to unsuppored list

801787b

Remove from unsupported

b25bf32

bhavika added 2 commits July 16, 2024 21:29

Update tests to exclude nomic in tests

fe38265

Modify test-custom-model

7f6a562

fxmarty reviewed Jul 17, 2024

View reviewed changes

tests/exporters/onnx/test_onnx_export.py Outdated Show resolved Hide resolved

bhavika and others added 6 commits July 17, 2024 08:25

Remove debugging statement

3fe0e46

Formatting

6252618

Merge branch 'main' into nomicbert

ac533ca

Formatting with black

dc948c6

Ruff formatting

1b26f3f

Remove nomic-bert as a supported arch

35f5ea2

bhavika and others added 4 commits September 22, 2024 17:35

Fix formatting

0b6aa14

Merge branch 'huggingface:main' into nomicbert

cdbf5f2

Run black

974e625

Remove test file

955fb15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for nomic-ai/nomic-embed-text-v1.5 model #1874

Add support for nomic-ai/nomic-embed-text-v1.5 model #1874

bhavika commented May 24, 2024 •

edited

Loading

xenova commented May 24, 2024

bhavika commented May 24, 2024 via email •

edited

Loading

bhavika commented May 29, 2024 •

edited

Loading

bhavika commented May 29, 2024 •

edited

Loading

xenova commented May 30, 2024

bhavika commented May 30, 2024 •

edited

Loading

bhavika commented Jun 6, 2024 •

edited

Loading

bhavika commented Jun 20, 2024

fxmarty left a comment

bhavika commented Jun 27, 2024

fxmarty commented Jun 28, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jun 28, 2024

bhavika commented Jul 13, 2024

fxmarty commented Jul 16, 2024

bhavika commented Aug 5, 2024

bhavika commented Oct 21, 2024

Add support for nomic-ai/nomic-embed-text-v1.5 model #1874

Are you sure you want to change the base?

Add support for nomic-ai/nomic-embed-text-v1.5 model #1874

Conversation

bhavika commented May 24, 2024 • edited Loading

What does this PR do?

CLI exporter

Before submitting

Who can review?

xenova commented May 24, 2024

bhavika commented May 24, 2024 via email • edited Loading

bhavika commented May 29, 2024 • edited Loading

Optimum environment information for debugging:

bhavika commented May 29, 2024 • edited Loading

Testing with longer texts:

xenova commented May 30, 2024

bhavika commented May 30, 2024 • edited Loading

bhavika commented Jun 6, 2024 • edited Loading

Update

bhavika commented Jun 20, 2024

fxmarty left a comment

Choose a reason for hiding this comment

bhavika commented Jun 27, 2024

fxmarty commented Jun 28, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Jun 28, 2024

bhavika commented Jul 13, 2024

fxmarty commented Jul 16, 2024

bhavika commented Aug 5, 2024

bhavika commented Oct 21, 2024

bhavika commented May 24, 2024 •

edited

Loading

bhavika commented May 24, 2024 via email •

edited

Loading

bhavika commented May 29, 2024 •

edited

Loading

bhavika commented May 29, 2024 •

edited

Loading

bhavika commented May 30, 2024 •

edited

Loading

bhavika commented Jun 6, 2024 •

edited

Loading

fxmarty commented Jun 28, 2024 •

edited

Loading