Skip to content

Commit

Permalink
Merge pull request #259 from stanford-crfm/update-01-2025
Browse files Browse the repository at this point in the history
2024 updates for index
  • Loading branch information
rishibommasani authored Jan 17, 2025
2 parents 2180237 + ffa7c88 commit 303d4cd
Show file tree
Hide file tree
Showing 14 changed files with 653 additions and 5 deletions.
73 changes: 73 additions & 0 deletions assets/amazon.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,76 @@
prohibited_uses: ''
monitoring: ''
feedback: https://github.com/amazon-science/chronos-forecasting/discussions
- type: model
name: Amazon Nova (Understanding)
organization: Amazon Web Services (AWS)
description: A new generation of state-of-the-art foundation models (FMs) that
deliver frontier intelligence and industry leading price performance, available
exclusively in Amazon Bedrock. Amazon Nova understanding models excel in Retrieval-Augmented
Generation (RAG), function calling, and agentic applications.
created_date: 2024-12-03
url: https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
model_card: unknown
modality:
explanation: Amazon Nova understanding models accept text, image, or video inputs
to generate text output.
value: text, image, video; text
analysis: Amazon Nova Pro is capable of processing up to 300K input tokens and
sets new standards in multimodal intelligence and agentic workflows that require
calling APIs and tools to complete complex workflows. It achieves state-of-the-art
performance on key benchmarks including visual question answering ( TextVQA
) and video understanding ( VATEX ).
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: All Amazon Nova models include built-in safety controls and creative
content generation models include watermarking capabilities to promote responsible
AI use.
access:
explanation: available exclusively in Amazon Bedrock
value: limited
license: unknown
intended_uses: You can build on Amazon Nova to analyze complex documents and videos,
understand charts and diagrams, generate engaging video content, and build sophisticated
AI agents, from across a range of intelligence classes optimized for enterprise
workloads.
prohibited_uses: unknown
monitoring: unknown
feedback: unknown
- type: model
name: Amazon Nova (Creative Content Generation)
organization: Amazon Web Services (AWS)
description: A new generation of state-of-the-art foundation models (FMs) that
deliver frontier intelligence and industry leading price performance, available
exclusively in Amazon Bedrock.
created_date: 2024-12-03
url: https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
model_card: unknown
modality:
explanation: Amazon creative content generation models accept text and image
inputs to generate image or video output.
value: text, image;image, video
analysis: Amazon Nova Canvas excels on human evaluations and key benchmarks such
as text-to-image faithfulness evaluation with question answering (TIFA) and
ImageReward.
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: All Amazon Nova models include built-in safety controls and creative
content generation models include watermarking capabilities to promote responsible
AI use.
access:
explanation: available exclusively in Amazon Bedrock
value: limited
license: unknown
intended_uses: You can build on Amazon Nova to analyze complex documents and videos,
understand charts and diagrams, generate engaging video content, and build sophisticated
AI agents, from across a range of intelligence classes optimized for enterprise
workloads.
prohibited_uses: unknown
monitoring: unknown
feedback: unknown
47 changes: 42 additions & 5 deletions assets/anthropic.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -608,15 +608,17 @@
speed of its predecessor, Claude 3 Opus, and is designed to tackle tasks like
context-sensitive customer support, orchestrating multi-step workflows, interpreting
charts and graphs, and transcribing text from images.
created_date: 2024-06-21
url: https://www.anthropic.com/news/claude-3-5-sonnet
created_date:
explanation: Claude 3.5 Sonnet updated on Oct. 22, initially released on June
20 of the same year.
value: 2024-10-22
url: https://www.anthropic.com/news/3-5-models-and-computer-use
model_card: unknown
modality: text; image, text
analysis: The model has been evaluated on a range of tests including graduate-level
reasoning (GPQA), undergraduate-level knowledge (MMLU), coding proficiency (HumanEval),
and standard vision benchmarks. In an internal agentic coding evaluation, Claude
3.5 Sonnet solved 64% of problems, outperforming the previous version, Claude
3 Opus, which solved 38%.
and standard vision benchmarks. Claude 3.5 Sonnet demonstrates state-of-the-art
performance on most benchmarks.
size: Unknown
dependencies: []
training_emissions: Unknown
Expand All @@ -637,3 +639,38 @@
integrated to ensure robustness of evaluations.
feedback: Feedback on Claude 3.5 Sonnet can be submitted directly in-product to
inform the development roadmap and improve user experience.
- type: model
name: Claude 3.5 Haiku
organization: Anthropic
description: Claude 3.5 Haiku is Anthropic's fastest model, delivering advanced
coding, tool use, and reasoning capability, surpassing the previous Claude 3
Opus in intelligence benchmarks. It is designed for critical use cases where
low latency is essential, such as user-facing chatbots and code completions.
created_date: 2024-10-22
url: https://www.anthropic.com/claude/haiku
model_card: unknown
modality:
explanation: Claude 3.5 Haiku is available...initially as a text-only model
and with image input to follow.
value: text; unknown
analysis: Claude 3.5 Haiku offers strong performance and speed across a variety
of coding, tool use, and reasoning tasks. Also, it has been tested in extensive
safety evaluations and exceeded expectations in reasoning and code generation
tasks.
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: During Claude 3.5 Haiku’s development, we conducted extensive
safety evaluations spanning multiple languages and policy domains.
access:
explanation: Claude 3.5 Haiku is available across Claude.ai, our first-party
API, Amazon Bedrock, and Google Cloud’s Vertex AI.
value: open
license: unknown
intended_uses: Critical use cases where low latency matters, like user-facing
chatbots and code completions.
prohibited_uses: unknown
monitoring: unknown
feedback: unknown
22 changes: 22 additions & 0 deletions assets/cohere.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -592,3 +592,25 @@
prohibited_uses: unknown
monitoring: unknown
feedback: https://huggingface.co/CohereForAI/aya-23-35B/discussions
- type: model
name: Command R+
organization: Cohere
description: Command R+ is a state-of-the-art RAG-optimized model designed to
tackle enterprise-grade workloads, and is available first on Microsoft Azure.
created_date: 2024-04-04
url: https://cohere.com/blog/command-r-plus-microsoft-azure
model_card: unknown
modality: unknown
analysis: unknown
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: unknown
access: ''
license: unknown
intended_uses: unknown
prohibited_uses: unknown
monitoring: unknown
feedback: unknown
43 changes: 43 additions & 0 deletions assets/genmo.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
- type: model
name: Mochi 1
organization: Genmo
description: Mochi 1 is an open-source video generation model designed to produce
high-fidelity motion and strong prompt adherence in generated videos, setting
a new standard for open video generation systems.
created_date: 2025-01-14
url: https://www.genmo.ai/blog
model_card: unknown
modality:
explanation: Mochi 1 generates smooth videos... Measures how accurately generated
videos follow the provided textual instructions
value: text; video
analysis: Mochi 1 sets a new best-in-class standard for open-source video generation.
It also performs very competitively with the leading closed models... We benchmark
prompt adherence with an automated metric using a vision language model as a
judge following the protocol in OpenAI DALL-E 3. We evaluate generated videos
using Gemini-1.5-Pro-002.
size:
explanation: featuring a 10 billion parameter diffusion model
value: 10B parameters
dependencies: [DDPM, DreamFusion, Emu Video, T5-XXL]
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: robust safety moderation protocols in the playground to ensure
that all video generations remain safe and aligned with ethical guidelines.
access:
explanation: open state-of-the-art video generation model... The weights and
architecture for Mochi 1 are open
value: open
license:
explanation: We're releasing the model under a permissive Apache 2.0 license.
value: Apache 2.0
intended_uses: Advance the field of video generation and explore new methodologies.
Build innovative applications in entertainment, advertising, education, and
more. Empower artists and creators to bring their visions to life with AI-generated
videos. Generate synthetic data for training AI models in robotics, autonomous
vehicles and virtual environments.
prohibited_uses: unknown
monitoring: unknown
feedback: unknown
66 changes: 66 additions & 0 deletions assets/google.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1908,3 +1908,69 @@
monitoring: unknown
feedback: Encourages developer feedback to inform model improvements and future
updates.
- type: model
name: Veo 2
organization: Google DeepMind
description: Veo 2 is a state-of-the-art video generation model that creates videos
with realistic motion and high-quality output, up to 4K, with extensive camera
controls. It simulates real-world physics and offers advanced motion capabilities
with enhanced realism and fidelity.
created_date: 2024-12-16
url: https://deepmind.google/technologies/veo/veo-2/
model_card: unknown
modality:
explanation: Our state-of-the-art video generation model ... text-to-image model
Veo 2
value: text; video
analysis: Veo 2 outperforms other leading video generation models, based on human
evaluations of its performance.
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: Veo 2 includes features that enhance realism, fidelity, detail,
and artifact reduction to ensure high-quality output.
access: limited
license: unknown
intended_uses: Creating high-quality videos with realistic motion, different styles,
camera controls, shot styles, angles, and movements.
prohibited_uses: unknown
monitoring: unknown
feedback: unknown

- type: model
name: Gemini 2.0
organization: Google DeepMind
description: Google DeepMind introduces Gemini 2.0, a new AI model designed for
the 'agentic era.'
created_date: 2024-12-11
url: https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/#ceo-message
model_card: unknown
modality:
explanation: The first model built to be natively multimodal, Gemini 1.0 and
1.5 drove big advances with multimodality and long context to understand information
across text, video, images, audio and code...
value: text, video, image, audio; image, text
analysis: unknown
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware:
explanation: It’s built on custom hardware like Trillium, our sixth-generation
TPUs.
value: custom hardware like Trillium, our sixth-generation TPUs
quality_control: Google is committed to building AI responsibly, with safety and
security as key priorities.
access:
explanation: Gemini 2.0 Flash is available to developers and trusted testers,
with wider availability planned for early next year.
value: limited
license: unknown
intended_uses: Develop more agentic models, meaning they can understand more about
the world around you, think multiple steps ahead, and take action on your behalf,
with your supervision.
prohibited_uses: unknown
monitoring: unknown
feedback: unknown
42 changes: 42 additions & 0 deletions assets/ibm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,45 @@
prohibited_uses: ''
monitoring: ''
feedback: ''
- type: model
name: IBM Granite 3.0
organization: IBM
description: IBM Granite 3.0 models deliver state-of-the-art performance relative
to model size while maximizing safety, speed and cost-efficiency for enterprise
use cases.
created_date: 2024-10-21
url: https://www.ibm.com/new/ibm-granite-3-0-open-state-of-the-art-enterprise-models
model_card: unknown
modality:
explanation: IBM Granite 3.0 8B Instruct model for classic natural language
use cases including text generation, classification, summarization, entity
extraction and customer service chatbots
value: text; text
analysis: Granite 3.0 8B Instruct matches leading similarly-sized open models
on academic benchmarks while outperforming those peers on benchmarks for enterprise
tasks and safety.
size:
explanation: 'Dense, general purpose LLMs: Granite-3.0-8B-Instruct'
value: 8B parameters
dependencies: [Hugging Face’s OpenLLM Leaderboard v2]
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: The entire Granite family of models are trained on carefully
curated enterprise datasets, filtered for objectionable content with critical
concerns like governance, risk, privacy and bias mitigation in mind
access:
explanation: In keeping with IBM’s strong historical commitment to open source
, all Granite models are released under the permissive Apache 2.0 license
value: open
license:
explanation: In keeping with IBM’s strong historical commitment to open source
, all Granite models are released under the permissive Apache 2.0 license
value: Apache 2.0
intended_uses: classic natural language use cases including text generation, classification,
summarization, entity extraction and customer service chatbots, programming
language use cases such as code generation, code explanation and code editing,
and for agentic use cases requiring tool calling
prohibited_uses: unknown
monitoring: ''
feedback: unknown
26 changes: 26 additions & 0 deletions assets/inflection.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,29 @@
prohibited_uses: ''
monitoring: ''
feedback: none
- type: model
name: Inflection 3.0
organization: Inflection AI
description: Inflection for Enterprise, powered by our industry-first, enterprise-grade
AI system, Inflection 3.0.
created_date: 2024-10-07
url: https://inflection.ai/blog/enterprise
model_card: unknown
modality: unknown
analysis: unknown
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: unknown
access:
explanation: Developers can now access Inflection AI’s Large Language Model
through its new commercial API.
value: open
license: unknown
intended_uses: unknown
prohibited_uses: unknown
monitoring: unknown
feedback: So please drop us a line. We want to keep hearing from enterprises about
how we can help solve their challenges and make AI a reality for their business.
Loading

0 comments on commit 303d4cd

Please sign in to comment.