Skip to content

Commit

Permalink
Merge pull request #234 from stanford-crfm/jonathanxu81205/llava-crit…
Browse files Browse the repository at this point in the history
…ic-1

@jonathanxu81205 - add llava-critic-1
  • Loading branch information
jxue16 authored Jan 17, 2025
2 parents 93ef5cc + 32b4fdb commit 4c2150b
Showing 1 changed file with 22 additions and 0 deletions.
22 changes: 22 additions & 0 deletions assets/bytedance.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,25 @@
prohibited_uses: unknown
monitoring: unknown
feedback: https://huggingface.co/ByteDance/SDXL-Lightning/discussions
- type: model
name: LLaVA-Critic
organization: ByteDance and University of Maryland, College Park
description: LLaVA-Critic is an open-source large multimodal model (LMM) designed as a generalist evaluator. It assesses performance across a variety of multimodal tasks by following a high-quality critic instruction dataset, incorporating diverse evaluation criteria. The model is effective in areas like LMM-as-a-Judge, providing reliable evaluation scores comparable to GPT models, and Preference Learning, offering reward signals for preference learning to enhance model alignment capabilities.
created_date: 2024-10-06
url: https://arxiv.org/pdf/2410.02712
model_card: unknown
modality: image, text; text
analysis: LLaVA-Critic was tested in scenarios such as LMM-as-a-Judge and Preference Learning, showing a high correlation with commercial GPT models in evaluation scores. It served as an alternative to expensive human feedback in resource-constrained settings and demonstrated better performance in providing AI-generated feedback for model alignment compared to human-reliant reward models.
size: unknown
dependencies: []
training_emissions: unknown
training_time: unknown
training_hardware: unknown
quality_control: The model ensures quality by utilizing a high-quality dataset for critic instructions, providing both quantitative judgments and reasoning, with transparency in assessments.
access: open
license: Apache 2.0
intended_uses: The model can be used for evaluating multimodal tasks, generating reward signals for preference learning, and serving as a reliable alternate judge for model assessments.
prohibited_uses: The model should not be used in scenarios requiring authorization from proprietary models, nor relied upon for critical applications without human oversight due to potential biases in dataset.
monitoring: unknown
feedback: unknown

0 comments on commit 4c2150b

Please sign in to comment.