Skip to content

Commit

Permalink
Automated report
Browse files Browse the repository at this point in the history
  • Loading branch information
deep-diver committed Dec 30, 2024
1 parent 6029362 commit f10d795
Show file tree
Hide file tree
Showing 11 changed files with 99 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-29"
author: Jiawei Lin
title: 'From Elements to Design: A Layered Approach for Automatic Graphic Design Composition'
thumbnail: ""
link: https://huggingface.co/papers/2412.19712
summary: This paper proposes a new method called LaDeCo to automatically compose graphic designs by dividing the elements into different layers and predicting their attributes. The method is designed to make the generation process smoother and clearer, and it can be used for various tasks such as resolution adjustment and element filling....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-29"
author: Junying Chen
title: HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
thumbnail: ""
link: https://huggingface.co/papers/2412.18925
summary: 'HuatuoGPT-o1 is a medical LLM that can perform complex reasoning and outperforms general and medical-specific baselines using only 40K verifiable problems. It uses a two-stage approach: a medical verifier to guide the search for a complex reasoning trajectory and reinforcement learning with verifier-based rewards to enhance complex reasoning. This approach is hoped to inspire advancements in reasoning across medical and other specialized domains....'
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-29"
author: Zehan Wang
title: 'Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models'
thumbnail: ""
link: https://huggingface.co/papers/2412.18605
summary: This paper introduces a new model called Orient Anything that can accurately estimate the orientation of objects in a single image by learning from 3D models and synthetic-to-real transfer strategies. It achieves state-of-the-art accuracy and improves various applications such as spatial concept comprehension and 3D object pose adjustment. ...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-29"
author: Ziang Yan
title: 'Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment'
thumbnail: ""
link: https://huggingface.co/papers/2412.19326
summary: The paper proposes a new method called Task Preference Optimization (TPO) to improve multimodal large language models (MLLMs) by incorporating task-specific heads and rich visual labels during training. TPO significantly enhances the MLLM's multimodal capabilities and task-specific performance, and demonstrates robust zero-shot capabilities across various tasks....
opinion: placeholder
tags:
- ML
9 changes: 9 additions & 0 deletions current/2024-12-30 1.58-bit FLUX.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-30"
author: Chenglin Yang
title: 1.58-bit FLUX
thumbnail: ""
link: https://huggingface.co/papers/2412.18653
summary: We introduce 1.58-bit FLUX, a method to reduce the size of a text-to-image model while maintaining its performance. This is done without using any image data and by developing a custom kernel for the model. The result is a model that uses less storage, memory, and time for inference, while still generating good images....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-30"
author: Yanlin Feng
title: 'CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era'
thumbnail: ""
link: https://huggingface.co/papers/2412.18702
summary: To improve retrieval from graph data for large language models (LLMs), the paper proposes property graph views on top of RDF knowledge graphs. It introduces CypherBench, a benchmark with property graphs for efficient LLM querying. The paper also addresses challenges in converting RDF to property graphs and generating tasks for Cypher....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-30"
author: Liang Chen
title: 'Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey'
thumbnail: ""
link: https://huggingface.co/papers/2412.18619
summary: 'This paper proposes a new taxonomy for multimodal learning that unifies both understanding and generation tasks within the Next Token Prediction (NTP) framework. The taxonomy covers five key aspects: multimodal tokenization, model architectures, task representation, datasets & evaluation, and open challenges. An associated GitHub repository is available at https://github.com/LMM101/Awesome-Multimodal-Next-Token-Prediction....'
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-30"
author: Risa Shinoda
title: 'SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images'
thumbnail: ""
link: https://huggingface.co/papers/2412.17606
summary: This paper introduces SBSFigures, a dataset for pre-training figure QA that uses a stage-by-stage pipeline to create chart figures with complete annotations and diverse topics, making it possible to achieve efficient training with a limited amount of real-world chart data....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-30"
author: Hua Farn
title: Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
thumbnail: ""
link: https://huggingface.co/papers/2412.19512
summary: The paper introduces a method to improve downstream task performance in safety-aligned LLMs without relying on additional safety data. This method involves merging the weights of pre- and post-fine-tuned safety-aligned models, which helps maintain the safety of LLMs while enhancing their performance. The approach is effective in mitigating safety degradation and offers a practical solution for adapting safety-aligned LLMs....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-30"
author: Marta Skreta
title: The Superposition of Diffusion Models Using the Itô Density Estimator
thumbnail: ""
link: https://huggingface.co/papers/2412.17762
summary: The Cambrian explosion of easily accessible pre-trained diffusion models suggests a demand for methods that combine multiple different pre-trained diffusion models without incurring the significant computational burden of re-training a larger combined model. In this paper, we cast the problem of combining multiple pre-trained diffusion models at the generation stage under a novel proposed framework termed superposition. Theoretically, we derive superposition from rigorous first principles stemmi...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-30"
author: Tao Wu
title: 'VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models'
thumbnail: ""
link: https://huggingface.co/papers/2412.19645
summary: This paper presents a new approach to create customized videos without needing additional models. It uses a method called Video Diffusion Model (VDM) to extract and inject subject features directly from reference images, and improves the consistency of subject appearance in the generated videos....
opinion: placeholder
tags:
- ML

0 comments on commit f10d795

Please sign in to comment.