generated from codingpot/newsletter_awesome_articles
-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
fb8ea8a
commit 612d771
Showing
10 changed files
with
90 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-24" | ||
author: Zhiheng Liu | ||
title: 'DepthLab: From Partial to Complete' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.18153 | ||
summary: DepthLab is a foundation depth inpainting model that can complete missing depth data and preserve scale consistency. It can be used in various downstream tasks and outperforms current solutions in both performance and visual quality.... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
...sition Embedding: Enhancing Attention's Periodic Extension for Length Generalization.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-24" | ||
author: Ermo Hua | ||
title: 'Fourier Position Embedding: Enhancing Attention''s Periodic Extension for Length Generalization' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.17739 | ||
summary: Fourier Position Embedding (FoPE) is introduced as an enhancement to Rotary Position Embedding (RoPE) in Language Models (LMs). FoPE improves the periodic extension and length generalization of RoPE-based attention by addressing the adverse effects of linear layers and activation functions outside of attention, as well as insufficiently trained frequency components caused by time-domain truncation. FoPE constructs Fourier Series and zero-outs destructive frequency components, increasing model ro... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
current/2024-12-24 ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-24" | ||
author: Ziteng Wang | ||
title: 'ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.14711 | ||
summary: ReMoE is a fully differentiable Mixture-of-Experts architecture that uses ReLU as the router instead of TopK+Softmax routing. It offers efficient dynamic allocation of computation across tokens and layers, and exhibits domain specialization. ReMoE consistently outperforms vanilla TopK-routed MoE across various model sizes, expert counts, and levels of granularity, and exhibits superior scalability with respect to the number of experts.... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
...2-24 SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-24" | ||
author: Aakash Mahalingam | ||
title: 'SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.15443 | ||
summary: SKETCH is a new method that improves the process of finding information from large datasets by combining text retrieval and knowledge graphs. It helps to create more accurate and relevant responses compared to traditional methods, as shown in four different datasets.... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
...hLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-25" | ||
author: Tatiana Zemskova | ||
title: '3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.18450 | ||
summary: A 3D scene graph represents a compact scene model, storing information about the objects and the semantic relationships between them, making its use promising for robotic tasks. When interacting with a user, an embodied intelligent agent should be capable of responding to various queries about the scene formulated in natural language. Large Language Models (LLMs) are beneficial solutions for user-robot interaction due to their natural language understanding and reasoning abilities. Recent method... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
...lti-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-25" | ||
author: Minghong Cai | ||
title: 'DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.18597 | ||
summary: We propose DiTCtrl, a method for generating videos with multiple sequential prompts using a Multi-Modal Diffusion Transformer (MM-DiT) architecture. Our method analyzes the attention mechanism of MM-DiT and utilizes mask-guided precise semantic control across different prompts with attention sharing to achieve smooth transitions and consistent object motion. We also introduce MPVBench, a new benchmark for evaluating multi-prompt video generation performance.... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
... Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-25" | ||
author: Sungjin Park | ||
title: Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.15797 | ||
summary: We introduce a new way to use many language models together to solve complex problems, called LE-MCTS. This method helps open-source models perform better on challenging reasoning tasks by choosing the best answer from different models based on a reward system. Our approach outperforms other methods and improves performance by up to 4.3% on certain math reasoning datasets.... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
current/2024-12-25 In Case You Missed It: ARC 'Challenge' Is Not That Challenging.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-25" | ||
author: Łukasz Borchmann | ||
title: 'In Case You Missed It: ARC ''Challenge'' Is Not That Challenging' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.17758 | ||
summary: The paper discusses how the evaluation setup of the ARC Challenge makes it seem more difficult than it actually is for modern LLMs. The paper also highlights how similar evaluation practices can lead to false assumptions about reasoning deficits in other benchmarks and offers guidelines to ensure that multiple-choice evaluations accurately reflect actual model capabilities.... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
current/2024-12-25 MotiF: Making Text Count in Image Animation with Motion Focal Loss.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-25" | ||
author: Shijie Wang | ||
title: 'MotiF: Making Text Count in Image Animation with Motion Focal Loss' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.16153 | ||
summary: The paper proposes MotiF, an approach to improve text alignment and motion generation in text-guided image animation by focusing on regions with more motion. They also introduce TI2V Bench, a dataset for evaluating text-guided image animation, and conduct a human evaluation protocol. MotiF outperforms nine open-sourced models on TI2V Bench, achieving an average preference of 72%.... | ||
opinion: placeholder | ||
tags: | ||
- ML |
9 changes: 9 additions & 0 deletions
9
...artGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
date: "2024-12-25" | ||
author: Minghao Chen | ||
title: 'PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models' | ||
thumbnail: "" | ||
link: https://huggingface.co/papers/2412.18608 | ||
summary: PartGen is a method that separates 3D objects into meaningful parts and reconstructs them using multi-view diffusion models. This method can generate 3D objects from text or images and can complete or hallucinate missing parts.... | ||
opinion: placeholder | ||
tags: | ||
- ML |