Skip to content

Commit

Permalink
Automated report
Browse files Browse the repository at this point in the history
  • Loading branch information
deep-diver committed Dec 25, 2024
1 parent fb8ea8a commit 612d771
Show file tree
Hide file tree
Showing 10 changed files with 90 additions and 0 deletions.
9 changes: 9 additions & 0 deletions current/2024-12-24 DepthLab: From Partial to Complete.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-24"
author: Zhiheng Liu
title: 'DepthLab: From Partial to Complete'
thumbnail: ""
link: https://huggingface.co/papers/2412.18153
summary: DepthLab is a foundation depth inpainting model that can complete missing depth data and preserve scale consistency. It can be used in various downstream tasks and outperforms current solutions in both performance and visual quality....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-24"
author: Ermo Hua
title: 'Fourier Position Embedding: Enhancing Attention''s Periodic Extension for Length Generalization'
thumbnail: ""
link: https://huggingface.co/papers/2412.17739
summary: Fourier Position Embedding (FoPE) is introduced as an enhancement to Rotary Position Embedding (RoPE) in Language Models (LMs). FoPE improves the periodic extension and length generalization of RoPE-based attention by addressing the adverse effects of linear layers and activation functions outside of attention, as well as insufficiently trained frequency components caused by time-domain truncation. FoPE constructs Fourier Series and zero-outs destructive frequency components, increasing model ro...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-24"
author: Ziteng Wang
title: 'ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing'
thumbnail: ""
link: https://huggingface.co/papers/2412.14711
summary: ReMoE is a fully differentiable Mixture-of-Experts architecture that uses ReLU as the router instead of TopK+Softmax routing. It offers efficient dynamic allocation of computation across tokens and layers, and exhibits domain specialization. ReMoE consistently outperforms vanilla TopK-routed MoE across various model sizes, expert counts, and levels of granularity, and exhibits superior scalability with respect to the number of experts....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-24"
author: Aakash Mahalingam
title: 'SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval'
thumbnail: ""
link: https://huggingface.co/papers/2412.15443
summary: SKETCH is a new method that improves the process of finding information from large datasets by combining text retrieval and knowledge graphs. It helps to create more accurate and relevant responses compared to traditional methods, as shown in four different datasets....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-25"
author: Tatiana Zemskova
title: '3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding'
thumbnail: ""
link: https://huggingface.co/papers/2412.18450
summary: A 3D scene graph represents a compact scene model, storing information about the objects and the semantic relationships between them, making its use promising for robotic tasks. When interacting with a user, an embodied intelligent agent should be capable of responding to various queries about the scene formulated in natural language. Large Language Models (LLMs) are beneficial solutions for user-robot interaction due to their natural language understanding and reasoning abilities. Recent method...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-25"
author: Minghong Cai
title: 'DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation'
thumbnail: ""
link: https://huggingface.co/papers/2412.18597
summary: We propose DiTCtrl, a method for generating videos with multiple sequential prompts using a Multi-Modal Diffusion Transformer (MM-DiT) architecture. Our method analyzes the attention mechanism of MM-DiT and utilizes mask-guided precise semantic control across different prompts with attention sharing to achieve smooth transitions and consistent object motion. We also introduce MPVBench, a new benchmark for evaluating multi-prompt video generation performance....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-25"
author: Sungjin Park
title: Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning
thumbnail: ""
link: https://huggingface.co/papers/2412.15797
summary: We introduce a new way to use many language models together to solve complex problems, called LE-MCTS. This method helps open-source models perform better on challenging reasoning tasks by choosing the best answer from different models based on a reward system. Our approach outperforms other methods and improves performance by up to 4.3% on certain math reasoning datasets....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-25"
author: Łukasz Borchmann
title: 'In Case You Missed It: ARC ''Challenge'' Is Not That Challenging'
thumbnail: ""
link: https://huggingface.co/papers/2412.17758
summary: The paper discusses how the evaluation setup of the ARC Challenge makes it seem more difficult than it actually is for modern LLMs. The paper also highlights how similar evaluation practices can lead to false assumptions about reasoning deficits in other benchmarks and offers guidelines to ensure that multiple-choice evaluations accurately reflect actual model capabilities....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-25"
author: Shijie Wang
title: 'MotiF: Making Text Count in Image Animation with Motion Focal Loss'
thumbnail: ""
link: https://huggingface.co/papers/2412.16153
summary: The paper proposes MotiF, an approach to improve text alignment and motion generation in text-guided image animation by focusing on regions with more motion. They also introduce TI2V Bench, a dataset for evaluating text-guided image animation, and conduct a human evaluation protocol. MotiF outperforms nine open-sourced models on TI2V Bench, achieving an average preference of 72%....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-12-25"
author: Minghao Chen
title: 'PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models'
thumbnail: ""
link: https://huggingface.co/papers/2412.18608
summary: PartGen is a method that separates 3D objects into meaningful parts and reconstructs them using multi-view diffusion models. This method can generate 3D objects from text or images and can complete or hallucinate missing parts....
opinion: placeholder
tags:
- ML

0 comments on commit 612d771

Please sign in to comment.