Skip to content

Commit

Permalink
Automated report
Browse files Browse the repository at this point in the history
  • Loading branch information
deep-diver committed Nov 22, 2024
1 parent bae5dc6 commit c7143a8
Show file tree
Hide file tree
Showing 14 changed files with 126 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Yuanhao Cai
title: Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation
thumbnail: ""
link: https://huggingface.co/papers/2411.14384
summary: This paper introduces a new method called DiffusionGS for creating 3D images from 2D ones. It's faster and produces better results compared to other methods, and it can handle different view directions and object-centric inputs. The authors also developed a training strategy to improve the model's ability to generalize to different scenes and objects....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Javier Ferrando
title: Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
thumbnail: ""
link: https://huggingface.co/papers/2411.14257
summary: We use sparse autoencoders to understand why large language models hallucinate and find that they have internal representations about their own capabilities in recognizing entities. These representations can steer the model to refuse to answer questions about known entities or to hallucinate attributes of unknown entities. They also have a causal effect on the chat model's refusal behavior, suggesting that chat finetuning has repurposed this existing mechanism. We explore the mechanistic role of...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Weiyun Wang
title: Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
thumbnail: ""
link: https://huggingface.co/papers/2411.10442
summary: Researchers have developed a new method called Mixed Preference Optimization (MPO) to improve the reasoning abilities of multimodal large language models (MLLMs). They created a large dataset for multimodal reasoning and used it along with MPO to enhance the performance of MLLMs, particularly in chain-of-thought tasks. The new model, InternVL2-8B-MPO, outperforms previous models and shows comparable performance to larger models....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Xin Dong
title: 'Hymba: A Hybrid-head Architecture for Small Language Models'
thumbnail: ""
link: https://huggingface.co/papers/2411.13676
summary: Hymba is a new type of small language model that combines different ways of understanding language to be more efficient. It uses something called attention to remember important information and stores it in special tokens. This makes it faster and uses less memory than other models, and it does a better job at understanding language too....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Yuhao Dong
title: 'Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models'
thumbnail: ""
link: https://huggingface.co/papers/2411.14432
summary: Insight-V is a new method that uses a multi-agent system to improve the reasoning abilities of large language models in vision-language tasks. It creates long and diverse reasoning paths and uses a summary agent to judge and summarize the results. This leads to better performance on multi-modal benchmarks that require visual reasoning....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Ruiyuan Gao
title: 'MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control'
thumbnail: ""
link: https://huggingface.co/papers/2411.13807
summary: MagicDriveDiT is a new method for creating long, high-resolution videos for self-driving cars. It uses a special kind of model called DiT and makes it work better by improving how it learns and using special controls to make the videos look better. The new method makes better videos than other methods and can be used for many different tasks in self-driving cars....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Yu Zhao
title: 'Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions'
thumbnail: ""
link: https://huggingface.co/papers/2411.14405
summary: Marco-o1 is a model that uses fine-tuning, MCTS, reflection mechanisms, and innovative reasoning strategies to solve complex real-world problems. It focuses on finding open-ended resolutions in various domains where clear standards are absent and rewards are challenging to quantify....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Enrico Fini
title: Multimodal Autoregressive Pre-training of Large Vision Encoders
thumbnail: ""
link: https://huggingface.co/papers/2411.14402
summary: This paper presents AIMV2, a family of generalist vision encoders that excel in various downstream tasks, including multimodal evaluations and vision benchmarks. The encoders are characterized by a straightforward pre-training process, scalability, and remarkable performance. AIMV2-3B encoder achieves 89.5% accuracy on ImageNet-1k with a frozen trunk and outperforms state-of-the-art contrastive models in multimodal image understanding across diverse settings....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Xidong Feng
title: Natural Language Reinforcement Learning
thumbnail: ""
link: https://huggingface.co/papers/2411.14251
summary: This paper proposes a new approach called Natural Language Reinforcement Learning (NLRL) that uses natural language to represent decision-making problems and solve them using large language models (LLMs). The authors demonstrate the effectiveness and efficiency of their approach through experiments on various games....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Akari Asai
title: 'OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs'
thumbnail: ""
link: https://huggingface.co/papers/2411.14199
summary: OpenScholar is a specialized retrieval-augmented LM that assists scientists in synthesizing scientific literature by identifying relevant passages from 45 million open-access papers and providing citation-backed responses. It outperforms GPT-4o and PaperQA2 in correctness and citation accuracy, and experts prefer its responses over expert-written ones in human evaluations....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Yijiong Yu
title: Patience Is The Key to Large Language Model Reasoning
thumbnail: ""
link: https://huggingface.co/papers/2411.13082
summary: Recent advancements in the field of large language models, particularly through the Chain of Thought (CoT) approach, have demonstrated significant improvements in solving complex problems. However, existing models either tend to sacrifice detailed reasoning for brevity due to user preferences, or require extensive and expensive training data to learn complicated reasoning ability, limiting their potential in solving complex tasks. To bridge this gap, following the concept of scaling test-time, w...
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Omri Avrahami
title: 'Stable Flow: Vital Layers for Training-Free Image Editing'
thumbnail: ""
link: https://huggingface.co/papers/2411.14430
summary: The paper proposes a method to identify 'vital layers' within Diffusion Transformer (DiT) models, crucial for image formation, to perform consistent image edits via selective injection of attention features. The authors also introduce an improved image inversion method for flow models and evaluate their approach through qualitative and quantitative comparisons, along with a user study, demonstrating its effectiveness across multiple applications....
opinion: placeholder
tags:
- ML
9 changes: 9 additions & 0 deletions current/2024-11-22 Ultra-Sparse Memory Network.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Zihao Huang
title: Ultra-Sparse Memory Network
thumbnail: ""
link: https://huggingface.co/papers/2411.12364
summary: This paper proposes UltraMem, a new architecture that combines a large-scale, ultra-sparse memory layer to address the limitations of existing models. It significantly reduces inference latency while maintaining model performance and demonstrates favorable scaling properties. Experiments show that it achieves state-of-the-art inference speed and model performance within a given computational budget....
opinion: placeholder
tags:
- ML
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
date: "2024-11-22"
author: Bethel Melesse Tessema
title: 'UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages'
thumbnail: ""
link: https://huggingface.co/papers/2411.14343
summary: We developed a method to collect text data for low-resource languages from the Common Crawl corpus efficiently, resulting in larger datasets than before. Our approach, UnifiedCrawl, filters and extracts common crawl using minimal compute resources. We fine-tuned multilingual LLMs using this data and efficient adapter methods (QLoRA), which significantly boosted performance on low-resource languages while minimizing VRAM usage. Our experiments showed improvements in language modeling perplexity a...
opinion: placeholder
tags:
- ML

0 comments on commit c7143a8

Please sign in to comment.