🔮 AI 精选

AI 自动挑选的高价值内容

ArXiv CS.AI
PhyDrawGen: Physically Grounded Diagram Generation from Natural Language

arXiv:2605.30512v1 Announce Type: new Abstract: Generating physics diagrams from text requires strict adherence to physical laws. While current generative models produce visually plausible outputs, they systematically hallucinate force vectors, ignore conservation laws, and violate geometric constr…

💬 暂无讨论
ArXiv CS.AI
Physically Viable World Models: A Case for Query-Conditioned Embodied AI

arXiv:2605.30542v1 Announce Type: new Abstract: World models for embodied AI must be physically viable: constructed to answer intervention queries by representing the physical structure governing action outcomes, rather than merely predicting future observations.

💬 暂无讨论
ArXiv CS.AI
Structure-Induced Information for Rerooting Levin Tree Search

arXiv:2605.30664v1 Announce Type: new Abstract: Subgoal-based policy tree search, which uses a policy to guide search, is effective for complex single-agent deterministic problems but often relies on explicit subgoal generation that can incur substantial overhead and hinders scalability.

💬 暂无讨论
ArXiv CS.AI
Learning Agent-Compatible Context Management for Long-Horizon Tasks

arXiv:2605.30785v1 Announce Type: new Abstract: LLM agents increasingly face long-horizon tasks such as web search and deep research in real-world applications, where accumulated context can cause long-context degradation and reasoning failures.

💬 暂无讨论
ArXiv CS.AI
COMPASS: Cognitive MCTS-Guided Process Alignment for Safe Search Agents

arXiv:2605.30838v1 Announce Type: new Abstract: LLM-powered search agents enable multi-step reasoning and tool use. However, these capabilities introduce retrieval-induced safety degradation, as harmful intents may decompose into seemingly innocuous sub-queries that lead to unsafe outcomes.

💬 暂无讨论
ArXiv CS.AI
Distilling LLM Feedback for Lean Theorem Proving

arXiv:2605.30861v1 Announce Type: new Abstract: Post-training for reasoning models typically combines supervised fine-tuning with reinforcement learning from verifiable rewards, most commonly with GRPO. However, this algorithm suffers from sparse rewards, limited exploration, and mode collapse.

💬 暂无讨论
ArXiv CS.AI
A Persona-Based Evaluation Framework for Pluralistic Alignment in Generative AI

arXiv:2605.31021v1 Announce Type: new Abstract: Current alignment paradigms for generative artificial intelligence rely predominantly on monolithic benchmarking frameworks that reduce the plurality of human judgment to aggregated statistical baselines, thereby obscuring cultural, demographic, and c…

💬 暂无讨论
ArXiv CS.AI
GraphARC: A Comprehensive Benchmark for Graph-Based Abstract Reasoning

arXiv:2605.31031v1 Announce Type: new Abstract: Relational reasoning lies at the heart of intelligence, but existing benchmarks are typically confined to formats such as grids or text. We introduce GraphARC, a benchmark for abstract reasoning on graph-structured data.

💬 暂无讨论
ArXiv CS.AI
Vector Linking via Cross-Model Local Isometric Consistency

arXiv:2605.31100v1 Announce Type: new Abstract: We study Vector Linking: given two embedding clouds produced by different black-box encoders over partially overlapping datasets, recover cross-model object correspondences using only vectors.

💬 暂无讨论
ArXiv CS.AI
Formalizing and falsifying causal pathways of rare events

arXiv:2605.31254v1 Announce Type: new Abstract: Building on recent formalizations of root cause analysis for rare events (``outliers'') in structural equation models, we propose a formal definition of a causal pathway and discuss its testable implications.

💬 暂无讨论
ArXiv CS.AI
Answer-Set-Programming-based Abstractions for Reinforcement Learning

arXiv:2605.31444v1 Announce Type: new Abstract: Reinforcement Learning (RL) enables autonomous agents to learn policies from experience, but realistic problems often involve enormous state spaces, making learning and generalisation challenging.

💬 暂无讨论
ArXiv CS.AI
Updating the standard neuron model in artificial neural networks

arXiv:2605.30370v1 Announce Type: cross Abstract: From their inception in the 1950s, artificial neural networks (ANNs) started using the so-called point neuron model then prevalent in neuroscience, hoping that this analogy would allow for a better emulation of brain function.

💬 暂无讨论
ArXiv CS.AI
Evolutionary Algorithm for Reservoir Learning and Yielding

arXiv:2605.30372v1 Announce Type: cross Abstract: Reservoir computing, a type of recurrent neural network, is a promising approach for temporal learning as it separates dynamic processing from the trained readout layer.

💬 暂无讨论
ArXiv CS.AI
AI Loss of Control Incident Management: Response & Resilience

arXiv:2605.30406v1 Announce Type: cross Abstract: Recent research demonstrating AI systems exhibiting deception and shutdown resistance suggests that AI loss of control (LOC) is an urgent policy concern , yet current literature focuses almost exclusively on alignment and prevention.

💬 暂无讨论
ArXiv CS.AI
SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer

arXiv:2605.30409v1 Announce Type: cross Abstract: Real-time streaming video-to-video editing (V2V) is critical for interactive applications such as live broadcasting and gaming, yet it remains a formidable challenge due to the stringent requirements for temporal consistency and inference throughput…

💬 暂无讨论
ArXiv CS.AI
LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis

arXiv:2605.30434v1 Announce Type: cross Abstract: Real-world data analysis is inherently iterative, yet existing benchmarks mostly evaluate isolated or short interactive tasks, leaving agents' ability to track evolving analytical context over long horizons untested.

💬 暂无讨论
ArXiv CS.AI
The Surface You Test Is Not the Surface That Breaks

arXiv:2605.30454v1 Announce Type: cross Abstract: Tool-augmented LLM agents are vulnerable to prompt injection: a third party who controls part of the agent's context can plant instructions that the agent then executes as if they came from the user.

💬 暂无讨论
ArXiv CS.AI
idSCD: Identifying Training Datasets through Semantic Correlation Descriptors

arXiv:2605.30462v1 Announce Type: cross Abstract: Can a dataset be recognized from the spurious correlations it induces during training? We argue that datasets leave dataset-specific traces in a model's learned semantic correlation structure: incidental regularities that are predictive within a dat…

💬 暂无讨论
ArXiv CS.AI
Improved Distribution Estimation in $\ell_\infty$

arXiv:2605.30509v1 Announce Type: cross Abstract: We present improved bounds for estimating discrete probability distributions under the $\ell_\infty$ norm. These include minimax bounds in expectation and high-probability tail bounds.

💬 暂无讨论
ArXiv CS.AI
VLM3: Vision Language Models Are Native 3D Learners

arXiv:2605.30561v1 Announce Type: cross Abstract: Vision Language Models (VLMs) enable a unified model to solve various vision tasks through prompting. They have shown promising performance in semantic understanding.

💬 暂无讨论
ArXiv CS.AI
Active Timepoint Selection for Learning Measure-Valued Trajectories

arXiv:2605.30625v1 Announce Type: cross Abstract: Inferring continuous probability paths from sparse snapshots is a fundamental challenge in domains like single-cell biology, where high-fidelity data acquisition is often destructive and constrained by prohibitive sequencing costs.

💬 暂无讨论
ArXiv CS.AI
PInVerify: An Offline Embodied Benchmark for Active Instance Verification

arXiv:2605.30639v1 Announce Type: cross Abstract: Embodied agents have made strong progress in navigating to target objects, but reaching the goal vicinity does not guarantee that the agent has found the correct instance: subtle attribute differences (e.g., "white floral" vs.

💬 暂无讨论
ArXiv CS.AI
EUDAIMONIA: Evaluating Undesirable Dynamics in AI

arXiv:2605.30654v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used as conversational partners for companionship, emotional disclosure, and interpersonal advice, but the social dynamics of these interactions can create harms that are not captured by capability-orien…

💬 暂无讨论
ArXiv CS.AI
Automatically Attacking Software Reverse Engineering AI Agents

arXiv:2605.30667v1 Announce Type: cross Abstract: Software tools for reverse engineering executable binary files, such as Ghidra, enable malware analysts to safely conduct robust static analysis without having access to original source code.

💬 暂无讨论
ArXiv CS.AI
CobSeg: Coherence Boundary Modeling for Dialogue Topic Segmentation

arXiv:2605.30668v1 Announce Type: cross Abstract: Dialogue topic segmentation is critical in many human-AI collaborative applications which requires identifying heterogeneous boundary cues, including lexical transitions near utterance edges and semantic discontinuities across utterances.

💬 暂无讨论
ArXiv CS.AI
Seeing Before Agreeing: Aligning Multi-Agent Consensus with Visual Evidence

arXiv:2605.30698v1 Announce Type: cross Abstract: Vision-language models (VLMs) have achieved strong performance on visual question answering (VQA). To mitigate individual hallucinations and blind spots, aggregating diverse perspectives via multi-agent collaboration has emerged as a promising parad…

💬 暂无讨论
ArXiv CS.AI
SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs

arXiv:2605.30711v1 Announce Type: cross Abstract: Agentic LLMs must continuously decide whether newly extracted facts should be added, merged with existing memories, or ignored, yet prior work has focused more on retrieval and storage than on principled write-side control.

💬 暂无讨论