- VERA: Explainable Video Anomaly Detection via Verbalized Learning of Vision-Language Models
- Integrating Physics-Informed Neural Networks and GRU for SciML-based Surface Temperature Prediction Li-ion Battery
- Optimized Cloud Performance Through Secure Data Detachment and Reproduction
- ChannelGuard: A DIRS-based Location Privacy-Protecting Mechanism for Integrated Sensing and Communication Systems
- EventSplat: 3D Gaussian Splatting from Moving Event Cameras for Real-time Rendering
- Scalable Runtime Architecture for Data-driven, Hybrid HPC and ML Workflow Applications
- Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning
- A Three-Phase Synchronous Reference Frame Controller-Based DC Link Voltage Balancing Technique for CHB-Based Modular SST
- DarkGAN-Enhanced Low-Light Detection and Localization of Low-Voltage Electricity Meters
- Recognizing Abnormalities in Fundus Images Using Vision Transformer for Ocular Diseases
- Enhancing Creative Generation on Stable Diffusion-based Models
- AdaptCMVC: Robust Adaption to Incremental Views in Continual Multi-view Clustering
- DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation
- Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
- Large Language Model for Verilog Generation with Code-Structure-Guided Reinforcement Learning
- Epidemic-Behavior Coevolutionary Vaccination Game Dynamics Under Prospect Theory
- Mr. DETR: Instructive Multi-Route Training for Detection Transformers
- DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models
- Towards In-the-wild 3D Plane Reconstruction from a Single Image
- On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach
- FedChip: Federated LLM for Artificial Intelligence Accelerator Chip Design
- Let Humanoids Hike! Integrative Skill Development on Complex Trails
- Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB
- MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting
- GIF: Generative Inspiration for Face Recognition at Scale
- LERFSNet: A Lightweight SAR Ship Detection Model with Enhanced Receptive Field and Shared Decoupling Head
- Light Weight Apple Leaf Disease Detection Method Based on Improved YOLOv8
- VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
- TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation
- Design and Analysis of 39GHz 5G Array Antennas for WBAN Applications
- FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
- QCLAB: A Matlab Toolbox for Quantum Computing
- ASIC-Agent: An Autonomous Multi-Agent System for ASIC Design with Benchmark Evaluation
- Fusing CNN and LSTM Networks for Residential Electricity Load Forecasting
- Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model
- IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments
- AniMo: Species-Aware Model for Text-Driven Animal Motion Generation
- A Hybrid Chip System Design for Visual Prosthetics
- Mv-Math: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts
- A Focused Human Body Model for Accurate Anthropometric Measurements Extraction
- Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement
- InteractVLM: 3D Interaction Reasoning from 2D Foundational Models
- OSDFace: One-Step Diffusion Model for Face Restoration
- Determining the Efficacy of SENet Integrated YOLO Models For Animal Detection
- Sample-Efficient Reinforcement Learning from Human Feedback via Information-Directed Sampling
- Trade-Offs in Resource-Constrained Dimensionality Reduction Algorithms
- Flashover Simulation in a 500 Kv Transmission Line Considering Propagation Time Between a Ground Wire and a Phase Conductor
- Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
- Low-Cost and Low-Frequency Interface for Soil Moisture Monitoring
- LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion
- ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
- Vision-Language Model IP Protection via Prompt-based Learning
- Radio Frequency Ray Tracing with Neural Object Representation for Enhanced RF Modeling
- Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
- Research on the Construction of an E-Commerce Discipline Knowledge Graph Based on Multi-Source Heterogeneous Data Fusion
- MODA: Motion-Drift Augmentation for Inertial Human Motion Analysis
- Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning
- AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models
- MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation
- MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
- A Reliability Index for Position Estimation in Trustworthy Location-Based Services
- A High Gain Non-Isolated Single-Switch DC-DC Boost Converter: Design and Analysis
- Monocular 3D Vehicle Detection Based on Video Point Tracking: A Novel Approach Without Prior Information
- On the Singularity of SYCL
- UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
- FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement
- Efficient Net Load Forecasting in Large-scale Power Distribution Systems via Dual-branch Experts Fusion Memory Network
- Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants
- A Dual-Network Architecture with Uncertainty-Aware Multimodal Fusion for Deepfake Detection
- Line Graph Neural Network for Drug-Disease Association Prediction
- An improved fault ride-through control of hybrid distribution transformer with considering reclosing scheme
- Physical Plausibility-aware Trajectory Prediction via Locomotion Embodiment
- Dynamic Generation Technology of Network Scene Based on Proximal Policy Optimization
- Improving Wireless Federated Learning via Joint Downlink-Uplink Beamforming over Analog Transmission
- OpenSDI: Spotting Diffusion-Generated Images in the Open World
- Control of Grid-Connected Dual-VSI DFIG-dc System With Series Connection at DC-link
- Hearing Anywhere in Any Environment
- PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
- Self-Evolving Visual Concept Library using Vision-Language Critics
- AI Personalized Language Learning Application
- Pioneering Cloud-enhanced Real-time Control of Modular-Multilevel Reconfigurable Battery Packs for Automotive Applications
- Real-IAD D 3 : A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection
- Towards Precise Embodied Dialogue Localization via Causality Guided Diffusion
- CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model
- Cooperative Algorithms for Multi-Agent Multi-Armed Bandits: Integrating $\varepsilon$ -Greedy Optimization
- Integrated State of Charge and Thermal Active Balancing in Lithium-Ion Batteries: A Finite Set Model Predictive Control Approach
- Prolog-RAG: A Symbolic Reasoning Approach to Retrieval-Augmented Generation
- ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer
- Modeling and Analysis of the Effect of Grounding and Bonding of Floating Roof Tanks on Their Performance Against the Direct Lightning Strikes
- Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior
- Exploring Simple Open-Vocabulary Semantic Segmentation
- Mitigation of Voltage Imbalance and Improving the Reliability of a Bipolar DC Microgrid using a Multiport Compensator
- A Continuous Path Planning Method for Multi-AGV Tasks Based on an Improved ECBS Algorithm
- Fault-Tolerant Synchronization Control of Switched Complex Networks by a Proportional-Integral Intermediate Observer Approach
- CBW_CRNet: Chebyshev Beluga Whale Optimization based Convolutional Recurrent Network for Vehicle Positioning and Tracking for 6G Networks
- Integration of On-board and Wireless Charging for Electric Vehicles with Single Stage Resonant Converter
- Flexible and Enzyme-Free Graphene-PVDF-Au Electrode for Glucose Detection in Sweat
- MCDDT: Mirror center loss based dual-scale dual-softmax transformer for multi-source subjects transfer learning in motor imagery recognition
- Securing End-to-End Reinforcement Learning-Driven Autonomous Driving: A Control Command Utility-based Intrusion Response System
- LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table