- Model Predictive Control of Interleaved DC-DC Boost Converter
- Vid2Avatar-Pro: Authentic Avatar from Videos in the Wild via Universal Prior
- DehazeMist: Research on Image Dehazing System Based on Improved Dark Channel Prior
- WildAvatar: Learning In-the-wild 3D Avatars from the Web
- FedCS: Coreset Selection for Federated Learning
- Joint Trajectory and Power Optimization for UAV-SAR based ISAC System
- Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization
- Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators
- Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
- Enhanced Multi-Class Driver Behavior Detection in IoMT Environments Using Hybrid LSTM-GRU Model
- Research on Engine Lubrication Oil Temperature Prediction Based on WOA-LSTM Algorithm
- Operating with Variable DC-link Voltage Under Dynamic Operating Conditions for HEV Traction Drives
- GraphGPT-o: Synergistic Multimodal Comprehension and Generation on Graphs
- EquiPose: Exploiting Permutation Equivariance for Relative Camera Pose Estimation
- ChatHuman: Chatting about 3D Humans with Tools
- A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition
- ShiftwiseConv: Small Convolutional Kernel with Large Kernel Effect
- Implementation of a Voltage-Dependent Transmission Line Model with Corona Effect Consideration
- InsightEdit: Towards Better Instruction Following for Image Editing
- A Methodology for Analyzing and Diagnosing Renewable Grid Tie Inverter Designs
- Free Lunch Enhancements for Multi-modal Crowd Counting
- HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation
- Scene Map-based Prompt Tuning for Navigation Instruction Generation
- An Efficient Cross-Domain Trusted Authentication Scheme for Microgrids
- ViStream: Improving Computation Efficiency of Visual Streaming Perception via Law-of-Charge-Conservation Inspired Spiking Neural Network
- Generative Zero-Shot Composed Image Retrieval
- VideoGuide: Improving Video Diffusion Models without Training Through a Teacher’s Guide
- Dynamic Stereotype Theory Induced Micro-expression Recognition with Oriented Deformation
- Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
- Accelerating Triangle Counting with Real Processing-in-Memory Systems
- Towards Orchestrating Agentic Applications as FaaS Workflows
- Leveraging 3D Geometric Priors in 2D Rotation Symmetry Detection
- Where’s the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
- Revisiting Generative Replay for Class Incremental Object Detection
- MotionPro: A Precise Motion Controller for Image-to-Video Generation*
- Direct Predictive Harmonic Current Control of Power Converters With Fixed Switching Frequency
- PICD: Versatile Perceptual Image Compression with Diffusion Rendering
- VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction
- Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems
- Synchronization and Pinning Control on Circulating Directed Hypergraphs
- Thalassa: Transforming Symbolic PDEs into Tensor-Based Solvers Running on ML Accelerators
- ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large Language Models
- Stability Assessment of a Weak Island System Connected to Two HVDC Links
- Optimization of Fiber Attenuation Prediction Based on GA-CNN-BiLSTM-Attention
- CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
- Open Set Label Shift with Test Time Out-of-Distribution Reference
- Unconditionally Stable Leapfrog Complying Divergence Implicit FDTD Method with Lumped Elements
- DPCT: Efficient High-Resolution Depth Prediction via Cross-Covariance Attention Transformers
- Context-Aware Multimodal Pretraining
- SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes
- MaSS13K: A Matting-level Semantic Segmentation Benchmark
- Adaptive Protein Design Protocols and Middleware
- Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation
- Efficient GPU Memory Resource Scheduling Algorithm for Vehicle Detection Tasks in High Concurrent Scenarios
- A Learning Algorithm Based on Similarity Identification and Knowledge Transfer for Dynamic Multi-Objective Optimization
- Frequency-Domain Analysis of Contaminant Effects on Leakage Current and Harmonic Distortion for Transmission Line Diagnostics
- SketchVideo: Sketch-based Video Generation and Editing
- Frequency-Biased Synergistic Design for Image Compression and Compensation
- HybridMQA: Exploring Geometry-Texture Interactions for Colored Mesh Quality Assessment
- Impedance Analysis and Compensation Method for IPT System Applying Inverse Coupled Current Doubler Rectifier
- A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets
- MotiF: Making Text Count in Image Animation with Motion Focal Loss
- Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification
- Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models
- Robust Message Embedding via Attention Flow-Based Steganography
- Cross Scale Attention Transformer for Single Image Super-Resolution
- Toward Efficient Asynchronous Single-Source Shortest Path
- Coupling Study in a 2D Gimbal-less Quasi-static Piezoelectrically-Actuated MEMS Mirror
- DA-Distill: Dual-Alignment Distillation for Multimodal Knowledge Transfer on Edge Devices
- Linear Attention Modeling for Learned Image Compression
- Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
- Concept Lancet: Image Editing with Compositional Representation Transplant
- OmniGen: Unified Image Generation
- Learning Physics-Based Full-Body Human Reaching and Grasping from Brief Walking References
- RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives
- Dynamic Motion Blending for Versatile Motion Editing
- Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters
- No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition
- The Datasets Crawling Based on Search Engine in Minor Fields AI Application
- BLADE: Single-View Body Mesh Estimation through Accurate Depth Estimation
- Secure Access: A Multimodal Authentication and Thread Detection System for Person in Residence Halls
- MetaCast: Generalizing HPC Application Runtime Prediction
- FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
- SF 2 T: Self-supervised Fragment Finetuning of Video-LLMs for Fine-Grained Understanding
- A Sustainable Learning Framework: UAV-Based Oat Chlorophyll Monitoring Using Radiative Transfer Models and Deep Learning Techniques
- A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
- Diffusion-based Realistic Listening Head Generation via Hybrid Motion Modeling
- Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models
- OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
- Encapsulated Composition of Text-to-Image and Text-to-Video Models for High-Quality Video Synthesis
- Video Language Model Pretraining with Spatio-temporal Masking
- Gate Efficient Composition of Hamiltonian Simulation and Block-Encoding with its Application on HUBO, Chemistry and Finite Difference Method
- Optimal Split Capacitor DC-Link Design for Partial Load Multi-Level Inverters
- Enhancing SRAM Efficiency and Stability with Self Pull Up Mechanism and Bitline Charge Sharing
- Triple-Band Efficiency Improvement of Smartwatch Antennas by Sharing a Zeroth-Order Resonance Patch
- Strain-Regulated Polarity Switching in Flexible MoTe 2 Transistors
- Conversion Formulas between the WLP Spectrum and the Frequency Spectrum for WLP-FDTD Analysis
- Breaking Down LLM Inference: A preliminary performance analysis of sparsified transformers
- Research Overview on Moving Object Detection Methods Based on Video Image Analysis
- CorrMADA: Improving Robustness of ML-Coupled Intrusion Detection Systems