- DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking
- A Hierarchical Patch Feature Distribution Network for Industrial Multiscale Defect Detection
- Analysis of Lightning Grounding Values Using Variations in Frequency, Distance Ratio, and Measurement Methods
- BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
- Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways
- MICAS: Multi-grained In-Context Adaptive Sampling for 3D Point Cloud Processing
- Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
- Input Series Output Parallel Connection based Fault Tolerant LV Power Supply in Automotive Applications
- DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences
- Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
- Kernel Instruction Optimization Based on the Triton Compiler
- Research on Active Braking Control Strategy of EHB Based on Finite-Time Adaptive Control for Intelligent Vehicle
- HomoGen: Enhanced Video Inpainting via Homography Propagation and Diffusion
- ML Enabled Parallel R-C Sensor for Level and Electrical Conductivity Measurement
- Study and Validation of a Novel dq-axes Equivalent Circuit Model for PMSM Considering the Iron Loss
- Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual
- Data Distributional Properties As Inductive Bias for Systematic Generalization
- SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow
- A Cost-effective Occupancy Estimation System for Energy-efficient Buildings in Africa
- Integrating Deep Learning with Blockchain Technology to Strengthen Decentralized AI Application's Data Privacy, Security, and Transparency
- Basis Expansion Extrapolation based Long-Term Channel Prediction for Massive MIMO OTFS Systems
- Best Price Prediction for Pre-Owned Cars using Machine Learning Techniques
- Analysis of Mixing of Pseudo – Random Number Generators with Statistical Tests
- Structured 3D Latents for Scalable and Versatile 3D Generation
- Goku: Flow Based Video Generative Foundation Models
- Optimized Gas Sensor Array with AI for Distinguishing and Classifying Similar Odorants
- MVSAnywhere: Zero-Shot Multi-View Stereo
- Boost the Inference with Co-Training: A Depth-Guided Mutual Learning Framework for Semi-Supervised Medical Polyp Segmentation
- Improved Video VAE for Latent Video Diffusion Model
- Physical Constraints into Deep Learning for Enhanced Snow Depth Retrieval over the Third Pole
- Learning Affine Correspondences by Integrating Geometric Constraints
- Fiber-tip magnetically driven microgripper for micromanipulation
- TuberSense: Industrial application of gas sensing as a tool for monitoring crop spoilage
- Model Poisoning Attacks to Federated Learning via Multi-Round Consistency
- DocVLM: Make Your VLM an Efficient Reader
- Modified Voltage Controller to Reduce Harmonics in Three-Level Boost PFC Converters
- Double-Frequency Control of Multi-Active Bridge Converters for Soft-Switching Range Extension
- Real-time High-fidelity Gaussian Human Avatars with Position-based Interpolation of Spatially Distributed MLPs
- Profile Least Squares Estimation in Networks with Covariates
- Sub-THz Power Amplifiers: Measurements, Behavioral Modeling and Predistortion Algorithms
- Parallel Scan on Ascend AI Accelerators
- Dynamic Integration of Task-Specific Adapters for Class Incremental Learning
- Lossy Parallel Visualization of Large-Scale Volume Data with Error-Bounded Image Compositing
- Non-Iterative Coordination of Interconnected Power Grids via Dimension-Decomposition-Based Flexibility Aggregation
- AdaCM 2 : On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction
- Comprehensive Analysis and Wide Range Operation of ZVS and Quasi-ZPA in Wireless Power Transfer System
- Joint Out-of-Distribution Filtering and Data Discovery Active Learning
- Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels
- MetaScenes: Towards Automated Replica Creation for Real-world 3D Scans
- Video Motion Transfer with Diffusion Transformers
- Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning
- Benchmarking Sustainability Assessment Tools for SMEs: An AHP–TOPSIS Framework and a Vision–Execution Quadrant Approach
- Educational Institution Financial Management System Based on Enhanced Smart Contracts
- Vid2Avatar-Pro: Authentic Avatar from Videos in the Wild via Universal Prior
- LoRA Subtraction for Drift-Resistant Space in Exemplar-Free Continual Learning
- Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
- Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation
- GazeGene: Large-scale Synthetic Gaze Dataset with 3D Eyeball Annotations
- Free Lunch Enhancements for Multi-modal Crowd Counting
- ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
- DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation
- BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance
- Iterative Sound Source Recognition Method Based on Randomness Removal
- Research on Sentiment Analysis Method Based on the Collaborative Modeling of Attention Mechanism and Sentiment Lexicon
- A System Identification Approach to Modelling Chlorophyll Fluorescence Response to Dynamic Light Conditions
- Exo-Glove Pinch: A Soft, Hand-Wearable Robot Designed Through Constrained Tendon Routing Analysis
- Single Event Transient Characterization of a Low-Dropout Regulator in a sub-20 nm CMOS Technology
- Generative Zero-Shot Composed Image Retrieval
- IRISX: A Dynamic Trade-off System for Harnessing Heterogeneity for Performance Portability
- Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
- Design and Performance Analysis of Planar Antenna for Ground Penetrating Radar Applications
- InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions
- Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction
- Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals
- Joint Optimization of Underwater Acoustic ISUDC Waveform Design and Sparse Channel Estimation Algorithms
- DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery
- Enhancing Time-Domain Shielding Effectiveness of Cables Using Metal-Coated Aramid-Fiber Composites
- Shadow Generation Using Diffusion Model with Geometry Prior
- Where’s the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
- Research on Dual Objectives Optimization of Quality and Cost Based on MCMC-ACO and Dynamic Bayesian Integration in Multi-Process Electronics Manufacturing
- MotionPro: A Precise Motion Controller for Image-to-Video Generation*
- Research on Trajectory Tracking Control Algorithm for Medical Delivery Robot
- AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection
- Analysis of higher-order Lotka-Volterra models: Application of S-tensors and the polynomial complementarity problem
- Towards Consistent Multi-Task Learning: Unlocking the Potential of Task-Specific Parameters
- Three Cars Approaching within 100m! Enhancing Distant Geometry by Tri-Axis Voxel Scanning for Camera-based Semantic Scene Completion
- VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding
- Spectral State Space Model for Rotation-Invariant Visual Representation Learning
- Generative Photomontage
- Research and Implementation of an Automatic Secondary Security Verification Method for System Operation Permissions
- Design and simulation analysis of ship automatic anti-interference heading controller
- Compositional Caching for Training-free Open-vocabulary Attribute Detection
- VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models
- VIRES: Video Instance Repainting via Sketch and Text Guided Generation
- PICD: Versatile Perceptual Image Compression with Diffusion Rendering
- ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency
- COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection
- Zero-Shot Monocular Scene Flow Estimation in the Wild
- Unveiling collective value creation behavior in public projects: The stakeholder value network approach
- Multitwine: Multi-Object Compositing with Text and Layout Control