- IRISX: A Dynamic Trade-off System for Harnessing Heterogeneity for Performance Portability
- Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
- Design and Performance Analysis of Planar Antenna for Ground Penetrating Radar Applications
- InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions
- Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction
- Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals
- Joint Optimization of Underwater Acoustic ISUDC Waveform Design and Sparse Channel Estimation Algorithms
- DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery
- Enhancing Time-Domain Shielding Effectiveness of Cables Using Metal-Coated Aramid-Fiber Composites
- Shadow Generation Using Diffusion Model with Geometry Prior
- Where’s the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
- Research on Dual Objectives Optimization of Quality and Cost Based on MCMC-ACO and Dynamic Bayesian Integration in Multi-Process Electronics Manufacturing
- MotionPro: A Precise Motion Controller for Image-to-Video Generation*
- Research on Trajectory Tracking Control Algorithm for Medical Delivery Robot
- AMSnet 2.0: A Large AMS Database with AI Segmentation for Net Detection
- Analysis of higher-order Lotka-Volterra models: Application of S-tensors and the polynomial complementarity problem
- Towards Consistent Multi-Task Learning: Unlocking the Potential of Task-Specific Parameters
- Three Cars Approaching within 100m! Enhancing Distant Geometry by Tri-Axis Voxel Scanning for Camera-based Semantic Scene Completion
- VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding
- Spectral State Space Model for Rotation-Invariant Visual Representation Learning
- Generative Photomontage
- Research and Implementation of an Automatic Secondary Security Verification Method for System Operation Permissions
- Design and simulation analysis of ship automatic anti-interference heading controller
- Compositional Caching for Training-free Open-vocabulary Attribute Detection
- VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models
- VIRES: Video Instance Repainting via Sketch and Text Guided Generation
- PICD: Versatile Perceptual Image Compression with Diffusion Rendering
- ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency
- COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection
- Zero-Shot Monocular Scene Flow Estimation in the Wild
- Unveiling collective value creation behavior in public projects: The stakeholder value network approach
- Multitwine: Multi-Object Compositing with Text and Layout Control
- Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification
- OmniGen: Unified Image Generation
- From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons
- A Regularization-Guided Equivariant Approach for Image Restoration
- YOLO-Poppy: Opium Poppy Detection Algorithm for Complex Aerial Scenes
- SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal
- Performance Modeling of Non-Uniform Heterogeneous Platforms
- Embedding Generative AI into Products – 10 Design Principles for Building Intelligent Systems
- An Example of Autism Co-Design: Physiological Sensor-driven Ecological Momentary Assessment Application
- Investigating CNN Models Efficacy in Spotting Lung Conditions using X-Ray Images
- Show and Segment: Universal Medical Image Segmentation via In-Context Learning
- MambaVision: A Hybrid Mamba-Transformer Vision Backbone
- MATCHA: Towards Matching Anything
- Optimus-2 : Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy
- Construction of Automated Machine Learning (AutoML) Framework Based on Large Language Models
- Efficient Dynamic Scene Editing via 4D Gaussian-based Static-Dynamic Separation
- Common3D: Self-Supervised Learning of 3D Morphable Models for Common Objects in Neural Feature Space
- MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM
- CRT-AS: A Chinese Remainder Theorem-Based Authentication and Key Agreement Scheme for SDVN
- VGGT: Visual Geometry Grounded Transformer
- POMP: Physics-constrainable Motion Generative Model through Phase Manifolds
- Implementation of Cryptographic Architecture for Secure Data Transmission using Reversible Logic Gates
- Template-Adaptive Content Organization: AI-Driven Personalization for E-Commerce Email Marketing
- A Novel Three Port Multi-Input Single Inductor DC-DC Bidirectional Boost Converter
- Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
- Comprehensive Comparative Study On Peak Particle Velocity Prediction For Surface Mines Via A Machine Learning Approach
- Bayesian Test-Time Adaptation for Vision-Language Models
- Assessing the Impact of Industrial Energy Reductions on Electric Truck Adoption
- Model-Based Failure Propagation Analysis and Automated Fault Tree Generation: Methodology and Application in Complex Avionics System
- A Modulation Scheme for Enhanced Performance of Hybrid Source Inverters in Electric Vehicles Application
- Highly Integrated Communication System for Commercial SAR Satellites Based on On-Board Computers
- CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction
- Simple Derivation of Approximate Crosstalk Expressions for Multicore Fibers With Core-Dependent Loss
- Ultra-Efficient Three-Phase Integrated-Active-Filter Isolated Rectifier for AI Data Center Applications
- Mechanical Parameters Identification of Servo Drive Using Periodic Velocity Profile
- Research on the Design of a Multi-User Interactive System for Chu Music Based on Multimodal Collaboration
- SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes
- One-for-More: Continual Diffusion Model for Anomaly Detection
- Learning Visual Generative Priors without Text
- DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection
- Task Offloading Using Policy Based Deep Reinforcement Learning
- The Standardization Framework of Product Traceability and Process Performance Monitoring in Interoperable Agroindustry Systems
- Cross-modal Information Flow in Multimodal Large Language Models
- MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
- Online Estimation of Region Inertia Based on Dynamic Division with Spatial-Temporal Perception
- Empirical Design of a Robotic Arm Control System based on Flex Sensors with Artificial Intelligence (AI) Association
- Deep Learning-Based Text Recommendation Algorithm
- HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution
- Rotational Magnetic Field Regulation Featuring High Spatial Freedom for Wireless Power Transfer System
- Dual Semantic Guidance for Open Vocabulary Semantic Segmentation
- Generative Inbetweening through Frame-wise Conditions-Driven Video Generation
- PhysGen3D: Crafting a Miniature Interactive World from a Single Image
- UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing
- PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution
- Coherent 3D Portrait Video Reconstruction via Triplane Fusion
- Task-driven Image Fusion with Learnable Fusion Loss
- Multi-Agent Reinforcement Learning for Optimal Resource Allocation in Space-Air-Ground Integrated Networks
- Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations
- Lung infection classification using HOG based SVM classifier on X-ray images
- Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World
- A Novel Passive Anti-islanding Detection Method for Synchronverter Controlled Grid Forming Inverter
- Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
- The Scene Language: Representing Scenes with Programs, Words, and Embeddings
- Multimodal Meta-Learning for Early Rumor Detection Based on Few-Shot Learning
- SPACE: Speaker Adaptation for Acoustic Eavesdropping using mmWave Radio Signals
- Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection
- Multifunctional Onboard Charger for Electric Vehicles with Single and Three phase Grid Compatibility
- Real-Time Adaptive Resolution System Using Electronically Foveated Dynamic Vision Sensor for Optimized Visual Processing