- DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation
- The Impact Label Noise and Choice of Threshold has on Cross-Entropy and Soft-Dice in Image Segmentation
- Z-Magic: Zero-shot Multiple Attributes Guided Image Creator
- Adversarial Diffusion Compression for Real-World Image Super-Resolution
- Educational Institution Financial Management System Based on Enhanced Smart Contracts
- LLM+RAG+Agent: Create an Efficient and Accurate Text Data Labeling System
- Visual Evolutionary Optimization on Graph-Structured Combinatorial Problems With MLLMs: A Case Study of Influence Maximization
- Decoupled Motion Expression Video Segmentation
- Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval
- Reconfigurable Intelligent Surfaces for ISAC: CRB Analysis and Optimization for Joint Angle and Radial Velocity Estimation
- Application Research of Lightning Warning Device for Transmission Lines in the Prediction of Severe Convection Thunderstorm Activities
- MixerMDM: Learnable Composition of Human Motion Diffusion Models
- A RISC-V Coprocessor for Seamless Integration of Stream-Based Accelerators
- Research on the Application of Artificial Intelligence Technology in Beam Resource Allocation for Multi-Beam Satellite Communication Systems
- Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation
- Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves?
- SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
- Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding
- DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation
- 3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation
- Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning
- Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection
- Maximizing Grid Forming Capabilities of Solar Inverters with Energy Storage Under Partial Shading Conditions
- VIRES: Video Instance Repainting via Sketch and Text Guided Generation
- ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency
- Abnormal Flow Monitoring Method of Power Grid Equipment Based on Isolation Forest Technology
- MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos
- Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation
- Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
- Deep Learning-Driven Vulnerability Detection Models for Software Security
- Advanced Charger Placement Strategies in Sensor Networks Using Graph Theory and Evolutionary Algorithms
- Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss
- Vortex Retarder Empowered Full-Stokes Parameter Measurement via Single-Shot RGB Color Imaging
- CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
- Augmented Deep Contexts for Spatially Embedded Video Coding
- OPTICAL: Leveraging Optimal Transport for Contribution Allocation in Dataset Distillation
- VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis
- Design and Feasibility Study of Transverse-flux Double-sided Linear Induction Motor
- Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion
- Improving Ethereum Mixing Address Linking with Tensor Computation, Neighbor Data Utilization and Asymmetric Information Modeling
- Fault Joint Detection and Adaptive Fault-Tolerant Control of Legged Robots Under Joint Partial Failures
- A Linked Stochastic Kriging for Multi-Layer Systems with Noisy Response
- Inverter Based Measurements of Common Mode Power to monitor Arching in Bearings of Wound Rotor Machine
- HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics
- BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis
- Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways
- Mechanical Parameters Identification of Servo Drive Using Periodic Velocity Profile
- Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
- SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
- Preconditioners for the Stochastic Training of Neural Fields
- Input Series Output Parallel Connection based Fault Tolerant LV Power Supply in Automotive Applications
- A Self-healing Electrical Impedance Tomography Sensor for the Selective Localization of Compression and Damage Based on a Diels-Alder Conductive Composite
- Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
- Research on Active Braking Control Strategy of EHB Based on Finite-Time Adaptive Control for Intelligent Vehicle
- Cross-modal Information Flow in Multimodal Large Language Models
- MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
- Online Estimation of Region Inertia Based on Dynamic Division with Spatial-Temporal Perception
- OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
- HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution
- Mdct-Dpanet: Dual-Path Attention Network for Multi-Channel Speech Separation
- Automatic Expansion and Contraction of Trapping Resources Based on Electric Power Environment
- Efficient Transfer Learning for Video-language Foundation Models
- Structured 3D Latents for Scalable and Versatile 3D Generation
- PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution
- BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
- Coherent 3D Portrait Video Reconstruction via Triplane Fusion
- CNN-Transformer Feature Aggregation for Underwater Self-Supervised Multi-Frame Monocular Depth Estimation
- GPAvatar: High-fidelity Head Avatars by Learning Efficient Gaussian Projections
- Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
- Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning
- Liquid Metal Elastomer Foam-based Sensor Array using Transmission Line for Continuous Spatial Stress Measurement
- Non-Iterative Coordination of Interconnected Power Grids via Dimension-Decomposition-Based Flexibility Aggregation
- AdaCM 2 : On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction
- SPACE: Speaker Adaptation for Acoustic Eavesdropping using mmWave Radio Signals
- ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
- Joint Out-of-Distribution Filtering and Data Discovery Active Learning
- Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning
- Can Reasoning Models Reason about Hardware? An Agentic HLS Perspective
- MetaWriter: Personalized Handwritten Text Recognition Using Meta-Learned Prompt Tuning
- DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels
- LoRA Subtraction for Drift-Resistant Space in Exemplar-Free Continual Learning
- Deep Integration Analysis of MEC Computing Nodes and CDN PoP Nodes
- Anchor-Aware Similarity Cohesion in Target Frames Enables Predicting Temporal Moment Boundaries in 2D
- Hybrid Machine Learning Approaches for Enhanced Grid Stability Prediction in Modern Energy Systems
- Deep RL-based Resource Allocation for User Fairness in STAR-RIS–assisted NOMA-enabled B5G Networks
- A Modular Hybrid Switched-Capacitor Cross-Switched Asymmetrical Multilevel Inverter With Low-Stress Voltage and High-Power Quality Utilizing the Nearest Level Control Switching Technique
- AI-Driven Optimization of Wave-Controlled Reconfigurable Intelligent Surfaces
- FoundationStereo: Zero-Shot Stereo Matching
- A Comprehensive Performance Study of Authentication Protocols for VANETs
- A Variable Gain Nonlinear Current Controller for Torque Ripple Minimization in Switched Reluctance Motor Drives
- PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models
- Spatial-Spectral Texture-Preserved Total Variation: A Novel Regularization for Hyperspectral Image Denoising
- GazeGene: Large-scale Synthetic Gaze Dataset with 3D Eyeball Annotations
- ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
- Exo-Glove Pinch: A Soft, Hand-Wearable Robot Designed Through Constrained Tendon Routing Analysis
- Real-Time Adaptive Resolution System Using Electronically Foveated Dynamic Vision Sensor for Optimized Visual Processing
- Illumination Spectrum Estimation for Multispectral Images via Surface Reflectance Modeling and Spatial-Spectral Feature Generation
- Person De-reidentification: A Variation-guided Identity Shift Modeling
- Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
- Closed Loop Power Hardware in the Loop Back EMF and Current Emulation of a PMDC Machine