- NVILA: Efficient Frontier Visual Language Models
- iMOV – An Active Surge Protection Device for Inverter-Dominated Systems
- DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction
- Research on the Synchronization Technology of Global Catalog Views with Multiple Nodes in Different Locations
- Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction
- Research on Information Fusion Algorithm for Wireless Sensor Homogeneous Networks
- One-Step Event-Driven High-Speed Autofocus
- Comparative Analysis of Lightning-Generated Electromagnetic Field Propagation in Mountainous Terrain: a Study Using 2D Axisymmetric and 3D Models
- Three-level Boost integrated Five-level Active Neutral Point Clamped Inverter for improved DC-link utilisation
- Research on the Application of Multi Factor Authentication Technology for 5G Terminals in the Coal Industry
- BWFormer: Building Wireframe Reconstruction from Airborne LiDAR Point Cloud with Transformer
- FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting
- STDD: Spatio-Temporal Dual Diffusion for Video Generation
- EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision
- Investigation into the Reverse Recovery Dynamics of High-Voltage Fast Recovery Diodes
- A Novel Field-Optimization of DDQ Transmitter for Highly Uniform Rotating Magnetic Field for Wireless UAV Charging
- A Novel Soft-Switching Inverter using Auxiliary Power Supply for Capacitive Load High Voltage and High Frequency Applications
- FreeTimeGS: Free Gaussian Primitives at Anytime Anywhere for Dynamic Scene Reconstruction
- Hierarchical Feature Fusion Graph Neural Network Model for Link Prediction
- Efficient Attention in Partially Relevant Video Retrieval: A Benchmarking Study on Accuracy-Efficiency Trade-Offs
- Collaborative Service Provisioning in IIoT Systems via Service Urgency and Situation-Adaptive Goal Modeling: A Dynamic Service–Energy Trade-off
- Finite Element Analysis of a Split-Phase Induction Machine with Open-Circuit Fault and its Mitigation
- Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene
- LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes
- Research Paper Finder: A Web-based Application for Efficient Academic Literature Retrieval using Web Scraping
- PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram
- Decouple Distortion from Perception: Region Adaptive Diffusion for Extreme-low Bitrate Perception Image Compression
- Remote System for Monitoring Failures in the Cubicle Type High Voltage Receiving Equipment
- ShowMak3r: Compositional TV Show Reconstruction
- Client Selection in Federated Learning for Industry 5.0: A Heuristic-Guided Pointer Network Reinforcement Learning Approach
- Design, Analysis and Fabrication of a High Speed Inner-Hollow Outer Rotor Brushless DC Motor for Yarn Feeding Textile Machinery
- DucDiff: Dual-consistent Diffusion for Uncertainty-aware Information Diffusion Prediction
- SemiDAViL: Semi-supervised Domain Adaptation with Vision-Language Guidance for Semantic Segmentation
- UniAlign: Scaling Multimodal Alignment within One Unified Model
- VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation
- UBiGTLoc: A Unified BiLSTM-Graph Transformer Localization Framework for IoT Sensor Networks
- RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression
- FineCaption: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
- AutoSSVH: Exploring Automated Frame Sampling for Efficient Self-Supervised Video Hashing
- EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
- DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
- Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement
- SPR fiber optic biosensor based on AI-assisted design for immunoglobulin G biomarker detection
- Integration of Textual and Structural Information for Knowledge Graph Completion
- OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning
- A Hubness Perspective on Representation Learning for Graph-Based Multi-View Clustering
- Ai Based Social Media Privacy Issues Handler
- QanDe: The Power of Binary Emulation for Obfuscation Analysis
- Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
- Induced Overvoltage at the Terminations of a $10 \text{kV} / 220 ~\mathrm{V}$ Co-Pole Overhead Distribution Lines Caused by Rocket-Triggered Lightning Strikes 10 Metres Away
- Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards
- No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition
- BLADE: Single-View Body Mesh Estimation through Accurate Depth Estimation
- RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives
- Dynamic Motion Blending for Versatile Motion Editing
- Audio-Visual Semantic Graph Network for Audio-Visual Event Localization
- UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping
- Virtual Wind Tower Siting Method Based on Multi-Objective Mayfly Optimization Algorithm
- VISTA3D: A Unified Segmentation Foundation Model For 3D Medical Imaging
- Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation
- A Sustainable Learning Framework: UAV-Based Oat Chlorophyll Monitoring Using Radiative Transfer Models and Deep Learning Techniques
- Vortex Retarder Empowered Full-Stokes Parameter Measurement via Single-Shot RGB Color Imaging
- Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
- Augmented Deep Contexts for Spatially Embedded Video Coding
- Simultaneous Enhancement of Electrochemical Migration Lifetime and Reliability of Sintered Silver
- Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM
- Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification
- Security of Dynamically Reconfigurable RISC-V Systems: I/O Attack Focus
- Classic Video Denoising in a Machine Learning World: Robust, Fast, and Controllable
- Overcoming Shortcut Problem in VLM for Robust Out-of-Distribution Detection
- VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis
- Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion
- INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations
- ADU: Adaptive Detection of Unknown Categories in Black-Box Domain Adaptation
- Video Language Model Pretraining with Spatio-temporal Masking
- Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data
- Energy Efficient Scheduling of AI/ML Workloads on Multi-Instance GPUs with Dynamic Repartitioning
- Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
- Low Latency Depth of Field Fusion System and Method Employing Fpga for Autonomous Driving
- Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal
- AI-Enhanced Detection of Dynamic Structural Changes in Inflammatory Protein Interfaces: A Case Study of CD11b/Mac-1 Interactions
- VisionArena: 230K Real World User-VLM Conversations with Preference Labels
- SCSGuardian: A Practical Hardware Defense against Speculative Cache Side-Channel Attacks
- Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
- A Method for Evaluating a Series Hybrid System Using a DC-Input Direct Electric-Power Converter (D-EPC) in Mode Driving with a Virtual Vehicle Model
- GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration
- Input Series Output Parallel Connection based Fault Tolerant LV Power Supply in Automotive Applications
- Harnessing Frozen Unimodal Encoders for Flexible Multimodal Alignment
- DiffCAM: Data-Driven Saliency Maps by Capturing Feature Differences
- HomoGen: Enhanced Video Inpainting via Homography Propagation and Diffusion
- ML Enabled Parallel R-C Sensor for Level and Electrical Conductivity Measurement
- MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
- FedMIA: An Effective Membership Inference Attack Exploiting "All for One" Principle in Federated Learning
- Mdct-Dpanet: Dual-Path Attention Network for Multi-Channel Speech Separation
- Foggy Target Detection Algorithm Based on CBAM-FE and SPD-Conv
- Relationship Between GDT Follow-Current Phenomenon and Active Gases
- BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
- Goku: Flow Based Video Generative Foundation Models
- Control of Utility Interfaced PEM Fuel Cell, Solar Energy Conversion System and Battery Storage
- CADDreamer: CAD Object Generation from Single-view Images