- A Two-Stage Intent Recognition Framework in Multiagent Environments
- SEI-DSR: Specific Emitter Identification via Dynamic Sparse Attention and Multidimensional Feature Reconstruction
- A Selective Re-Learning Mechanism for Hyperspectral Fusion Imaging
- Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
- Tripartite Weight-Space Ensemble for Few-Shot Class-Incremental Learning
- Methodology for Business Value Analysis of Innovative IT in a Business Sector. The Case of the Material Supply Chain
- The Application Progress of Power Batteries in New Energy Ships
- Control Strategy of Permanent Magnet Synchronous Motor Based on Improved Sliding Mode Controller
- LATENT: LLM-Augmented Trojan Insertion and Evaluation Framework for Analog Netlist Topologies
- Quantization without Tears
- Efficient Personalization of Quantized Diffusion Model without Backpropagation
- Learning Class Prototypes for Unified Sparse-Supervised 3D Object Detection
- Power Equipment Health Monitoring: Portable Inspection Technology Based on Big Data and AI
- Analysing Loss Mechanisms in PSFB Current Doublers for Telecom Tower Applications: Impact of Frequency and Power Level
- DeformCL: Learning Deformable Centerline Representation for Vessel Extraction in 3D Medical Image
- A Practical Approach to Video Capture and Control Using FPGA and Cortex-A7
- Learning weakly monotone operators for convergent Plug-and-Play PET reconstruction
- Design and Research of a Tea Production Process Interactive Educational Device Based on Arduino
- XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?
- Research on Position Sensorless Torque Ripple Suppression Method for Six-Phase PMSM
- Comprehensive Analysis for Rheumatoid Arthritis Detection using Deep Learning Models
- CorrBEV: Multi-View 3D Object Detection by Correlation Learning with Multi-modal Prototypes
- Open-World Amodal Appearance Completion
- A novel hybrid distribution transformer with integrated flexible voltage and current compensation capability
- Beyond Capabilities: How Indian R&D Subsidiaries Use Issue Selling to Shape Power and Innovation Mandates in a MNC
- Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
- BERT-Based Joint Task Approach for Named Entity Recognition
- GIFStream: 4D Gaussian-Based Immersive Video with Feature Stream
- Distraction is All You Need for Multimodal Large Language Model Jailbreaking
- Image Captioning with Multi-Scale Dilated Attention Mechanism
- Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
- Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks
- Design of Transformer Turn-ratio for Maximizing the ZVS Region of Dual Active Bridge Converter
- UnCommon Objects in 3D
- Question-Aware Gaussian Experts for Audio-Visual Question Answering
- UrbanCAD: Towards Highly Controllable and Photorealistic 3D Vehicles for Urban Scene Simulation
- Application of Quantum Machine Learning in Genomic Data Analysis Using Quantum Support Vector Machines (QSVM)
- Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation
- Frequency investigation of bio-polymer based motion sensors
- Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
- Parallel Fractal Decomposition Optimization Algorithms on Heterogeneous Architectures
- SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
- Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition
- Acc3D: Accelerating Single Image to 3D Diffusion Models via Edge Consistency Guided Score Distillation
- Low-cost IoT for Seismic Monitoring: Experimental Evaluation of On-board Sensor Fusion for Noise Reduction
- Token Cropr: Faster ViTs for Quite a Few Tasks
- ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems
- MAGE : Single Image to Material-Aware 3D via the Multi-View G-Buffer Estimation Model
- Object-Shot Enhanced Grounding Network for Egocentric Video
- Observation and Analysis of a Multiple Lightning Strike Based on Dynamic Vision
- Face Forgery Video Detection via Temporal Forgery Cue Unraveling
- Scattering Center Modeling Of Complex Targets Under Cross-polarization
- Obstacle Avoidance Distributed Tracking of Networked UAVs with Online Path Planning
- Event-Equalized Dense Video Captioning
- Enhancing Generalization in Video Anomaly Detection through Multimodal Data Mixing
- Continuous Space-Time Video Resampling with Invertible Motion Steganography
- Statistical Model Limitations of Ground Flash Density for Lightning Risk Assessment
- Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
- Adversarial Robust Salient Object Detection in Optical Remote Sensing Images with Implicit Feature Enhancement
- Galaxy Walker: Geometry-aware VLMs For Galaxy-scale Understanding
- Scheduling Strategies for Partially-Replicable Task Chains on Two Types of Resources
- Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment
- Determining Jaundice Severity in Neonates using Optical Spectroscopy
- Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression
- Capacitor Ripple and CM Noise Mitigation Oriented Design of 800 V Dual Traction Inverters
- Adapting to Observation Length of Trajectory Prediction via Contrastive Learning
- Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
- A Machine Anomalous Sound Detection Method Based on Deep Residual Generative Adversarial Network
- Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
- Robust Methodology Design to Predict Opioid Overdose System based on AI Assisted Deep Learning Principles
- DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering
- T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
- Multi-Modal Aerial-Ground Cross-View Place Recognition with Neural ODEs
- One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion
- Development of a Wireless Charging System Using A Reconfigurable Dynamic Inductive Power Transfer Technology with Constant Voltage and Constant Current Charging
- Hyperbolic Safety-Aware Vision-Language Models
- MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction
- AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-Modal Alignment
- Overview of Research on Low-Resource Language Machine Translation Based on Artificial Intelligence
- Multi-modal Medical Diagnosis via Large-small Model Collaboration
- TSP-Mamba: The Travelling Salesman Problem Meets Mamba for Image Super-resolution and Beyond
- Calibrated Uncertainty Estimation for Trustworthy Deep IoT Attack Detection
- Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
- High-Voltage Pulse Modulator for S-Band Klystron in IR-FEL RF System
- Three-view Focal Length Recovery From Homographies
- T-CIL: Temperature Scaling using Adversarial Perturbation for Calibration in Class-Incremental Learning
- MaDCoW: Marginal Distortion Correction for Wide-Angle Photography with Arbitrary Objects
- SOAP: Vision-Centric 3D Semantic Scene Completion with Scene-Adaptive Decoder and Occluded Region-Aware View Projection
- An Intelligent Prediction Method for Safety Margins of Flexible Thermal Power Units Based on PipeLine Creep Life Damage
- Enhancing Scene Coordinate Regression with Efficient Keypoint Detection and Sequential Information
- Self-Sustaining Multi-Sensor LoRa-Based Activity Monitoring for Community Workout Parks
- Real-Time Fire Detection Algorithm Based on Improved RT-DETR
- Modeling and Analysis of a Multipole Permanent Magnet Assisted Synchronous Reluctance Machine for Electric Vehicles
- SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering
- Scaling Down Text Encoders of Text-to-Image Diffusion Models
- From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech
- Real-Time Sampling-Based Safe Motion Planning for Robotic Manipulators in Dynamic Environments
- Design and Analysis of a PSFB Current Doubler for VRFB: Impact of Magnetic Components and Snubber Circuit Requirements
- Low Complexity Detectors For Spectral and Energy Efficient GQSM-OTFS System
- DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models