- LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
- PIDSR: Complementary Polarized Image Demosaicing and Super-Resolution
- EventPSR: Surface Normal and Reflectance Estimation from Photometric Stereo Using an Event Camera
- CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology
- MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
- Instance-wise Supervision-level Optimization in Active Learning
- Landing Point Prediction of Aircraft Based on Transfer Learning Method
- Performance Comparison of Feature Selection Methods for Machine Learning Models on DDoS Attack Dataset
- Novel Virtual Resistance Method To Maximize Current Utilization by Grid Forming Inverter During Asymmetrical Fault Ride Through
- One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception
- Empowering LLMs to Understand and Generate Complex Vector Graphics
- A Novel 36kV, 24A DC Power Supply with 24-Pulsed Input and Ripple Free Output Capability
- Elucidating the Reactivity of Reflectance-Based Core-Body Photoplethysmogram to Posture and Respiratory Changes
- OW-OVD: Unified Open World and Open Vocabulary Object Detection
- SapiensID: Foundation for Human Recognition
- ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction
- The Illusion of Unlearning: The Unstable Nature of Machine Unlearning in Text-to-Image Diffusion Models
- NN-Former: Rethinking Graph Structure in Neural Architecture Representation
- Joint Optimal Allocation of Radio and Computational Resources Aiming at Minimizing Global Average Task Offloading Age for Long-Term Multi-Cell MEC Systems
- SAFE: Semantic Adaptive Feature Extraction with Rate Control for 6G Wireless Communications
- FedCDC: Efficient Similarity Identification in Clustered Federated Learning via Community Detection on Non-IID Data
- Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering
- Degradation-Aware Feature Perturbation for All-in-One Image Restoration
- Apollo: An Exploration of Video Understanding in Large Multimodal Models
- Heart Diseases Prediction using Machine Learning Algorithm
- Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
- A Yolov8-Based Object Detection Framework with Moiré Pattern Removal
- IoT-Enabled Smart Robot for Efficient Banana Harvesting and Quality Assessment
- MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
- The Art of Deception: Color Visual Illusions and Diffusion Models
- GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking
- HOT: Hadamard-based Optimized Training
- Model Predictive Networked Control for Autonomous Underwater Vehicles Tracking Control with Sequence-Based Compensation
- ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices
- GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities
- Multi-User Privacy-Preserving and Verifiable Spatial-Feature Data Query for IoT Clouds
- Research on Face Recognition Early Warning in Dangerous Areas Based on Raspberry Pi
- See Further When Clear: Curriculum Consistency Model
- Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models
- Semantic-guided Cross-Modal Prompt Learning for Skeleton-based Zero-shot Action Recognition
- Rumor Detection Based on Supervised Multiprototype Contrastive Learning
- Rethinking Training for De-biasing Text-to-Image Generation: Unlocking the Potential of Stable Diffusion
- BimArt: A Unified Approach for the Synthesis of 3D Bimanual Interaction with Articulated Objects
- Vision-Language Models Do Not Understand Negation
- Pairbot: Enhancing Computational Capabilities by Pairing of Autonomous Mobile Robots
- Detection of IoT Botnet Attacks using Hybrid Deep Learning Models
- DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
- Towards Efficient Allocation of Tasks in Dag-Based Workflows Across Federated Fog Systems
- Joint User Priority and Power Scheduling for QoS-Aware WMMSE Precoding: A Constrained-Actor Attentive-Critic Approach
- SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance
- Research on Multi-Class Component Detection of Transmission Lines Based on Improved YOLO11
- Simulation of Overvoltage in Photovoltaic Energy Storage System Caused by Lightning Strike
- Research on Heat Transfer Coefficient of Rotor Oil Jet Impingement Cooling of Permanent Magnet Synchronous Motor in Electric Vehicles
- Hyperbolic Uncertainty-Aware Few-Shot Incremental Point Cloud Segmentation
- M 3 -VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation
- Improving LLM-Powered EDA Assistants with RAFT
- Characterizing the Influence of Circuit Parasitics and Operating Conditions on a Passive Regenerative Snubber for Phase-Shifted Full-Bridge Converter
- Hydrogen Fuel Cell-Hybrid Energy Storage Ship Integrated Power System Design
- PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields
- Pathways on the Image Manifold: Image Editing via Video Generation
- Timescape Museum in Virtual Reality With Blender and Unity
- Virtual Reality Capabilities for Robot Programming by Demonstration
- A Study on HMI Design of Intelligent Networked Vehicles Based on KJ-AHP
- Realistic Test-Time Adaptation of Vision-Language Models
- Intelligent lighting system control scheme assessment using analytic hierarchy process (AHP): A case study of crew restaurant on the ship
- Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
- Optimization of Chinese Diplomatic Rhetoric Generation Based on LoRA Efficient Fine-Tuning
- Green Edge Computing Based IoV Dynamic Task Collaborative Strategy
- Incremental Survivability Enhancement in Mobile Crowdsensing Systems
- Research on E-Commerce Sales Forecasting Based on ARIMA
- MIRE: Matched Implicit Neural Representations
- A New Power Flow Controller for HVDC Grids and its Protection against Ground Faults
- Diabetic Foot Ulcer using Machine Learning
- A Compliant Mandrel-Based Fiber Optic Hydrophone for Underwater Acoustic Sensing
- OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking
- Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
- Online Text Risk Assessment Based on the BERT-CNN-BiLSTM Model
- A Multiphysics Reservoir Computing System with Mass-Spring Metamaterials and Spintronic Readout for Vibration Analysis
- Self-Supervised Cross-View Correspondence with Predictive Cycle Consistency
- R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning
- Optimal Planning of Battery Swapping Stations for e-Taxis in a Coupled Transportation-Distribution Network
- A self-healing smart grid for Railway Signalling Network
- Non-Natural Image Understanding with Advancing Frequency-based Vision Encoders
- A Novel Charge Pump Cell Based Modified Quadratic Boost Converter
- Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
- LL-Localizer: A Life-Long Localization System based on Dynamic i-Octree
- A Multi-mode Adaptive Switching Vibration Control Strategy For Marine Turbines
- Stop learning it all to mitigate visual hallucination, Focus on the hallucination target
- Synthetic Visual Genome
- Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes
- LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
- Task-Agnostic Guided Feature Expansion for Class-Incremental Learning
- Spectral Characterization of an LC-Based Optical Shutter as Modulator for Secure VLC-based Wireless Sensing in V2X Scenarios
- An Indoor Mobility Assistance System for Perception Enhancement of the Visually Impaired
- A T A: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting
- Experimental Evaluation of Efficiency and Power Distribution Control by 3-Level Inverter Drive for DC-inputs Direct Electric Power Converter (D-EPC)
- Deep Reinforcement Learning-Based Adaptive Bandpass Filter With Reconfigurable Frequency and Bandwidth
- Multi-Label Black-Box Attacks via Evolutionary Structured Many-Objective Adversarial Perturbations
- Single-Stage Solar PV Wireless Charging System for Electric Vehicles Ultra-Wide Voltage Applications
- Towards a More Efficient Traffic Management System: Implementing a Density-Sensitive Canny Edge Detection Algorithm for Real-Time Congestion Mitigation