- Gray Scale Image Colorization using Convolutional Neural Network and PyTorch
- ZoomLDM: Latent Diffusion Model for multi-scale image generation
- Unlabeled Samples Improve Few-Shot Underwater Acoustic Target Recognition
- A System for DNS Over HTTPS Deployment and Security Measurement
- TKG-DM: Training-free Chroma Key Content Generation Diffusion Model
- Interpretable Image Classification via Non-parametric Part Prototype Learning
- Design, Selection and Implementation of Conditioning Circuits for Digital Control Applications
- HiVeGen – Hierarchical LLM-based Verilog Generation for Scalable Chip Design
- EVPGS: Enhanced View Prior Guidance for Splatting-based Extrapolated View Synthesis
- Lska-Yolo: Improved Yolo Framework Tailored for Cervical Cell Detection
- IoT-Enabled Smart Robot for Efficient Banana Harvesting and Quality Assessment
- HERA: Hybrid Explicit Representation for Ultra-Realistic Head Avatars
- Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding
- GPVK-VL: Geometry-Preserving Virtual Keyframes for Visual Localization under Large Viewpoint Changes
- Model Predictive Networked Control for Autonomous Underwater Vehicles Tracking Control with Sequence-Based Compensation
- RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges
- Research on Face Recognition Early Warning in Dangerous Areas Based on Raspberry Pi
- Split Adaptation for Pre-trained Vision Transformers
- Authentication of Bergamot Essential Oil by IR Analysis and Pattern Recognition
- Dynamic Pseudo Labeling via Gradient Cutting for High-Low Entropy Exploration
- Road Target Detection Method Based on Improved YOLOv7-Tiny
- Cross-View Completion Models are Zero-shot Correspondence Estimators
- Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception
- Trajectory Planning and Antiswing for Rotary Crane Considering Physical Constraints and Obstacles Avoidance
- Advanced Animal Detection System with Cascaded YOLOv8 and Dynamic Feature Extraction
- Deep Learning-Based Player Behavior Modeling and Game Interaction System Optimization Research
- Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning
- Detection of IoT Botnet Attacks using Hybrid Deep Learning Models
- SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance
- Simulation of Overvoltage in Photovoltaic Energy Storage System Caused by Lightning Strike
- Research on Heat Transfer Coefficient of Rotor Oil Jet Impingement Cooling of Permanent Magnet Synchronous Motor in Electric Vehicles
- EntitySAM: Segment Everything in Video
- NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting
- High Voltage Design Strategies for Gallium Oxide Power Devices
- UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
- Large Language Models Can Achieve Explainable and Training-Free One-Shot HRRP ATR
- Research on the Solution Method of Mixed Integer Nonlinear Planning Model Based on Influence Factor
- Towards Explainable and Unprecedented Accuracy in Matching Challenging Finger Crease Patterns
- Research on Automotive Motors Suitable for LCA, Including Motors with Aluminum Windings
- UNIC-Adapter: Unified Image-Instruction Adapter with Multi-Modal Transformer for Image Generation
- InteractionMap: Improving Online Vectorized HDMap Construction with Interaction
- Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models
- Delay Optimizing Based Passivity Enhancement of Converter-Side Current Controlled LCL-Type Grid Converters
- OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking
- Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation
- High Availability Design for a Container Cloud Platform Monitoring and Management Module
- HeMoRa: Unsupervised Heuristic Consensus Sampling for Robust Point Cloud Registration
- Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
- Leveraging Large Language Models for Cultural Heritage Digitization: A Textual Analysis of Historical Buildings
- Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration
- Artificial Intelligence Algorithm Decision and Optimization Methods for Complex Production Systems
- A self-healing smart grid for Railway Signalling Network
- EmoEdit: Evoking Emotions through Image Manipulation
- Joint Optimization of Multi-UAV Assisted Computation Offloading and Topological Task Routing for Consumer IoT Emerging Businesses
- InteractAnything: Zero-Shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
- A Novel Charge Pump Cell Based Modified Quadratic Boost Converter
- EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
- Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?
- Brain Tumor Classification with ResNeXt and its Comprehensive Evaluation
- Output Power Control of Three-Phase Secondary-Resonant Single-Active-Bridge DC-DC Converter
- OmniStereo: Real-Time Omnidireactional Depth Estimation with Multiview Fisheye Cameras
- Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation
- ACL: Activating Capability of Linear Attention for Image Restoration
- Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video
- Stop Walking in Circles! Bailing Out Early in Projected Gradient Descent
- Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient
- Is There a Path Backward If the Cloud is Compromised?
- Mitigating Lightning Hazards in Open Mining Areas: A Mobile Lightning Protection System Approach
- Flexible and Enzyme-Free Graphene-PVDF-Au Electrode for Glucose Detection in Sweat
- VideoGEM: Training-Free Action Grounding in Videos
- Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
- A Focused Human Body Model for Accurate Anthropometric Measurements Extraction
- Divide, Conquer, and Match: A Distributed and Asynchronous Approach for Subgraph Isomorphism
- ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning
- Early Streamer Emission Lightning Protection Systems: Global Views Revisited
- ChatGarment: Garment Estimation, Generation and Editing via Large Language Models
- Make It Count: Text-to-Image Generation with an Accurate Number of Objects
- GUARD: A GNN-Based Tool for Automated Unit Test Case Generation and Code Defect Prediction
- Ocularone-Bench: Benchmarking DNN Models on GPUs to Assist the Visually Impaired
- Unlearning through Knowledge Overwriting: Reversible Federated Unlearning via Selective Sparse Adapter
- Data Lab for Climate Change Adaptation: A Literature and Practice Review
- Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation
- Understanding and Mitigating Lightning-Related Animal Fatalities: Case Studies, Injury Pathways, and Protection Measures
- ActiveGAMER: Active GAussian Mapping through Efficient Rendering
- Semiconductor Loss Balancing of a 9-level Cascaded H-Bridge multilevel Inverter through Novel Carrier-reassignment PWM Scheme
- Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing
- StableAnimator: High-Quality Identity-Preserving Human Image Animation
- Towards a Model-Based Framework for Automated Traceable Systems and Probabilistic Model Checking
- UNEM: UNrolled Generalized EM for Transductive Few-Shot Learning
- MODA: Motion-Drift Augmentation for Inertial Human Motion Analysis
- 4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
- Optimized Cloud Performance Through Secure Data Detachment and Reproduction
- A Three-Party Batch Authentication Based on Twisted Edwards-Curve in Mobile Edge Computing
- Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning
- Forensics Adapter: Adapting CLIP for Generalizable Face Forgery Detection
- VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
- MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation
- DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane Reconstruction Model
- Gromov–Wasserstein Problem with Cyclic Symmetry
- ViUniT: Visual Unit Tests for More Robust Visual Programming