- Towards Training-free Anomaly Detection with Vision and Language Foundation Models
- CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation
- PIT: A Plug-and-Play Image Translator for Making Off-the-Shelf Models Adapt to Corruptions
- SGCR: Spherical Gaussians for Efficient 3D Curve Reconstruction
- High-Speed Signal Simulation and Optimization Based on DDR5
- Integrated Driver Monitoring and Real-Time Accident Detection System using IoT and Computer Vision
- Towards a Unified Charging Infrastructure: Integrating Conductive and Wireless Charging Methods
- Improving Visual and Downstream Performance of Low-Light Enhancer with Vision Foundation Models Collaboration
- Online Fault Detection in Traction PMSM Using a MEMS Accelerometer: A Deep Learning Approach
- Taste More, Taste Better: Diverse Data and Strong Model Boost Semi-Supervised Crowd Counting
- Immersive Ecological Virtual Environment for Inducing Balance Disturbances
- FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images
- Human Motion Instruction Tuning
- Improving Sound Source Localization with Joint Slot Attention on Image and Audio
- Dynamic Power Tracking for Grid-Connected Microinverter PV Systems
- Seeing A 3D World in A Grain of Sand
- FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting
- Mind the Gaps: Toward a Unified Model of Multi-Cloud Firewall Configurations
- PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model
- Targeted Forgetting of Image Subgroups in CLIP Models
- Deep Learning-Driven IoT Framework for Smart Home Appliance Load Monitoring and Energy Optimization
- Frequency Dynamic Convolution for Dense Image Prediction
- Multi-Scale Image Deblurring Using Wavelet Transform and Attention-Based Feature Fusion
- Quantized Graph-Based Personalized DRL for Dependency-Aware Task Offloading in Heterogeneous Edge Networks
- EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights
- ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding
- An FPGA-Accelerated Framework for Optimizing Decision Tree Ensembles in Supervised Learning
- On the Out-Of-Distribution Generalization of Large Multimodal Models
- Satellite-based Crop Monitoring and Yield Prediction: An Integrated Approach using Sentinel-2 and Soil Health Data for Precision Agriculture
- Multi-View Subspace-Enhanced Clustering Via Tensor-Nuclear Norm and Discriminative Graph
- Multi-AGV Dynamic Path Planning Method for Warehouse Logistics
- An Information-Theoretic Framework for Out-of-Distribution Generalization with Applications to Stochastic Gradient Langevin Dynamics
- A Novel Deep Learning Approach for Automatic Indian Classical Dance Style Classification
- D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-Based Affective Recognition
- CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner
- Order-One Rolling Shutter Cameras
- Development of a Hybrid Experimental Environment using PHIL for Multi-Unit Power Converter Networks
- HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
- Leveraging RAG for Enhanced Business Intelligence with Local LLMs
- Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising
- BHViT: Binarized Hybrid Vision Transformer
- MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments
- Effect of Field Contaminants on rGO-coated Flexible Leaf Wetness Sensors for In-Situ Agriculture Applications
- Heuristic Methods for Checking the Normality of Measurement Data with Graphical and Numerical Tests
- Data Analysis for Structural Health Monitoring of a Steel Jacket Offshore Platform
- Low-Rank Adaptation in Multilinear Operator Networks for Security-Preserving Incremental Learning
- Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
- GCC: Generative Color Constancy via Diffusing a Color Checker
- AI in Public Procurement: Potential and Adoption in the Competitive Tendering Process
- Unseen Visual Anomaly Generation
- Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
- SOH Estimation of Lithium-ion Batteries using LSTM Model with Deconvoluted EIS Parameters
- SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis
- A Robust Cascade Controller Based Phase Shifted Full Bridge Converter for Electric Vehicle Applications
- An AST-guided LLM Approach for SVRF Code Synthesis
- Conformal Prediction for Zero-Shot Models
- Moving Towards Measuring Spatial Hearing Using Consumer-grade Headband EEG
- Investigating Efficient Edge Offloading Architectures for Serverless Systems
- Resilient Sensor Fusion under Adverse Sensor Failures via Multi-Modal Expert Fusion
- Reconfigurable Coding Design for Programmable Metasurface-Based DOA Estimation via Riemannian Manifold Optimization
- Sensify: A Learning-Based Budget-Aware Task Assignment in Mobile Crowdsensing
- Interpreting Object-level Foundation Models via Visual Precision Search
- TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
- DiskVPS: Vanishing Point Detector via Hough Transform in a Disk Region
- Open-Canopy: Towards Very High Resolution Forest Monitoring
- 4Deform: Neural Surface Deformation for Robust Shape Interpolation
- From Elements to Design: A Layered Approach for Automatic Graphic Design Composition
- Efficient Dynamic mmWave Beam Selection Using Multimodal Attention-Based Approach
- Fault Detection for Train-Controlled On-Board Equipment Using a Hybrid CNN-LSTM Model
- LIM: Large Interpolator Model for Dynamic Reconstruction
- SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model
- Crash Course on Quantum Computing for Engineering Students
- STF-GCN: A Multi-Domain Graph Convolution Network Method for Automatic Modulation Recognition via Adaptive Correlation
- A Systematic Approach for Continuous Monitoring and Validation of Product Properties in the Product Engineering Process
- A Multi-Time Selection Framework for Machine Translation Based on Large Language Models
- Perceptual Video Compression with Neural Wrapping
- High Dynamic Range Video Compression: A Large-Scale Benchmark Dataset and A Learned Bit-depth Scalable Compression Algorithm
- Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation
- Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
- SceneCrafter: Controllable Multi-View Driving Scene Editing
- ObjectMover: Generative Object Movement with Video Prior
- Partial Discharge Fault Detection Method of CNN-LSTM Based on Fusion Attention Mechanism
- FedCALM: Conflict-aware Layer-wise Mitigation for Selective Aggregation in Deeper Personalized Federated Learning
- DarkIR: Robust Low-Light Image Restoration
- Analysis of Students Stress Level using Machine Learning Algorithms
- Testing of a Concept for In-situ Detection of Humidity-Driven Degradation of IGBT Modules under Accelerated Aging
- Wavelet and Prototype Augmented Query-based Transformer for Pixel-level Surface Defect Detection
- GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
- Latent Space Imaging
- Scaling Inference Time Compute for Diffusion Models
- FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering
- Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
- FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation
- Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval
- Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation
- When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning
- Prototype-Based Image Prompting for Weakly Supervised Histopathological Image Segmentation
- Multi-View Multi-Scale Network for 3D Object Recognition and Retrieval
- Homogeneous Dynamics Space for Heterogeneous Humans
- Enhanced Doa Estimation for Lightning Sources Using Music and Coherent Signal Subspace Method