- Analyzing 16,193 LLM Papers for Fun and Profits
- Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
- FRAME: Floor-aligned Representation for Avatar Motion from Egocentric Video
- Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting
- Taming Teacher Forcing for Masked Autoregressive Video Generation
- CASP: Compression of Large Multimodal Models Based on Attention Sparsity
- Paint by Inpaint: Learning to Add Image Objects by Removing Them First
- Exploring CLIP’s Dense Knowledge for Weakly Supervised Semantic Segmentation
- Air Quality Predictive Analysis using Empirical Mode Decomposition with Adaptive Noise-Bee Colony Optimization
- High Throughput Low Latency Network Intrusion Detection on FPGAs: A Raw Packet Approach
- Energy Flexibility Optimization in Industry: A Hybrid Approach with Synthetic Data Evaluation
- Novel PWM Operation Scheme for Enhancing Hot Carrier Injection Reliability in n-LDMOS Devices
- Generic serial communication implementation in Texas Instruments’ MCU to support edge AI applications
- Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
- A Blockchain-Assisted Model for Data Security and Supervision in Unmanned Coal Measurement: Towards Sustainable Industry
- Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity
- Low-Altitude UAV Trajectory Optimization for Complex 3D Terrains Based on Energy Consumption
- Real Time Capacity Estimation For Lithium-Ion Battery Using Deep Transfer Learning
- Tracking and Synchronization Control of the 4WISBW System Considering Uncertain Network Communication Delay
- Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild
- Enhancing Network Resilience Through Automated Rule-Based Monitoring, Alerting, and Proactive Mitigation
- Separation of powers: On segregating knowledge from observation in LLM-enabled knowledge-based visual question answering
- Contrastive Learning-Based Agent Modeling for Deep Reinforcement Learning
- A Novel Formation Control Strategy for USVs With Improved DDPG: Simulation and Field Test
- RePerformer: Immersive Human-centric Volumetric Videos from Playback to Photoreal Reperformance
- AnimateAnything: Consistent and Controllable Animation for Video Generation
- Decoupling of Mixed Reliability Degradation Mechanisms in LDMOS Using Ultrafast Measurement and Neural Network
- HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views
- STDD: Spatio-Temporal Dual Diffusion for Video Generation
- Differentiable Predictive Control for Power Electronic Systems
- Articulated Kinematics Distillation from Video Diffusion Models
- SpecTRe-GS: Modeling Highly Specular Surfaces with Reflected Nearby Objects by Tracing Rays in 3D Gaussian Splatting
- DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
- EntityErasure: Erasing Entity Cleanly via Amodal Entity Segmentation and Completion
- Current Measurement of GaN HEMTs Without Insertion Impedance and Unaffected by Magnetic Field Noise Using Two Optical Probe Electric Current Sensors
- HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving
- Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body
- TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting
- Predicting ocular diseases using squeezenet as feature maps with Convolutional Neural Networks
- Fully Distributed Design for Synchronization of Discrete-Time Multi-Agent Systems using State Feedback Protocols
- Data-Driven Propulsion System Fault Diagnosis for Deep-Sea Submersible
- Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning
- Time of the Flight of the Gaussians: Optimizing Depth Indirectly in Dynamic Radiance Fields
- Less Attention is More: Prompt Transformer for Generalized Category Discovery
- Self-Commissioning Single-Inductor Dual-Output (SIDO) DC-DC Bi-Polar Converter
- DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension
- Evaluating Expansion Memory for Optimizer State Offloading for Large Transformer Models
- Allocating Battery Energy Storage System in Droop Controlled Islanded Microgrid Considering Uncertainties
- AvatarArtist: Open-Domain 4D Avatarization
- A New Strategy to Detect and Localize Interturn Short-Circuit Fault in Medium-frequency-Transformer of Dual-Active-Bridge Converter
- Reinforcement Learning Reward Function Evaluator for USV Straight-Path Following
- Electronic Healthcare Data Sharing Application Based on Hyperledger Consortium Blockchain Network
- A Time-Domain Integration Comparison Scheme With Noise Immunity for Wake-Up Receivers
- FLAIR: VLM with Fine-grained Language-informed Image Representations
- PLeaS — Merging Models with Permutations and Least Squares
- CASP: Consistency-aware Audio-induced Saliency Prediction Model for Omnidirectional Video
- Cross-Granularity Relation-Aware Network for Visual Intention Understanding
- Nested Diffusion Models Using Hierarchical Latent Priors
- LATTE-MV: Learning to Anticipate Table Tennis Hits from Monocular Videos
- A Modified Carrier-Based PWM with High DC Voltage Utilization for Three-Level Inverters with Unbalanced Neutral-Point Voltage
- Physics-Informed Neural Network for Parameter Identification: a Buck Converter Case Study
- Adaptive Neural Optimal Backstepping Control for Heterogeneous Multi-Agent Systems With Non-Cooperative Target via Identifier-Critic-Actor Algorithm
- PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models
- Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models
- V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection
- IoT-Enabled ExoLimb: A Cost-Effective Exoskeleton for Enhanced Mobility and Medical Rehabilitation
- Gate-Tunable Photoresponse in $\text{SnSe}_{2}$ Field Effect Transistors
- Dynamic Updates for Language Adaptation in Visual-Language Tracking
- Scene-Centric Unsupervised Panoptic Segmentation
- Hazy Low-Quality Satellite Video Restoration Via Learning Optimal Joint Degradation Patterns and Continuous-Scale Super-Resolution Reconstruction
- Methodology for Developing Inclusive Social Robots: A User-Centered Design Perspective
- StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
- A 950 MHz SIMT Soft Processor
- ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
- SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens
- Analytical Modelling of Turn off characteristics in a SiC MOSFET based half bridge configuration
- Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints
- FLARE: Feed-Forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views
- MBQ: Modality-Balanced Quantization for Large Vision-Language Models
- Treetap9: a forestry tool for measuring standing tree stiffness
- An Integrative Learning-Based Biclustering Algorithm for Cancer Multi-Omics Data
- Joint LDPC Code and Spreading Optimization for Multi-user Communications
- Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization
- Enhancing Agricultural Decision Making with Machine Learning
- Learning Flow Fields in Attention for Controllable Person Image Generation
- STAR-Edge: Structure-aware Local Spherical Curve Representation for Thin-walled Edge Extraction from Unstructured Point Clouds
- Towards More General Video-based Deepfake Detection through Facial Component Guided Adaptation for Foundation Model
- Block Epsilon-Circulant Preconditioning with GPU-Accelerated Spatial Solvers for Linear Time-Dependent PDEs
- From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral Perspective
- Be More Specific: Evaluating Object-centric Realism in Synthetic Images
- VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models
- Fusion-Based Additive Manufacturing of Hastelloy C-Series: A Comparative Study on Microstructure, Mechanical Properties, and Residual Stress
- TSUE: A Two-Stage Data Update Method for an Erasure Coded Cluster File System
- HumanMM: Global Human Motion Recovery from Multi-shot Videos
- Benchmarking Floating Point Performance of Massively Parallel Dataflow Overlays on AMD Versal Compute Primitives
- DQN-Based Vertical Handover Strategy for Satellite and Terrestrial Integrated Vehicular Networks
- Material Anything: Generating Materials for Any 3D Object via Diffusion
- RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing
- MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
- Revisiting Source-Free Domain Adaptation: Insights into Representativeness, Generalization, and Variety