z-logo
open-access-imgOpen Access
ViT-NeBLa: A Hybrid Vision Transformer and Neural Beer–Lambert Framework for Single-View 3D Reconstruction of Oral Anatomy from Panoramic Radiographs
Author(s) -
Bikram Keshari Parida,
Anusree P. Sunilkumar,
Abhijit Sen,
Wonsang You
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3613789
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Panoramic radiography (PX) is widely used in dentistry but provides only a flattened 2D view; Cone Beam Computed Tomography CT (CBCT) recovers 3D anatomy at higher dose and cost. We tackle full 3D reconstruction from a single real-world PX for varying patients. We propose ViT–NeBLa (Vision-Transformer Neural Beer–Lambert), a physics-guided framework that estimates a continuous 3D dentoalveolar density field directly from single PX. The design mirrors panoramic acquisition while simplifying the inverse problem: we parameterize projection-ray directions by tangency to a patient-adaptive elliptical path and independently restrict sampling to a jaw-focused horseshoe (focal trough) in which the projection rays do not intersect. This removes the intermediate density-aggregation step used by overlapping-ray methods and reduces per- ray samples by about 52%, lowering memory and compute. A hybrid ViT–CNN backbone extracts global anatomical context and local texture from the PX, and a learnable multi-resolution hash positional encoding maps 3D sample coordinates to expressive features that preserve fine dental and osseous detail. Per-point densities are predicted by a compact MLP and accumulated into a coarse 3D grid, which a lightweight 3D U-Net refines into the final volume. Training is end-to-end using synthetic PX rendered from CBCT via the Beer–Lambert law together with voxelwise, projection-consistency, and perceptual losses; at inference, a single PX is processed directly— without CBCT flattening or dental-arch priors. Experiments show that ViT–NeBLa outperforms contemporary PX-to-3D baselines both quantitatively and qualitatively, yielding sharper cortical boundaries, clearer trabecular patterns, and fewer artifacts. In sum, ViT–NeBLa provides a radiation-efficient route to clinically informative 3D visualization from routine panoramic radiographs while simplifying geometry, reducing sampling, and preserving high-frequency structure.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom