Hello, my name is Haotian Bai(白皓天). I am currently pursuing a Ph.D. in computer vision, AI Thrust, at HKUST’s Guangzhou campus, supervised by Prof. Hui Xiong(熊辉).

My primary research interests are in 3D reconstruction, with a focus on Neural Radiance Field (NeRF), and its applications using multi-model data. I am also interested in robotics, specifically simultaneous localization and mapping (SLAM). My papers are accepted at prestigious AI conferences such as CVPR, ICCV, ECCV, and NeurIPS. I also serve as reviewers on well known journals including TPAMI, TOMM, and TIP. Please feel free to comment on Github for further discussion on my works.

In my spare time, I enjoy cooking, playing basketball, listening to music, and learning about starting a business.

Recently,

  • I am investigating on 3D/4D scene reconstruction and generation.

🔥 News

  • 2024.11: I join the research team lead by Prof.Hui Xiong (熊辉).
  • 2023.09: I join AI Thrust, Info Hub in HKUST(GZ) as a Ph.D. student.
  • 2023.07:  🎉🎉 One paper is accepted by ICCV2023.
  • 2023.03:  🎉🎉 One paper is accepted by CVPR2023 highlight(top 2.5%).
  • 2022.07:  🎉🎉 My first paper is accepted by ECCV2022.
  • 2022.05: I join AI Thrust, Info Hub in HKUST(GZ) as a research assistant supervised by Addison Lin Wang(王林).
  • 2021.07: I join School of Data Science, CUHKSZ as a research assistant supervised by Ruimao Zhang(张瑞茂).
  • 2021.06: I graduate from Shanghai Univeristy with my bachelor’s degree in computer science and engineering.
  • 2019.09: I was selected as (1/300) Chinese undergraduate student representatives to attend the Global Grand Challenge Summit in London.
  • 2019.06: I was selected as (1/30) undergraduate student representatives to join the leadership program in The Wharton School of UPEEN.

📝 Publications

Arxiv
sym

High-Fidelity Mask-free Neural Surface Reconstruction for Virtual Reality

Haotian Bai, Yize Chen, Lin Wang

Project | Video |

  • A novel rendering-based framework for neural implicit surface reconstruction, aiming to recover compact and precise surfaces without multi-view object masks.
  • Since overlaps in images implicitly identifies the surface that a user intends to capture, Hi-NeuS takes multi-view rendering weights to guide the signed distance functions of neural surfaces in a self-supervised manner.
Arxiv
sym

CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout

Haotian Bai, Yuanhuiyi Lyu, Lutao Jiang, Sijia Li, Haonan Lu, Xiaodong Lin, Lin Wang

Project | Video |

  • A novel framework that synthesizes coherent multi-object scenes by integrating textual descriptions with box-based spatial arrangements.
  • CompoNeRF is designed for precision and adaptability, allowing for individual NeRFs, each denoted by a unique prompt color, to be composed, decomposed, and recomposed with ease, streamlining the construction of complex scenes from cached models after decomposition.
ICCV 2023
sym

Dynamic PlenOctree for Adaptive Sampling Refinement in Explicit NeRF

Haotian Bai, Yiqi Lin, Yize Chen, Lin Wang

Project | Video |

  • A more compact and fertile PlenOctree (POT) NeRF representation.
  • Inspiration: POT’s fixed structure for direct optimization is sub-optimal as the scene complexity evolves continuously with updates to cached color and density, necessitating refining the sampling distribution to capture signal complexity accordingly.
  • Competitive: DOT outperforms POT by enhancing visual quality, reducing over 55.15/68.84% parameters, and providing 1.7/1.9 times FPS for NeRF-synthetic and Tanks and Temples.
CVPR 2023
sym

Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective

Jinjing Zhu*, Haotian Bai*, Lin Wang

Project | Video |

  • Be selected as one of CVPR (highlight) papers(top 2.5%)
  • Large Domain Gap: PMTrans bridges source and target domains with an intermediate domain in a relatively smooth way.
  • Game Theory: Interpret UDA as a min-max CE game with three players, including the feature extractor, classifier, and PatchMix to find the Nash Equilibria.
  • Competitive: PMTrans surpasses ViT-based and CNN-based SoTA methods by +3.6% on Office-Home, +1.4% on Office-31, and +17.7% on DomainNet.
ECCV 2022
sym

Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration

Haotian Bai, Ruimao Zhang, Jiong Wang, Xiang Wan

Project | Video |

  • SCM is the external transformer based solution for Weakly Supervised Object Localization.
  • Lightweight: SCM is an external Transformer model that produces no additional parameters.
  • Competitive: SCM outperforms most competitive frameworks (CNN & Transformer) using only about 𝟐𝟎%~𝟑𝟎% of their parameters.

🎖 Honors and Awards

  • 2024.08 Serve as a reviewer for Transactions on Multimedia Computing Communications and Applications (TOMM).
  • 2024.06 Serve as a reviewer for IEEE Transactions on Image Processing (TIP).
  • 2024.04 Serve as a reviewer for Transactions on Multimedia Computing Communications and Applications (TOMM).
  • 2024.03 Serve as a reviewer for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
  • 2019.07 The third municipal price in the national math modeling competition.
  • 2018.10 The third municipal prize in the 2019 National College Students Extracurricular Academic Practice Competition(Alzheimer).
  • 2018.06 The third national prize in the China University program development competition.

📖 Educations

  • 2023.09 - (now), Ph.D., Hong Kong University of Science and Technology.
  • 2017.09 - 2021.06, B.E., Shanghai University

💬 Invited Talks

  • 2022.12, The global Ph.D. talk for sharing papers accepted in ECCV 2022, AI TIME, International Science and Technology Information Center, Tsinghua University. | [video]

💻 Internships

  • 2022.05 - 2023.08, research assistant at AI Thrust, Info Hub in HKUST(GZ).
  • 2021.07 - 2022.04, research assistant at School of Data Science, CUHKSZ.