Juexiao Zhang
Hi, I am currently a second year PhD student in computer science at the Courant Institute at New York University advised by Professor Chen Feng.
My First name in Chinese is 觉晓. It comes from a famous poem in Tang Dynasty (618-907 AD), which depicts a quiet spring morning.
I also go by my English nickname Jeremy.
Previously, I obtained my Bachelor's degree in EE from Tsinghua University and my Master's degree in CS from NYU.
During my Master's I was fortunate to work with Prof. Chen Feng AI4CE Lab on scene representation for robotics and Dr. Yubei Chen on unsupervised representation learning.
I am interested in learning scene representations that are useful for robots to understand the world and interact with it.
Email  / 
LinkedIn  / 
Google Scholar  / 
Github
|
|
|
Multiview Scene Graph
Juexiao Zhang,
Gao Zhu,
Sihang Li,
Xinhao Liu,
Haorui Song,
Xinran Tang,
Chen Feng
NeurIPS 2024
arXiv
/
project page
/
code
Build multiview place+object scene graph from unposed RGB image set.
|
|
VLM See, Robot Do:
Human Demo Video to Robot Action Plan via Vision Language Model
Beichen Wang*,
Juexiao Zhang*,
Shuwen Dong†,
Irving Fang†,
Chen Feng
(*† for equal contribution)
Under review.
arXiv
/
project page
/
code
Let the robot follow a human's actions by just watching one video.
|
|
Tell Me Where You Are: Multimodal LLMs Meet Place Recognition
Zonglin Lyu, Juexiao Zhang, Mingxuan Lu, Yiming Li, Chen Feng
Under review.
arXiv
/
project page
/
code
Can multimodal LLMs help visual place recognition?
|
|
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
Jing Zhang*,
Irving Fang*,
Hao Wu,
Akshat Kaushik,
Alice Rodriguez,
Hanwen Zhao,
Juexiao Zhang,
Zhuo Zheng,
Radu Iovita,
Chen Feng
(* for equal contribution)
CVPR, 2024. Highlight
project page
/
arXiv
/
code
Paleoanthropology meets cutting-edge computer vision! We create the first Lithic Use-Wear Analysis (LUWA) dataset and challenge Large Vision Model and Large Language and Vision Model with it.
|
|
Actformer: Scalable Collaborative Perception via Active Queries
Suozhi Huang*, Juexiao Zhang*, Yiming Li, Chen Feng
(* for equal contribution)
ICRA, 2024
arXiv
/
project page
/
code
A collaborative BEV Transformer for 3D object detection where each BEV query can actively select relevant cameras for information aggregation based on their pose information, instead of interacting with all cameras indiscriminately.
|
|
URLOST: Unsupervised Representation Learning without Stationarity or Topology
Zeyu Yun,
Juexiao Zhang,
Yann LeCun,
Yubei Chen
Under review.
arXiv
/
Code coming soon.
An unsupervised representation learning for high-dimensional data without explicit stationarity and topology.
|
|
Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception
Yiming Li*, Juexiao Zhang*, Dekun Ma, Yue Wang, Chen Feng
(* for equal contribution)
CoRL, 2022
paper
/
project page
/
code
A task-tgnostic framwork that allows asynchronous training for Collaborative perception. An autoencoder that amortizes communication in spatial-temporal domain.
|
|
Word Embedding Visualization via Dictionary Learning
Juexiao Zhang*, Yubei Chen*, Brian Cheung, Bruno Olshausen
(* for equal contribution)
arXiv preprint arXiv:1910.03833
paper
/
code
Decomposed word embedding via dictionary learning and spectral clustering and discover elemantary semantic factors.
|
Service
Reviewer for ICRA 2024, IROS 2024, NeurIPS 2024.
|
Awards/Scholarships
National Scholarship in Tsinghua University, 2017
Scholarship of Academic Excellence in Tsinghua University, 2017
Scholarship of Outstanding Voluntary Work in Tsinghua University, 2017
|
Misc
In my spare time, I enjoy playing soccer, making coffee, reading, photograph and travel.
|
|