Research
My research interests lie in the field of Artificial Inteligence. The most of my current works focused on
AI for Science , Large Language Models and Multi-modal Models .
|
Inertial Confinement Fusion Forecasting via LLMs
Mingkai Chen,
Taowen Wang,
James Chenhao Liang,
Chuan Liu,
Chunshu Wu,
Qifan Wang,
Ying Nian Wu,
Michael Huang,
Chuang Ren,
Ang Li,
Tong Geng,
Dongfang Liu
arXiv, Under review by NeurIPS 2024
We developed Fusion-LLM, a novel integration of Large Language Models (LLMs) with reservoir computing paradigms, tailored for Inertial Confinement Fusion (ICF). The approach includes an LLM-anchored Reservoir for accurate forecasting of hot electron dynamics, Signal-Digesting Channels for detailed temporal and spatial analysis of laser intensity, and a Confidence Scanner to quantify prediction confidence. Demonstrated superior performance in predicting Hard X-ray (HXR) energies, achieving state-of-the-art results. Introduced Fusion4AI, the first ICF benchmark based on physical experiments, to foster advancements in AI-driven plasma physics research.
|
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs
Daoan Zhang*,
Junming Yang*,
Hanjia Lyu*,
Zijian Jin,
Yuan Yao,
Mingkai Chen,
Jiebo Luo
ICPR 2024
We investigated Large Multimodal Models' (LMMs) ability to process multiple image inputs, focusing on fine-grained perception and information blending. Our research involved image-to-image matching and multi-image-to-text matching assessments, using models like GPT-4V and Gemini. We developed a Contrastive Chain-of-Thought (CoCoT) prompting method to improve LMMs' multi-image understanding, significantly enhancing model performance in our evaluations.
|
Aggregation of Disentanglement: Reconsidering Domain Variations in Domain Generalization
Daoan Zhang*,
Mingkai Chen*,
Chenming Li,
Lingyun Huang,
Jianguo Zhang
arXiv, Under review by IJCV
We proposed a new perspective to utilize class-aware domain variant features in training, and in the
inference period, our model effectively maps target domains into the latent space where the known
domains lie. We also designed a contrastive learning based paradigm to calculate the weights for
unseen domains.
|
* equal contribution.
|
|
Student Assistant, Department of Computer Science, Stony Brook University
|
|