Tools and Technology Seminar 11/07/2024 - Yufeng Zhang
Tools and Technology Seminar Series Gilbert S. Omen Department of Computational Medicine and Bioinformatics University ofย ...
Co-Founder, President, Chief Executive Officer, Chief Scientific Officer & Director, Amphastar Pharmaceuticls
Search every verified Yongfeng Zhang interview, podcast appearance, and on-the-record quote โ each transcript cross-checked by AI and human review to confirm speaker identity. Yongfeng Zhang, co-founder, president, CEO, chief scientific officer, and director at Amphastar Pharmaceuticals, presented at the Tools and Technology Seminar on November 7, 2024, as part of the University of Michigan's Department of Computational Medicine and Bioinformatics. During the seminar, Zhang discussed a research project called "Necklace," which focuses on necrotizing enterocolitis (NE), a serious gastrointestinal disease in premature infants. Zhang stated that the disease progresses rapidly with a mortality rate of 20% to 30%, and emphasized that timely diagnosis is crucial. Zhang noted that the project uses open-source large language models (LLMs) rather than closed-source models like GPT, citing the need to protect patient privacy by running models locally. Zhang described using quantization and low-rank adaptation methods to train and deploy LLMs efficiently on a single Nvidia A40 GPU with 16GB of memory. Zhang reported that larger LLMs consistently outperformed smaller ones, and that increasing the fine-tuning dataset size significantly boosted performance, particularly for smaller models and classes with limited examples. Zhang also mentioned using model distillation, where a teacher model annotates a dataset to train a smaller student model, which improved performance. Future plans include collaborating with hospitals to obtain more data to address bias from single-facility retrospective data, improving model interpretability by adding clinical explanations, and using federated learning to preserve privacy when working with multicenter medical data.
“The disease can progress really fast, with mortality ranging from 20% to 30%, so timely diagnosis is very crucial to avoid postponing necessary treatment and worsening the patient's condition.”
“We cannot use GPT models because they are closed source and require sending sensitive patient data to OpenAI, which would compromise patient privacy; therefore, we use open-source models that can be run locally to preserve data privacy.”
“Using large language models as teacher models can substantially improve the performance of smaller BERT models, making them comparable to large language models themselves.”
“The large language models consistently perform better than smaller ones, and increasing the size of the fine-tuning dataset significantly boosts model performance, particularly for smaller models and classes with limited examples.”
Tools and Technology Seminar Series Gilbert S. Omen Department of Computational Medicine and Bioinformatics University ofย ...
Sign in to search the full transcript archive, filter by topic, and access every quote from Yongfeng Zhang.