I create advanced AI solutions using LLMs and GNNs to transform vast, complex data into actionable insights, driving decision-making and recommendations across healthcare, finance, and social networks.
Education
Aug 2022 - Present
Ph.D.
Aug 2022 - Apr 2024
M.Sc.
Sep 2017 - Aug 2021
B.Sc.
Leadership
Rasht School of AI Leader
University of Guilan
Dec 2020 - Aug 2022
Brain and Cognition Association AI Head
University of Guilan
Oct 2020 - Oct 2021
CE Scientific Association Head of Research Affairs
University of Guilan
Oct 2020 - Oct 2021
Awards
Selected Speaker
3rd Henry Ford + MSU Cancer Research Symposium (2023)
Poster Award Winner at "Cancer Control & Prevention"
Selected Tutorial
4th ACM International Conference on AI in Finance (2023)
Large Language Models for NLP in Finance
Teacher Assistanship
Certifications
Language
English
Proficient
German
Intermediate
Persian
Native
Experience
Aug 2022- Present
East Lansing, USA
AI Research Assistant
Human Augmentation and Artificial Intelligence Laboratory
Proven track record in data science and business consulting, delivering impactful insights and results across industries.
Sep 2018 - Jul 2022
Rasht, Iran
NLP Research Assistant
Guilan NLP Group
Demonstrated expertise in business advisory and data analysis, providing impactful conclusions and outcomes across various.
Techstack
Presentations
Publications
Bridging Scientific Research, Innovation, and Finance: A Temporal Heterogeneous Graph Dataset for Financial Investment Prediction
Khanmohammadi, R., Singh, K., Maheshwari, P., Panda, V., Kaur, S., Brugere, I., Smiley, C. H., Nourbakhsh, A., Alhanai, T., & Ghassemi, M. M. (2024).
Under review - https://arxiv.org/abs/2110.06131
Built a 70M+ node graph dataset linking papers, patents, and financial data (2001–2022).
Developed ML models and an advanced TGNN model, Durendal++, for investment predictions.
Durendal++ achieved top performance, with F1 Micro scores up to 89% F1 in 2022.
Showcased the benefits of diverse data integration in financial predictions.
In collaboration with:
JPMorgan Chase
NYU-AD
Hybrid student-teacher large language model refinement for cancer toxicity symptom extraction
Khanmohammadi, R., Ghanem, A. I., Verdecchia, K., Hall, R., Elshaikh, M., Movsas, B., Bagher-Ebadian, H., Luo, B., Chetty, I. J., Alhanai, T., Thind, K., & Ghassemi, M. M. (2024).
Under review - https://arxiv.org/abs/2408.04775
Applied a student-teacher framework to improve compact LLMs for symptom extraction.
Used GPT-4o (teacher) to guide compact LLMs in refining prompts, using RAG, and finetuning.
Achieved F1 improvements of 26% for Phi3 and 13% for Zephyr.
Reduced costs: Phi3 was 48x and Zephyr 30x cheaper than GPT-4o.
Demonstrated an efficient, cost-effective approach for using LLMs in clinical settings.
In collaboration with:
Henry Ford Health
Cedars-Sinai
NYU-AD
Investigating the Temporal Association of Biomedical Research on Small Business Funding: A Bibliometric and Data Analytic Approach
Khanmohammadi, R., Kaur, S., Smiley, C. H., Alhanai, T., Brugere, I., Nourbakhsh, A., & Ghassemi, M. M. (2024).
Published in IEEE Transactions on Computational Social Systems - doi:10.1109/TCSS.2024.3466010
Analyzed 10,873 biomedical topics to link scientific innovation with small business funding.
Combined bibliometric analysis with SBIR data to assess science’s industrial impact.
Measured time-lagged effects of scientific advances on industry funding (2010-2021).
Found impactful scientific topics as predictors of future funding (p-values < 0.05).
Revealed strong contextual overlap between scientific papers and industry projects.
In collaboration with:
JPMorgan Chase
NYU-AD
Iterative Prompt Refinement for Radiation Oncology Symptom Extraction Using Teacher-Student Large Language Models
Khanmohammadi, R., Ghanem, A. I., Verdecchia, K., Hall, R., Elshaikh, M., Movsas, B., Bagher-Ebadian, H., Chetty, I., Ghassemi, M. M., & Thind, K. (2024).
Published in International Conference on the use of Computers in Radiation therapy (ICCR) - HAL ID: hal-04720234
Automated prompt optimization through a teacher-student model setup.
Improved model performance using zero-shot learning, avoiding additional training.
Ensured local data processing to protect sensitive clinical information.
Improved domain-specific concept extraction accuracy through iterative refinement.
In collaboration with:
Henry Ford Health
Cedars-Sinai
A Novel Localized Student-Teacher LLM for Enhanced Toxicity Extraction in Radiation Oncology
Khanmohammadi, R., Ghanem, A. I., Verdecchia, K., Hall, R., Elshaikh, M. A., Movsas, B., Bagher-Ebadian, H., Chetty, I. J., Ghassemi, M. M., & Thind, K. (2024).
Published in International Journal of Radiation Oncology, Biology, Physics - doi:10.1016/j.ijrobp.2024.07.1392
Developed a student-teacher LLM system to improve toxicity extraction in radiation oncology.
Tested on prostate cancer notes, focusing on key symptoms and treatments from 177 patients.
Achieved significant accuracy, precision, recall, and F1 score improvements in single and multi-symptom as well as single and multi-treatment notes (p < 0.05).
Demonstrated potential for local, privacy-preserving NLP in clinical environments.
In collaboration with:
Henry Ford Health
Cedars-Sinai
A Novel Learning Approach for Training Transformer-Based Natural Language Processing Pipeline to Extract Toxicity Symptoms in Prostate Cancer Patients after Definitive Radiotherapy
Khanmohammadi, R., Ghassemi, M. M., Ghanem, A. I., Elshaikh, M. A., Verdecchia, K., Chetty, I. J., & Thind, K. (2023)
Published in the American Association of Physicists in Medicine (AAPM) - link
In collaboration with:
Henry Ford Health
An Introduction to Natural Language Processing Techniques and Framework for Clinical Implementation in Radiation Oncology
Khanmohammadi, R., Ghassemi, M. M., Verdecchia, K., Ghanem, A. I., Bing, L., Chetty, I. J., Bagher-Ebadian, H., Siddiqui, F., Elshaikh, M., Movsas, B., & Thind, K. (2023).
Under review - https://arxiv.org/abs/2311.02205
Introduced NLP's role in converting clinical text to structured data for radiation oncology.
Reviewed major advancements in NLP, focusing on applications in radiation oncology.
Proposed a comprehensive evaluation framework for assessing NLP models' readiness for clinical use, focusing on purpose, technical performance, bias, ethics, and quality assurance.
Identified current challenges with LLMs, including hallucinations, bias, and issues in clinical deployment.
Outlined a checklist for clinical implementation, providing practical guidance for researchers and clinicians to evaluate NLP models for safe and effective use.
In collaboration with:
Henry Ford Health
MambaNet: A Hybrid Neural Network for Predicting the NBA Playoffs
Khanmohammadi, R., Saba-Sadiya, S., Esfandiarpour, S., Alhanai, T., & Ghassemi, M. M. (2024)
Published in SN Computer Science - doi:10.1007/s42979-024-02977-0
Introduced MambaNet for NBA playoff prediction with advanced neural layers.
Leveraged Feature Imitating Networks (FINs) for improved statistical feature representation.
Outperformed baseline models, achieving AUC up to 0.82.
Demonstrated model generalizability with NBA and Iranian Super League data.
In collaboration with:
Hudl Instat
NYU-AD
The Broad Impact of Feature Imitation: Neural Enhancements Across Financial, Speech, and Physiological Domains
Khanmohammadi, R., Alhanai, T., & Ghassemi, M. M. (2023)
Under review - https://arxiv.org/abs/2309.12279
FINs with Tsallis entropy boosted performance in finance, speech, and physiology tasks.
FIN-ENN improved Bitcoin prediction accuracy by reducing RMSE and MAPE.
Enhanced speech emotion recognition by 2.65% with FIN.
Improved Chronic Neck Pain detection accuracy to 62.5%, outperforming traditional models.
In collaboration with:
NYU-AD
Fetal Biological Sex Identification using Machine and Deep Learning Algorithms on Phonocardiogram Signals
Khanmohammadi, R., Mirshafiee, M. S., Alhanai, T., & Ghassemi, M. M. (2022)
Under review - https://arxiv.org/abs/2110.06131
Developed a method to identify fetal biological sex from fetal phonocardiogram (FPCG) signals.
Achieved 91% accuracy, surpassing previous baselines by 10%.
Analyzed a dataset of 1000 FPCG samples, balanced across male and female fetuses.
Combined statistical and sound features to improve classification over individual models.
In collaboration with:
NYU-AD
COPER: a Query-Adaptable Semantics-based Search Engine for Persian COVID-19 Articles
Khanmohammadi, R., Mirshafiee, M. S., Allahyari, M. (2021)
Published in International Conference on Web Research (ICWR) - doi:10.1109/ICWR51868.2021.9443151
Built COPER, a search engine with 3,500 Persian COVID-19 articles.
Used BM25, TF-IDF, and BERT/SBERT for query-adaptive re-ranking.
Developed PerSICK, the first Persian semantic textual similarity dataset with 3,000 pairs.
Fine-tuned SBERT, achieving 97% STS accuracy.
Prose2Poem: The Blessing of Transformers in Translating Prose to Persian Poetry
Khanmohammadi, R., Mirshafiee, M. S., Rezaee, Y., Mirroshandel, S. A. (2023)
Published in ACM Transactions on Asian and Low-Resource Language Information Processing - doi:10.1145/359279
Created the first Persian Prose-to-Poem translation using a new low-resource NMT method.
Released a unique prose-poem and synonym-antonym dataset in Persian.
PGST: A Persian gender style transfer method
Khanmohammadi, R., Mirroshandel, S. A. (2023)
Published in Natural Language Engineering - doi:10.1017/S1351324923000426
PGST is the first Persian text style transfer method for gender-based language differences.
A benchmark compares PGST with models using word and character embeddings.
PGST is extended to English and evaluated against top models with various metrics.