Building Predictive Models for Biological Data
Where AI meets life sciences, discover data-driven research, publish your findings, and shape the future of bioinformatics.
Fall 2025
School of Systems Biology
Schar School of Policy and Government
George Mason University
Program Duration:
October 11, 2025 - January 3, 2026
Program Format:
Virtual (synchronous & asynchronous)
Seats:
Limited to 12 (first-come, first-served)
The internship program consists of:
- Alternating synchronous and asynchronous virtual meetings. 12 - 2 p.m. EST; 9 a.m. – 11 p.m. PST; Synchronous lectures will be given biweekly, followed by an asynchronous self-paced lab
- Synchronous meetings (lectures)
- Saturday October 11
- Saturday October 25
- Saturday November 8
- Saturday November 29
- Asynchronous meetings (labs)
- Saturday October 18
- Saturday November 1
- Saturday November 15
- Saturday December 6
- Synchronous meetings (lectures)
- Optional - Eight Synchronous Office Hours/Group Discussions with a Teaching Assistant on Tuesdays or Thursday evenings. Virtual meetings via Zoom, 7 - 8 p.m. EST; 4 – 5 p.m. PST;
- Four weeks Research Paper Finalizing Period, December 13, 2025 – January 3, 2026:
- A dedicated four-week period to refine and finalize research papers.
- This additional time allows students to strengthen their work, ensuring a well-developed, high-quality final paper.
- Final Research Paper Due: January 3, 2026
Registration Fee
We can only accept 12 students this fall. Available seats are on a first-come, first-served basis.
- Please email execed@gmu.edu for application and program fees.
- Need-based scholarships are available.
Certification
Students who successfully finish the program will receive a Young Scholars Research Program Certificate of Completion, Machine Learning & Bioinformatics: Building Predictive Models for Biological Data.
Program Summary
Machine Learning & Bioinformatics: Building Predictive Models for Biological Data
This innovative Young Scholars Program introduces students to the powerful intersection of artificial intelligence, machine learning, and the life sciences. Through hands-on projects, students will learn how machine learning techniques can be applied to biological data, unlocking insights into genomics, health, and bioinformatics research.
Using graphical machine learning software, students will explore real-world datasets without requiring advanced coding skills. They will gain practical experience in:
- Building and interpreting machine learning models
- Analyzing biological and biomedical data
- Understanding how AI is transforming scientific discovery
- Presenting results in professional research formats
By the end of the program, students will not only strengthen their skills in data analysis, research design, and scientific communication but also prepare for the future of STEM, where AI-driven discovery is rapidly shaping careers in science, technology, healthcare, and beyond.
This program is ideal for motivated high school students eager to explore the next frontier of research and innovation in a supportive, collaborative environment.
Research Projects Highlights and Features
- Students will learn about supervised and unsupervised machine learning algorithms and their application to biological and health data. Students will participate in lectures and labs designed to provide a rich background in machine learning.
- Students will use the Orange Data Mining platform to build machine learning pipelines, enabling students to explore machine learning without the need for explicit programming experience.
- Students will complete a machine learning project to gain hands-on research experience. Projects include predicting therapeutic peptides, disease outcomes from gene expression profiles, and more.
Analyze Data and Drawing Evidence-Based Conclusions
- Learn to analyze data using statistical tools.
- Understand how to interpret graphs, tables, and figures in research articles.
- Critically assess whether conclusions are supported by scientific evidence.
Develop Critical Thinking and Scientific Inquiry Skills
- Foster analytical thinking through active participation in real-world research projects.
- Develop the ability to ask precise, researchable questions based on gaps in current scientific knowledge.
- Teach students how to interpret scientific data, conduct research, and evaluate literature.
- Learn hypothesis development and experimental design to test ideas effectively.
Gain Experience in the Application of Machine Learning to the Biological Sciences
- Educate students on various supervised and unsupervised machine learning algorithms and their uses.
- Learn how to practically employ machine learning pipelines using graphical tools such as Orange.
- Apply machine learning to a biological dataset to gain real-world experience into its predictive utility.
Enhancing Scientific Writing and Communication
- Learn how to summarize key findings from journal articles.
- Practice writing research abstracts, literature reviews, and lab reports.
- Cultivate curiosity and passion for scientific discovery among high school students.
- Encourage students to consider careers in biomedical research, public health, and related fields.
Program Key Benefit
Engage High School Students in Advanced Biomedical Research
- Introduce students to cutting-edge research on machine learning and its application to biological and health data.
- Students actively participate in developing machine learning pipelines and explore the nuances of its use in biology.
Encourage Collaboration with Leading Scientists
- Provide mentorship from experienced researchers and faculty from the School of Systems Biology.
- Offer networking opportunities to inspire future careers in science, medicine, and biotechnology.
Exploring Future Careers in Biomedical Science and Healthcare
- Exposure to research fields such as biotechnology, pharmacology, genetics, and bioengineering.
- Gain insights into potential medical, biomedical engineering, pharmaceuticals, and public health careers.
- Cultivate curiosity and passion for scientific discovery; Inspire the Next Generation of Biomedical Innovators
Scholarly Research Paper Publication
The final scholarly research paper will be published on the Schar School Young Scholars Journals Webpage as well as the George Mason University (GMU) Library MARS Repository.
Keynote Speaker
Amarda Shehu's research advances foundational investigations in Artificial Intelligence (AI) and Machine Learning (ML). Her team is driven by a passion to push the barriers of their understanding of the physical and biological world. She says, "It is real-world, complex, wicked problems that prompt us to design novel AI and ML frameworks and algorithms. This is nowadays abbreviated as AI4Sience."
Shehu is an accomplished administrator, teacher, and scholar. She currently serves as George Mason’s Inaugural VP and Chief AI Officer in which capacity she also continues to provide leadership for the Institute of Digital InnovAtion (IDIA) for which she served as Associate Vice President for Research during 2022 and 2024. Shehu also serves as an Associate Dean for Research in the College of Engineering and Computing (CEC), where she is also a tenured Professor in the Department of Computer Science.
She is a fellow of the American Institute for Medical and Biological Engineering (AIMBE) and a member of the Virginia Academy of Science, Engineering, and Medicine (VASEM). Shehu has received numerous awards, including the 2022 Outstanding Faculty Award from the State Council of Higher Education for Virginia, the 2021 Beck Family Presidential Medal for Faculty Excellence in Research and Scholarship, the 2018 Mason University Teaching Excellence Award, the 2014 Mason Emerging Researcher/Scholar/Creator Award, the 2013 Mason OSCAR Undergraduate Mentor Excellence Award, and the 2012 National Science Foundation (NSF) CAREER Award.
Her research is regularly supported by various NSF programs, the Department of Defense, as well as state and private research awards.
About the Course Directors
Dr. Christopher Lockhart is a Research Assistant Professor at the School of Systems Biology. He uses computational methods to study biomolecules of medical importance. He frequently employs all-atom molecular dynamics simulations with enhanced sampling algorithms to exhaustively explore the conformational ensembles of target systems and compute their thermodynamic properties. He complements these simulation techniques with artificial intelligence and machine learning for structure prediction and analysis. Dr. Lockhart received his PhD from George Mason University, where he focused on uncovering the structural diversity of Alzheimer’s disease Aβ peptides to improve understanding of their cytotoxic mechanisms and the roles of anti-aggregation agents and biomarkers. Currently, Dr. Lockhart’s lab leverages simulations to investigate in atomistic detail the binding of antimicrobial peptides such as PGLa and indolicidin. Ultimately, Dr. Lockhart seeks to understand how sequence variations in antimicrobial peptides affect their three-dimensional structures and their biological functions. More broadly, Dr. Lockhart is interested in developing in silico pipelines for designing peptide-based therapeutics.