North Carolina A&T State University and Elon University
Joint REU in Mathematical Biology

REU Mentors for Summer 2021 Program

The following faculty will be serving as project mentors for the 2021 Joint REU program.  The general area of each mentor’s project is listed, but specfic project topics have not yet been finalized.

  • Dr. Nicholas Luke, North Carolina A&T State University: Epidemiological Modeling or Pharmacokinetics.
  • Dr. Karen Yokley, Elon University: Epidemiological Modeling or Pharmacokinetics
  • Dr. Choongseok Park, North Carolina A&T State University: Mathematical Neuroscience
  • Dr. Nicholas Bussberg, Elon University: Statistical Modeling involving Environmental Science.
  • Dr. Seongtae Kim, North Carolina A&T State University: Biostatistics

Projects from Summer 2020 REU Program

Predicting Phosphorylation Sites Using Gradient Boosting: Phosphorylation is a post-translational modification of proteins where specialized enzymes calledkinases add a phosphate group to serine, threonine or tyrosine, which have the highest abundanceof phosphorylation sites in the phosphoproteome. Due to its flexibility and reversibility, phosphory-lation is implemented in numerous cellular processes, such as signal transduction, cell proliferation,apoptosis, gene expression, cell cycle progression, and cytoskeletal regulation. The prediction ofphosphorylation sites can also be utilized in cancer therapy to treat B-Cell Lymphoma II. UsingPhospho.ELM for training and testing data, we compared 6 models: support vector machines withlinear and radial kernels, random forest, and gradient boosting with varying interaction depths of 1,2, and 5. There were 514 protein sequence features defined for a window size of nine. The proteinsequence features implemented into our gradient boosting model were: Shannon Entropy (SE),Relative Entropy (RE), Information Gain (IG), Average Cumulative Hydrophobicity (ACH), Com-position, Transition and Distribution (CTD), Sequence Order Coupling Numbers (SOCN), QuasiSequence Order (QSO), Sequence Features (SF), Overlapping Properties (OP), Pseudo-Amino AcidComposition (PseAAC), and Amphiphilic Pseudo-Amino Acid Composition (APseAAC). The sixpredictive models were assessed based on accuracy, sensitivity, specificity, MCC, and F1. Althoughthe random forest model was found to have higher accuracy and specificity, gradient boosting withan interaction depth of 5 was found to have the higher sensitivity, MCC, and F1. A balanceddataset was also considered and measures evaluated for each model.

Incorporation of Body Composition Variations into a Physiologically-BasedPharmacokinetic Model of Xylene: Physiologically based pharmacokinetic (PBPK) models are ordinary differential equation modelsthat have been used to estimate the internal dosages for toxicants in the body. One such toxicantis xylene, a hydrocarbon that comes in the form of a liquid or vapor and has a wide variety of uses,most commonly in paint and paint-thinner in the industrial setting. While any long exposure toxylene is harmful (inhalation, ingestion, dermal contact, etc), the inhalation of the substance is oneof the most dangerous. Inhalation of xylene may cause dizziness, headache, nausea, and vomiting.The current study uses a pre-existing PBPK model to investigate how xylene is distributed todifferent compartments of the body. While the PBPK model was originally parameterized for theaverage male body, we account for the variety of human bodies, examining how different bodycompositions may affect the concentration of xylene in the different compartments. To betterinvestigate this, body volume parameters are altered and the addition of body height parametersare considered. Fat percentages are altered as well to describe different body types during theresearch.

A Quantitative Investigation of Preventative Measures for COVID-19: Coronaviruses are RNA viruses that cause respiratory infections. In late 2019 and early 2020, anovel coronavirus, SARS-CoV-2 or COVID-19, was discovered and caused an outbreak in Wuhan,China. Since, cases of COVID-19 have spread worldwide, causing a global pandemic. Therefore, itis important to understand, predict, and prevent the spread of COVID-19. A mathematical modelis used to divide an arbitrary population into susceptible, exposed, infected, quarantined, andrecovered populations. The model shows the progression of the virus among these populations andcan be used to determine the total infections caused by COVID-19, the rapidity of the outbreak, andthe reproductive number. This model can be further used to investigate the efficacy of measures taken to prevent the spread of infection such as vaccination, social distancing, and masks. To nvestigate such measures, the model is modified. For vaccination, a vaccinated subset of the population is added. For social distancing and masks, the model adds susceptible, exposed, and infected populations who are social distancing or wearing masks. Investigation of these preventative measures also included the addition of new parameters or changes to existing parameter values within the model. For vaccination, parameters representing vaccination rate and efficacy of the vaccine were added. For social distancing and mask usage, the contact rate and transmission rate, respectively, were adjusted. By examining the effect of these parameters on the total infections, rapidity of the outbreak, and reproductive number, the efficacy of each preventive measure can be determined. Then, a suggestion can be made to help mitigate the real-world destruction of COVID-19.

Physiologically Based Pharmacoketic (PBPK) Modeling of Per- andPolyfluoroalkyl Substances (PFASs): Per- and polyfluoroalkyl substances (PFASs) are a group of persistent manufacturing byproducts.Studies by several state environmental agencies have found PFASs in water sources, raising signifi-cant concern for human safety. This is a result of the chemicals’ ability to stay in the body for longperiods of time and their resistance to chemical and thermal breakdown. In lab animals, PFASshave been shown to cause tumors as well as have reproductive, developmental, and immunologicaleffects. PFAS research with regard to humans is incomplete, yet existing findings are notable. Inthis project, we use physiologically-based pharmacokinetic (PBPK) modeling to represent the flowof PFASs through the body. PBPK modeling shows the concentration of a toxicant in differenttissues and organs in the body, called compartments. Compartments include the fat, brain, lungs,gut, and liver. Equation parameters include the rate of blood flow to and from each compartment;partition coeffcients, which quantify the difficulty of passing from blood to each compartment; andmetabolism coefficients. The final model is a system of differential equations, each representing therate of change of the toxicant within one compartment. Each compartment uses a modified versionof a general equation to model the toxicant, based initially on ’flow in’ minus ‘flow out’. We first replicate existing published data using MATLAB, to test the general structure and coding of PFASsin PBPK modeling. Existing parameter values also come primarily from published data. We then compare our simulated data to experimental observations, resulting in a qualitative assessment of leading models.