###### May 2015

DIMACS will welcome students participating in the 2015 Research Experiences for Undergraduates (REU) program on June 1, and it looks like the students from the 2014 program left them big shoes to fill. During the past year, the 2014 REU students have co-authored 13 papers, given 13 conference talks, and presented many more posters describing their research. In anticipation of the 2015 program, we highlight a few of the results from the 2014 group.

The REU was well represented at the 2015 Joint Math Meetings (JMM), which included talks by five students and posters presented by five more. Among the speakers was Elizabeth Yang (Princeton University) who worked with postdoctoral researcher Brian Nakamura on a project exploring the relationship between competition graphs and permutation patterns. Competition graphs trace their roots to ecology, where they were applied to the study of food webs. A competition graph has a node corresponding to each species in an ecosystem and a link between them if they share a common prey. Thus, two species are connected if they compete for food. More generally, competition graphs can be created for other directed graphs (not just those associated with food webs). They are constructed over the same set of nodes, with two nodes being connected by a link in the competition graph whenever they have a common successor node in the graph. Studying competition graphs that arise from other families of directed graphs is an area of active research. Nakamura and Yang’s work extends previous work from the literature on competition graphs arising from doubly partial orders on points in the plane. They illustrate the connection between permutations and directed graphs and then show that any doubly partial order corresponding to a finite set of points in the plane can be thought of as a doubly partial order on a permutation in terms of its induced directed graph and competition graph. The connection with permutations allows them to make observations on how the structure within a permutation leads to structure within the competition graph by showing that each edge in the competition graph corresponds to an instance of a 123 or a 132 pattern within the permutation. Additional related results are described in Nakamura and Yang’s complete work on the arXiv.

Two student papers were also included in the 2015 IEEE Symposium on Technologies for Homeland Security (IEEE HST). Roshan Thapaliya (Howard University) worked with postdoctoral researcher Brian Ricks in developing a simulation to assist homeland security professionals in planning for crowd control during emergencies. They developed their simulation based on a real incident at the Bradford City soccer stadium in 1985 that resulted in 56 deaths and more than 200 injuries. In what became the worst fire disaster in English soccer history, a small blaze turned into a panic-inducing inferno in less than three minutes. The simulation that Thapaliya and Ricks developed is envisioned as a tool to help planners understand the dynamics of a crowd during a panic situation. It enables them to explore the effect of factors such as: the location of fire and how fast it spreads; the location of obstacles and how quickly they can be surmounted; and the speed of pedestrian movement. A screen shot from their simulation is shown at left. On the left side, an animation shows the movement of people represented by red, blue, and black dots. The red dots represent people who always make good decisions; the blue are people who sometimes make mistakes; and the black dots represent people who have perished in the fire represented by the yellow disc. The gray bar represents a barrier that was present at Bradford and prominent in film and interviews about the fire. Simulations revealed that although the position of the obstacle had an impact on survival rate, its affect was not nearly as large as expected.

IEEE HST also included the work of fellow 2014 REU students Leo Behe (Lehigh University) and Zachary Wheeler (Grove City College) and 2013 DIMACS REU student Brian Knopp (Montana Tech). The three students worked with DIMACS faculty member William Pottenger and postdoctoral researcher Christie Nelson on exploring the relative performance of two machine learning methods for text classification. The methods they tested were the well-known naïve Bayes (NB) method, which assumes data are independent and identically distributed (IID), and a variant called higher-order naïve Bayes (HONB) which captures some of the dependencies within real-world data for better performance, but at a higher computational cost. In an effort to better predict when it is advantageous to invest the additional computational effort in using HONB, the students and their mentors tested whether there is a correlation between the degree to which the data follow a Zipf distribution and the performance of HONB. In Zipf-distributed data, the frequency of any value is inversely proportional to its rank in the frequency table. It has been observed that natural language data can often approximated with a Zipf distribution. Behe, Wheeler, and other members of their research team looked at seven different real-world data sets – both text and microtext – whose features were character substrings of length n (called n-grams). In tests with multiple values of n and multiple sizes of feature sets for each data set, the team’s initial results suggest that in instances where the Zipfian-ness is increasing but has not yet reached 100%, there does seem to be a correlation with the performance of HONB. Such cases generally involve smaller values of n and smaller feature sets.

Priscilla Lo (Rutgers University) worked with chemistry professor Wilma Olsen and postdoctoral researcher Nicolas Clauvelin in both the 2013 and 2014 DIMACS REU programs. In a forthcoming article in the Journal of Physics: Condensed Matter, they introduce a model and computational framework that make it possible to incorporate detailed structural features of DNA and histones in simulations of short chromatin constructs. Their results reveal a remarkable effect of nucleosome spacing on chromatin flexibility, with small changes in DNA linker length significantly altering the interactions of nucleosomes and the dimensions of the fiber as a whole. In addition, they found that changes in nucleosome positioning influence the statistical properties of long chromatin constructs. They observed that simulated chromatin fibers with the same number of nucleosomes exhibit polymeric behaviors ranging from Gaussian to worm-like, depending upon nucleosome spacing. These findings suggest that the physical and mechanical properties of chromatin can span a wide range of behaviors, depending on nucleosome positioning, and that care must be taken in the choice of models used to interpret the experimental properties of long chromatin fibers.

Printable version of this story: [PDF]

References:

L. Behe, Z. Wheeler, C. Nelson, B. Knopp, and W. Pottenger, “To Be or Not to Be IID: Can Zipf’s Law Help?” Proceedings of the 2015 IEEE Symposium on Technologies for Homeland Security.

N. Clauvelin, P. Lo, O. I. Kulaeva, E. V. Nizovtseva, J. Diaz-Montes, J. Zola, M. Parashar, V. M. Studitsky, and W. K. Olson, “Nucleosome Positioning and Composition Modulate in Silico Chromatin Flexibility,” to appear in Journal of Physics: Condensed Matter 27, (2015).

B. Nakamura and E. Yang, “Competition Graphs Induced by Permutations,” http://arxiv.org/abs/1503.05617.

R. Thapaliya and B. Ricks, “Using Crowd Simulation to Suggest Efficient Evacuation in Emergency Situation,” Proceedings of the 2015 IEEE Symposium on Technologies for Homeland Security.