Proposed projects for the 2011 DIMACS and DIMACS/DIMATIA REU Programs


Please revisit this page frequently, additional projects will be posted through January.


Project #: DC2011-01

Mentor: James Abello, abello at dimacs dot rutgers dot edu, DIMACS

Project Title: Anomaly Detection

This project addresses the problem of finding persistent patterns in evolving networks. A central question is the characterization of patterns that can be used as the basis to detect anomalous activities in time-evolving networks.

Requirements: Background in Statistics and SQL is desired. Participants are expected to develop scripts to statistically analyze a variety of data sets.

References:
http://www.cs.cmu.edu/~christos/TALKS/SIGMOD-07-tutorial/ J. Abello, T.Eliassi Rad, N. Devanur, "Detecting Novel Discrepancies in Communication Networks", 10th IEEE International Conference on Data Mining, Australia, Dec 2010.

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-02

Mentor: James Abello, abello at dimacs dot rutgers dot edu, DIMACS

Project Title: DIMACS Graph Mining In this project we will build a variety of multi-graphs that can be extracted from this data set. We will then use our current set of graph analysis tools to parse, navigate, visualize and synthesize the findings. One central challenge is to devise methods that are privacy preserving.

Requirements: Some Knowledge of SQL, fundamental graph algorithms and C/C++ programming is a plus. However, there are several analysis tasks that can be performed using our plug in graph visualization system and that require only minimal amount of programming.

References:
[AvHK06] J. Abello, F. van Ham, and N. Krishnan, "Ask-GraphView-: A Large Scale Graph Visualization System", IEEE Transactions in Visualization and Computer Graphics, 12(5): 669-676 (2006).
J. Abello, F. V. Ham, N. Krishnan, "AskGraphView: A Large Graph Visualization System", IEEE Transactions in Visualization and Computer Graphics, 12(5): 669-676 (2006)

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-03

Mentor: James Abello, abello at dimacs dot rutgers dot edu, DIMACS

Project Title: Name That Cluster

Given a user query, search engines generally return a very sizeable collection of possible answers. Clustering has been proposed as a tool to partition the possible answer set into more manageable subsets of related results. There is no current agreement on the preferred mode of presentation of these clusters. Currently, most search engines display the set of results on almost a pure textual form. However, relatively recently we have witnessed some timid attempts to use some graphical representations. This study is a first step to elucidate when and why text appears to outperform graphics for certain fundamental clustering related tasks. To this end we designed three interfaces to display flat clusters of user queries. The interfaces are enhanced with mechanisms by which users provide feedback about the relevance of a cluster for a pre-specified input query. Subsequently, users are asked to provide a name for a given cluster that best describes the cluster contents. In this project, we will analyze the results obtained from a web user study that started on May 2008.

Requirements: Basic Statistics, reading clustering related papers and preparing summary of findings.

Reference::
[ASGT07] J. Abello, J, Schulz, H, Gaudin, B, and Tominski, C (2007). Name That Cluster - Text vs. Graphics, IEEE InfoVis Conference, Sacramento, November 2007.

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-04

Mentor: Ming Xie, mxie at stat.rutgers.edu, Department of Statistics and E.A. Elsayed, elsayed at rci.rutgers.edu, Department of Electrical Engineering

Project Title: Optimum Strategies of Container Inspection Scheduling and Sequencing at Port-of-Entry

Containers arriving at ports may contain undesirable contents such as drugs, radioactive material or chemical and biochemical agents. Inspection of a fraction of these containers (randomly or based on some risk factor) might lead to an improved security system. Clearly, there is a probability that a container with undesirable contents might enter into the system without detection or a "clean" container might be subjected to a thorough inspection.... The objective of this project is to determine the optimum inspection strategy and container inspection sequence that minimize the occurrence of the above probabilities as well as the delay of releasing the container at the port. Approaches such as the development of adaptive threshold levels of the inspection sensors based on the assigned risk factor might prove to an efficient inspection strategy. The research investigates new and efficient inspection strategies.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-05

Mentor: Paul Kantor, kantor at scils.rutgers.edu, School of Communication, Information and Library Studies

Project Title: Equivalent Quadratic Pseudo-Boolean Functions

Unconstrained quadratic binary optimization is a widely used technique for numerous practical problems, and is also known to be a hard optimization problem. We know that every real-valued set-function on an n-set has a unique multilinear polynomial representation (called a pseudo-Boolean function). Quadratic pseudo-Boolean functions also have many other representations, one of which is a nonnegative combination of 2-variable parity functions. This form can be equivalently viewed as a weighted signed graph. Every spanning tree in this signed graph gives rise to a different pseudo-Boolean function, all of which, however, share the same value-set (same range). This implies that they have the same minimum (maximum) values, and we can use any one of these functions to determine the minimum of the originally given input function. Let us call two pseudo-Boolean functions equivalent if they have the same value-set. The project will explore the equivalent family of functions corresponding to the different spanning trees under the weighted signed graph representation. We plan to study several questions, in particular how the graph structure determines the availability of alternative quadratic multilinear polynomials, and how different equivalent representations change the set of local minima. Of course, the global minimum value remains the same, as noted above, but the different functions may have different local minima.

Prerequisites: An introductory course in O.R. or linear programming would be helpful, but not necessary.

Reference:
E. Boros, K. Elbassioni, V. Gurvich, L. Khachiyan, and K. Makino, On the Complexity of Some Enumeration Problems for Matroids," SIAM J. on Discrete Mathematics, 19 (2005), 966-984.
E. Boros, V. Gurvich, Perfect Graphs Are Kernel Solvable," Discrete Mathematics, 159 (1996), 35-55.
E. Boros, V. Gurvich, L. Khachiyan, and K. Makino, On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices," Annals of Mathematics and Artificial Intelligence, 39 (2003) 211-221.
E. Boros, V. Gurvich, and R. Meshulam, Difference Graphs," Discrete Mathematics, 276 (2004), 59-64.
E. Boros, I. Lari, and B. Simeone, Block Linear Majorants in Quadratic 0-1 Optimization," Discrete Applied Mathematics, 145 (2004), 52-71.
E. Boros, and V. Menkov, Exact and Approximate Discrete Optimization Algorithms for Finding Useful Disjunctions of Categorical Predicates in Data Analysis," Discrete Applied Mathematics, 144 (2004), 43-58.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-06

Mentor: Graham Cormode, graham at research.att.com, AT&T Labs

Project Title: Learning Attacks on Anonymized Data

There have been many recent attempts to create "anonymized" versions of data sets, by masking the true mapping from individuals to their sensitive data but still preserving some correlations. The security analysis of such attempts depends on a relatively naïve attacker model. However, it is often possible to learn more about the "true" data by applying a more sophisticated attack based on building an accurate classifier to use on the anonymized data. This project is to study the power of such attacks. Can we apply this approach to existing anonymization methods? How severe are they? To what extent can the effects be mitigated?

Prerequisites: A solid background in algorithmic thinking is needed to be able to apply the attacks. Courses in Machine Learning or Statistical Inference are a definite plus. Knowledge of Security/Privacy issues is also helpful.


Project #: DC2011-07

Mentor: Ji Young Choi, jychoi at ship.edu, DIMACS visitor

Project Title: The Boundaries of the Minimum Pk-Total Weights for Simple Connected Graphs

The sum of the weights on every path of length k in a graph with edges of weight 1 or -1 is called the Pk-total weight of the ± 1-assigned graph. The minimum of the absolute value of the Pk-total weights of a graph considering every possible ± 1 assignment is called the minimum Pk-total weight. This project is to study the boundaries of the minimum Pk-total weights for simple connected graphs.

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-08

Mentor: Eugene Fiorini, gfiorini at dimacs.rutgers.edu, DIMACS

Project Title: Partitioning Graphs into Class-0 Pebbling Subraphs

Graph pebbling consists of placing "pebbles" or markers on vertices of a simple graph and moving the pebbles according to the following rule: There must be at least two pebbles on a given vertex in order to make a pebbling move. A pebbling move from one vertex to an adjacent vertex consists of removing one pebble from the initial vertex and moving a second pebble from the initial vertex to the terminal vertex. The pebbling number of a graph, p(G), is the minimum number of of pebbles needed to, after a series of pebbling moves, have a pebble at any given vertex, given any initial configuration of the p(G) pebbles. A class-0 graph G on n vertices is one in which p(G) = n. We define a k class-0 graph G as one in which a simple graph G can be partitioned into k class-0 subgraphs on n vertices. Previous results have shown that complete graphs on n vertices, Kn, are k class-0 for a given n > N for some integer N (Fiorini, Verbeck, de Silva). This project will explore additional questions related to pebbling and k class-0 graphs.

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-09

Mentor: Paul Kantor, kantor at scils.rutgers.edu, School of Communication, Information and Library Studies

Project Title: Game Theoretic Aspects of Homeland Security

The effort to protect our country form the threat of smuggled weapons is a non-zero sum game, in which the opponent's estimates of the value of a success are different from our own estimates of the cost of failure. The problem is known under the name "Inspector Game". The summer project will examine this problem in the context of an ongoing research program on the problem of optimal detection of nuclear threats at borders, and within the country.

* You must be enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-10

Mentor: Mentor: Wilma Olson, olson at rutchem dot rutgers dot edu, Chemistry, Rutgers University

Project Title: Protein-Induced DNA Looping

Many genetic processes are mediated by proteins that bind at separate, often widely spaced, sites on the double helix, tethering the intervening DNA into a loop [1, 2]. Examples of these processes include gene expression and its control, DNA replication, genetic rearrangements, multi-site cutting by restriction enzymes, and DNA packaging. A DNA loop stabilized by such long-range protein-protein contacts constitutes a discrete topological unit. As long as the ends of the DNA stay in place and the duplex remains unbroken, the linking number, Lk, or number of times the two strands of the double helix wrap around one another, is conserved. This constraint in Lk underlies the well known supercoiling of DNA: the stress induced by positioning the ends of the polymer in locations other than the natural (relaxed) state perturbs the overall coiling of the chain axis and/or the twisting of successive base-pairs in the intervening parts of the chain [3]. As a first step in understanding the effects of specific proteins and drugs on DNA looping, we propose to study the imposed bending and twisting of neighboring base pairs [4] in known complexes of proteins and drugs with double helical DNA stored in the Nucleic Acid Database [5]. By subjecting a DNA segment of the same chain length as that found in a given complex to the observed orientation, displacement, and superhelical stress and setting the elastic moduli to sufficiently high values, we can use existing programs, e.g., [6], to simulate the presence of a rigidly bound molecule at arbitrary positions on circular DNA molecules or to model specific systems in which DNA looping plays an important role, e.g., the lac repressor-operator assembly in EscherichiaA0coli [7]. One student could devote a summer project to the collection of geometric information. Other students could apply this information to simulations of DNA loops and folds in subsequent years. Prerequisites: Students should have an interest (but not necessarily formal training) in molecular biology, familiarity with linear algebra in order to understand the parameters used to describe DNA structure, and knowledge of basic chemistry and physics to understand the nature of the DNA molecules and the elastic treatment of the double helix.

[1] Halford, S. E., Gowers, D. M., and Sessions, R. B., ``Two are Better than One,'' Nature Struct. Biol., 7, (2000), 705-707.
[2] Schleif, R., ``DNA Looping,'' Annu. Rev. Biochem., 61, (1992), 199-223.
[3] Bauer, W. R. ``Structure and Reactions of Closed Duplex DNA,'' Annu. Rev. Biophys. Bioeng., 7, (1978), 287-313.
[4] Olson, W. K., Gorin, A. A., Lu, X.-J., Hock, L. M., and Zhurkin, V. B., ``DNA Sequence-dependent Deformability Deduced from Protein-DNA Crystal Complexes,'' Proc. Natl. Acad. Sci. USA, 95, (1998), 11163-11168.
[5] Berman, H. M., Olson, W. K., Beveridge, D. L., Westbrook, J., Gelbin, A., Demeny, T., Hsieh, S.-H., Srinivasan, A. R., and Schneider, B., ``The Nucleic Acid Database: A Comprehensive Relational Database of Three-dimensional Structures of Nucleic Acids,'' Biophys. J., 63, (1992), 751-759.
[6] Westcott, T. P., Tobias, I., and Olson, W. K., ``Modeling Self-contact Forces in the Elastic Theory of DNA Supercoiling,'' J. Chem. Phys., 107, (1997), 3967-3980.
[7] Muller-Hill, B., The Lac Operon, Walter de Gruyter, Berlin, 1996.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-11

Mentor: Midge Cozzens, midge6930 at Comcast dot net, DIMACS

Project Title: A Game Theory Approach to Cascading Behavior in Networks

This project will study the rapid flow of information in large social networks. This is important in "viral marketing" and comes out of early work done on the spread of epidemics and social network theory. A graph is built with nodes representing individuals in a population and an edge between two nodes if they have some form of communication. Each node has a choice of two behaviors, an old behavior and a new behavior, and an incentive (payoff) for matching behaviors. Each node plays the coordinated game with each neighbor and the payoff for a node is the sum of the individual game payoffs. A sample problem is to determine the k most influential people in the network, an NP-hard problem. We seek variations on the model, such as weighting the edges with other payoffs (e.g., with marketing strategies of price reductions) to better determine the most influential players (often those to start marketing to).

References:
E. Adar, L. Zhang, A. Adamic, and R. M. Lukose, "Implicit structure and the dynamics of blogspace," Workshop on Weblogging Ecosystem, 2004.
D. Kempe, J. Kleinberg, and E. Tardos, "Maximizing the spread of influence in a social network". Proc. of the 9th International Conference on Knowledge Discovery and Data Mining, pp. 137-146. 2003.
J. Leskovec, L. Adamic, and B. Huberman, "The Dynamics of Viral Marketing," Proc. 7th ACM Converence on Electronic Commerce, 2006.
N. Nisan, T. Roughgarden, E. Tardos, and V. Vazirani, (eds.) "Cascading Behavior in Networks: Algorithmic and Economic Issues", Algorithmic Game Theory, Cambridge: Cambridge University Press, 2007 pp. 613-632.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-12

Mentor: Alexander Schleip, schliep at cs.rutgers.edu, BioMaps Institute for Quantitative Biology and Computer Science

Project Title: Algorithm animation for bioinformatics algorithms

Gato (http://gato.sf.net), an open source software system written in Python, animates graph algorithms to show their dynamic behavior. A wide range of bioinformatics algorithms - such as aligning two sequences to assess their similarity, assembling complete genomes from fragments produced by sequencing, answering questions about the origin of species by constructing phylogenetic trees - can either be phrased as graph algorithms or can operate on graphs. Developing animations for selected algorithms will introduce participants to algorithmic aspects of bioinformatics and provide deep insights into the workings of selected algorithms.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-13

Mentor: Eric Allender, allender at cs.rutgers.edu, Computer Science

Project Title: Small Complexity Classes

The P vs NP question is without doubt the most important open question in computational complexity theory. However, there is also great interest in understanding various subclasses of P. In this project, we will study a recent approach initiated by Cook and McKenzie, for showing that L is not equal to P (where L is the class of problems that can be solved in logarithmic space). One advantage of this approach is that there are a number of bite-sized problems that can be broken off and studied separately. * You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-14

Mentor: William Pottenger, billp at dimacs.rutgers.edu, DIMACS and Computer Science

Project Title: Higher Order Learning

In our work we are developing a framework termed Higher Order Learning that builds models by leveraging associations in the form of latent relationships in an entity-relation graph. A general multi-attribute entity-relation graph has nodes that are people, places and things, etc. Higher Order Learning violates the underlying assumption made in traditional machine learning algorithms that records are independent and identically distributed (Taskar et al., 2002). Although this assumption simplifies the underlying mathematics of statistical models and of the corresponding parameter estimation procedures, it actually does not hold for many real world applications (Getoor & Diehl, 2005). Machine learning methods based on Higher Order Learning leverage relations between objects and features in an entity-relation graph and in doing so, operate on a much richer data representation compared to the traditional feature vector form.

In this research, Higher Order Learning techniques have been investigated and applied to a number of interesting datasets including Border Gateway Protocol (cybersecurity), E-Commerce (web mining), and Nuclear Detection (homeland security). The approach has been shown to outperform the popular Support Vector Machine algorithm on benchmark textual data sets as well as real-world streaming data (Ganiz, Lytkin & Pottenger, 2009). REU summer interns will have an unparalleled opportunity to explore Higher Order Learning algorithms and applications in the context of law enforcement and other domains.

Requirements: Applicants should have a solid Java programming background. Familiarity with XML is a plus, as is experience in text processing with Perl or other scripting languages

You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-15

Mentor: William Pottenger, billp at dimacs.rutgers.edu, DIMACS and Computer Science

Project Title: Entity Resolution

Emergencies invariably require crisis managers to field numerous calls for service such as 911, radio reports by police and fire teams, medical reporting, infrastructure support teams, etc. Technologies for Entity Resolution can be applied to handle the confusion by identifying geospatial, temporal, and content similarities in the messages, enabling first responders to resolve and identify the location and scope of multiple events. Entity Resolution also plays an important role in counter-terrorism, for example, in determining when two terrorist incidents are the same. This project will involve the research and development of Entity Resolution techniques to group incidents from the Internet Crime Complaint Center database (www.ic3.gov). REU summer interns will have an unparalleled opportunity to explore Entity Resolution algorithms and applications in the context of law enforcement and other domains.

Requirements: Applicants should have a solid Java programming background. Familiarity with XML is a plus, as is experience in text processing with Perl or other scripting languages

You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-16

Mentor: Linda Ness, Telcordia Applied Research and Tamra Carpenter, tcar at dimacs.rutgers.edu, DIMACS

Project Title: Mathematical Models for Cognitive Systems

There is an increasing need for real-time networked computing systems that fuse streaming data from multiple sensors, perceive events, act on them and communicate effectively to other people and systems. This project will a) conceive and document examples of such systems, b) attempt to develop mathematical models for these systems by identifying suites of relevant mathematical models and algorithms, c) identify the strengths and limitations and d) glimpse the research forefront in this area.

Reference:
D. Bassu and C. Behrens, "Distributed LSI: Scalable Concept-based Information Retrieval with High Semantic Resolution," Proceedings of the 3rd SIAM International Conference on Data Mining, 2003.
R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker, "Geometric Diffusions as a Tool for Harmonic Analysis and Structure Definition of Data: Multiscale Methods", PNAS 102:21 (2005), pp. 7432-7437.
P. W. Jones, M. Maggioni, and R. Schul, "Manifold Parametrizations by Eigenfunctions of the Laplacian and Heat Kernels," PNAS 105:6, pp. 1803-1808.
E. Liberty, F. Woolfe, P. G. Martinsson, V. Rokhlin, and M. Tygert, "A Fast Randomized Algorithm for the Approximation of Matrices", Applied and Computational Harmonic Analysis 25, pp.335-366, 2008.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-17

Mentor: Nina Fefferman, Ecology, Evolution, and Natural Resources and DIMACS

Project Title: Developing and Applying Algorithms Inspired by Nature to Applications in Ecology and Epidemiology

Dr. Fefferman is a leading researcher with both the Department of Ecology, Evolution, and Natural Resources and DIMACS who works largely in the areas of epidemiology, evolutionary & behavioral ecology, and conservation biology. Dr. Fefferman will work with a qualified REU student to identify a new problem from her area of research that will be of interest to both the student and the mentor. This project will appeal to those students who are interested in developing and applying mathematical and computational models to biological systems.

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-18

Mentor: Nina Fefferman, Ecology, Evolution, and Natural Resources and DIMACS

Project Title: Self-Organizing Biosurveillance Systems

Self-organizing systems for anomaly detection offer intriguing possibilities for enhancing practical health surveillance systems: they easily integrate information from multiple sources, at different locations, at different times, with different rates of collection and reporting, and are able to make accurate distributed decisions under shifting external conditions, relieving the need for centralized analysis. This project will involve helping create, implement, and test novel algorithms for biosurveillance using techniques borrowed from self-organizing systems for distributed decision making and anomaly detection found in nature. Prerequisite: Familiarity with computer programming.

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-19

Mentor: Nina Fefferman, Ecology, Evolution, and Natural Resources and DIMACS

Project Title: Bio-inspired Anomaly Detection under Uncertainty - Evolutionary Implications

Many "algorithms" found in nature have been shown to be near-optimal for the environmental conditions inder which they evolved. However, very little work has investigated the robustness of these algorithms to applications outside their evolutionary context. This project focuses on two different types of altered environments: stochasticity in the environment itself and uncertainty in our assessment ability. Work will involve implementation and computational exploration of simulations to determine the efficiency of two specific bio-inspired algorithms for anomaly detection in these altered environments. Further, we will be interested in the evolutionary environmental implications of the success of these algorithms in this broader context, integrating basic research in theoretical biology with applied research in data analytics. Prerequisite: Familiarity with computer programming.

* You must be a enrolled at a U.S. university to be eligible for this project.


Project #: DC2011-20

Mentor: Bahman Kalantari, kalantar at cs.rutgers.edu, Computer Science

Project Title: Exploring Polynomiography and Its Algorithmic and Mathematical Applications

Dr. Bahman Kalantari is a leading researcher with the Department of Computer Science who was instrumental in developing the field of polynomiography. Polynomiography is a mathematically-inspired computer medium based on algorithmic visualizations of one of the most basic and fundamental tasks in science and mathematics: solving a polynomial equation. Polynomiography serves as a powerful medium for creativity, discovery, and learning, with numerous applications in education, math, science, art and design. Dr. Kalantari will work with a qualified REU student to identify a new problem from polynomiography that will be of interest to both the student and the mentor. In particular, some specific problems will be described from computational geometry, numerical analysis, discrete math, education, algorithmic mathematical art, fine art, and games. www.polynomiography.com

Reference:

B. Kalantari, A new medium for visual art: Polynomiography, Computer Graphics Quarterly, 38, 2004, 22-24.

B. Kalantari, Polynomiography and Applications in Art, Education, and Science, Computers & Graphic 28:3, (2004) pp. 417-430.

B. Kalantari, Polynomiography: From the Fundamental Theorem of Algebra to Art, LEONARDO, Vol. 38, No. 3, 233-238, 2005.

B. Kalantari, Polynomial Root-Finding and Polynomiography, World Scientific, New Jersey, 2008.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-21

Mentor: Aaron Jaggard, adj at dimacs.rutgers.edu, Colgate University

Project Title: Characterizing the trade-off between network monitoring and network overload

Operators of networks are interested in performance anomalies that may occur, e.g., timing problems or traffic loss. To detect such anomalies, probing devices can be placed throughout the network; these devices can send test packets to each other, measure properties of the transmission (e.g., delay), and alert the operator when something abnormal is detected. However, there is a trade-off among the number of probe devices, amount of test traffic, and overhead that negatively affects regular network traffic.

Some work [1] has been done on algorithms to choose locations for probing devices such that the paths between the devices adequately cover the network). However, this work is only one of many possible ways to approach the trade-off between adequate probing and network overhead. The purpose of this project is to work towards characterizing this trade-off by (1) formalizing metrics by which to evaluate probing strategies and (2) identifying example strategies and studying their effects on such metrics.

Experience with graph algorithms is helpful but not necessary for this project.

[1] "Network Performance Anomaly Detection and Localization." P. Barford, N. Duffield, A. Ron, J. Sommers. INFOCOM 2008.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-22

Mentor: Matthew Stone, matthew.stone at Rutgers dot edu, Computer Science

Project Title: Modeling interactions between humans and dialogue systems

Start from an existing data set of interactions between human subjects and a dialogue system, including text, context and interpretation. Apply existing machine learning tools such as maximum entropy methods to build models of this data, such as: what dialogue moves speakers tend to make, how speakers tend to express those moves in words, or what outcomes different dialogue moves tend to achieve. Focus on the creation of features, guided by results on conversation from cognitive science, that capture the natural coherence of dialogue. Evaluate the new models offline using techniques such as cross-validation that assess how well they characterize the existing data.

Required skills: undergraduate coursework in AI, data mining or computational statistics; experience with cognitive psychology helpful but not essential.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-23

Mentor: Matthew Stone, matthew.stone at Rutgers dot edu, Computer Science

Project Title: Building computational dialogue systems

Carry out human-subjects experiments to contrast the performance of two different versions of a computational dialogue system that identifies objects with human users. One version will be a baseline system; the other will be an improved version (perhaps built in collaboration with a linguistics, psychology or AI student) with additional expressive capabilities or optimized decision-making algorithms. Develop hypotheses about qualitative and quantitative performance differences between the systems, come up with conditions and metrics to measure the differences and assess whether the hypotheses are confirmed using appropriate statistical tests.

Required skills: undergraduate coursework in user interface design or experimental methods (and ideally both).

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-24

Mentor: Andy Nealen, andy at nealen dot net, Computer Science

Project Title: Shape Modeling and Animation

Computer software for creating, modifying and animating digital 3D shapes is a key component in computer aided design, engineering, computer animation and digital games. The student will be involved in research into dynamic hierarchical part hierarchies, which combine the interrelated topics of modeling and animating digital shapes into a coherent whole, as well as and simple and effective sketch-based computer tools, with the ultimate goal of making these tasks more accessible to amateurs and professionals alike. The research will build upon previous work, but put an emphasis on developing new sketch-based interfaces, algorithms and shape representations that allow us to combine the previously mostly disjoint, yet intrinsically connected tasks of shape modeling and animation.

Requirements: basic programming (in C/C++) and computer graphics (CS 428 equivalent)would be very beneficial.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-25

Mentor: Andy Nealen, andy at nealen dot net, Computer Science

Project Title: Shape, Color, Motion and Input in User-Centric Video Game Design

Video games have established themselves as an everyday cultural artifact. As a result, they have become a prominent, successful and expressive form. This research will explore how humans perceive a video game environment, and how this environment influences common control tasks. While early day designers of computer games were constrained by technical limitations, we believe that abstraction as a game design paradigm is an important information visualization tool, and that artificial constraints on video game rendering serve the goal of making the game more accessible. The research objectives are (a) to provide algorithms for the creation of an entire spectrum of pixel-, vector- and line-based visual abstractions from photographs and 3D models, and (b) to prove, by user study, that non-representational video games, combined with just the right kind (and complexity) of user control, are preferred by the majority of human users.

Requirements: basic programming (in C/C++) and computer graphics (CS 428 equivalent)would be very beneficial.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-26

Mentor: Eileen Kowler, kowler at rci dot Rutgers dot edu, Psychology

Project Title: Visual search

Visual search is an operation that is performed by humans and by machines (e.g., robots navigating through cluttered environments). This project will require students to develop a simple but realistic visual search task (e.g., finding an object hidden under another on the screen) where the likelihood of finding the target in a particular location is traded-off against the cost (search time or effort) of accessing the most likely locations, and compare the path humans prefer to the optimal strategy.

Requirements: Course work in statistics; Matlab programming skills.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-27

Mentors: Jacob Feldman, Jacob at ruccs dot rutgers dot edu, Psychology, Manish Singh, manish at ruccs dot rutgers dot edu, Psychology, Doug DeCarlo, decarlo at cs dot rutgers dot edu, Computer Science

Project Title: Part-based studies of shape.

Shapes are often represented according to their part structure: examples include a "skeletal" model developed by Singh and Feldman, and a hierarchical attachment model due to Mi and DeCarlo. Such models can be used to recognize objects and create abstract imagery by simplifying and removing parts, and computation of relevant geometric features such as local symmetries. Projects can refine these models by representing new kinds of part organization (e.g., operations that create texture ) or encoding constraints on part structure for specific classes of shape (biological forms or man-made artifacts or moving shapes).

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-28

Mentor: Michael Littman, mlittman at cs dot rutgers dot edu, Computer Science

Project Title: Integrating perception and action via learning.

Models of animal and machine learning usually assume that the perceptual world is parsed into symbols or features before learning methods can be applied. This project will look at a perceptually simple domain---an 80s era video game---and learn perceptual features necessary for completing the task. The construction of the perceptual representation will be guided by the task. That is, instead of presuming the existence of a generic scene analyzer, we will experiment with methods for extracting features in a task-dependent manner ? good features are those that let the system better predict the future of the game and plan its actions accordingly . The resulting system will combine elements of machine vision and reinforcement learning

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-29

Mentor: Doug DeCarlo, decarlo at cs dot rutgers dot edu, Computer Science

Project Title: Constructing and perceiving artistic imagery.

Line drawings can be clearer and simpler than realistic pictures. DeCarlo's models for creating line drawings from 3D objects computed in terms of surface geometry provides a starting point for projects using combinations of lines to create specific artistic styles (using a heuristic approach or one learned from empirical data from what lines artists draw), visualization experiments overlaying scientific data over line drawings, and perceptual studies to understand which renderings convey shape most effectively.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-30

Mentor: Elizabeth Torres, ebtorres at rci dot rutgers dot edu, Psychology

Project Title: Models of human movement

Build an interface that captures positions and orientations of real time hand movements (using available sensor data) and outputs "fake" feedback in real time. The underlying algorithm should fool a human performer into believing they are moving in a certain way, which in reality is different from the actual way in which they are moving. Trial to trial repeats should be stochastic enough that reveals no possible pattern, yet in the long run some pattern should emerge to drive the motor and perceptual systems to adopt new performance patterns.

Requirements: Knowledge of Matlab and C++ is recommended.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: PSG2011-31

Mentor: Thomas Papathomas, papathom at rci dot rutgers dot edu, Biomedical Engineering and Elizabeth Torres, ebtorres at rci dot rutgers dot edu, Psychology

Project Title: Vision for action versus vision for perception.

Research on the characteristics of the ventral neural system (also referred as the "what" system, or the "vision for perception" system) and the dorsal neural system (also referred as the "where" system, or the "vision for action" system) is one of the long-standing debates in vision research. This project combines two main tools to test the hypothesis that action is governed by the dorsal system, even in the presence of conflicting input from the ventral system. One tool is three-dimensional bistable stimuli that provide conflicting inputs; the other is an elaborate computer-monitored system that can detect and measure motion by body parts in real time.


Project #: DC2011-32

Mentor: Andrew Baxter, baxter at math dot rutgers dot edu, Mathematics

Project Title: Billiards in Polygons

Choose a polygon P and imagine a billiard ball bouncing about inside it, following the usual rules of reflection. What can one say about the paths in P generated in this way? In particular, how many paths repeat themselves after n bounces? What if one were given a sequence of sides of the polygon: is it possible to construct a path such that the nth bounce is on the nth side of the polygon?

References:

Baxter, Andrew M.; Umble, Ronald. Periodic orbits for billiards on an equilateral triangle. Amer. Math. Monthly 115 (2008), no. 6, 479?491.
(ArXiv version: http://arxiv.org/PS_cache/math/pdf/0509/0509292v7.pdf)

Gutkin, E. Billiard dynamics: a survey with the emphasis on open problems. Regul. Chaotic Dyn. 8 (2003), no. 1, 1?13.

Gutkin, E. Billiards in polygons. Phys. D 19 (1986), no. 3, 311?333

Tabachnikov, S. Billiards, Panoramas Et Synthµeses. Societe Mathematique de France, Paris, 1995.

Vorobets, Ya. B.; Gal'perin, G. A.; Stepin, A. M. Periodic billiard trajectories in polygons: generating mechanisms, Russian Math. Surveys 47 (1992) 5-80.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-33

Mentor: Urmi Ghosh-Dastidar, ughosh-dastidar at citytech dot cuny dot edu, DIMACS

Project Title: A Location based Graph Theory Approach for Pattern Identification in Public Health

Graph theory is a division of mathematics that involves studying 'graphs.' Graphs identify various objects and classify the bonds between them. A graph is a collection of points (referred as nodes or vertices) joined by lines (referred as edges). Within the context of this study, nodes represent the locations and edges are determined by the common boundaries, income distribution, and population demography. The objective of this study is to use a graph theory approach for pattern recognition to identify potential public health problem.

Requirements: An introductory knowledge of Linear Algebra is a must and an introductory course in Differential Equations is a plus. Some knowledge in programming will be helpful.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-34

Mentor: Endre Boros, boros at rutcor dot rutgers dot edu, RUTCOR

Project Title: Equivalent Quadratic Pseudo-Boolean Functions

Unconstrained quadratic binary optimization is a widely used technique for numerous practical problems, and is also known to be a hard optimization problem. We know that every real-valued set-function on an n-set has a unique multilinear polynomial representation (called a pseudo-Boolean function). Quadratic pseudo-Boolean functions also have many other representations, one of which is a nonnegative combination of 2-variable parity functions. This form can be equivalently viewed as a weighted signed graph. Every spanning tree in this signed graph gives rise to a different pseudo-Boolean function, all of which, however, share the same value-set (same range). This implies that they have the same minimum (maximum) values, and we can use any one of these functions to determine the minimum of the originally given input function. Let us call two pseudo-Boolean functions equivalent if they have the same value-set. The project will explore the equivalent family of functions corresponding to the different spanning trees under the weighted signed graph representation. We plan to study several questions, in particular how the graph structure determines the availability of alternative quadratic multilinear polynomials, and how different equivalent representations change the set of local minima. Of course, the global minimum value remains the same, as noted above, but the different functions may have different local minima.

Prerequisites: An introductory course in O.R. or linear programming would be helpful, but not necessary.

Reference:

E. Boros, K. Elbassioni, V. Gurvich, L. Khachiyan, and K. Makino, On the Complexity of Some Enumeration Problems for Matroids," SIAM J. on Discrete Mathematics, 19 (2005), 966-984.

E. Boros, V. Gurvich, Perfect Graphs Are Kernel Solvable," Discrete Mathematics, 159 (1996), 35-55.

E. Boros, V. Gurvich, L. Khachiyan, and K. Makino, On Maximal Frequent and Minimal Infrequent Sets in Binary Matrices," Annals of Mathematics and Artificial Inteligence, 39 (2003) 211-221.

E. Boros, V. Gurvich, and R. Meshulam, Difference Graphs," Discrete Mathematics, 276 (2004), 59-64.

E. Boros, I. Lari, and B. Simeone, Block Linear Majorants in Quadratic 0-1 Optimization," Discrete Applied Mathematics, 145 (2004), 52-71.

E. Boros, and V. Menkov, Exact and Approximate Discrete Optimization Algorithms for Finding Useful Disjunctions of Categorical Predicates in Data Analysis," Discrete Applied Mathematics, 144 (2004), 43-58.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2011-35

Mentor: Cliff Behrens, cliff at research dot telcordia dot com, Telcordia Applied Research

Project Title: Agent Based Modeling of Crowd Behavior

The focus of this research is agent based modeling (ABM) of crowd behavior. We will use NetLogo to model crowding and the behavior that emerges from the interaction of individual human agents in crowded conditions. The emphasis will be on representing in models realistic assumptions about human cognition and emotion in situations such as emergency evacuations from built spaces, and comparison of the relative quality of knowledge obtained through crowd-sourcing and expert opinion sampling.

* You must be a enrolled at a U.S. university to be eligible for this project.


Please revisit this page frequently, additional projects will be posted
Index of REU program
DIMACS Home Page
Contacting the Center
Document last modified on February 15, 2011.