Research Computing - MGHPCC Feed

November Publications

Below is a selection of papers that appeared in October 2019 reporting the results of research using the Massachusetts Green High Performance Computing Center (MGHPCC), or acknowledging the use of Harvard’s Odyssey Cluster, Northeastern’s Discovery Cluster, the Boston University Shared Computing Cluster and MIT’s Engaging Cluster all of which are housed at the MGHPCC.

Mengxi Chen, Lin Hu,  Ashwin Ramasubramaniam, and  Dimitrios Maroudas (2019), Effects of pore morphology and pore edge termination on the mechanical behavior of graphene nanomeshes, Journal of Applied Physics, doi: 10.1063/1.5125107

Xueyan Feng, Christopher J. Burke, Mujin Zhuo, Hua Guo, Kaiqi Yang, Abhiram Reddy, Ishan Prasad, Rong-Ming Ho, Apostolos Avgeropoulos, Gregory M. Grason and Edwin L. Thomas (2019), Seeing mesoatomic distortions in soft-matter crystals of a double-gyroid block copolymer, Nature, doi: 10.1038/s41586-019-1706-1

Brandt Gaches (2019), The Impact of Protostellar Feedback on Astrochemistry, Doctoral Dissertation – University of Massachusetts Amherst, https://scholarworks.umass.edu/dissertations_2/1721

Aaron T. Lee, Stella S.R. Offner, Kaillin M. Kratter, Rachel A. Smullen, and Pak Shing Li (2019), The Formation and Evolution of Wide-orbit Stellar Multiples in Magnetized Clouds, arXiv:1911.07863 [astro-ph.GA]

D. Kirk Lewis and Sahar Sharifzadeh (2019), Defect-induced exciton localization in bulk gallium nitride from many-body perturbation theory, Physical Review Materials, doi: 10.1103/PhysRevMaterials.3.114601

Xiaorong Liu and Jianhan Chen (2020), Modulation of p53 Transactivation Domain Conformations by Ligand Binding and
Cancer-Associated Mutations, Conference Proc. Pacific Symposium on Biocomputing 25:195, https://psb.stanford.edu/psb-online/proceedings/psb20/Liu_X.pdf

Clara Maurel, Patrick Michel, J. Michael Owen, Richard P. Binzel, Megan Bruck-Syal, G. Libourel (2019), Simulations of high-velocity impacts on metal in preparation for the Psyche mission, Icarus, doi: 10.1016/j.icarus.2019.113505

Lindsay J. Underhill, Chad W. Milando, Jonathan I. Levy, W. Stuart Dols, Sharon K. Lee, M. Patricia Fabian (2019), Simulation of indoor and outdoor air quality and health impacts following installation of energy-efficient retrofits in a multifamily housing unit, Building and Environment, doi: 10.1016/j.buildenv.2019.106507

Aristoula Selevoua, George Papamokosa, Tolga Yildirimc, Hatice Duran, Martin Steinhart and George Floudas (2019), Eutectic liquid crystal mixture E7 in nanoporous alumina. Effects of confinement on the thermal and concentration fluctuations, RSC Adv., doi: 10.1039/C9RA08806G

Soplata, Austin Edward (2019), A Thalamocortical Theory of Propofol Phase-amplitude Coupling, Doctoral Dissertation – Boston University School of Medicine, https://search.proquest.com/openview/2e231ecca119628be0bdf376b72bc920/1

Do you have news about research using computing resources at the MGHPCC? If you have an interesting project that you want to tell people about or a paper you would like listed, contact hlh@mit.edu

Links

MGHPCC Publications

UMASS Researcher Receives NSF Grant for GPU-Enabled HPC Cluster at MGHPCC

GPU facilities will be made available to researchers through Internet2 links and regional computing partnerships at MGHPCC.Read this story at umass.edu

To support a broadly shared Graphic Processing Unit (GPU)-enabled high-performance computing cluster for the Institute for Applied Sciences (IALS), computational biophysicist Jianhan Chen, chemistry and biochemistry and molecular biology, with others, recently was awarded a two-year, $415,000 grant from the National Science Foundation (NSF) that will fill what Chen calls “a critical need” for enabling computation-intensive research activities on campus.

Although the UMass system has a traditional shared cluster housed at the Massachusetts Green High-performance Computing Center (MGHPCC) in Holyoke, Chen points out, the current cluster has “minimal GPU capacity” and the campus and IALS need dedicated GPU computing hardware to support their research communities. His co-principal investigators on the project are Erin Conlon, mathematics and statistics, Peng Bai, chemical engineering, Chungwen Liang, IALS director of computational modeling, and Matthew Moore, food science.

“When we put in the grant we solicited comments and surveyed the need from IALS and identified 30 labs that could use it,” Chen explains. “They testified to the need and committed to the cost-share with NSF, which will come from IALS, the College of Natural Sciences, College of Engineering, central IT and the Vice Chancellor for Research and Engagement. This is going to be a really unique entity on campus, and it will have a far-reaching impact,” he predicts. “It will be busy from the get-go.”

“I think NSF saw how much need and support we have. I want to particularly highlight the important contributions of Chris Misra and John Griffin of IT,” he adds. “They have taken the leadership in providing technical support that’s absolutely critical to me and other principal investigators on campus. Without them and their excellent help, this will not work, period.”

The new cluster, once carefully built up by Griffin, Chen and his co-investigators will be managed by the IALS Computational and Modeling Core to provide long-term stability for operation and management, serving 250 IALS-affiliated research labs across 27 departments and seven colleges. “The GPU facility offers high-speed single- and double-precision operations as well as extreme parallelism to enhance current activities that contribute to the open science movement,” project leaders state.

It will also contribute to efforts to integrate regional education, outreach, diversity, and economic activities, as the GPU facilities will be made available to researchers through Internet2 links and regional computing partnerships at MGHPCC. The researchers predict that the new cluster “will most likely lead to new developments and discoveries including novel GPU-enabled modeling and simulation technologies that may elucidate molecular mechanism of drug delivery, computational design catalysts for renewable energy and chemical synthesis, advanced computational analysis tools for next-generation informatics and big data, and improved understanding of risk and resistance to breast cancer.”

 

Story image: Helen Hill

Scaling HPC Education

With the explosion in artificial intelligence and machine learning, modeling, simulation, and data analytics, High Performance Computing (HPC) has grown to become an essential tool across academic disciplines. However, HPC expertise remains in short supply with a shortage of people who know how to make HPC systems work and how to use them. At September’s IEEE HPEC 2019 conference, a session chaired by Dr Julie Mullen (MIT LLSC) and Lauren Milechen (MIT EAPS) (who are involved with the MGHPCC hosted MIT SuperCloud System) provided a platform for members of local area research computing teams to share how they are scaling up HPC education in response.

In her presentation Julie Ma (Project Lead, Northeast Cyberteam Initiative, MGHPCC) presented “Northeast Cyberteam: A Workforce Development Strategy for Research Computing” describing activity within the Northeast Cyberteam Initiative, an NSF funded effort, now in its third and final year, to increase the effective use of cyberinfrastructure by researchers and educators at small and mid-sized institutions in Northern New England by making it easier to obtain support from expert Research Computing Facilitators outside of their immediate academic networks.

“Our Northeast Cyberteam Research Computing Facilitators combine technical knowledge and strong interpersonal skills with a service mindset and use their connections with cyberinfrastructure providers to ensure that researchers and educators have access to the best available resources,” Ma explains. “It is widely recognized that such facilitators are critical to successful utilization of cyberinfrastructure, but in very short supply. The Northeast Cyberteam aims to build a pool of Research Computing Facilitators in the region and a process to share them across institutional boundaries. At the same time, we are providing experiential learning opportunities for students interested in becoming Research Computing Facilitators, as well as developing a self-service learning toolkit to provide timely access to information when it is needed.”

Mullen and Milechen, who are both intimately involved with the day-to-day running of the MIT SuperCloud System, used their presentation to describe the development and ongoing progress of a MOOC for teaching how to write scalable code through the use of standard workflows and a SPOC (Small, Private, Online Course) for training on the specifics of using and running on the MIT SuperCloud System.

“Most HPC centers recognize the need to provide their users with HPC training however, the limited time and resources available make this training and education difficult to scale to a growing and broadening audience. MOOCs (Massine Open Online Courses) can provide more accessible and scalable learning paths toward HPC expertise. In our talk, we presented MOOCs and their related technologies and teaching approaches, outlining how MOOC courses differ from face-to-face training, video-capturing of live events, webinars, and other established teaching methods with respect to pedagogical design, development issues, and deployment concerns,” says Milechen.

Robert Freeman directs Research Technology Operations at Harvard Business School (HBS). His talk “Humans in Scaling Research Computing Facilitation and Education” again focused on the challenge of growing the specialized workforce needed to respond to accelerating growth in campus research computing.

“Scaling people-efforts in HPC facilitation and education is an important problem as science and research programs are no longer isolated, work-in-silos efforts; and increasing complexity on all fronts drives an increased need for a better-trained workforce for both research and support staff,” says Freeman. “A number of communities, both local and national, are working on these efforts using multiple approaches. In my talk I discussed specific themes, highlighting the institutions and organizations (both historical and ongoing) that play a part, that have met success and encourage participation, and all of which are growing opportunities to democratize and evangelize these ever-changing advanced cyberinfrastructure resources: creating communities in education, bringing HPC/HTC (high throughput computing) to all disciplines, bringing facilitation approaches to everyone, and building communities for enabling research.”

In particular, Freeman drew attention to the Campus Research Computing Consortium (CARCC) an organization seeking to develop, advocate for, and advance campus research computing and data and associated professions in response to the accelerating rate of change in the area encouraging his audience to perhaps contribute to CARCC efforts by themselves helping enrich the consortium.

Brian Gregor is a member of the Research Computing Staff (RCS) at Boston University. He used his talk “Developing HPC Skills Across the University Community” to share the experience of the BU RCS’s eight-member Applications Support team who work with the > 2,000 researchers across the university using BU’s Shared Computing Cluster.

“We teach tutorials at the start of each semester on a variety of programming topics from introductions to Linux and cluster programming to advanced programming in R and Python,” said Gregor. “In 2018 our tutorials had approximately 1200 attendees. Tutorial attendance continues to grow in 2019 with demand especially high for our set of Python tutorials.”

“Over the past four years,” he continued, “the team has become increasingly involved in teaching specialized topics for academic classes including cluster usage, HPC software, big data tools such as Spark, and other programming languages. In the academic arena, we assist in areas that include deep learning, computational biomedicine, and biostatistics, as well as a graduate data science program in the department of mathematics. As with our tutorials, the interest in our teaching at the academic level only continues to grow with each passing semester.”

To accommodate the increased demand for HPC skills education the BU team plans rollout of OnDemand for easier access to the cluster for academic classes and the research community.

In response to the increasing demand for HPC skills education and training, he said his team was developing video versions of their introductory tutorials to help meet the increasing demand and to free up time to introduce tutorials on more advanced topics. He also said that his team was starting an internship program for graduate students interested in improving their HPC skills and in learning about research facilitation.

Related

HPEC ’19 MGHPCC News

Lincoln Laboratory’s new artificial intelligence supercomputer is the most powerful at a university

TX-GAIA is tailor-made for crunching through deep neural network operations.

Read this story at MIT News

The new TX-GAIA (Green AI Accelerator) computing system at the Lincoln Laboratory Supercomputing Center (LLSC) has been ranked as the most powerful artificial intelligence supercomputer at any university in the world. The ranking comes from TOP500, which publishes a list of the top supercomputers in various categories biannually. The system, which was built by Hewlett Packard Enterprise, combines traditional high-performance computing hardware — nearly 900 Intel processors — with hardware optimized for AI applications — 900 Nvidia graphics processing unit (GPU) accelerators.

“We are thrilled by the opportunity to enable researchers across Lincoln and MIT to achieve incredible scientific and engineering breakthroughs,” says Jeremy Kepner, a Lincoln Laboratory fellow who heads the LLSC. “TX-GAIA will play a large role in supporting AI, physical simulation, and data analysis across all laboratory missions.”

TOP500 rankings are based on a LINPACK Benchmark, which is a measure of a system’s floating-point computing power, or how fast a computer solves a dense system of linear equations. TX-GAIA’s TOP500 benchmark performance is 3.9 quadrillion floating-point operations per second, or petaflops (though since the ranking was announced in June 2019, Hewlett Packard Enterprise has updated the system’s benchmark to 4.725 petaflops). The June TOP500 benchmark performance places the system No. 1 in the Northeast, No. 20 in the United States, and No. 51 in the world for supercomputing power. The system’s peak performance is more than 6 petaflops.

But more notably, TX-GAIA has a peak performance of 100 AI petaflops, which makes it No. 1 for AI flops at any university in the world. An AI flop is a measure of how fast a computer can perform deep neural network (DNN) operations. DNNs are a class of AI algorithms that learn to recognize patterns in huge amounts of data. This ability has given rise to “AI miracles,” as Kepner puts it, in speech recognition and computer vision; the technology is what allows Amazon’s Alexa to understand questions and self-driving cars to recognize objects in their surroundings. The more complex these DNNs grow, the longer it takes for them to process the massive datasets they learn from. TX-GAIA’s Nvidia GPU accelerators are specially designed for performing these DNN operations quickly.

TX-GAIA is housed in a new modular data center, called an EcoPOD, at the LLSC’s green, hydroelectrically powered site in Holyoke, Massachusetts. It joins the ranks of other powerful systems at the LLSC, such as the TX-E1, which supports collaborations with the MIT campus and other institutions, and TX-Green, which is currently ranked 490th on the TOP500 list.

Kepner says that the system’s integration into the LLSC will be completely transparent to users when it comes online this fall. “The only thing users should see is that many of their computations will be dramatically faster,” he says.

Among its AI applications, TX-GAIA will be tapped for training machine learning algorithms, including those that use DNNs. It will more quickly crunch through terabytes of data — for example, hundreds of thousands of images or years’ worth of speech samples — to teach these algorithms to figure out solutions on their own. The system’s compute power will also expedite simulations and data analysis. These capabilities will support projects across the laboratory’s R&D areas, such as improving weather forecasting, accelerating medical data analysis, building autonomous systems, designing synthetic DNA, and developing new materials and devices.

TX-GAIA, which is also ranked the No. 1 system in the U.S. Department of Defense, will also support the recently announced MIT-Air Force AI Accelerator. The partnership will combine the expertise and resources of MIT, including those at the LLSC, and the U.S. Air Force to conduct fundamental research directed at enabling rapid prototyping, scaling, and application of AI algorithms and systems.

Story image: TX-GAIA is housed inside of a new EcoPOD, manufactured by Hewlett Packard Enterprise, at the site of the Lincoln Laboratory Supercomputing Center in Holyoke, Massachusetts. Photo: Glen Cooper

 

 

Supercomputer analyzes web traffic across entire internet

Researchers at the Lincoln Laboratory Supercomputing Center use the MIT SuperCloud to model web traffic potentially aiding cybersecurity, computing infrastructure design, Internet policy, and more.

Read this story at MIT News

Using a supercomputing system, MIT researchers have developed a model that captures what web traffic looks like around the world on a given day, which can be used as a measurement tool for internet research and many other applications.

Understanding web traffic patterns at such a large scale, the researchers say, is useful for informing internet policy, identifying and preventing outages, defending against cyberattacks, and designing more efficient computing infrastructure. A paper describing the approach was presented at the recent IEEE High Performance Extreme Computing Conference (HPEC 2019).

For their work, the researchers gathered the largest publicly available internet traffic dataset, comprising 50 billion data packets exchanged in different locations across the globe over a period of several years.

They ran the data through a novel “neural network” pipeline operating across 10,000 processors of the MIT SuperCloud, a system that combines computing resources from the MIT Lincoln Laboratory and across the Institute. That pipeline automatically trained a model that captures the relationship for all links in the dataset — from common pings to giants like Google and Facebook, to rare links that only briefly connect yet seem to have some impact on web traffic.

The model can take any massive network dataset and generate some statistical measurements about how all connections in the network affect each other. That can be used to reveal insights about peer-to-peer filesharing, nefarious IP addresses and spamming behavior, the distribution of attacks in critical sectors, and traffic bottlenecks to better allocate computing resources and keep data flowing.

In concept, the work is similar to measuring the cosmic microwave background of space, the near-uniform radio waves traveling around our universe that have been an important source of information to study phenomena in outer space. “We built an accurate model for measuring the background of the virtual universe of the Internet,” says Jeremy Kepner, a researcher at the MIT Lincoln Laboratory Supercomputing Center and an astronomer by training. “If you want to detect any variance or anomalies, you have to have a good model of the background.”

Joining Kepner on the paper are: Kenjiro Cho of the Internet Initiative Japan; KC Claffy of the Center for Applied Internet Data Analysis at the University of California at San Diego; Vijay Gadepally and Peter Michaleas of Lincoln Laboratory’s Supercomputing Center; and Lauren Milechin, a researcher in MIT’s Department of Earth, Atmospheric and Planetary Sciences.

Breaking up data

In internet research, experts study anomalies in web traffic that may indicate, for instance, cyber threats. To do so, it helps to first understand what normal traffic looks like. But capturing that has remained challenging. Traditional “traffic-analysis” models can only analyze small samples of data packets exchanged between sources and destinations limited by location. That reduces the model’s accuracy.

The researchers weren’t specifically looking to tackle this traffic-analysis issue. But they had been developing new techniques that could be used on the MIT SuperCloud to process massive network matrices. Internet traffic was the perfect test case.

Networks are usually studied in the form of graphs, with actors represented by nodes, and links representing connections between the nodes. With internet traffic, the nodes vary in sizes and location. Large supernodes are popular hubs, such as Google or Facebook. Leaf nodes spread out from that supernode and have multiple connections to each other and the supernode. Located outside that “core” of supernodes and leaf nodes are isolated nodes and links, which connect to each other only rarely.

Capturing the full extent of those graphs is infeasible for traditional models. “You can’t touch that data without access to a supercomputer,” Kepner says.

In partnership with the Widely Integrated Distributed Environment (WIDE) project, founded by several Japanese universities, and the Center for Applied Internet Data Analysis (CAIDA), in California, the MIT researchers captured the world’s largest packet-capture dataset for internet traffic. The anonymized dataset contains nearly 50 billion unique source and destination data points between consumers and various apps and services during random days across various locations over Japan and the U.S., dating back to 2015.

Before they could train any model on that data, they needed to do some extensive preprocessing. To do so, they utilized software they created previously, called Dynamic Distributed Dimensional Data Mode (D4M), which uses some averaging techniques to efficiently compute and sort “hypersparse data” that contains far more empty space than data points. The researchers broke the data into units of about 100,000 packets across 10,000 MIT SuperCloud processors. This generated more compact matrices of billions of rows and columns of interactions between sources and destinations.

Capturing outliers

But the vast majority of cells in this hypersparse dataset were still empty. To process the matrices, the team ran a neural network on the same 10,000 cores. Behind the scenes, a trial-and-error technique started fitting models to the entirety of the data, creating a probability distribution of potentially accurate models.

Then, it used a modified error-correction technique to further refine the parameters of each model to capture as much data as possible. Traditionally, error-correcting techniques in machine learning will try to reduce the significance of any outlying data in order to make the model fit a normal probability distribution, which makes it more accurate overall. But the researchers used some math tricks to ensure the model still saw all outlying data — such as isolated links — as significant to the overall measurements.

In the end, the neural network essentially generates a simple model, with only two parameters, that describes the internet traffic dataset, “from really popular nodes to isolated nodes, and the complete spectrum of everything in between,” Kepner says.

Using supercomputing resources to efficiently process a “firehose stream of traffic” to identify meaningful patterns and web activity is “groundbreaking” work, says David Bader, a distinguished professor of computer science and director of the Institute for Data Science at the New Jersey Institute of Technology. “A grand challenge in cybersecurity is to understand the global-scale trends in Internet traffic for purposes, such as detecting nefarious sources, identifying significant flow aggregation, and vaccinating against computer viruses. [This research group has] successfully tackled this problem and presented deep analysis of global network traffic,” he says.

The researchers are now reaching out to the scientific community to find their next application for the model. Experts, for instance, could examine the significance of the isolated links the researchers found in their experiments that are rare but seem to impact web traffic in the core nodes.

Beyond the internet, the neural network pipeline can be used to analyze any hypersparse network, such as biological and social networks. “We’ve now given the scientific community a fantastic tool for people who want to build more robust networks or detect anomalies of networks,” Kepner says. “Those anomalies can be just normal behaviors of what users do, or it could be people doing things you don’t want.”

 

Story image:

Using a supercomputing system, MIT researchers developed a model that captures what global web traffic could look like on a given day, including previously unseen isolated links (left) that rarely connect but seem to impact core web traffic (right).

Image courtesy of the researchers, edited by MIT News

Related

HPEC’19 MGHPCC News

The MGHPCC Supercloud

October Publications

Below is a selection of papers that appeared in October 2019 reporting the results of research using the Massachusetts Green High Performance Computing Center (MGHPCC), or acknowledging the use of Harvard’s Odyssey Cluster, Northeastern’s Discovery Cluster, the Boston University Shared Computing Cluster and MIT’s Engaging Cluster all of which are housed at the MGHPCC.

Connor Bottrell, Maan H. Hani, Hossen Teimoorinia, Sara L. Ellison, Jorge Moreno, Paul Torrey, Christopher C. Hayward, Mallory Thorp, Luc Simard and Lars Hernquist (2019), Deep learning predictions of galaxy merger stage and the importance of observational realism, arXiv: 1910.07031 [astro-ph.GA]

Xiang Chen, Juan Ruiz Ruiz, Nathan Howard, Walter Guttenfelder, Jeff Candy, Jerry Hughes, Robert Granetz, Anne White
(2019),  Prediction of high-k electron temperature fluctuation in an NSTX H-mode plasma, abstract for 61st Annual Meeting of the APS Division of Plasma Physics,
http://meetings.aps.org/Meeting/DPP19/Session/GP10.158

Cedric Flamant, Grigory Kolesov, Efstratios Manousakis, Efthimios Kaxiras (2019), Imaginary-Time Time-Dependent Density Functional Theory and Its Application for Robust Convergence of Electronic States, J. Chem. Theory Comput., doi: 10.1021/acs.jctc.9b00617

Peng Liang and Juan Pablo Trelles (2019), 3D numerical investigation of a free-burning argon arc with metal electrodes using a novel sheath coupling procedure, Plasma Sources Science and Technology, doi: 10.1088/1361-6595/ab4bb6

Peilong Li, Chen Xu, Hao Jin, Chunyang Hu, Yan Luo, Yu Cao, Jomol Mathew, Yunsheng Ma (2019), ChainSDI: A Software-Defined Infrastructure for Regulation-Compliant Home-Based Healthcare Services Secured by Blockchains, IEEE Systems Journal, doi: 10.1109/JSYST.2019.2937930

Philip L Pagano, Qi Guo, Chethya Ranasinghe, Evan Schroeder, Kevin Robben, Florian Häse, Hepeng Ye, Kyle Wickersham, Alán Aspuru-Guzik, Dan T. Major, Lokesh Gakhar, Amnon Kohen, Christopher M. Cheatum (2019), Oscillatory Active-site Motions Correlate with Kinetic Isotope Effects in Formate Dehydrogenase, ACS Catal., doi: 10.1021/acscatal.9b03345

Pranay Patil and Anders W. Sandvik (2019), Hilbert Space Fragmentation and Ashkin-Teller Criticalityin Fluctuation Coupled Ising Models, arXiv: 1910.03714 [cond-mat.str-el]

Juan Ruiz Ruiz (2019), Validation of gyrokinetic simulations in NSTX including comparisons with a synthetic diagnostic for high-k scattering, abstract for 61st Annual Meeting of the APS Division of Plasma Physics, http://meetings.aps.org/Meeting/DPP19/Session/TI2.1

Debjani Sihi, Eric A. Davidson, Kathleen E. Savage, Dong Liang (2019), Simultaneous numerical representation of soil microsite production and consumption of carbon dioxide, methane, and nitrous oxide using probability distribution functions, Global Change Biology, doi: 10.1111/gcb.14855

Nia S. Walker, Rosa Fernández, Jennifer M. Sneed, Valerie J. Paul, Gonzalo Giribet, David Combosch (2019), Differential Gene Expression during Substrate Probing in Larvae of the Caribbean Coral, Molecular Ecology, doi: 10.1111/mec.15265

Cheng-Chiang Wu, Fay-Wei Li, Elena M. Kramer (2019), Large-scale phylogenomic analysis suggests three ancient superclades of the WUSCHEL-RELATED HOMEOBOX transcription factor family in plants, PLoS ONE, doi: 10.1371/journal.pone.0223521

Do you have news about research using computing resources at the MGHPCC? If you have an interesting project that you want to tell people about or a paper you would like listed, contact hlh@mit.edu

Links

MGHPCC Publications

The Computer Will See You Now

Vijaya B. Kolachalama, Ph.D., is an Assistant Professor at the Boston University School of Medicine. His area of expertise is in computational biomedicine and in particular machine learning and computer vision.

Activity in the Kolachalama Lab falls into two broad categories: machine learning and computer vision for precision medicine, and research into device-artery interactions, interfacial mechanics and drug delivery. In his machine learning work, Kolachalama makes extensive use of BU’s Shared Computing Cluster housed at the MGHPCC.

“Artificial intelligence is poised to help deliver precision medicine, yet achieving this goal is nontrivial,” says Kolachalama. “Machine learning and image processing techniques along with developments in software and hardware technologies allow us to consider questions across a range of scales,” he continues. “In the radiology and digital pathology work in my lab, we leverage these tools for pattern recognition and understanding pathophysiological mechanisms, paving the way for the development of new and, we hope, more effective and accessible, diagnostic and prognostic biomedical technologies geared to a range of diseases.”

Vijaya Kolachalama is an Assistant Professor of Medicine at Boston University’s Medical School. In his work, he uses BU’s Shared Computing Cluster,  housed at the MGHPCC, in research seeking to apply machine learning and computer vision for precision medicine.

A 2018 paper about his use of deep neural networks to help in the assessment of chronic kidney disease (Kolachalama, 2018) exemplifies his lab’s approach and methodologies, applying advanced machine learning techniques to systematize digital pathology.

“Chronic kidney damage is routinely assessed semiquantitatively by scoring the amount of disease seen in a renal biopsy sample,” explains Kolachalama. “Although image digitization and morphometric techniques have made quantifying the extent of damage easier, the advanced machine learning tools we are developing provide a more systematic way to stratify kidney disease severity.”

Speaking to BU News at the time, Kolachalama said, that “While the trained eyes of expert pathologists are able to gauge the severity of disease and detect nuances of kidney damage with remarkable accuracy, such expertise is not available in all locations, especially at a global level”. Recognizing the potential of his team’s model to act as a surrogate nephropathologist, especially in resource-limited settings, Kolachalama noted that “If healthcare providers around the world can have the ability to classify kidney biopsy images with the accuracy of a nephropathologist right at the point-of-care, then this can significantly impact practice.”

More recently, Kolachalama has applied his machine learning techniques similarly in other areas including Alzheimer’s disease, and osteoarthritis.

“It remains difficult to characterize the source of pain in knee joints either using radiographs or magnetic resonance imaging (MRI),” he explains. “In work with Gary Chang (Chang et al, 2019), a Postdoctoral Associate in my lab, we were interested to see if using deep neural networks could distinguish knees with pain from those without it as well as to perhaps identify the structural features that are associated with knee pain.”

In that study, the team constructed a convolutional Siamese network to associate MRI scans obtained on subjects from the NIH’s Osteoarthritis Initiative with frequent unilateral knee pain, comparing the knee with frequent pain to the contralateral knee without pain in order to map model-predicted regions of high pain association. An expert radiologist then compared the MRI scans with the derived maps to identify the presence of abnormalities.

The radiologist’s review revealed that about 86% of the cases that were predicted correctly had effusion-synovitis within the regions that were most associated with pain, suggesting deep learning can be applied to assess knee pain from MRI scans.

In the context of Alzheimer’s disease, in a study, this time working with Shangran Qiu, a graduate student in the Physics Department at BU, Kolachalama and co-authors applied their machine learning tools to explore whether by combining MRI data with results from the Mini–Mental State Examination (MMSE) and logical memory tests the accuracy of diagnosing mild cognitive impairment could be enhanced (Qui et al, 2019.)

“We combined deep learning models trained on MRI slices to generate a fused MRI model using different voting techniques to predict normal cognition versus mild impairment. We then combined the fused MRI model with a second class of deep learning models trained on data obtained from NIH’s National Alzheimer Coordinating Center database containing individuals with normal cognition and mild cognitive impairment,” Kolachalama explains. “Our fused model did better than the individual models alone with an overall accuracy of over 90%”

Finally, in a collaboration between researchers in the Kolachalama Lab with researchers at Visterra Inc, a clinical-stage biotechnology company committed to developing innovative antibody-based therapies for the treatment of patients with kidney diseases and other hard-to-treat diseases, Kolachalama was recently involved in a study published in the journal Protein Engineering, Design & Selection (Wollacott et al, 2019), applying his machine-learning tools to quantify the “nativeness” of antibody sequences.

“Antibodies can be useful in treating, in particular, cancer and autoimmune diseases, and it has been shown that synthetic antibodies that more closely resemble their natural counterparts demonstrate improved rates of expression and stability,” explains Kolachalama. “Antibodies often undergo substantial engineering en route to the generation of a therapeutic candidate with good developability properties. Characterization of antibody libraries has shown that retaining native-like sequence improves the overall quality of the library. Using a bi-directional long short-term memory (LSTM) network model to score sequences for their similarity to naturally occurring antibodies, we were able to demonstrate our model was able to outperform other approaches at distinguishing human antibodies from those of other species.”

“None of this work would be possible without access to BU’s Shared Computing Cluster and by extension the Massachusetts Green High Performance Computing Center in Holyoke where it is housed,” says Kolachalama. “Our access to them is indispensable in advancing our work towards developing clinically useful digital pathology tools.”

Story image: Trichrome-stained images from renal biopsy samples at different magnifications – image courtesy V. Kolachalama

Related Publications:

Kolachalama V.B., Singh P., Lin C.Q., Mun D., Belghasem M.E., Henderson J.M., Francis J.M., Salant D.J., Chitalia V.C.(2018), Association of Pathological Fibrosis with Renal Survival Using Deep Neural Networks, Kidney Int. Rep., doi: 10.1016/j.ekir.2017.11.002

Chang G.H., Felson D.T., Qiu S., Capellini T.D., Kolachalama V.B. (2019), Assessment of bilateral knee pain from MR imaging using deep neural networks, bioRxiv, doi: 10.1101/463497

Qiu, S., Chang G.H., Panagia M., Gopal D.M., Au R., Kolachalama V.B. (2018), Fusion of deep learning models of MRI scans, Mini–Mental State Examination, and logical memory test enhances diagnosis of mild cognitive impairment, Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring, doi: 10.1016/j.dadm.2018.08.013

Andrew M Wollacott, Chonghua Xue, Qiuyuan Qin, June Hua, Tanggis Bohnuud, Karthik Viswanathan, Vijaya B Kolachalama (2019), Quantifying the nativeness of antibody sequences using long short-term memory networks, Protein Engineering, Design and Selection, doi: 10.1093/protein/gzz031

Links

Kolachalama Lab

Boston University Shared Computing Cluster

New AI Technology Significantly Improves Human Kidney Analysis BU News

Collaboration Awarded an NSF Grant of $5M to Create New Cloud Computing Testbed

BU ECE Professors Orran Krieger and Martin Herbordt Awarded $1.4M to Develop New Cloud Computing Platforms

Read this story via BU News

BOSTON, Mass. –Boston University Professors Orran Krieger and Martin Herbordt are among a team of researchers that will develop a testbed for research and development of new cloud computing platforms thanks to a grant from the National Science Foundation. The collaborative project includes UMass Amherst and Northeastern University and could reach a total of $5 million if fully funded after a review by the NSF in three years. The funding for Boston University is expected to total $2,050,000 over five years.

Cloud computing, the delivery of services over the internet, plays an important role in supporting most applications we currently use. Testbeds such as the one being constructed by the research team, are critical for enabling new cloud technologies and making the services they provide more efficient and accessible to a wide range of scientists focusing on research in the area of computer systems.

Krieger and Herbordt are both professors of electrical and computer engineering (ECE) at Boston University’s College of Engineering. The project’s leadership team also includes Michael Zink, associate professor of electrical and computer engineering (ECE) at UMass Amherst, Peter Desnoyers, associate professor at Northeastern University Khoury College of Computer Sciences, and Miriam Leeser, professor at Northeastern University, College of Engineering. Zink says, “This testbed will accelerate innovation in cloud technologies, technologies affecting almost all of computing today.”

By providing capabilities that currently are only available to researchers within a few large commercial cloud providers, the new testbed will allow diverse communities to exploit these technologies, thus “democratizing” cloud-computing research and allowing increased collaboration between the research and open-source communities.

This project will construct and support a testbed for research and experimentation into new cloud platforms – the underlying software which provides cloud services to applications. Testbeds such as this are critical for enabling research into new cloud technologies. This is research that requires experiments that potentially can change the operation of the cloud itself.

The testbed will integrate capabilities developed in the CloudLab testbed with the Mass Open Cloud (MOC), an academic cloud hosted by Boston University’s Hariri Institute for Computing and developed through a partnership of academia (Boston University, Harvard University, Northeastern University, Massachusetts Institute of Technology, and the University of Massachusetts), government (Mass Tech Collaborative, USAF), and industry (Red Hat, Intel, Two Sigma, NetApp, Cisco). Over the past six years, the MOC has grown into a community of thousands of users and provides the ideal environment for this purpose. The testbed and the MOC are possible because of the Massachusetts Green High-Performance Computing Center, a 90, 000 square foot, 15-megawatt facility located in Holyoke, MA and established as a joint venture between Boston University, Harvard University, the Massachusetts Institute of Technology, Northeastern University, and the University of Massachusetts. “An important part of the MOC has always been to enable cloud computing research by the academic community”, says Krieger. “This project dramatically expands our ability to support researchers both by providing much richer capabilities and by expanding from a regional to a national community of researchers.”

The new testbed will combine proven software technologies with the MOC and enhanced with new technologies including programmable hardware called Field Programmable Gate Arrays (FPGA). FPGAs provide capabilities not present in other facilities available to researchers today, enabling investigation into hardware acceleration techniques. “Field Programmable Gate Arrays (FPGAs) provide a new level of parallelization and acceleration in the cloud,” says Leeser. “This new infrastructure will be on the cutting edge and allow many research areas such as security and privacy, machine learning, bioinformatics, provide solutions faster, and process even greater amounts of data.”

The combination of a testbed and production cloud allows for work on a larger scale compared to isolated testbeds, reproducible experimentation based on realistic user behavior and applications, as well as a model for transitioning successful research results to practice. All of these features are currently not offered by commercial cloud providers to computer systems researchers. The community outreach portion of the project aims to identify, attract, and retain interested researchers, and to educate them in the use of the facility. Tutorials, workshops, and webinars will offer training in the use of the testbed. The project will support educating the next generation of researchers in this field, and existing relationships with industrial partners of the affiliated production cloud will accelerate technology transfer from academic research to practical use. The testbed also offers a unique sustainability model by allowing additional computing resources to be dynamically moved from institutional uses into the testbed and back again, providing a path to growth beyond the initial testbed.

Additional information on the project is available here.

Links:

Mass Open Cloud

NEREN Seminar: “Bridging the Gap: AI and Machine Learning”

The Northeast Research and Education Network, NEREN’s, Fall 2019 seminar held October 4, 2019, at the Gateway City Arts Center, Holyoke, MA.

In collaboration with UMass Amherst and the Massachusetts Green High-Performance Computing Center (MGHPCC), NEREN presented the seventh in a series of day-long seminars devoted to proposing and advancing ideas for regional collaboration in research computing and networking this time on the theme of AI and Machine Learning.

The seminar began with a presentation by Erik G. Learned-Miller, Professor in the College of Information and Computer Sciences, the University of Massachusetts Amherst, who spoke about what can be learned from FDA processes for regulating the drug and medical device industries when seeking to build a roadmap for developing regulatory structures, processes, definitions, and rules to manage the complex implications of facial recognition technology.

The second speaker was Adrian Del Maestro, Associate Professor of Physics, University of Vermont, who spoke about the deployment of a new GPU based supercomputer at the University of Vermont and the associated experience of engaging with a new class of users that are interested in employing machine learning in their research, but who may have minimal previous experience with high performance computing.

The third presentation was given by MIT Lincoln Laboratory Fellow Jeremy Kepner and concerned his team’s research performing hypersparce neural network analysis on the massive internet traffic dataset his team collects and curates using the MIT SuperCloud.

Finally, after lunch, Jim Hendler, Tetherless World Chair of Computer, Web and Cognitive Sciences, and Director RPI/IBM AI Research Collaboration, Rensselaer Polytechnic Institute (RPI) reflected on “Knowledge Representation in the Era of Deep Learning, Watson and the Semantic Web” arguing that while these technologies have been developing fast and are individually extremely powerful, they remain a long way from human intelligence.

Christopher Misra, Vice-Chancellor and CIO, University of Massachusetts Amherst shared closing remarks.

To access a recording of the meeting or to be added to the NEREN mailing list contact NEREN Program Administrator, laurie@neren.org.

Story image credit: L. Robinson

Related

NEREN Seminar: “Bridging the Dap: Ahring computing resources across campuses” MGHPCC News

NEREN Seminar: “Bridging the Gap: Advancing regional collaboration and research IT collaboration” MGHPCC News

Links

NEREN website

Scaling High Performance Computing Education

A special event at the recent IEEE HPEC 2019 conference was a session focused on Scaling HPC Education organized by Dr Julie Mullen (MIT LLSC) and Lauren Milechen (MIT EAPS) who are involved with the MGHPCC hosted MIT SuperCloud System and including a talk about the MGHPCC sponsored Northeast Cyberteam Project.

In her presentation Julie Ma (Project Leader, Northeast Cyberteam Initiative, MGHPCC)  presented  “Northeast Cyberteam: A Workforce Development Strategy for Research Computing” describing activity within the Northeast Cyberteam Initiative, an NSF funded effort, now in its third and final year, to increase the effective use of cyberinfrastructure by researchers and educators at small and mid-sized institutions in Northern New England by making it easier to obtain support from expert Research Computing Facilitators outside of their immediate academic networks.

“Our Northeast Cyberteam Research Computing Facilitators combine technical knowledge and strong interpersonal skills with a service mindset, and use their connections with cyberinfrastructure providers to ensure that researchers and educators have access to the best available resources,” Ma explains. “It is widely recognized that such facilitators are critical to successful utilization of cyberinfrastructure, but in very short supply. The Northeast Cyberteam aims to build a pool of Research Computing Facilitators in the region and a process to share them across institutional boundaries. At the same time, we are providing experiential learning opportunities for students interested in becoming Research Computing Facilitators, as well as developing a self-service learning toolkit to provide timely access to information when it is needed.”

In their presentation, Mullen and Milechen explored the applicability of Massively Open Online Courses (MOOCs) for scaling High Performance Computing (HPC) training and education.

“Most HPC centers recognize the need to provide their users with HPC training however, the limited time and resources available make this training and education difficult to scale to a growing and broadening audience. MOOCs (Massine Open Online Courses) can provide more accessible and scalable learning paths toward HPC expertise. In our talk, we presented MOOCs and their related technologies and teaching approaches, outlining how MOOC courses differ from face-to-face training, video-capturing of live events, webinars, and other established teaching methods with respect to pedagogical design, development issues, and deployment concerns,” says Milechen.

Mullen and Milechen, who are both intimately involved with the day-to-day running of the MIT SuperCloud System, used their presentation to describe the development and ongoing progress of a MOOC for teaching how to write scalable code through the use of standard workflows and a SPOC (Small, Private, Online Course) for training on the specifics of using and running on the MIT SuperCloud System.

Among the other invited talks were “Humans in Scaling Research Computing Facilitation and Education” given by Robert Freeman (Harvard Business School), and “Developing HPC Skills across the University Community” given by Brian Gregor (Boston University).

Related

HPEC ’19 MGHPCC News

Links

Northeast Cyberteam Initiative
MIT SuperCloud System

HPEC ’19

Organized by Lincoln Laboratories, and with sponsorship, this year from DELL, Hewlett Packard, Intel Corp, NVIDIA, and MITRE, IEEE HPEC 2019, was held September 24th to 26th, 2019 in Waltham, MA.

This year’s speakers included Marc Hamilton (Nvidia VP of Solutions Architecture and Engineering) “GPU Accelerated Machine Learning”, Dr. Michael Rosenfield (IBM VP Data Centric Solutions) “Future Computing Systems”, Stan Reiss (Matrix Partners) “Brilliant Technologists Building Cool Stuff”, Prof. Julian Shun (MIT CSAIL) “Large-Scale Graph Processing”, Mark Hamilton (Microsoft) “Microsoft MLSpark: Unifying Machine Learning Ecosystems at Massive Scales”, Jaya Shankar (Mathworks) “Deploying High-Performance Deep Learning Applications”, Prof. Yunsi Fei (Northeastern University) “Evaluating Fault Resilience of Compressed Deep Neural Networks”,  Dr. Robert Freeman (Director, Research Technology Operations, Harvard Business School), and Julie Ma (Project Leader, Northeast Cyberteam Initiative, MGHPCC) “Northeast Cyberteam: A Workforce Development Strategy for Research Computing.”

The meeting’s technical program included tutorials from industry and academic experts, demonstrations, and poster sessions, among them: the Julia Programming LanguageRemote Sensing for Humanitarian Assistance & Disaster Relief (organized by Dr. John Aldridge, and Dan DumanisAndrew Weinert (MIT LL), HPSEC: High Performance Secure Extreme Computing (organized by Dr. Michael Vai (MIT LL), BRAIDS: Boosting Resilience through Artificial Intelligence and Decision Support (organized by Dr. Alexia Schultz (MIT LL), Dr. Pierre Trepagnier (MIT LL), Dr. Igor Linkov (Corps of Engineers), Matthew Bates (Corps of Engineers)), and Bridging Quantum and High Performance Computing (organized by Prof. Patrick Dreher (NC State Univ), and Scaling HPC Education (organized by Dr. Julie Mullen (MIT LLSC) and Lauren Milechen (MIT EAPS).)

Special events included the MIT/Amazon/IEEE Graph Challenge, a GraphBLAS forum to define standard building blocks for graph algorithms (organized by Dr Timothy Mattson (Intel), Dr Scott McMillan (CMU SEI), and Dr Marcin Zalewski (PNNL)), and the IEEE Innovation in Societal Infrastructure Award.

The 2020 IEEE High Performance Extreme Computing Conference, will take place from the 22nd to the 24th of September 2020 in Waltham, MA. The submission deadline for papers is May 18, 2019. Submission dates for GraphChallenge will be posted here graphchallenge.mit.edu. For more information visit ieee-hpec.org.

Related

Scaling High Performance Computing Education MGHPCC News

Links

HPEC’18 MGHPCC News

HPEC’17 MGHPCC News

 

FASRC Cluster Refresh 2019

With the decommissioning of the Odyssey Cluster earlier this month, Harvard’s FASRC (Faculty of Arts and Sciences, Research Computing division) recently announced details of its replacement.

The new cluster which went online September 24, was provided by Lenovo and utilizes their SD650 NeXtScale servers with direct-to-node water-cooling for increased performance, density, ease of expansion, and controlled cooling.

The refreshed cluster, named Cannon in honor of Annie Jump Cannon, is comprised of 670 plus 16 new GPU nodes. This new cluster will have 30,000 cores of Intel 8268 “Cascade Lake” processors. Each node will have 48 cores and 192 GB of RAM. The interconnect is HDR 100 Gbps Infiniband (IB) connected in a single Fat Tree with 200 Gbps IB core. The entire system is water-cooled which will allow us to run these processors at a much higher clock rate of ~3.4GHz. In addition to the general-purpose compute resources we are also installing 16 SR670 servers each with four Nvidia V100 GPUs and 384 GB of RAM all connected by HDR IB.

Links

Harvard University Research Computing

Related

Harvard Deploys Cannon, New Lenovo Water-Cooled HPC Cluster HPC Wire

Harvard Names New Lenovo HPC Cluster after Astronomer Annie Jump Cannon InsideHPC

September Publications

Below is a selection of papers that appeared in July 2019 reporting the results of research using the Massachusetts Green High Performance Computing Center (MGHPCC), or acknowledging the use of Harvard’s Odyssey Cluster, Northeastern’s Discovery Cluster, the Boston University Shared Computing Cluster and MIT’s Engaging Cluster all of which are housed at the MGHPCC.

Amin Aboubrahim and Pran Nath (2019), LHC phenomenology with hidden sector dark matter: a long-lived stau and heavy Higgs in an observable range, arXiv: 1909.08684 [hep-ph]

Gabriel Birzu, Sakib Matin, Oskar Hallatschek, Kirill S. Korolev (2019), Genetic drift in range expansions is very sensitive to density dependence in dispersal and growth, Ecology Letters, doi: 10.1111/ele.13364

William J. Cunningham, and Sumati Surya (2019), Dimensionally Restricted Causal Set Quantum Gravity: Examples in Two and Three Dimensions, arXiv: 1908.11647 [gr-qc]

Ewan S. Douglas, John Debes, Kian Milani, Yinzi Xin, Kerri L. Cahoy, Nikole K. Lewis, Bruce Macintosh (2019), Proceedings Volume 11117, Techniques and Instrumentation for Detection of Exoplanets, doi: 10.1117/12.2529488

Sam Hadden (2019), An Integrable Model for the Dynamics of Planetary Mean Motion Resonances, arXiv: 1909.05264 [astro-ph.EP]

Dahlia R. Klein, David MacNeill, Qian Song, Daniel T. Larson, Shiang Fang, Mingyu Xu, R. A. Ribeiro, P. C. Canfield, Efthimios Kaxiras, Riccardo Comin & Pablo Jarillo-Herrero (2019), Enhancement of interlayer exchange in an ultrathin two-dimensional magnet, Nature Physics, doi: 10.1038/s41567

Lachlan Lancaster, Cara Giovanetti, Philip Mocz, Yonatan Kahn, Mariangela Lisanti, David N. Spergel (2019), Dynamical Friction in a Fuzzy Dark Matter Universe, arXiv: 1909.06381 [astro-ph.CO]

Yuyu Li, Pablo Ferreyra, Anna K. Swan, Roberto Paiella (2019), Current-Driven Terahertz Light Emission from Graphene Plasmonic Oscillations, ACS Photonics, doi: 10.1021/acsphotonics.9b01037

Chungwen Liang, Sergey N. Savinov, Jasna Fejzo, Stephen J. Eyles, Jianhan Chen (2019), Modulation of Amyloid-β42 Conformation by Small Molecules Through Nonspecific Binding, J. Chem. Theory Comput., doi: 10.1021/acs.jctc.9b00599

Loïc M. Roch, Semion K. Saikin, Florian Häse, Pascal Friederich, Randall H. Goldsmith, Salvador León, and Alán Aspuru-Guzik (2019), From absorption spectra to charge transfer in PEDOT nanoaggregates with machine learning, arXiv: 1909.10768 [physics.chem-ph]

Jayson R. Vavrek, Brian S. Henderson, Areg Danagoulian (2019), Validation of Geant4’s G4NRF module against nuclear resonance fluorescence data from 238U and 27Al, Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms, doi: 10.1016/j.nimb.2019.08.034

Huan Yang, Prasad Bandarkar, Ransom Horne, Vitor B. P. Leite, Jorge Chahine, and  Paul C. Whitford (2019), Diffusion of tRNA inside the ribosome is position-dependent, J. Chem. Phys, doi: 10.1063/1.5113814

Do you have news about research using computing resources at the MGHPCC? If you have an interesting project that you want to tell people about or a paper you would like listed, contact hlh@mit.edu

Links

MGHPCC Publications

Computing the Toll of Trapped Diamondback Terrapins

Reporting by Helen Hill for MGHPCC

Benjamin Levy is an Assistant Professor of mathematics at Fitchburg State University, Massachusetts. His research is in biological modeling with an emphasis on population and infectious disease dynamics. Working with students Ben Burnett (UMass Dartmouth) and Abigail Waters (Suffolk University), Levy is leading a project assessing threats to the Diamondback Terrapin population in North Inlet Winyah Bay, a site on the South Carolina coast, using a model he has been developing which, thanks to support from the Northeast Cyberteam he is able to run using  high-performance computing resources at the MGHPCC.

Diamondback terrapins live in estuarine habitats such as salt marshes, creeks, and tidal flats along the Atlantic and Gulf coasts of the United States. Crab traps pose a significant threat to the population as large numbers of individuals can become stuck and drown. Additionally, predators and humans regularly destroy the eggs that exist in nests along the shore. Levy has formulated an agent-based model (ABM) which uses data from a mark-recapture study to assess the impact of crab traps and nest disturbances on the longevity of a localized population. In particular, since individuals perish in crab traps relative to their size and sex, the model has been tailored to illuminate and quantify how the presence of traps can skew the sex ratio of the population.

Levy first developed two traditional matrix population models that encode key population parameters to determine those that have the strongest impact on future success of the population. However, since these relatively simple models do not allow for analysis of more complex factors that contribute to terrapin decline, Levy also developed a spatial stochastic agent-based model for the terrapin population. The novelty of Levy’s approach in this study is to use an agent-based model to validate and extend results of standard matrix population models.

“The purpose of our agent-based formulation is to spatially model diamondback terrapins in the heterogeneous environment that is North-Inlet Winyah Bay (NIWB), with a focus on how crab traps and nest disturbances impact the longevity and sex-ratio of the population,” Levy says. “The form of our ABM reach beyond the capabilities of the matrix models by allowing us to explicitly model crab traps in the actual geographic locations they are deployed in the real world. As a result, we are able to consider how the spatial location of crab traps influence long-term dynamics and examine the resulting non-linear decrease in survival rates,” Levy explains. “The stochastic nature of the ABM also allows us to account for variations in key components by taking a probabilistic approach to the timing of nesting and brumation, the likelihood of nest disturbances, and implementation of survival and movement rates.”

However, stochastic models typically involve integrating-forward the state of thousands of independent simulations of the model for terrapins on intervals of a single day for periods of twenty years and more, calling bigger, faster computers to run on.

“Access to high-performance computing resources provided by the Northeast Cyberteam has been invaluable in allowing us to perform large numbers of complex simulations. Without these resources, we would be unable to obtain meaningful results for our agent-based model for diamondback terrapins. We are grateful to have received these resources as results from this project help us better understand how crab traps and nest disturbances impact the longevity and sex-ratio of the terrapin population and assist stakeholders in making informed decisions,” Levy says.

Story image:  Diamondback Terrapin – Image credit: Jeffrey Shultz

To find out more about this work contact Ben

About the Researcher Assistant Professor of Mathematics Ben Levy

Benjamin Levy is an assistant professor in the Department of Mathematics at Fitchburg State University in Fitchburg, MA. His research is in biological modeling with an emphasis on population modeling and infectious disease modeling.

Links

Benjamin Levy

The Northeast Cyberteam is a 3-year NSF funded initiative to make advanced computing more accessible researchers at small and mid-sized institutions in New England that do not have critical mass to support these resources on campus. Our two-pronged approach: 1) Build a regional pool of research computing facilitators covering a broad spectrum of domain knowledge, and a process to share them across institutional boundaries and 2) Develop web-based tools and information to enable timely self-service learning about the wide range of subject matter relevant to advanced computing. 

 

 

August Publications

Below is a selection of papers that appeared in July 2019 reporting the results of research using the Massachusetts Green High Performance Computing Center (MGHPCC), or acknowledging the use of Harvard’s Odyssey Cluster, Northeastern’s Discovery Cluster, the Boston University Shared Computing Cluster and MIT’s Engaging Cluster all of which are housed at the MGHPCC.

Ronen Bar-Ziv, Priyadarshi Ranjan, Anna Lavie, Akash Jain, Somenath Garai, Avraham Bar Hen, Ronit Popovitz-Biro, Reshef Tenne, Raul Arenal, Ashwin Ramasubramaniam, Luc Lajaunie, Maya Bar-Sadan (2019), Au-MoS2 Hybrids as Hydrogen Evolution Electrocatalysts, ACS Appl. Energy Mater, doi: 10.1021/acsaem.9b01147

Sarah C. Conner, Sara Lodi, Kathryn L. Lunetta, Juan P. Casas, Steven A. Lubitz, Patrick T. Ellinor, Christopher D. Anderson, Qiuxi Huang, Justin Coleman, Wendy B. White, Emelia J. Benjamin, and Ludovic Trinquart (2019), Refining the Association Between Body Mass Index and Atrial Fibrillation: G‐Formula and Restricted Mean Survival Times, Journal of the American Heart Association, doi: 10.1161/JAHA.119.013011

Shiang Fang, Stephen Carr, Ziyan Zhu, Daniel Massatt, Efthimios Kaxiras (2019), Angle-Dependent Ab initio Low-Energy Hamiltonians for a Relaxed Twisted Bilayer
Graphene Heterostructure, arXiv: 1908.00058 [cond-mat.mes-hall]

Brandt A. L. Gaches. Stellar S. R. Offner, and Thomas G. Bisbas (2019), The Astrochemical Impact of Cosmic Rays in Protoclusters II: CI-to-H2 and CO-to-H2 Conversion Factors, arXiv:1908.06999 [astro-ph.GA]

Kyle S. Honegger, Matthew A.-Y. Smith, Matthew A. Churgin, Glenn C. Turner, and Benjamin L. de Bivorta (2019), Idiosyncratic neural coding and neuromodulation of olfactory individuality in Drosophila, PNAS, doi: 10.1073/pnas.1901623116

Jilei Liu et al (2019), Integrated Catalysis-Surface Science-Theory Approach to Understand Selectivity in the Hydrogenation of 1-Hexyne to 1-Hexene on PdAu Single-Atom Alloy Catalysts, ACS Catal, doi: 10.1021/acscatal.9b00491

Xiya Lu, Juan Duchimaza-Heredia, Qiang Cui (2019), Analysis of Density Functional Tight Binding with Natural Bonding Orbitals,J. Phys. Chem. A, doi: 10.1021/acs.jpca.9b05072

Atoallah Mesgarnejad and Alain Karma (2019), Vulnerable Window of Yield Strength for Swelling-Driven Fracture of Phase-Transforming Battery Materials, arXiv: 1908.02175 [physics.app-ph]

César Omar Ramírez Quiroz et al (2019), Interface Molecular Engineering for Laminated Monolithic Perovskite/Silicon Tandem Solar Cells with 80.4% Fill Factor, Advanced Functional Materials, doi: 10.1002/adfm.201901476

George Papamokos, Theodoros Dimitriadi, Dimitrios N. Bikiaris, George Z. Papageorgiou, George Floudas (2019), Chain Conformation, Molecular Dynamics, and Thermal Properties of Poly(n-methylene 2,5-furanoates) as a Function of Methylene Unit Sequence Length, The Macromolecules, doi: 10.1021/acs.macromol.9b01320

V.V. Volkov, R. Chelli, R. Righini, C.C. Perry (2019), Indigo chromophores and pigments: Structure and dynamics, Dyes and Pigments, doi: 10.1016/j.dyepig.2019.107761

Yi Wen, Colin Ophus, Christopher S. Allen, Shiang Fang, Jun Chen, Efthimios Kaxiras, Angus I. Kirkland, Jamie H. Warner (2019), Simultaneous Identification of Low and High Atomic Number Atoms in Monolayer 2D Materials Using 4D Scanning Transmission Electron Microscopy, Nano Lett., doi: 10.1021/acs.nanolett.9b02717

Yaqin Xia, Qiuming Chen, Eugene I. Shakhnovich, Wenli Zhang, Wanmeng Mu (2019), Simulation-guided enzyme discovery: A new microbial source of cellobiose 2-epimerase, International Journal of Biological Macromolecules, doi: 10.1016/j.ijbiomac.2019.08.075

Huan Yang, Prasad Bandarkar, Ransom Horne, Vitor B. P. Leite, Jorge Chahine, and Paul C. Whitford (2019), Diffusion of tRNA inside the ribosome is position-dependent, The Journal of Chemical Physics, doi: 10.1063/1.5113814

Do you have news about research using computing resources at the MGHPCC? If you have an interesting project that you want to tell people about or a paper you would like listed, contact hlh@mit.edu

Links

MGHPCC Publications

IBM gives artificial intelligence computing at MIT a lift

Nearly $12 million machine, to be housed at the MGHPCC, will let MIT researchers run more ambitious AI models.

Read this story at MIT News

IBM designed Summit, the fastest supercomputer on Earth, to run the calculation-intensive models that power modern artificial intelligence (AI). Now MIT is about to get a slice.

IBM pledged earlier this year to donate an $11.6 million computer cluster to MIT modeled after the architecture of Summit, the supercomputer it built at Oak Ridge National Laboratory for the U.S. Department of Energy. The donated cluster is expected to come online this fall when the MIT Stephen A. Schwarzman College of Computing opens its doors, allowing researchers to run more elaborate AI models to tackle a range of problems, from developing a better hearing aid to designing a longer-lived lithium-ion battery.

“We’re excited to see a range of AI projects at MIT get a computing boost, and we can’t wait to see what magic awaits,” says John E. Kelly III, executive vice president of IBM, who announced the gift in February at MIT’s launch celebration of the MIT Schwarzman College of Computing.

IBM has named the cluster Satori, a Zen Buddhism term for “sudden enlightenment.” Physically the size of a shipping container, Satori is intellectually closer to a Ferrari, capable of zipping through 2 quadrillion calculations per second. That’s the equivalent of each person on Earth performing more than 10 million multiplication problems each second for an entire year, making Satori nimble enough to join the middle ranks of the world’s 500 fastest computers.

Rapid progress in AI has fueled a relentless demand for computing power to train more elaborate models on ever-larger datasets. At the same time, federal funding for academic computing facilities has been on a three-decade decline. Christopher Hill, director of MIT’s Research Computing Project, puts the current demand at MIT at five times what the Institute can offer.

“IBM’s gift couldn’t come at a better time,” says Maria Zuber, a geophysics professor and MIT’s vice president of research. “The opening of the new college will only increase demand for computing power. Satori will go a long way in helping to ease the crunch.”

The computing gap was immediately apparent to John Cohn, chief scientist at the MIT-IBM Watson AI Lab, when the lab opened last year. “The cloud alone wasn’t giving us all that we needed for challenging AI training tasks,” he says. “The expense and long run times made us ask, could we bring more compute power here, to MIT?”

It’s a mission Satori was built to fill, with IBM Power9 processors, a fast internal network, a large memory, and 256 graphics processing units (GPUs). Designed to rapidly process video-game images, graphics processors have become the workhorse for modern AI applications. Satori, like Summit, has been configured to wring as much power from each GPU as possible.

IBM’s gift follows a history of collaborations with MIT that have paved the way for computing breakthroughs. In 1956, IBM helped launch the MIT Computation Center with the donation of an IBM 704, the first mass-produced computer to handle complex math. Nearly three decades later, IBM helped fund Project Athena, an initiative that brought networked computing to campus. Together, these initiatives spawned time-share operating systems, foundational programming languages, instant messaging, and the network-security protocol, Kerberos, among other technologies.

More recently, IBM agreed to invest $240 million over 10 years to establish the MIT-IBM Watson AI Lab, a founding sponsor of MIT’s Quest for Intelligence. In addition to filling the computing gap at MIT, Satori will be configured to allow researchers to exchange data with all major commercial cloud providers, as well as prepare their code to run on IBM’s Summit supercomputer.

Josh McDermott, an associate professor at MIT’s Department of Brain and Cognitive Sciences, is currently using Summit to develop a better hearing aid, but before he and his students could run their models, they spent countless hours getting the code ready. In the future, Satori will expedite the process, he says, and in the longer term, make more ambitious projects possible.

“We’re currently building computer systems to model one sensory system but we’d like to be able to build models that can see, hear and touch,” he says. “That requires a much bigger scale.”

Richard Braatz, the Edwin R. Gilliland Professor at MIT’s Department of Chemical Engineering, is using AI to improve lithium-ion battery technologies. He and his colleagues recently developed a machine learning algorithm to predict a battery’s lifespan from past charging cycles, and now, they’re developing multiscale simulations to test new materials and designs for extending battery lifeWith a boost from a computer like Satori, the simulations could capture key physical and chemical processes that accelerate discovery. “With better predictions, we can bring new ideas to market faster,” he says.

Satori will be housed at a silk mill-turned data center, the Massachusetts Green High Performance Computing Center (MGHPCC) in Holyoke, Massachusetts, and connect to MIT via dedicated, high-speed fiber optic cables. At 150 kilowatts, Satori will consume as much energy as a mid-sized building at MIT, but its carbon footprint will be nearly fully offset by the use of hydro and nuclear power at the Holyoke facility. Equipped with energy-efficient cooling, lighting, and power distribution, the MGHPCC was the first academic data center to receive LEED-platinum status, the highest green-building award, in 2011.

“Siting Satori at Holyoke minimizes its carbon emissions and environmental impact without compromising its scientific impact,” says John Goodhue, executive director of the MGHPCC.

Visit the Satori website for more information.

Story image credit: Helen Hill

 

 

 

July Publications

Below is a selection of papers that appeared in July 2019 reporting the results of research using the Massachusetts Green High Performance Computing Center (MGHPCC), or acknowledging the use of Harvard’s Odyssey Cluster, Northeastern’s Discovery Cluster, the Boston University Shared Computing Cluster and MIT’s Engaging Cluster all of which are housed at the MGHPCC.

Anders Andreassen, Ilya Feige, Christopher Frye, and Matthew D. Schwartz (2019), Binary JUNIPR: an interpretable probabilistic model for discrimination, arXiv: 1906.10137 [hep-ph]

T.T.G. Beucler (2019), Interaction between water vapor, radiation and convection in the tropics, MIT Doctoral Dissertation

Blakesley Burkhart and Philip Mocz (2019), The Self-gravitating Gas Fraction and the Critical Density for Star Formation, The Astrophysical Journal, doi: 10.3847/1538-4357/ab25ed

Phillip A. Cargile, Charlie Conroy, Benjamin D. Johnson, Yuan-sen Ting, Ana Bonaca, Aaron Dotter (2019), MINESweeper: Spectrophotometric Modeling of Stars in the Gaia Era, arXiv: 1907:07690 [astro-ph.SR]

Stephen Carr, Chenyuan Li, Ziyan Zhu, Efthimios Kaxiras, Subir Sachdev, and Alex Kruchkov (2019), Coexistence of ultraheavy and ultrarelativistic Dirac quasiparticles in sandwiched trilayer graphene, arXiv:1907.00952 [cond-mat.str-el]

Cheng-Chieh Chuang, Hsun-Chen Chu, Sheng-Bor Huang, Wei-Shun Chang, Hsing-Yu Tuan (2019), Laser-induced plasmonic heating in copper nanowire fabric as a photothermal catalytic reactor, Chemical Engineering Journal, doi: 10.1016/j.cej.2019.122285

Charlie Conroy, Ana Bonaca, Phillip Cargile, Benjamin D. Johnson, Nelson Caldwell, Rohan P. Naidu, Dennis Zaritsky, Daniel Fabricant, Sean Moran, Jaehyon Rhee, Andrew Szentgyorgyi (2019), Mapping the Stellar Halo with the H3 Spectroscopic Survey, arXiv: 1907.07684 [astro-ph.GA]

Qian Di, Heresh Amini, Liuhua Shi, Itai Kloog, Rachel Silvern, James Kelly, M. Benjamin Sabath, Christine Choirat, Petros Koutrakis, Alexei Lyapustin, Yujie Wang, Loretta J. Mickley, Joel Schwartz (2019), An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution, Environmental International, doi: 10.1016/j.envint.2019.104909

Richard M. Feder, Stephen K. N. Portillo, Tansu Daylan, and Douglas Finkbeiner (2019), Multiband Probabilistic Cataloging: A Joint Fitting Approach to Point Source Detection and Deblending, arXiv: 1907.04929 [astro-ph.IM]

William Fitzhugh, Fan Wu, Luhan Ye, Haoqing Su, Xin Li (2019), Strain‐Stabilized Ceramic‐Sulfide Electrolytes, Small, doi: 10.1002/smll.201901470

Joel Leja, Sandro Tacchella, and Charlie Conroy (2019), Beyond UVJ: More efficient selection of quiescent galaxies with UV / mid-IR fluxes, arXiv: 1907.02970 [astro-ph.GA]

Xiaorong Liu, Jianhan Chen (2019), Residual Structures and Transient Long-Range Interactions of p53 Transactivation Domain: Assessment of Explicit Solvent Protein Force Fields, Journal of Chemical Theory and Computing, doi: 10.1021/acs.jctc.9b00397

Andreea Panaitescu, Meng Xin, Benny Davidovitch, Julien Chopin, and Arshad Kudrolli (2019), Birth and decay of tensional wrinkles in hyperelastic sheets, arXiv: 1906.10054 [cond-mat.soft]

Joonha Park, Yves Atchad (2019), Markov chain Monte Carlo algorithms with sequential proposals, arXiv: 1907.06544 [stat.CO]

Sun, Qinfang (2019), Simulating Hydrogen Bonded Clusters and Zeolite Clusters for Renewable Energy Applications, University of Massachusetts, Amherst, Doctoral Dissertation.

Sherer, Zachary (2019), A Comparison of Two Architectures for Breadth-First Search on FPGA SoC Platforms, University of Massachusetts, Lowell, Masters Dissertation

Tirado, Luis Eladio (2019), On-the-move Detection of Security Threats Using 3D MM-Wave Radar Imaging, Northeastern University Doctoral Dissertation

Do you have news about research using computing resources at the MGHPCC? If you have an interesting project that you want to tell people about or a paper you would like listed, contact hlh@mit.edu

Links

MGHPCC Publications

HolyokeCodes: Soccer Robots

Students at a HolyokeCodes Soccer Robotics program learn to play soccer with robots using JavaScript.

RoboCup is an international scientific initiative with the goal of advancing the state of the art of intelligent robots. When established in 1997, the original mission was to field a team of robots capable of winning against the human soccer World Cup champions by 2050.

Of the multiple Robocup Soccer leagues that exist, the “Small Size” league is one of the oldest. Arjun Guha and Joydeep Biswas, assistant professors in the College of Information and Computer Sciences at UMass Amherst, and their students adapted the UMass Minute Bots Robocup Team robots (which they use to compete in Small Size league competition), to a JavaScript interface designed for high school students. Centrally controlled via radio, using perception based on a central overhead camera, the robots can travel up to 5 m/s and kick, chip-kick, and dribble a golf ball.

Over the course of a week in early July, local area students in grades 9 through 12 learned basic commands to control the robots, developed simple planning algorithms, and programmed behaviors for offensive and defensive roles with the week culminating in a series of 2 v 2 matches. Students came away from the activity with a new appreciation for the problems of intelligent multi-robot/agent cooperation and control in the highly dynamic environment of even a toy soccer pitch plus plenty of hands-on coding and robotics experience.

Links

Holyoke Codes

Container Gardening

On June 27, 2019, Holyoke Community College celebrated the graduation of 12 apprentices from its Freight Farms workforce training program, a new apprenticeship-based opportunity to learn the art and practice of cutting-edge hydroponics based in a facility close to the MGHPCC.

Training takes place in a pair of refurbished shipping containers  “Freight Farms” located on Race Street, in the heart of Holyoke. The re-purposed containers are used to grow leafy greens and herbs without the use of soil. Each of the container farms can hold 256 grow towers with a capacity of 10-12 plants each, and are able to grow as much produce in a year as an acre of farmland.

The soil-free facilities use water, mineral nutrients and LED lights for growing leafy greens like lettuce – image courtesy Holyoke Community College.

From learning how to seed, transplant, harvest and package crops to maintaining a safe, clean, and organized work area, apprentices graduate the program with a full working knowledge of how to work in a hydroponic facility and follow food safety standards

The Holyoke Freight Farm is a collaboration between Holyoke Community College, Nuestras Raices and the City of Holyoke, with support from MassDevelopment’s Transformative Development Initiative. MGHPCC is happy to be a neighbor and to assist, as the fiscal sponsor for a TDI Fellows Cohort from MassDevelopment.

Pages