Service Models

High performance computing (HPC) changes how research is conducted and expands the reach of scientific inquiry. A large, growing and efficiently managed HPC cluster is instrumental to attracting talented researchers to the University of Massachusetts, and to improving research grant application success. The ability of a single research grant, laboratory or program to fund a cluster powerful enough to fulfill its informatics needs is difficult.  Pooling investments in a massively scaled, equitably shared HPC resource maximizes the value to the research community and the university.

The Massachusetts Green High Performance Computing Center (MGHPCC) is a unique initiative by the University of Massachusetts, Boston University, Harvard University, MIT, Northeastern University and the Commonwealth of Massachusetts to deliver a world-class computational infrastructure to the research and academic community, indispensible in the increasingly data-rich environment of modern science. With the increasingly integrated role of computation in basic and applied research, the MGHPCC represents a critical piece of infrastructure that will fuel the innovation economy of the Commonwealth.

The five campuses of the University of Massachusetts are the "consortium within the consortium" of the MGHPCC, deploying their own shared high performance computing cluster at the MGHPCC facility in Holyoke. The rationale for a UMass HPC cluster is simple: it's a cost-effective and powerful resource for the UMass research community.  This UMass HPC is governed by a research advisory council and a user group.

There are three service models available for UMass high performance computing services at the MGHPCC: UMass Shared Service, Condo and Co-Lo (co-localization).

UMass Shared Service

This model allows all UMass research faculty to access the MGHPCC environment as a user of large scale cluster resources.  Users will manage and submit jobs using the general cluster based job scheduler (LSF) and access will follow Fair Share policies.  This model maximizes efficient use of this shared resource.  A fair share queuing model, governed by research & IS, allows each lab and university to leverage HPC resources far beyond their individual means. By adjusting the priority of waiting jobs according to a job's resource needs, expected run times and resource usage history, UMass' HPC cluster will dramatically increase compute access across all campuses. All management for these services will be supplied by MGHPCC central resources. Compute resources are seeded initially by a shared budget. Growth of the shared cluster can occur by campus or PI-based contributions with an understanding that all compute resources, current and new, in the shared cluster are available as fair share and will remain in the shared cluster.

Condo Service

Condo based services enable researchers to supply compute and storage that comply with MGHPCC standards, and use the common LSF scheduling software.  All management for these services will be provide by the MGHPCC central resources. Fair share amongst condo tenants is required. Priority queues with preemption (with member determined policy parameters for priority) are supported. Storage architecture is identical to that used for the shared service and is part of a larger university-wide data center optimization strategy.

Co-lo (Private) Service

Co-lo services are also commonly described as "ping and power" service. It enables researchers to deploy a net new HPC environment that they manage and support.  The customer supplies all hardware and software, including networking equipment.  The management of all equipment (except for WAN link) is the responsibility of the researcher.

Principal Investigators may select the service model option that best meets their research group needs. Resources are not shared between different service models.

 

Shared

Condo

Co-Lo

Hardware Standards

Yes1

Yes1

Enterprise Class

Hardware Acquisition

Campus/Dept/PI

Dept/PI

Dept/PI

LSF/OS licensing

Yes

Yes

NA

Queue

Fair Share

Priority with Pre-emption

Fair Share within Condo

NA

Shared Services Support

Yes

Yes

No

Network Management

Yes

Yes

Yes (WAN only)2

Storage

Yes3

Yes3

No

IdM Integration

Yes

Yes

No

Operating Expenses

Campus

Campus

Dept/PI

1Coordinated by UMassMed; standard vendors enable shared support and minimize costs

2May require additional investments for network infrastructure

3May require additional investments based on capacity requirements

Fair Share Queue: All resources are available in accordance with agreed policies and use-based adjustments are algorithmically made to job queue priorities to maximize efficient use of resources.

Priority Queue with Pre-emption: Similar to fair share except jobs submitted by a "tenant" are assigned a higher priority and will pre-empt running jobs on contributed resources.