Internet2

close
Use Internet2 SiteID

Already have an Internet2 SiteID?
Sign in here.

Internet2 SiteID

Exploring Clouds for Acceleration of Science

Overview

Internet2 leads the "Exploring Clouds for Acceleration of Science (E-CAS)" project in partnership with representative commercial cloud providers to accelerate scientific discoveries. The effort demonstrates the effectiveness of commercial cloud platforms and services in supporting applications critical to growing academic and research computing and computational science communities, and will illustrate the viability of these services as an option for leading-edge research across a broad scope of science. The project helps researchers understand the potential benefit of larger-scale commercial platforms for simulation and application workflows such as those currently using NSF's High-Performance Computing (HPC), and explores how scientific workflows can innovatively leverage advancements in real-time analytics, artificial intelligence, machine learning, accelerated processing hardware, automation in deployment and scaling, and management of serverless applications in order to provide digital research platforms to a wider range of science. The project aims to accelerate scientific discovery through integration and optimization of commercial cloud service advancements with NSF's cyberinfrastructure resources; identify gaps between cloud provider capabilities and their potential for enhancing academic research; and provide initial steps in documenting emerging tools and leading deployment practices to share with the community. 

Cloud computing has revolutionized enterprise computing over the past decade and it has the potential to provide similar impact for campus-based scientific workloads. The E-CAS project explores this potential by providing two phases of funded campus-based projects addressing acceleration of science. Each phase is followed by a community-led workshop to assess lessons learned and to define leading practices. Projects are selected from two categories; time-to-science (to achieve the best time-to-solution for scientific application/workflows that may be time or situation sensitive) and innovation (to explore innovative use of heterogeneous hardware resources, serverless applications and/or machine learning to support and extend application workflows). The project is guided by an external advisory board including leading academic experts in computational science and other fields, commercial cloud representatives, NSF program officers, and others. It leverages prior and concurrent NSF investments while creating a new model of scalable cloud service partnerships to enhance science in a broad spectrum of disciplines. 

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. Read the NSF announcement.

The proposal submission process for the first phase closed on Friday 1st of February 2019 at 5:00 p.m. (submitters' local time-zone).

For questions or comments, please send an email to ecas@internet2.edu.


News about the E-CAS Project

First-Phase Proposals

There are 6 projects chosen to participate in the first phase of the Exploring Clouds for Acceleration of Science (E-CAS) project based on their need for on-demand, scalable infrastructure, and their innovative use of newer technologies such as hardware accelerators and machine learning platforms. Read the official announcement

The successful proposals for the year-long first phase of the E-CAS project are: 

Accelerating Science by Integrating Commercial Cloud Resources in the CIPRES Science Gateway
Mark Miller, San Diego Supercomputing Center (UCSD)

CIPRES is a web portal that allows scientists around the world to analyze DNA and protein sequence data to determine the natural history of a group or groups of living things. For example, one can ask where mammals originated, or how does Ebola virus spread, or whether a given plant is really a new species, or an unwelcome imported species, or how does a given species interact with other species and its environment over long periods of time. CIPRES helps answer these kinds of questions by providing access to parallel phylogenetics codes run on large HPC clusters provided by the NSF XSEDE program. CIPRES currently runs analyses for about 12,000 scientists per year, and that number is growing each year. CIPRES accelerates research by increasing each researcher’s throughput. Job runs go faster using parallel codes, and users can run many jobs simultaneously on large clusters. For example, CIPRES provides access to P100 GPUs that can speed up some jobs by 100-fold relative to a single core run. But GPUs are in short supply in the XSEDE portfolio, and so usage must be strictly limited. This project will develop the infrastructure needed to cloudburst CIPRES jobs to newer, faster V100 GPUs at AWS. As a result, individual jobs will run up to 1.5 fold faster, and users will have access to twice as many GPU nodes as they did in the previous year. The infrastructure created will also open the door for scalable access to AWS cloud resources through CIPRES for all users.   

Investigating Heterogeneous Computing at the Large Hadron Collider
Philip Harris, Massachusetts Institute of Technology (MIT)

At 40 Million collisions per second, data rates at the Large Hadron Collider are some of the largest in the world. To contend with these large data rates a tiered system is utilized to filter out and reconstruct the most interesting collisions. Unfortunately, this system has limitations. At each tier, events are not selected that contain important physics processes, some these events include Higgs bosons and potentially dark matter. With expected increases in data rates, these limitations will get worse. To overcome this limitation, we propose to redesign the algorithms using modern machine learning techniques and then to incorporate these algorithms into heterogeneous computing systems. Dramatic improvements in processing time can be obtained by exploiting the high level of parallelization of machine learning algorithms used in conjunction with specialized processors, such as Field Programmable Gate Arrays. By migrating to this paradigm, more data can be processed at the Large Hadron Collider leading to larger physics output and potentially foundational discoveries in the field. While we focus on the Large Hadron Collider, the lessons are far-reaching and can impact many fields were large data flow is present.

Ice Cube computing in the cloud
Benedikt Riedel, University of Wisconsin

The IceCube Neutrino observatory located at the South Pole supports science from a number of disciplines including astrophysics, particle physics, and geographical sciences operating continuously being simultaneously sensitive to the whole sky.  Astrophysical Neutrinos yield understanding of the most energetic events in the universe and could show the origin of cosmic rays. Being able to burst into cloud supports follow-up computations of observed events & alerts to and from the community such as other telescopes and LIGO.  This project plans to use custom spot instances and FPGA based filters in AWS and GPU/TensorFlow Machine Learning in GCP.

Building Clouds: Worldwide building typology modelling from images
Daniel Aliaga, Purdue University

This Exploring Clouds for Acceleration of Science (E-CAS) project will exploit the computational power and network connectivity to provide a world-scalable solution for generating building-level information for urban canopy parameters as well as for improving the information for estimating local climate zones, both of which are critical to high resolution urban meteorological/environmental models. The challenge is that current computational models have a bottleneck, not just in terms of the physics and processes within the land surface and boundary layer schemes, but even more critically the need is for providing a robust means of generating parameter values that define the urban landscape. This is how the proposed E-CAS inverse modeling approach comes into play. By utilizing images and world-wide input about building properties, we can infer a sampling of 3D building models at world scale containing more than just the geometrical shape information and enable world-scale urban weather modeling.

Deciphering the Brain's Neural Code Through Large-Scale Detailed Simulation of Motor Cortex Circuits
William Lytton, State University of New York (SUNY Downstate MC)

This project will investigate how the brain encodes and processes information through very large-scale and detailed simulations of the brain cortical circuits. The brain cortex is the outermost layer of the brain and is responsible for most high-level functions like vision, language or reasoning. We have developed the most detailed computational model of the motor cortex circuits using experimental data from over 30 studies. It includes details at multiple scales, from molecular effects inside the neuron to long-range connections from other brain regions. This means we now have our own in silico brain cortex that we can experiment with precisely and repeatedly to try to decipher the neural code. We will use NetPyNE, our own software tool for brain modeling, to run thousands of parallelized simulations exploring different conditions and inputs to the system. Google Cloud and SLURM will enable us to run thousands of these simulations at the same time by employing up to 50k cores concurrently. These cloud computing resource will therefore vastly accelerate our research and help decipher the brain's neural coding mechanisms. This knowledge has far-reaching applications, including developing treatments for brain disorders (which affect 1 out of 4 people), advancing brain-machine interfaces for people with paralysis, and developing novel artificial intelligence algorithms. 

Development of BioCompute Objects for Integration into Galaxy in a Cloud Computing Environment
Raja Mazumder, George Washington University

BioCompute Objects allow researchers to describe bioinformatic analyses comprised of any number of algorithmic steps and variables to make computational experimental results clearly understandable and easier to repeat.  Galaxy is a widely used bioinformatics platform that aims to make computational biology accessible to research scientists that do not have programming experience. The project will create a library of BioCompute objects that describe bioinformatic workflows on Amazon Web Services, which can be accessed and contributed to by Galaxy users from all over the world.  This project also plans to utilize AWS Direct Connect over Internet2 to connect the library of biocomputer objects to the campus HPC environment at George Washington University.
 

FAQ

What is E-CAS?

Exploring Clouds for Acceleration of Science, or E-CAS, is a new project funded by the National Science Foundation (NSF), being administered and coordinated by Internet2 in collaboration with commercial cloud service providers. The project will invite proposals from researchers from multiple disciplines interested in performing cutting-edge scientific and computing studies by leveraging capabilities in cloud computing platforms.

Who is involved?

The project will include a partnership between NSF and Internet2 as well commercial cloud providers Google and Amazon Web Services (AWS) who have agreed to commit significant resources to support the project. Additional cloud provider collaborators who are able to accommodate agreed upon criteria for project support within 30 days following the project's public announcement may also be able to participate.

What are the objectives of this project?

E-CAS is intended to accelerate scientific discoveries by leveraging advancements and novel technologies in commercial cloud platforms to demonstrate their effectiveness in supporting a range of applications critical to growing academic and research computing and computational science communities, and to illustrate the viability of commercial cloud services as an option for leading-edge research across a broad spectrum of scientific disciplines.

How will this help the research and science community?

The project's broader impact will allow it to serve as a model for other large-scale research initiatives and in the likelihood of it enhancing curriculum and training advancements in the use of cloud services with academic research environments. More specifically, the broader research community can benefit from this project by understanding whether there are simulation and application workflows that currently use High-Performance Computing (HPC) resources that can benefit from commercial cloud platforms.

What background and experience does Internet2 bring to this project?

For more than 20 years now, Internet2 has existed to facilitate collaborative efforts of U.S. higher education institutions to advance aspects of their academic, service, and research missions. Internet2 brings many years of experience working with cloud platforms, technologies and the service provider community. Therefore, Internet2 will serve as a natural coordinator, facilitator, and administrator of this project for the further benefit of its members and the broader research and education community. For more information about Internet2, visit www.internet2.edu.

What is Internet2's role in this project?

Working with NSF, Internet2 has established a team to implement and manage the project. Specific responsibilities will include:
  • Establishing, convening and managing an external Advisory Board
  • Working with cloud providers to provide access, documentation, and support for cloud resources
  • Working with regional network providers, and cloud service providers to establish appropriate connectivity to enable data pipelining between data sources and compute facilities
  • Managing the Phase I and II proposal submission, review, and selection processes; and
  • Managing and implementing the Phase I and II awards.

How is the project governed?

The project will be governed by an advisory board made up of academic researchers and cloud service representatives.

What is the Advisory Board and what is their role?

The Advisory Board will provide support to Internet2 for the execution of the project, manage logistics including resource selection, account setup, interactions with cloud providers, and provide support to the selected Phase I and II proposals. They will provide input to the external reviewers for proposal reviews.

Who is on the advisory board?

The Advisory Board consists of expert academic researchers, representatives from the high-performance computing community and commercial cloud providers. The Advisory board uses external reviewers in the selection of Phase I and Phase II awards.
  • Dr. Amy Apon, Ph.D. (Chair)
    Co-Director, Complex Systems, Analytics, and Visualization Institute
    Professor and Chair, Division of Computer Science, School of Computing
    Clemson University
  • Professor Thomas E. Cheatham, III
    Department of Medicinal Chemistry, College of Pharmacy
    Director, Research Computing and CHPC, UIT
    University of Utah
  • Dr. Valerie Taylor, Ph.D.
    Director, Mathematics and Computer Science Division
    Argonne National Laboratory
  • Dan Stanzione, Ph.D.
    Executive Director, Texas Advanced Computing Center
    Associate Vice President for Research
    The University of Texas at Austin
  • Marla Meehl
    Section Head: Network Engineering and Telecommunications Section (NETS)
    Manager: Front Range GigaPoP (FRGP)
    President: Westnet Education and Research Consortium (WERC)
  • Sanjay Padhi, Ph.D
    AWS Research Initiatives
    Worldwide Public Sector, Amazon Web Services
  • Karan Bhatia, PhD
    Google Cloud, High Performance Computing
  • Jenny Tsai-Smith
    Vice President, Oracle Cloud Innovation Accelerator - Higher Education & Research
  • Principal Investigator (PI)
    Howard Pfeffer

    President and CEO, Internet2
  • Co-PIs
    Jim Bottum, Ana Hunsinger

    Internet2

Why were Google and AWS selected for this project and what role will each play?

Both Google and Amazon Web Services (AWS) were responsive to initial discussions and agreed to commit significant resources to support the project. These initial discussions framed the format of the project and without their commitment the project would not be possible. Other cloud providers may be welcome to join the project if they can match the commitments of Google and AWS within the first 30 days of the project.

Will other cloud providers have the opportunity to participate in the project?

The project may welcome up to two additional cloud service providers who can agree to meet some minimum project coordination, project support and cloud credit requirements. Any additional providers will be determined within 30 days of the project's formal public announcement.

What are the specific focus areas of the project?

The project will include two areas of focus throughout its two phases of execution. The first is Acceleration of Science: The goal of the Acceleration of Science studies is to achieve the best time-to-solution for scientific application/workflows using cloud. The measures of acceleration may include end-to-end performance (e.g., wall clock and data-movement), or other relevant measures such as number of concurrent simulations or workflows, or the ability to process near real-time streaming data. The second focus area is Innovation: The goal of the Innovation studies is to explore the innovative use of heterogeneous hardware resources such as CPUs, GPUs, and FPGAs to support and extend application workflows.

Can you describe the specifics of the project?

The project will consist of two phases.

Phase I: The project’s first phase will include the submission, review, selection, and funding of an estimated six Phase I projects. Each recipient will have one year to perform a six-month operations study and corresponding development work. At the end of Phase I, Internet 2 will host a project workshop to assess lessons learned.

Phase II: The project’s second phase will include submission, review, and selection of two-Phase II awards. The Phase II awards will be selected from the six Phase I projects and will ideally include one from each area of focus – one from Acceleration of Science and the other from Innovation. The Phase II awardees will have one year to complete their project. At the end of Phase II, Internet2 will host a final workshop to help define and document best practices, lessons learned, and recommendations for sustained, scalable commercial cloud service adoption research environments.

What are the criteria for proposal submissions?

A Program Team and Advisory Board will define evaluation criteria for both phases of the project. The intent is to span a range of different classes of workflows, and the Program Team and Advisory Board will reach out to exemplar workflows running on NSF's current HPC resources. Proposals will be considered for both operations projects (to support efforts with a track record of prior research involving cloud resources) and development projects (to support efforts from applicants with less experience with research involving cloud support). Applicants will be encouraged to explore hybrid co-compute tools for analytical and scientific workflows. For studies associated with modern hardware accelerators, proposals may use application programming interfaces (APIs) from the selected cloud provider(s).

How will the project submission process work?

For Phase I Proposals, potential applicants will be able to submit proposals of up to 10 pages addressing general project goals, previous work, methodology and justification of requested resources. Project proposals will also require a 2-page, NSF-format bio sketch for the Project PI as well as NSF-format current and pending funding lists for the Project PI. Full requirements can be found at www.internet2.edu/ecas.

Phase II Proposals will be selected from the 6 awarded Phase I proposals. Phase II proposal submissions will include a narrative of up to 15 pages addressing general project goals, previous work within the applicant’s Phase I project demonstrating value, necessity, and potential impact, methodology, and justification of requested resources. Project proposals will also require a 2-page, NSF-format bio sketch for the Project PI. Full requirements can be found at www.internet2.edu/ecas.

How will proposals get reviewed?

At the end of the submission period for each phase, proposals will be sent to external reviewers for evaluation. Once complete, the individual reviews will be evaluated by the academic members of the Advisory Board where final decisions will be made.

Who can be an external reviewer?

The Advisory Board is seeking computer and computational scientists from a wide range of fields and deep levels of experience to perform as reviewers, including members of the HPC provisioning community rooted in national centers and campuses with a wide range of expertise and scale including cloud provisioning and performance measurement and efficiency. We welcome recommendations for reviewers and encourage nominations to be sent to ecas@internet2.edu.

What is the funding available and how will that aspect of the project work?

Phase I accepted proposals will be granted cloud credits up to a maximum value of $100,000 per award based on justification provided by scientific workload needs. Cloud provider preferences and stated cloud resource needs will be considered in the selection of the Phase I awardees. The Phase I project awards also will include funding for up to one year of partial salary support and fringe benefits for a staff member, postdoctoral fellow, or graduate student to help with development and initial cloud deployment. Funding for this purpose will be allocated for one year for partial salary and fringe benefits, plus indirect costs on these direct costs for a total of up to $81,000.

Each of the Phase II accepted proposals will be allocated as a one-year sub-award from Internet2 to the awardee including direct costs for partial salary support for a staff member, postdoctoral fellow, or graduate student (including fringe benefits). The sub-award will also include up to $500,000 in cloud services from an assigned cloud provider (to be transmitted by the awardee directly to the provider from funds received under their Internet2 sub-award under a separate agreement including appropriate award, agency, and OMB compliance requirements). Phase II sub-awards will include indirect cost recovery on salary, fringe benefits, and cloud services.

How will cloud credits be allocated from the cloud service providers?

Each project will receive an allocation of credits directly from an assigned commercial cloud provider to develop appropriate cloud architectures, test software, fine-tune performance, and then run the proposed workloads at scale within the execution period. These agreements will be arranged directly between the Phase I award campuses and cloud providers.

How do I apply for this project?

The anticipated submission deadline for Phase I proposals is 01 February 2019. Those projects selected for Phase I will be invited to submit Phase II proposals in April/May 2020. Proposal submission will be electronic with brief applications submitted via Internet2 as a portal for E-CAS panel review. Please visit the E-CAS Submission Process tab to submit your proposal.

Are the submission deadlines flexible?

No. Proposals will not be accepted after 5:00 PM submitter's local time on the deadline date.

Who is eligible to submit proposals for this project?

Proposals may be submitted by academic institutions, including universities and two-and four-year colleges (including community colleges) accredited in and having a campus located in the United States. Proposals will also be accepted from non-profit, non-academic organizations, including national research laboratories, regional networks, and similar organizations within the United States associated with educational or research activities.

When will the awards be announced?

It is anticipated that Phase I awards will be announced in March 2019 and Phase II awards will be announced mid 2020.

If I submit a proposal, would my home institution need to be connected to the Internet2 network?

No, there is no requirement for your home institution to be an Internet2 member or to be connected to the Internet2 network. There is however significant benefit in being connected to the Internet2 network which enables high performance connectivity to cloud providers through its Cloud Exchange and Cloud Connect services.

See https://www.internet2.edu/products-services/advanced-networking/cloud-access/

What is the relationship between the E-CAS project and the recently announced solicitation (NSF 19-510) on "Enabling access to cloud computing resources for CISE research and education"?

NSF 19-510 is a separate program from this project. Interested parties in that solicitation should contact the NSF directly.

What is the difference between this project and the Internet2 NET+ program and specific agreements with Google, AWS?

There is no direct connection between the two. This project is a fixed-term project in partnership with the NSF to support research innovation and acceleration of science for a small number of selected projects. The NET+ program is an ongoing engagement between Internet2 and cloud providers to make cloud services available through an agreed set of commercial terms and conditions.

How do I find out if my home institution is already using a NET+ program agreement for these cloud providers?

To determine this information, please send an email to netplus@internet2.edu.

There will be six projects in Phase I and two projects in Phase II. Can I apply for Phase II if I missed out on Phase I?

No, the Phase II projects will be selected from the projects in Phase I. Think of it as a funded "proof of concept" for six projects in Phase 1, followed by an award for two of those projects to further develop and repeatedly run the science workloads at scale for an additional year.

Who do I contact for more information about this project?

For further information, please send all questions to ecas@internet2.edu. To find out more information, please visit www.internet2.edu/ecas.