Skip to content

2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering

Notifications You must be signed in to change notification settings

se4science/SE-CODESE17

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering (SE-CODESE17) held in conjunction with SC17

Sunday, November 12, 2017, 2:00pm - 5:30pm, Room 501

Website for this workshop is https://se4science.github.io/SE-CODESE17/

https://sc17.supercomputing.org/session/?sess=sess419

Document for collaborative note taking

Please leave feedback about this workshop using this survey

Agenda

Goals of workshop

This workshop is concerned with identifying and applying appropriate software engineering (SE) tools and practices (e.g., code generators, static analyzers, validation + verification (V&V) practices, testing, design approaches, and maintenance practices) to support and ease the development of reproducible Computational and Data-enabled Science & Engineering (CoDeSE) software for High Performance Computing (HPC). Specifically:

  • CoDeSE applications that include large parallel models/simulations of the physical world running on HPC systems.
  • CoDeSE applications that utilize HPC systems (e.g., GPUs computing, compute clusters, or supercomputers) to manage and/or manipulate large amounts of data.

Despite the increasing demand for utilizing HPC for CoDeSE applications, software development for HPC historically attracted little attention from the SE community. Paradoxically, the HPC CoDeSE community has increasingly been adopting SE techniques and tools. Indeed, the development of CoDeSE software for HPC differs significantly from the development of more traditional business information systems, from which many SE best practices and tools have been drawn. These differences appear at various phases of the software lifecycle as described below:

  • Requirements
    • Risks due to the exploration of relatively unknown scientific/engineering phenomena;
    • Supporting reproducible science, particularly on non-deterministic systems;
    • Constant change as new information is gathered;
  • Design
    • Data dependencies within the software;
    • The need to identify the most appropriate parallelization strategy for CoDeSE algorithms;
    • The presence of complex communication among HPC nodes that could degrade performance;
    • Challenges in designing unit and system tests at appropriate scales;
    • The need for fault tolerance and task migration mechanisms to mitigate the need to restart time-consuming computations due to software or hardware errors;
  • V&V
    • Results are often unknown when exploring novel science or engineering areas, algorithms, and datasets;
    • Challenges in applying unit and system tests at appropriate scales;
    • Challenges in retrospectively designing and implementing tests for legacy code;
    • Popular tools often do not work on the latest HPC architectures; they need to be tuned to handle many threads executing at the same time.
  • Deployment
    • Failure of components within running systems is expected due to system size;
    • Continuous integration on platforms with high available and infrequent downtimes;
    • Long system lifespans necessitate porting across multiple platforms

Therefore, in order to identify and develop appropriate tools and practices to support HPC CoDeSE software, members of the SE community, the CoDeSE community and the HPC community must interact with each other. This workshop aims to provide a platform to facilitate this interaction by encourage paper submission and workshop participation by people from all three communities. In addition to presentation and discussion of the accepted papers, significant time during the workshop will be devoted to large and small group discussions among the participants to identify important research questions at the intersection of SE and HPC CoDeSE that are in need of additional study.

Previous editions of this workshop have focused discussion around a number of interesting topics, including: bit-by-bit vs. scientific validation, reproducibility, unique characteristics of CoDeSE software that affect software development choices, major software quality goals for CoDeSE software, crossing the communication chasm between SE and CoDeSE, measuring the impact of SE on scientific productivity, SE tools and methods needed by the CoDeSE community, and how to effectively test CoDeSE software.

Motivated by the discussion during the 2015 and 2016 workshops, in this edition of the workshop, we expand the previous workshops by continuing and extending two special focus areas, and emphasizing data-enabled science and engineering as a partner of computational science and engineering, turning CSE into CoDeSE. First, we will place special emphasis on experience reports (including positive, negative, and neutral) of applying software engineering practices to the development of HPC scientific software. It is important to document those successes and failures for the community. Second, as quality assurance is a challenge in the scientific HPC domain, which was specifically discussed in 2016, we will also recruit papers describing quality assurance techniques for HPC science and their use in practice focussing specifically on the challenges of unit testing, system testing, and continuous integration for HPC codes, addressing both legacy code and testing at scale on different architectures and platforms.

We will split into small groups to discuss the topic of software lifecycles models for scientific software.

Each group will start by discussing one of the following questions:

  1. Are there any stages of the scientific software lifecycle that are fundamentally different/novel from the lifecycle for other software? Notes from Group 1a, Notes from Group 1b

  2. Do any commonly identified software lifecycles from industry / open source work well for particular types of scientific software projects? If so, how can these projects be characterised? Notes from Group 2a, Notes from Group 2b

  3. Are there any metrics that help us understand which software development model we should choose for a particular type/size of scientific software project? Notes from Group 3a, Notes from Group 3b

  4. What aspects of the software engineering lifecycle process are difficult for your projects and why? Notes from Group 4a, Notes from Group 4b

Each group should nominate a facilitator who will take notes in the Google Doc documents linked above (anyone can add to these notes using the link) and keep discussion flowing.

Your aim as a group is to:

  • (Briefly) share the knowledge that you have on this topic with the rest of your group
  • Record any good examples that inform the discussion
  • Record any places where you think further research or experiences are required to understand the topic
  • Summarize what you think it is important to understand about this topic

In the wrap-up session, each group will be asked to summarise their discussions and report them to the rest of the workshop. If you think you have completed discussion of your question, please feel free to move on to one of the other questions.

Related Sessions

You might be interested in the following related sessions at SC17:

Code Review Survey

Jeffrey Carver and Nasir Eisty of the University of Alabama are conducting a research study titled “Code Review Process in Computational Science and Engineering Software”. They wish to understand the practices, impacts and barriers of code review technique in Computational Science and Engineering (CSE) software development.

We encourage workshop participants to complete a web survey that will take about 15 minutes. This survey contains questions about your previous experience with code review process.

Complete the survey at: http://bit.ly/CodeReview-SC17

SC17 Feedback Survey

Please provide feedback on this workshop using the survey at: https://submissions.supercomputing.org/?page=SessionEval&new_year=sc17&id=sess419&eval_stype=stype171

Committees

Organizing Committee

Program Committee

  • David E. Bernholdt - Oak Ridge National Laboratory
  • Jeff Daily - Pacific Northwest National Laboratory
  • Ali Jannesari - University of California, Berkeley
  • Hilmar Lapp - Duke University
  • Lois Curfman McInnes - Argonne National Laboratory
  • Sarah Mount - King's College London
  • Aleksandra Pawlik - New Zealand eScience Infrastructure
  • Tracy Teal - Data Carpentry
  • Stefan Wagner - University of Stuttgart
  • Ethan White - University of Florida

Code of Conduct

All participants are reminded that their involvement in this session is covered by the SC17 Code of Conduct.

About

2017 International Workshop on Software Engineering for High Performance Computing in Computational and Data-Enabled Science and Engineering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published