April 4, 2019
One goal of the conceptualization phase of URSSI is to gather as much input from the community as possible about the different facets and pain points of sustainability of research software: from career paths of software developers in academia to citations of software to gaps in existing training and education programs for software engineering. The awareness of the importance of this topic is evident in diverse initiatives and projects around research software sustainability such as WSSSPE (Working towards Sustainable Software for Science: Practice and Experiences) and BSSw (Better Scientific Software), and funding programs like the past NSF SI2, which also funds the conceptualization of URSSI, and its successor, CSSI. Our first community-driven URSSI workshop in April 2018 was the model for our second community-driven URSSI workshop in Chicago on October 23-24, 2018.
This report describes the discussions and outcomes of the Chicago workshop, which featured lightning talks and interactive sessions with topics suggested and selected at the workshop. The workshop attracted 52 participants from 47 diverse affiliations including universities, national laboratories, industry, and funding agencies.
Setting the Stage for Discussions
The workshop began with presentations from the PIs of URSSI to provide context for the conceptualization and to introduce the workshop format. Karthik Ram and Daniel S. Katz gave an overview of URSSI, its motivation and goals, the workshop format and high-level outcomes of the Berkeley workshop. Jeffrey Carver provided a high level summary of the URSSI survey and encouraged attendees to fill it out. Nic Weber presented preliminary findings from ethnographic case studies of three software projects. These projects include heavily used software packages in Astronomy and Biophysics, as well as a short-term software package developed in the field of hydrology. Sandra Gesing summarized various efforts to engage the broader research community.
We had 14 lightning talks presenting on-going related efforts, projects, and initiatives. Topics include initiatives such as XSEDE campus champions, ACI-REF (Advanced Cyberinfrastructure Research & Education Facilitator) Virtual Residency, and sustainable open source research software organizations as well as projects such as the ECP IDEAS project and the Computational Infrastructure for Geodynamics. These talks elucidated the importance of community building and training on technical as well as on soft skills and leadership skills. Some talks went into detail on technical solutions such as Sciunit providing reproducible containers and software ontologies. The lightning talks were selected to add ideas for the breakout sessions and to provide updates on on-going related efforts. The slides for the lightning talks are available in GitHub.
The primary purpose of this workshop was to spark discussion among the attendees around topics that emerged during the Berkeley workshop as well as before this workshop. We asked the participants and the wider community to post challenges that URSSI should tackle. All of these ideas, including those from the Berkeley workshop, were collected as issues in our workshop’s GitHub repository. After filtering and grouping related ideas, we presented all of them to the workshop attendees so they could form small groups and brainstorm ideas and possible solutions.
Over the one-and-a-half-day workshop we ran 4 breakout sessions, including 2 on topics ddetermined in advance (Mission and Vision for URSSI and URSSI Organization Strawhorse). One topic (URSSI summer school for research software engineering) was quite popular and was discussed in several breakout sessions to build up a better undertanding and more substantial ideas.
The key themes of the 2 breakout sessions whose topics were determined during the workshop were (i) community building, (ii) career path and institutional support for Research Software Engineers (RSEs), (iii) training and workforce development and (iv) sustainability in relation to reproducibility, usability, and discoverability.
Participants pointed out the importance of community building in several discussions. Mentoring and training as well as workforce development is part of this but it is also an area on its own right. The discussion concluded that there is a need to facilitate the related groups into forming one or more larger communities and to provide ongoing community engagement to keep the communities going.
Career path and institutional support for RSEs
To support RSEs and their career paths, a work environment that motivates and rewards RSEs is essential. Clear performance evaluations and possibilities for skill enhancement is part of this. The lack of incentives and credit for software development, as well as the lack of job descriptions and role descriptions, hampers a well-defined career path at the moment. Software contributions are generally not a factor in career advancement in academia; typical evaluation criteria do not include software but instead include publications and citations, successful proposals and funding, and advised and graduated students. Research software positions at universities are still poorly defined. Further, compensation packages and opportunities for professional development still lag far behind similar positions in industry. Successful case studies at different universities, which built core software facilities or worked with outsourcing would be a good start to define roles and a career path. Institutional support is an essential part of improving career path for RSEs.
Training and workforce development
Training and workforce development was a major concern in many discussion even if the main topic started from a different angle, such a reproducibility related to sustainability. Training in best practices including tests and documentation was mentioned several times as well as developer training and curricula. A discussion concluded that there might be a lack of resources to make software engineering a core discipline in graduate or undergraduate studies but that a collaboration with The Carpentries would be beneficial, and perhaps some online classes or training should be developed and provided.
Sustainability in relation to reproducibility, usability, and discoverability
The discussions led to several outcomes regarding the role of sustainability in relation to reproducibility, usability and discoverability. Sustainability contributes to reproducibility or can be even seen as essential requirement to achieve the it. Tools for usability such as creating documentation have often a large overlap with tools for sustainability. Discoverability almost only makes sense if a software is sustainable or developed in a sustainable manner. If software is not sustainable, it might be not necessary to find it to apply it for further domains or use cases.
Mission and Vision for URSSI
The URSSI project leads are in the process of defining mission and vision statements with a small group of stakeholders. At this workshop we presented draft version of both documents for feedback and discussion. We then asked the participants to contribute their own versions of these values. All of these ideas are now with the mission-vision team to further refine and present at the final meeting of PIs, advisory board, and senior personnel, which is scheduled for April.
Core Mission Statements
Participants offered a variety of perspectives on what should be the mission of a research software sustainability institute. Broad themes included the ability to lead, grow, support, and advocate on behalf of a research software community that includes not just researchers, but also developers (such as Research Software Engineers), students (at various levels of training), and a broader set of public stakeholders.
Participants felt that the core mission should include general advocacy such as, “Increasing the understanding of the importance and pervasiveness of software in modern research,” as well specific advocacy, such as diversity outreach - as one group put it succinctly, “Diversity is key for software sustainability - diversity in job types represented, diversity in people’s gender/ethnicities/backgrounds, diversity in scientific fields represented, and diversity in types of software being supported,” and another similarly wrote that URSSI should, “Foster a culture that values diverse, equitable, and inclusive community as a requirement for developing and maintaining robust research software.”
Attendees also felt that an important value in URSSI’s mission should be building trust:
- Adoption of best practices strengthens trust both within and outside the research community
- Software is an inherent part of the scientific process, and it needs to be trustworthy and give reproducible results in order to produce the best science.
Another theme was around providing direction and examples of software sustainability:
- Provide state-of-the-art test cases for software sustainability.
- URSSI should be the leader when it comes to providing direction in achieving software sustainability.
Core Vision Statements
The second exercise asked small groups to discuss and then write future-oriented declarations about the vision of a research software institute. The vision statements were framed as providing both the purpose and aspirations of an institute. The following vision statements were produced by attendees of the Chicago workshop:
- Software is developed following best available practices with long-term use in mind. (engineering excellence in research software)
- We aspire to create a software ecosystem where software is adaptable to emerging technologies and robust modular, software is readily reusable across domains.
- “To improve the quality, usefulness, and sustainability of research software around the world by improving practice.”
- Redefining how the community engages with research software.
- Creating a safe haven for research software engineers.
Based on our first workshop at Berkeley and the strawhorse plan developed there, we used a session of the Chicago workshop to rerun the plan with a new set of participants and to gather more inputs. One change is that we realized we didn’t include a category of Management, Sustainability, and Governance in the plan, so we divided into six groups to discuss this and the other five putative categories of URSSI activities: Development and Support, Incubator, Training, Policy, and Community. Each group discussed budget (based mostly on the previously decided allocation of $5m/year), staff, and specific activities. Note that because each group worked independently, there are a number of similar topics that came up in different areas; we show these topics as each group discussed them, with the full knowledge that we will need to resolve such overlaps and duplication as we go forward.
Development and Support
The PIs want URSSI to impact Development and Support, but how to do this at scale and in a useful and affordable manner is not completely clear. One idea is to work indirectly as much as possible. This group might have a manager and a set of staff members, who would work on training, outreach, education, preservation, reusability, best practices, and professional development, with an overall goal of helping to make these ideas core to other organizations, though these activities also fit some other URSSI strawhorse areas. The group could also work more directly, doing some consulting with specific projects, similar to XSEDE’s ECSS or SGCI’s Extended Developer Support, but not overlapping or competing with these other activities, just filling gaps that they do not cover. Potentially URSSI could also coordinate these activities across projects and institutes, including XSEDE, SGCI, MolSSI, IRIS-HEP, etc.
The intent of the Incubator area is to be a place and a catalyst where project ideas can germinate into small projects and small projects grow into viable sustainable projects, by providing technology, business planning, governance/licensing, and usability advice. A software idea coming into the incubator area should be able to make a case that it will be useful to at least some audience as part of a incubation proposal, which would also discuss how open and sustainable the current idea/project is and plans to be. The process could be something like a “Shark Tank” for sustainable research software, perhaps also including a bootcamp (similar to or based on SGCI’s), mentorship, and training. The incubator could include transferring knowledge and practices on how to build a project, how to release and package it, how to transfer it to another group, etc.
The Training area initially talked about developing quick start guides (if you are using Stack X, here are some good practices to use), or training modules that discuss high-level practices. Given the existence and success of the Carpentries, the group also discussed building a follow-on course of several weeks to a semester that multiple organizations could deliver, as well as a summer school. This latter idea was further developed by a separate breakout group. Given that training geared towards commercial software optimizes for problems that typically don’t concern researchers, we need to address challenges that are unique to research software development. URSSI might hire someone to write lessons, curate recommendations, and/or support someone to develop curriculum for summer school. It could also support full-time mentors who listen to the community to determine needs, and then curate/write curriculum. Or URSSI could hire a few trainers to develop content, but also curate existing content. Another idea is to focus on developing different lessons each year, and once something is complete and mature, move to the next one, though some effort also would need to be made to keep old lessons up-to-date. Part of training also needs to be outreach about the materials, and evaluation to show that they are useful and are used. Finally, the group felt the train-the-trainer model from The Carpentries is one of the only ways to ensure scalability.
The URSSI Policy area is aimed mostly at how to affect change in broader funding and research institutions, outside the research software community itself. Some initial areas that URSSI could address include software (management/sustainability) plans, licensing, and careers (hiring, tenure & promotion, titles, RSE, etc.) The idea of dividing policy into research & analysis vs advocacy, from the Berkeley workshop, still seems correct. Research & analysis might include a postdoc, a fellowship for a student working in copyright & IP, and some workshops. Advocacy requires a staff member (or a portion of one), with substantial travel funding. This area could also include an annual conference/workshop (akin to an Aspen Institute event), or this could be tied to an annual URSSI event. An expanded policy area might include a summer program for undergraduate or graduate students focusing on software and science policy. In all cases, URSSI policy would need to work and coordinate with others in this area, e.g., Creative Commons. While there are a number of groups interested in software policy, there are few interested specifically in research software policy; this would be URSSI’s specialization.
The Community area could include a senior-level person with experience in strategy and research software, and half of two people, one shared with training/policy, working on communications, website, logistics, coordinating working groups, and the other working on analytics and metrics to use a data-driven approach to this community area and perhaps other areas of the institute, including evaluation. The Community area would also include fellowships, with the goal of creating and supporting advocates for sustainable software and for URSSI itself. The fellows would work in one a of a few different areas, possibly advocacy, community building, and training. This area would also support an URSSI conference each year, and some number of small, focused workshops, as well as more general communications, such as a website and a regular newsletter.
Management, Sustainability, and Governance
The Management, Sustainability, and Governance group talked about how URSSI will organize itself and be guided by the community. The group proposed this would require three full-time staff members–—an executive director, a deputy director, and an administrator–—in addition to at least 25% of the PI, and some time for a co-PI for each area, (perhaps via course buyout, though this depends on if they are faculty). Either the executive director or the deputy should be charged with pursuing additional funding (for sustainability), depending on the specific people hired. The administrator would be responsible for organizing events, budgeting and contracting, and report processing.
Management here means internal management of the URSSI staff and organization, as well as interaction with other software institutes, reporting to NSF and internally, and integrating diversity and inclusion into URSSI. Governance and sustainability is more focused externally, and includes public reporting, managing community needs and NSF goals and resolving conflicts between them when necessary, and bringing in volunteers and funding. URSSI should use titles when it doesn’t have funding to offer to motivate staff and volunteers, and should try to have a start-up mentality, where people are willing to work outside their defined positions to get things done.
Among many discussions at the Chicago URSSI workshop in October, one popular thread was the idea that URSSI could run or coordinate a summer school for research software. Researchers often develop software without any formal training in software engineering best practices. This has led to the creation of software of inconsistent quality, reliability and often without any plan for sustainability. While simultaneously raising awareness of the problem, URSSI could also train junior researchers in best practices during their graduate programs. Some ideas that emerged from the discussion include:
- URSSI could run a 1-2 week summer school primarily targeted at first and second year PhD students whose research may involve writing software. Some participants suggested that the audience be limited to students who have already advanced to candidacy and therefore had clearly laid out software aspirations. The summer school would consist of a mix of lectures and hands on exercises, with students working on their own research for the second half of the course while still having access to professors and teaching assistants.
- Participants suggested that URSSI subsidize the cost of attendance (or waive it entirely) for the first few years while making the true costs very clear. Over time, participants would bear an increasing portion of the costs, until a point where the program becomes entirely self-funded is reached. PIs would cover the costs of their students from research grants.
- To continue growing the community of RSEs, past participants would be required (or offered incentives) to return as teaching assistants and project mentors in the future. This mentorship requirement would help build connections among students but also help with keeping instructor costs low.
- It was also suggested that professors rotate though the course in half- or one-week intervals. This would help students gain exposure to more diverse ideas but also make it easier for professors to take time off to teach without giving up their entire summer. Some overlap between consecutive cohorts of instructors would be necessary to ensure that courses run smoothly.
- URSSI’s primary role here could be coordinating the course, which includes activities such as student selection, setting the curriculum and coordinating instructors, TAs, and student projects.
At this and other URSSI workshops, we listened to invaluable input and feedback from the research community and are using these ideas to formulate the plan for an institute. It was evident from the enthusiasm and feedback that the community would see the benefits of an implementation of URSSI collaborating closely with existing efforts like IDEAS and ACI-REF VR. The community-driven workshop in Berkeley in April 2018 was similar structured to the Chicago workshop with open topics in the large area of research software sustainability. Two workshops were organized around specific topics - the software citation workshop in January and the incubator workshop in February - and with invited experts to discuss pain points and objectives in the community. We plan to synthesize outputs from all of our workshops at the final meeting of the project leaders as we draft the plan for a research software institute.