Career Progression for Software Developers in UK Academia?

Apr 20, 2012 by

Career Progression for Software Developers in UK Academia?

In this guest post, Ilian Todorov from the Science and Technologies Facilities Council gives us his perspective on the SSI Collaborations Workshop.

This article represents my personal point of view. It is related but complementary to Dirk Gorissen’s ‘The researcher programmer, a new species?’ blog and the discussion on the ‘Scientific Software Development and Management’ group page of LinkedIn generated as an outcome of SSI’s WP12 meeting. The relation pertains to why the software engineer in academia needs recognition (definition and self-awareness in the eco-system of the UK academia as discussed in a parallel session at WP12 which I chaired).

This article exploits the relationship between the lack of funding policy for academic software within UK academia and Research Councils and the lack of career progression path and thus recognition and appreciation of research software developers within the UK academia. By all means this still represents a personal view despite some citations borrowed from the discussion on LinkedIn.

Software has become the major technique of choice for many scientists. It is often considered free though often THIS means at no cost to academia. Forgetting about commercial and non-free software, software available at no cost to academics is often considered as a free lunch. However, someone somewhere down the line paid for it. Even if it was truly open and free, stepping aside from any issues related to intellectual property and licensing, someone has put their labour in writing code instructions to implement a scientific methodology of some kind. In most cases this was the labour of a postgraduate (PG) student and/or post-doctoral (PD) researcher attempting to automate and simplify the work-flow of their research routines in the immediate future.

Well, this is how it all started in academia but evolution happens to everything including software. Some home-generated codes by dedicated academics as well as research groups’ home-grown ones have aggregated into packages and suites. Although many have failed to emerge in the open and be widely used, many are still in use. Some die out after a researcher retires or a research group ceases to exist. However, some pieces of scientific software get reborn like the Phoenix, being rewritten again and again in new ways with new or more modern versions of computer languages. This is the very ground where software development skills are discovered and trained, and scientific software developers born.

In the fight to increase the quality and quantity of research and not reinvent the wheel every time when a new PG/PD researcher joins a research group, some groups as well as whole scientific communities put efforts to generate project-based ‘community codes’. Clearly, the first generation of community code leaders had a taste for writing scientific software. By and large they were research scientists themselves who produced a lot of research but also wrote and managed software.

Of course, times have changed enormously in the last 20 years. However, for the researcher in academia one thing has remained constant – their career progression is based on research performance counted as impact of their research papers in peer-reviewed journals. More papers in higher impact journals lead to more success and recognition and better chances when applying for funding or academic jobs. In contrast, software development has diversified and become a well-defined profession with many sub-fields and computer languages. It is not surprising at all for an industry that governs our lives at home – PCs, electrical appliances, games and toys, mobile and smart devices, apps – at work and anywhere we go –databases, financial transactions in trade, GPS, traffic and goods flow control, industry etc. It has also become a discipline in its own right in higher education as informatics or computer science.

Despite the strong engagement of academic research with scientific software production, the software developer’s position in academia has often been related to that of a support scientist or lab technician role and scientific software engineering is often not considered as a discipline of its own. Of course, when one’s job or calling is to create code to facilitate scientific research they do not have the time to do all possible research they could with their creations and write papers in the domain of science their work is associated with. For creating code (deployment, testing, maintenance, continuous integration, etc.) is not the only task they do – other tasks include: software design to implement specific methodology, handle large amounts of data, utilise high performance computers (HPC) effectively, setting up websites and databases, writing database apps, writing GUIs, writing documentation, providing user support and training. Like scientists, these software engineers test their routines (hypotheses and solutions) for correctness by subjecting them to well-defined test cases. Like researchers, they keep abreast of the cutting edge of scientific methodologies in their research area in order to develop and prove them worthwhile and working. Like lecturers, they develop pedagogic skills to teach and train PG/PD researchers. Like academics, they write papers and grant proposals, as well as technical reports and manuals. However, unlike most academic researchers they keep abreast of relevant IT trends – numerical algorithms, computer languages, analysis tools and hardware – because all these are changing too quickly to be ignored. These changes may lead to poor performance of paradigms and algorithms that worked well before (bottlenecks), or novel constructs in computer language solving concepts that took a long time to engineer with older versions of the same languages, etc.

Unlike the LinkedIn discussion of Dirk Gorissen, I would argue the point that every scientific software developer is an academic researcher by heart but not every academic researcher is a scientific software developer. That often what is delivered by a scientific software developer is ‘mainly code’ (i) does not make them non-researchers, and (ii) is usually an outcome of too many requirements to satisfy and too much demand to answer for (from which they have to learn to fend off).

Doing research and scientific software development are neither mutually exclusive nor fully inclusive activities. They involve different skills as do realising research and writing grant proposals. They are thus complementary activities, but training an extra skill takes extra time and effort. Scientific software engineering skills and training are not considered as valuable as, for example, a degree in computer science and thus they are often not considered as proper and ignored by commercial software houses.

It is not the naming of ‘the new species’ that Dirk Gorissen defines as a problem but the lack of recognition and appreciation of the work, role and skill of the scientific software engineer in both academia and industry. On the one hand it is the lack of progression path in academia for scientists who devote themselves to the research software development path. On the other hand it is the lack of recognition of skill gained in academia by the commercial software houses. The lack of security of career development in academia often makes talented software developers move to industry no matter the inconvenience and cost. However, such a move has to be well-timed. Almost like obtaining a PhD, being trained to think on a more complex level may make scientists fail to obtain a commercial job if they stayed too long post-doc-ing. Being trained as both a scientist and a software engineer may indeed lead to a dead end career move. However, life does not stop and someone has to pay the bills so any job paying better than the one of a software developer in academia is better to have. However the loss of a scientific software developer to a team is usually much greater than the loss of a researcher because it is much harder to cultivate the unique combination of skills and knowledge in a person.

Until recently, software development in the UK academia has simply been viewed as an uninteresting means for achieving interesting research. Research councils have had no funding policy for software development and sustainability, which has led to disguising such development work as just research in grant proposals. As a consequence software development work has been carried out in a cash-starved environment and software engineers had to make a living migrating from one project to another and from one department to another. This has not helped academic institutions generate career progression and development paths for them and thus put scientific software engineers in a position to be looked upon as out-of-place technicians who facilitate researchers. Often when a research project stops, so does the work of maintaining the software. In many cases, development priorities focus on meeting the minimum requirements of a specific scientific research case. Usually, it is considered a low priority to invest in making software solutions more generic in a way that might appeal to other research projects to reuse them. Without other research projects as future stakeholders of a software solution, project investigators (PIs) may only rely on extending the life of their project software by extending the life of their project through grant renewals. It remains a challenge for research groups to produce reusable software assets which can last beyond the scope of any one project. Probably, the root cause of this is the way IT projects are funded in research environments. As a consequence some software projects die, e.g. GAMESS-UK, some freeze in time, e.g. SHELL, and others move overseas, e.g. GULP, leading to what I would consider as loss of irreplaceable assets – scientific software engineers and their expertise.

Of course, my view is slightly exaggerated in order to expose the root causes of the current state of software development and sustainability in academia and the limited awareness and even more limited action of the universities and research councils in the UK. The reluctance to act comes from the overly high a price to pay for proper software development and support (let alone for sustainability)the muddy clarity of IP and ownership of software solutions, and least but not last the limited understanding of how to benchmark software developers’ skills, performance and progress. Solutions and examples may need to be looked for in industry (software houses such as NAG Ltd., Wolfram Research Ltd., Scenomics Ltd., DEShow Ltd., etc.) where I am convinced that for similar-sized software development projects on each side, the funding level in academia would be at least by an order of magnitude less than in industry. It is worth pointing out the obvious fact that the poor post-doc salary is 25 to 100% less than that of the ‘professional software developer’ and that the post-doc software developer has to publish scientific papers in order to claim progression on the academic ladder.

It is more than fair to say that I have only criticised and not shown that there are changes in the UK academia and research council environments which, although do not answer the problems outlined in this article, at least address them. Indeed, some software development and service has been supported and provided by UK communities such as the CCPs, e-Minerals and programmes such as e-Science, ultimately funded by UK research councils as EPSRC, NERC and JISC. Limited funding has been provided by EPSRC (i) to application support via the CCPs; (ii) to software development support via HEA (Daresbury Laboratory); (iii) to software HPC optimisation via distributed computational and software engineering (dCSE) projects serviced by NAG Ltd. For the last (HPC) decade EPSRC has generated just one software engineering call in 2010 (which was very weakly defined). And last but not least this year (2012), EPSRC announced software development fellowships. On the university side there has also been some activity – computational science and engineering has been established as a course by EPCC and recognised as a different subject from computer science. HPC and parallel programming training courses are now widespread and regular events for PhD students in natural sciences.

Despite the ‘progress’ claimed above the situation of scientific software engineering practitioners in academia still remains pretty dire. In my opinion it has worsened by the much quicker cycles of hardware (HPC platforms including heterogeneous ones) and novel computer languages (HPC ones and extensions such as CUDA and openCL) to that of software development. The demand for software development in academia has risen but the availability of scientific software engineers has not due to bad long-term planning and succession policies for them in academia. Millions of pounds are spent annually on commodity clusters in the UK academia and still it is the computers that are considered to be the commodity rather than the software and the developers…