Research Software Engineering (RSE) Project Manager . The role holder will offer oversight and coordination of RSE efforts within the Stephen Hawking Centre and other Faculty of Mathematics research groups, assistance with design and enhancement of a Faculty Computing Development Platform, and the training and support of PhD students, especially those associated with the Centre for Doctoral Training (CDT) for Data Intensive Science. You will provide senior-level expertise to oversee the employment and coordination of RSEs based within the Stephen Hawking Centre and participating research groups. The role facilitates RSE teamwork in the areas of testing, profiling and optimising parallel code in preparation for production runs on external High Performance Computing (HPC) facilities, notably improving parallel scaling. The role advises Faculty members on grant applications for RSE support and hardware procurement and offers support through RSE-based assistance. The role holder will be responsible for the operation of services for HPC application development, delivery and training. You will assist with second-tier support for the Faculty of Mathematics HPC Development Platform, a modelling and data analytics facility that includes a UKRI ExCALIBUR Intel GPU/CPU Max testbed. Note that a Faculty HPC Manager (SysAdmin) is in post, so this system role is a shared and strategic responsibility. You will take technical responsibility for HPC training resources for PhD students, especially those in the CDT. Training courses for this purpose are available through the new MPhil degree in Data Intensive Science, so you would be expected to advise with course design and teaching, especially the support and training of CDT PhD students. Minimum Qualifications: Experience of programming in C, C++, Fortran 90, and Python. Using scripting like Bash, and parallel programming using OpenMP and MPI. Desirable: GPU programming experience using SYCL and/or CUDA. Familiarity with administering Linux operating systems in a research environment. Desirable: Knowledge about configuring and managing Linux HPC clusters, notably queuing systems like Slurm. Experience with source control systems and advanced compilation, optimisation, and installation methods for scientific HPC applications, as well as AI/ML toolkits. Desirable: Knowledge of storage sub-systems, co-processors and accelerators (GPU) and familiarity with visualisation on HPC systems. Fixed-term: Funds for this post are available for 2 years in the first instance. Apply here:
https://www.jobs.cam.ac.uk/job/48314/
#J-18808-Ljbffr