First EuroHPC19 Workshop to Seed and Foster Collaborations Across Europe

19-20 September 2022 at Campus Puerta de Toledo, UC3M, in Madrid Find Workshop materials here Program Monday 19th September Time Topic 12:30 – 13:00 pm Welcome by Jesus, Hans-Christian and Peter 13:00 – 13:30 pm Collaboration on IO – traces suggested by Maike Gilliot, Philippe Deniel and André Brinkmann – including IO-SEA, Admire, MAELSTROM 13:30 … Read more

HIPEAC 2023. EuroHPC JU Projects Shaping Europe’s HPC Landscape.

Ten research & innovation projects were started by the EuroHPC joint undertaking to address the challenge at the hardware, system architecture, system software and software development tool levels. The results achieved will have a lasting impact on the European HPC ecosystem. This workshop presents the progress made by them in the last 20 months, putting … Read more

SC23 BoF: Enabling I/O and Computation Malleability in High-Performance Computing

The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22) Nov 13–18, 2022 • Dallas, Texas. BoF Session 127. Schedule: November 16th, Wednesday 5:15pm-6:45pm See here BOF presentations Traditional interest in increasing parallelism for individual jobs in HPC systems is being conditioned by the variety and dynamicity of resource demands of jobs at … Read more

Towards I/O monitoring at scale

Designing a self-tuning I/O environment in HPC Download in PDF I/O Challenges in HPC In High-Performance Computing (HPC) data movements are one of the biggest challenges. Indeed, large computation is necessarily leading to large datasets. Current HPC workflows favor a feed-forward way of launching programs, loading their dataset, and then storing the result in persistent … Read more

Closing the loop: from Observation to Action

Performance monitoring and observation is a requirement in the complex IT systems we are building nowaday. Exascale systems are digital factories operating with millions of cores and discrete components. As any factory, these systems are instrumented and monitored. Performance observation is facing three main challenges: Operating at scale ADMIRE monitoring infrastructure is using Prometheus as … Read more