Do you want to work faster and more efficiently with large datasets? This training will introduce you to the world of High-Performance Data Analytics (HPDA) through intensive, hands-on sessions in the Jupyter Notebook environment. During the workshops, you will learn how to leverage the capabilities of Python and libraries such as Pandas and Dask to efficiently analyze data – from single files to datasets that exceed your computer’s memory. The training combines theory with practice, allowing you to immediately apply newly acquired skills to real-world data analysis scenarios.
What can you expect?
- learning how to work with large datasets in Python
- a practical introduction to data analysis using Pandas
- getting familiar with tools for scaling computations (Dask)
- understanding how HPC (High-Performance Computing) and HTC (High-Throughput Computing) work
- working with real datasets, including scientific data
- hands-on exercises demonstrating the differences between local and distributed approaches
- learning methods for optimizing performance and accelerating data processing
Who is it for? This training is aimed at individuals who want to develop their data analysis skills:
- researchers and academic staff
- engineers
- data scientists and analysts
- anyone working with larger datasets who wants to do it more efficiently
Requirements To benefit fully from the training, participants should have:
- a basic understanding of Python
- familiarity with Jupyter Notebook
- willingness to work with data and learn new tools No prior training courses or account registration are required.
The training will be held online (Zoom), and the meeting link will be sent to registered participants. Language: Polish or English (depending on the group) Duration: 2 days (8 hours total)
Registration (and the waiting list) closes automatically on July 3, 2026. The number of places is limited, and registration may close earlier if the limit is reached.
The training will be delivered by Leszek Grzanka, an experienced specialist in high-performance computing and data analytics from the ACC Cyfronet AGH team.