Costa Rica Big Data School 2025

Costa Rica National High Technology Center

August 18 -22, 2025

The Costa Rica National Research and Education Network: RedCONARE and the Costa Rica National High Technology Center (CeNAT) are proud to host the next Costa Rica Big Data School 2025.

This school will be hands-on style. Working with live coding in conjunction with lectures discussing the art of scientific programming, algorithm design, and data science. Instructors will guide attendees through the Computational Research cycle. Students will participate in a coding challenge to develop a computational model writing code in Python, generate data to verify and validate information, applying machine learning techniques to the data set and find interesting outcomes and revalidate the data.

We hope you enjoy what we have prepared, and take full advantage of this exciting event.

M.Sc. Carlos Gamboa Venegas
RedCONARE Scientific Coordinator
Costa Rica Big Data School Chair

Instructors

Susan Lindsey

Technical Information Coordinator

Texas Advanced Computing Center

BiographySusan is TACC’s Technical Information Coordinator. As such, she is responsible for gathering and presenting timely, consistent and accurate technical information to the supercomputing user community. She will also be evaluating and implementing emerging technologies related to the dissemination and presentation of technical content.

Susan comes to TACC after eight years at the San Diego Supercomputer Center where she researched and programmed on a variety of computational biology projects.

Charlie Dey. B.A.

Director, Training And Professional Development

User Services  Texas Advanced Computing Center

Biography: Charlie is the Director of Training and Professional Development with the User Services group at TACC with a background in web development and scientific computing. Charlie’s responsibilities at TACC include organizing, developing content, and building curriculums for TACC’s academic course selection taught in conjunction with several departments at the University of Texas at Austin, as well as for TACC’s professional development and educational training. Prior to joining TACC, he worked as a Senior Application Developer for the Carle Foundation, and as a computer science instructor at Parkland College in Champaign, IL. He was also a member of a specialized application development team at the University of Illinois and has also been a contracted research consultant for NASA Ames Research Center, studying computational immunology and bioinformatics. Charlie holds a Bachelor’s Degree concentrating in Computer Science and Biology from Eastern Illinois University, and certifications in 3D programing and visualization.

Program

This 4–5 day hands-on AI Workshop is designed to introduce participants to foundational AI techniques using Python and guide them through the full data analysis pipeline—from exploration to forecasting. The workshop begins with a shared dataset and step-by-step demonstrations of how to apply AI methods such as classification, clustering, regression, and time series forecasting to gain insights. Participants will then use these techniques to model and project future data trends. In the final day and a half, attendees will work in teams to select their own datasets, analyze them using the techniques they’ve learned, and present their findings, including a discussion of the AI methods used and the insights gained.

Date & TimeMonday 18Tuesday 19Wednesday 20Thursday 21Friday 22
Foundations and SetupAI Techniques – Classification and ClusteringAI Techniques – Regression and ForecastingTeam Projects – Dataset Exploration and AnalysisTeam Projects – Final Presentation
08:30 – 10:30Introduction to the workshop goals and structure
Setting up the Python environment
Introduction to supervised vs. unsupervised learningIntroduction to linear and non-linear regression modelsForm teams and identify datasets of interest (sources: Kaggle, UCI, etc.)Finalize analysis and visualizations
10:30 – 11:00Break
11:00 – 13:00Introduction to the shared dataset
Data cleaning and exploratory data analysis (EDA) techniques
Applying classification algorithms (e.g., Decision Trees, KNN, Logistic Regression)Time series analysis and forecasting (e.g., ARIMA, LSTM basics)Perform data cleaning and EDA
Select and apply appropriate AI techniques
Teams present findings, methodology, and insights
13:00 – 14:00Lunch
14:00 – 16:00Basic visualizations and feature engineeringHands-on with clustering (e.g., K-Means, DBSCAN)
Discussion on model evaluation and performance metrics
Applying models to project future trends in the dataset
Error analysis and model tuning
Begin working on analysis and presentation prepGroup discussion on techniques used and key takeaways
Wrap-up and feedback session
16:00 – 16:20End of day

Registry

Tuition fee

Participation is free. There are no tuition costs associated with participating in this school for those affiliated to CONARE institutions.

Maximum quota

The maximum quota is 50 participants.

Inscription

The following form has to be fully filled before July 31th.

Important dates:

  • The closing of the application process to the School: July 31th.
  • Notification of acceptance/rejection in the participation of the School: August 6th.

Requirements

Being a student, teacher or researcher of any public university (UCR, TEC, UNA, UNED, UTN), from CONARE or any of its ascribed programs: CeNAT, PEN and SINAES.

Also, we are admitting functionaries of the Ministries and Public Entities of the Government of Costa Rica. (Limited spaces)

Have an intermediate English knowledge (reading and hearing). All of the presentations and exercises are going to be in this language.

 

Having basic programming skills with Python and basic Linux handling.

Organizers

RedCONARE is the Costa Rica National Research and Education Network (NREN). It provides technical infrastructure and communication services like eduroam, Mconf, LA Referencia, and the Colaboratorio, among others. The NRENs or Advanced Networks are common spaces that the universities research community has among the world to enhance their knowledge and contributions to humanity. In Costa Rica, RedCONARE has been positioning as a research space and join collaboration among its members.

The Advanced Computing Laboratory (CNCA) at Costa Rica High Technology Center (CeNAT) is a multidisciplinary space where scientific discovery is accelerated through an advanced computing infrastructure. This infrastructure includes not only specialized and updated hardware, but also a set of efficient applications and well-trained staff in order to take advantage of all the technology. This allows CNCA to work in the main dimensions of research, project development, training, and services provision.

TACC inspires and educates the next generation of computational scientists and technologists and increases the public’s understanding of the roles computing and science play in shaping our society. To educate the next generation of researchers and computational professionals, TACC developed a unique scientific computing curriculum for The University of Texas at Austin.

Location