Ferran Espuña: Deep Learning Research Engineer, Mathematician and Computer Scientist
About Me
I have a double degree in mathematics and computer science and am currently working in the field of artificial intelligence research. I thoroughly enjoy solving problems of any kind and am very curious, hard-working, and proactive given the right motivations. I work well both independently, as a leader, and as a part of a team.
Contact Info
- Email: ferranespuna@gmail.com
- Location: Barcelona, Spain
- Phone: +34 600 24 69 87
Professional Experience
Barcelona Supercomputing Center | Research Engineer
2023 - Present
As a Deep Learning Research Engineer for Language Technologies, my main work focuses on Large Language Models. So far, my work has involved:
- Building and automating CURATE, a text processing pipeline designed to work in High Performance Computing environments and used to create CATalog, the largest pretraining dataset in Catalan.
- Actively participating in design decisions and writing pretraining scripts for the Salamandra models, a collection of highly multilingual language models pretrained from scratch using the Marenostrum 5 cluster.
- Researching ways to better evaluate the performance of our language models in open generation settings (not multiple choice).
- Opening research lines such as those for alternatives to Transformers (SSMs in particular) and mechanistic interpretability techniques (in particular, Sparse Autoencoders).
Computer Vision Center | Research Intern
2022
Awarded a fully funded research internship at CVC, which also served as an opportunity to develop my Bachelor’s Thesis.
- Research Topic: Application of Topological Data Analysis methods to study the generalization capabilities of neural networks.
- Supervisors: Professor Sergio Escalera (CVC), Professor Carles Casacuberta, and Rubén Ballester.
ChipScope Research Group at UB | Image Processing
2022
Contributed to the European research project ChipScope aimed at creating a high-resolution microscope the size of a computer chip. My work involved:
- Controlling the illumination and camera of the microscope to take image samples from effectively different points of view.
- Aligning and blending the images through computer vision techniques to produce a global view of the samples.
- Implementing wave backpropagation algorithms to remove interference artifacts caused by the small scale of the setup.
- Designing a user interface to make the process user-friendly.
Education
Master’s degree in Advanced Mathematics and Mathematical Engineering | UPC
Expected Graduation: June 2025
- Courses so far: Commutative Algebra, Number Theory, Coding Theory, Cryptography, Combinatorics and Graph Theory.
- Currently working on my master’s thesis on Algorithmic Extremal Graph and Hypergraph Theory.
Double Degree in Mathematics and Computer Science | UB
February 2023
- GPA: 9.0/10
- Achieved honors in courses like Computer Vision, Computer Graphics, Distributed Software, Databases, Advanced Algorithms, Numerical Methods, and Differential Equations, among many others.
Publications
A CURATEd CATalog: Rethinking the Extraction of Pretraining Corpora for Mid-Resourced Languages
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, European Language Resource Association and the International Committee on Computational Linguistics.
Personal Projects
- Developed a neural network from scratch, enhancing understanding of calculus, linear algebra, and gradient descent.
- Created visualizations for various 2D fractals and physical simulations.
- Designed a framework for 3D visualization of polynomial zeros through raytracing, applying my knowledge of Computer Graphics and Differential Geometry.
Skills
- Programming Languages and Technologies:
- Python (NumPy, Pandas, TensorFlow, PyTorch, OpenCV)
- NeMo Framework
- Slurm
- C/C++
- Bash
- Java
- MySQL
- Languages:
- Spanish (native)
- Catalan (native)
- English (C2 / Proficiency)
Certifications
- Stanford University: Machine Learning Specialization
- IELTS Academic: 8.5/9 overall band score
- Driver’s License: B permit in Spain