About

Research Engineer at LAMSADE, bridging bioinformatics, machine learning, and full-stack engineering. Trained across Kenya, Algeria, France — I build tools that make complex biological and computational workflows accessible and reproducible.

Currently Working On

Active Project · LAMSADE / PEPR Santé Numérique
ShareFAIR — Queryable Provenance for Nextflow Workflows
Developing a system to structure and make queryable the raw provenance data of Nextflow bioinformatics workflows. This involves building a PROV-compliant knowledge graph in Neo4j that integrates prospective provenance (workflow specification) with retrospective provenance (execution traces captured via nf-prov as RO-Crate JSON-LD), enabling scientists to trace data lineage and reproduce analyses without navigating scattered execution directories.

Experience

Research Engineer
2025 — present
LAMSADE – Université Paris-Dauphine · Paris, France
  • Structuring and making queryable raw provenance data from Nextflow bioinformatics workflows.
  • Developing a granular tracing system at the record/attribute level.
  • Empirically evaluating the performance and precision of developed tools via concrete use cases.
  • Contributing to the writing of research articles to disseminate results.
R&D Intern — Workflow Visualisation
04/2024 — 09/2025
LISN – Laboratoire Interdisciplinaire des Sciences du Numérique · Paris
  • Developed a user-friendly visualisation system for graph output workflows.
  • Translated complex workflow outputs into interactive visualisations for enhanced reusability and analysis.
R&D Intern — IFB-Core Madbot Team
04/2024 — 08/2024
CNRS – Institut Français de Bioinformatique (IFB) · Paris
  • Built a connector to enhance multi-omics data integration within the IFB-Core platform.
  • Improved data processing efficiency and cross-team collaboration.
Research Intern — Computational Neuroscience
04/2022 — 06/2022
CNRS – NeuroPSI · Gif-sur-Yvette, France
  • Simulated neuron firing patterns using the AdEx model to study claustrum neurons.
  • Analysed neuron dynamics with Python and Brian2, focusing on interspike intervals and thalamo-cortical interactions.
  • Classified neurons based on electrical properties and peptide expression.

Publications

Loading from ORCID…

Education

Université Paris-Saclay · Orsay, France
2023 – 2025 · Specialisation: AMI2B
Advanced machine learning (CART, Random Forests, LASSO) · Statistical data analysis · AI applications for biological data · Computational algorithms · Object-oriented programming in C & Java · Bioinformatics workflow design and analysis.
Moringa School · Nairobi, Kenya
2022 – 2023
Python · JavaScript · RESTful API design · Object-oriented programming · Full-stack web development · Industry-standard SDLC practices · Project-based learning with real-world applications.
Université Paris-Saclay · Orsay, France
2021 – 2023
Molecular biology · Cell biology · Biochemistry · Genomics · Genetics · Complementary coursework in programming and data analysis.
Université Ferhat Abbas · Sétif, Algeria
2018 – 2021
Microbiology · Molecular genetics · Biochemistry · Biotechnology algorithms · Data-driven approaches to biological systems.
Catholic University of Eastern Africa · Nairobi, Kenya
2017 – 2018
Programming fundamentals · Data structures · Algorithmic problem-solving · Object-oriented principles.

Projects

GenAnnotator
Django · Next.js · Docker · Redis · Biopython
Open-source full-stack platform for genomic annotation. Gene, peptide, and protein analysis with BLAST integration, JWT auth, and async processing via Huey.
Variable Stars Classification
Python · Scikit-learn · RAMP · CI
ML pipeline for automated classification of variable stars from light curve data using feature extraction, ensemble learning, and cross-validation.
ReproHackathon
Nextflow · Bash · R · Singularity
Group project replicating a 2019 bioinformatics paper to validate reproducibility of computational workflows using modern containerisation tools.
Workflow Visualiser
JavaScript · D3.js · Graph APIs
Interactive visualisation system for scientific graph output workflows (LISN internship), making complex pipeline results accessible for analysis and reuse.
Jaffar Gura — CV Updated March 2026
EN · FR