The above is my 3-monitor setup at the Sanger Institute, snapped by a colleague as I was
doing statistical analysis dealing with some hostile aliens in Mass Effect:Andromeda. As you can see, my desktop is on the messy side. The Coke bottle is not mine, but the flags are.
The journey so far
Along with a group of 5 other students, I acted as a consultant for a healthcare charity. We developed models for the health economics and social return on investment of Obsessive-Compulsive Disorder (OCD), and produced a report and recommendations.April-June 2017
Helped organise and run the 1st Volos Summer School of Human Genetics
The Summer School is aimed at training Greek students of all scientific backgrounds in the basics of human genetic research. The School runs through 2 days of theoretical lectures in genetics and statistics, combined with practical workshops where the technical aspects of genetic data analysis are covered. I developed the website for the event, ran several lectures and workshops, and managed the github repository where the materials are stored.April-June 2017
📰Article (Human Molecular Genetics)
Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation.May 2016
Started working as Principal Bioinformatician
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Following on from the low-depth sequencing work, I kept working on the HELIC study, this time focusing high-depth, population wide sequencing. The unprecedented scale of the datasets allows the exploration of a previously untapped section of the genetic landscape: rare mutations with comparatively low effects on human traits and diseases. To adress the growing need for analysis, I started supervising a Senior Bioinformatician, while continuing to train visiting workers and students.
Skills used: Python (pandas, plotly, bokeh, seaborn), R, bash (sed/awk), REST/JSONJune 2016
Started a PhD
Department of Public Health and Primary Care, University of CambridgeApril 2016
UNEP-WCMC, Cambridge, UKNov 2014
📰Article (Nature Communications)
Genetic characterization of Greek population isolates reveals strong genetic drift at missense and trait-associated variantsNov 2014
Cambridge Big Biology Day
Hills Road Sixth Form College, Cambridge, UK
The Big Biology Days are an initiative by the Society of Biology to bring together researchers from the life sciences and young students, as well as the broader public. The goal is to introduce children to the basic concepts of biology through play activities.
I animated the DNA sequencing, Sorting Algorithm and DNA alphabet activities.October 2014
📰Article (Briefings in Functional Genomics)
Using population isolates in genetic association studies.Nov 2014
Started working as Senior Bioinformatician / Statistical Geneticist
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Low-depth sequencing allows to assess the genetic landscape of individuals at a lower cost than other methods, but also lower accuracy. I am in charge of developing an analysis pipeline that accurately predicts millions of mutations in small (thousands) to large (hundred thousands) human populations, and uses them to model traits such as disease status, height or cholesterol levels.</br> </br> This involves large multidimensional analyses to characterise and choose the best types of population to sequence, building and testing disease models, and data quality control among other things.
Skills used: Bash, Perl, C++ (boost libraries), Python (scipy libraries), R, Tableau
Statistics used: Logistic regression, PCA/MDS, Linear Mixed Models, hypothesis testing, clustering (k-means), methods for sparse data (imputation)August 2013
Crossed the Channel to Cambridge, UK 🏡☔August 2013
Article (BMC Bioinformatics)
TE-Tracker: Systematic identification of transposition events through whole-genome resequencingMarch 2013 View on publisher's website
Transposons are small DNA sequences that have the ability to copy-paste or cut-and-paste themselves around in the genome. Some of them are thought to be virus sequences that were absorbed in the genomes of modern species throughout evolution. Due to their potential for genetic disruption, they are usually silenced (inactivated) in most living organisms. They are thought to play a role in certain genetic disorders. I co-wrote TE-Tracker, a Perl tool that allows the detection of transposition events via whole-genome sequencing.
Ran statistics courses at CEA/Genoscope
Topics covered: Introduction to statistics, estimation theory, hypothesis testing, regression models, (M)AN(C)OVA, model building.</br></br> Fortnightly 2-hour sessions and practicals using R.Autumn/Winter 2012
Started working as Research Engineer
CEA/Genoscope, Evry, Paris, France
Genome sequencing reads the DNA from living cells by duplicating it many times and breaking it into small fragments. The fragments can be read individually, but the whole sequence has to be reconstructed algorithmically, which is done using a reference sequence onto which the fragments are aligned. However, when structural variations (such as copy and paste of long sequences) affect the genome, alignment will fail.</br></br> I designed a program, TE-Tracker, that clusters sequencing reads that do not align properly and tries to model the structural mutation that produced them.
Skills used: Bash, Perl, C++ (boost libraries)
Statistics used: clustering (single-linkage), supervised classificationJanuary 2012
Started working as R&D Engineer (short-term contract)
Misys Sophis, Paris, France
When predicting the value and risk of an investment portfolio, the worth of each financial asset needs to be priced accurately by incorporating market data. Due to the complexity of exotic financial products, no explicit formula can be derived to value them using probabilistic theory. It is however possible to estimate their price using simulation.
A client requested this functionality in a call for tender, so I was sent to the Paris R&D department to develop it. I used a Monte-Carlo pricing algorithm that models the behavior of the derivative under different volatility scenarios using a volatility surface based on historical measurements.
Skills used: C#
Statistics used: Monte-Carlo methods, Stochastic processes, derivatives valuation modelsAugust 2011
MSc in Applied Mathematics 🎓
Department of Mathematical Modelling, Image and Simulation, Grenoble INP-ENSIMAG, Grenoble, FranceAugust 2011
Started working as Junior Front Office Consultant (intern)
Misys Sophis, Hong Kong SAR, P.R. of China 🇭🇰
Misys Sophis is a major solutions provider to the banking and fund management industry. Its main products at the tine, Risque and Value, were designed to manage large investment portfolios down to the single trade. My role was to provide technical support to institutional clients as well as answering specific client demands and calls for tender.
Skills used: C#, Excel, SQLJanuary 2011