Hi, I'm Roman Schulte-Sasse.

Data Scientist Machine Learning Engineer Software Developer Bioinformatician

About

Let me introduce myself.

Profile Picture

I am a passionate data scientist and machine learning engineer with experience in deep learning, robotics and computational biology. I care about applying advanced ML models to complex problems to gain knowledge and care about interpretability of predictive systems.

Profile

After a bachelor in computer science, I joined our university's robot soccer team for several years during my master's degree. I worked on computer vision problems and robot localization, applying ML algorithms. For my master's thesis, I switched to the domain of computational biology where I stayed for a doctorate program. In my PhD, I predicted cancer genes using graph deep learning (see publications).

General Info

  • Full Name: Roman Schulte-Sasse
  • Location: Berlin (but open to remote work)
  • Email: schultesasse(atsymbol)posteo.de

Languages

  • German: Mother Tongue
  • English: Fluent
  • French: Fluent

Skills

I have profound experience in data analysis, data cleaning and machine learning. I enjoy adapting algorithms to specific domains, especially computational biology. I work mostly in Python (and rarely R) with Tensorflow, Pandas and Numpy but have realized projects in C++, MATLAB and Java before.

  • 95%
    Machine Learning
  • 90%
    Deep Learning
  • 90%
    Data Science
  • 80%
    Bioinformatics
  • 80%
    Computational Cancer Biology
  • 80%
    Network Analysis
Resume

My professional work and education in detail.

Education

PhD Graduate Student

January 2017 - January 2021

Max Planck Institute for molecular genetics

During my PhD, I developed and adapted a deep learning model to learn about molecular mechanisms leading to disease. I used graph deep learning to integrate different molecular experimental data of patients to predict cancer-related genes. Using feature interpretation methods for neural networks, we could identify the causes leading to the predictions of individual genes.

Master Degree

October 2014 - October 2016

Freie Universität Berlin

I took focused courses on machine learning during this time and reinforced linear algebra and computer vision. I continued working in robotics but finally switched to computational biology. Master thesis about detecting patterns in DNA sequences using convolutional Restricted Boltzmann Machines.

Bachelor Degree

October 2009 - June 2014

Freie Universität Berlin

I studied computer science at the "Freie Universität Berlin" (free university of Berlin). I started working with the FUmanoids, the soccer playing robot team of the university with a Bachelor thesis on robot localization based on landmarks identified using computer vision.

Work Experience

Machine Learning Researcher

May 2021 - Present

AIgnostics

I design architectures, train models and conduct analyses on pathology images for better treatment and understanding of cancer diseases.

Student Research Assistant

January 2016 - November 2016

Max Planck Institute for molecular genetics

I worked at the MPI for molecular genetics during my master thesis. Here, I worked on the publication of our method to identify transcription factor binding sites with convolutional restricted Boltzmann machines and extended the framework to multiple layers (a deep belief network), producing an unsupervised deep learning architecture.

Student Research Assistant

August 2015 - December 2015

Humboldt Universität zu Berlin

I worked on the automatic tracking of tendons in ultrasound videos for research at the sports faculty. In this work, I continued developing a MATLAB framework for semi-automated tracking of the tendons. Ideally, the researcher only selects the first point which is then recognized in all consecutive frames.

Student Developer

January 2013 - September 2015

FUmanoids

In my time with the FUmanoids, the humanoid soccer playing robots of the free university, I worked on modeling and computer vision in C++. For my bachelor thesis, I developed a localization framework based on particle filters (see publications) and afterwards developed a SVM-based ball recognition. I also implemented a strategy model.

Intern

May 2009 - September 2009

IVU traffic technologies

Prior to starting at university, I did an internship during which I first got in contact with professional software engineering. I assisted in the build process of a complex software using maven, subversion and make.

Publications

A list of my academic publications.

Integration of Multi-Omics Data with Graph Convolutional Networks Identifies New Cancer Genes and their Associated Molecular Mechanisms.

We predicted cancer genes by integrating several molecular data types such as mutation rates (single nucleotide variants and copy number changes), DNA methylation, gene expression and protein-protein interaction data. To successfully integrate these data types, we made use of graph convolutional networks and use gradient-based a posteriori feature interpretation methods to disentangle the molecular alterations of our classifications.

Cancer, Data Integration, Graph Convolutional Networks, Interpretable Machine Learning

Learning Representations of Sequences using convolutional restricted Boltzmann Machines.

In my master thesis, I made use of unsupervised convolutional architectures to learn transcription factor binding site motifs. Transcription factors (TFs) often preferentially bind to a certain stretch of DNA, determined by the nucleotide sequence. We use convolutional restricted Boltzmann Machines to find TF binding sites by searching for over-represented patterns of nucleotides.

Convolutions, Restricted Boltzmann Machines, Bioinformatics, Machine Learning

TriPepSVM: de novo prediction of RNA-binding proteins based on short amino acid motifs.

In this work, we used support vector machines with string kernels to predict RNA-binding proteins from the protein sequence of the protein in question for human and bacteria. We find tri-peptides (sequences of three amino acids) allow to distinguish well between RNA binders and non-binders and furthermore find that several RBP-enriched tri-peptides occur more often in structurally disordered regions of RBPs.

RNA-binding proteins, Support Vector Machines, Bioinformatics, String Kernels

Unsupervised learning of DNA sequence features using a convolutional restricted Boltzmann machine.

This work was the publication of my master thesis in which we used unsupervised convolutional architectures to identify transcription factor binding sites. Transcription factors bind preferentially to genomic regions, thereby regulating the expression of other genes. Our method identifies such TF binding sites from experimental data using a convolutional restricted Boltzmann machine.

Convolutions, Restricted Boltzmann Machines, Bioinformatics, Machine Learning