Open to alternative roles

Dustin M.
Hanke

Bioinformatician, Data Scientist & Biologist

Bridging evolutionary biology, machine learning, and computational science to extract meaning from complex data — from plasmid genomes to computational tool development.

Scroll
Who I Am

Biology meets Data Science

I'm a bioinformatician, data scientist and biologist based in Kiel, Germany, with a PhD in computational evolutionary biology focused on molecular evolution in plasmid genomes. My work lives at the intersection of biology, data analysis, bioinformatics, and high-performance computing.

Since 2025 I work as an IT forensics analyst at the State Criminal Office, applying investigative data analysis to digital forensics cases.

I have deep expertise in deep learning, large-scale genomics pipelines, and scientific software development. I've authored peer-reviewed papers, taught university bioinformatics courses, and built ML pipelines ranging from transformer architectures to CNNs — all trained on HPC clusters.

6+
Peer-Reviewed Publications
PhD
Evolutionary Biology & Bioinformatics
6+
Conference Presentations
10+
Data Science related Certificates
Career

Professional Experience

From marine research diving to machine learning — shaped by curiosity and analytical rigor.

2025 — Present
IT Forensics Analyst
State Criminal Office
Digital forensics investigations, data analysis, and evidence processing using computational tools and methodologies.
2021 — 2025
Research Scientist (PhD)
Christian-Albrechts-Universität zu Kiel (CAU)
Doctoral research on molecular evolution in plasmid genomes. Developed SegMantX for DNA duplication detection. Taught bioinformatics courses: Python, R, Computational & Comparative Genomics.
2020 — 2021
Bioinformatician / Research Assistant (HiWi)
Genomic Microbiology Group, CAU Kiel
Bioinformatics analysis supporting research on plasmid evolution and bacterial genome dynamics.
2018 — 2019
Research Diver & Training Assistant
Research Diving Centre, CAU Kiel
Scientific diving for marine research and diver training support at the university's diving research centre.
2018
Research Diver
GEOMAR Helmholtz Centre for Ocean Research Kiel
Scientific diving contributions to marine research expeditions at one of Germany's leading oceanographic institutes.
Academic Background

Education & Training

2021 — 2025
PhD — Evolutionary Biology (Bioinformatics & Data Science)
Christian-Albrechts-Universität zu Kiel
Research on molecular evolution in plasmid genomes. Developed SegMantX, analysed pseudogene landscapes, applied transformer and LSTM models for transposable element detection on HPC clusters.
2021 — 2023
Machine Learning Degree
opencampus.sh, Kiel
Comprehensive ML curriculum: deep learning, NLP, computer vision, transformers, and ML project management. Multiple specialised certificates earned.
2019 — 2021
M.Sc. Biology — Bioinformatics Focus
Christian-Albrechts-Universität zu Kiel
Master's specialisation in bioinformatics, computational genomics, and molecular biology.
2015 — 2019
B.Sc. Biology
Christian-Albrechts-Universität zu Kiel
Bachelor's degree covering genetics, molecular biology, ecology, and marine biology.
Tech Stack

Skills & Expertise

From genomics pipelines to neural networks — a versatile computational toolkit.

🐍
Programming Languages
PythonR Bash / ShellSQLC++
🧠
Machine Learning
TensorFlowKeras PyTorchHugging Face Scikit-learnTransformers LSTMCNN
📊
Data Analysis
PandasNumPy MySQLdplyr tidyverseStatistics
📈
Data Visualisation
MatplotlibSeaborn Plotlyggplot2 TensorBoard
HPC & Infrastructure
LinuxHPC / SLURM SnakemakeGit Supportive Serveradministration
🔬
Bioinformatics
BiopythonBLAST MMSeqs2Prokka SPAdesIQTREE Genome Assembly
🎓
Professional Skills
University TeachingProject Management Scientific WritingEnglish (fluent) MS Office
Bioinformatics & Machine Learning

Featured Projects

Open-source tools and deep learning applied to genomics, sequence analysis, and ecological classification.

Project 01 — Transformers
Transposon Detection via Transformer Models
Predict composite transposable elements in bacterial genomes using NLP-inspired deep learning.
PythonHugging Face TransformersNER MLMPandas HPC V100
  • Addressed challenging detection of composite transposable elements whose boundaries are hard to determine due to missing terminal features, genetic diversity, and variable gene frequencies.
  • Applied Transformer models to tokenize protein families, implemented a Masked Language Model (MLM) to learn bacterial genome grammar, and used Named Entity Recognition (NER) for element boundary detection.
  • Trained on Google Colab and NEC HPC-System (CAU) with 4× NVIDIA Tesla V100 GPUs using multi-GPU training for scalability.
Project 02 — LSTM
Transposon Detection via LSTM Networks
Supervised LSTM model detecting composite transposable elements through gene frequency and positional analysis.
PythonTensorFlow LSTMScikit-learn SeabornHPC V100
  • Tackled composite transposable elements — flanked by inverted repeats and moving as single units — whose detection is challenged by missing terminal features and compositional diversity.
  • Developed a supervised LSTM neural network predicting transposable element regions from gene frequencies, protein family clusters, and positional data simultaneously.
  • Trained on the NEC HPC-System (University of Kiel) with 4× NVIDIA Tesla V100 GPUs on full bacterial genome datasets.
Project 03 — CNN
Mushroom Classification with CNNs
Image classification CNN distinguishing edible from non-edible mushrooms — deep learning meets ecology.
PythonCNN KaggleTensorBoard ImageDataGeneratorMatplotlib
  • Developed a CNN to classify mushroom genera, focusing on beginner-friendly groups (Russula and Boletales) to aid in identifying edible versus toxic species.
  • Applied data augmentation via ImageDataGenerator, transfer learning, and TensorBoard for training monitoring and performance visualisation.
  • A practical demonstration of computer vision applied to real-world ecological and safety challenges.
Project 04 — Bioinformatics
SegmentationR — Genome Duplication Analysis in R
R-native segmentation approach for detecting duplicated regions in genome sequences — the conceptual precursor to SegMantX.
RBioconductor ggplot2dplyr Sequence AnalysisLinux
  • Implements a segmentation algorithm in R to identify and characterize duplicated regions within genome sequences, leveraging R's strengths in statistical computing and data visualisation.
  • Served as the methodological foundation and prototyping environment for the later Python/HPC-optimised SegMantX tool.
  • Produces ggplot2-based visualisations of duplication landscapes across genomic coordinates for exploratory analysis and publication-ready figures.
Project 05 — Data Tools
RMySQL — SQL Database Control via R
Lightweight R tool for executing MySQL commands by communicating directly with the Linux terminal — bridging R and relational databases.
RMySQL Linux / BashDBI
  • Provides a simple R interface that passes SQL commands to a MySQL server via the Linux terminal, enabling R users to query and manage databases without switching environments.
  • Designed for bioinformatics workflows where large genomic or metadata tables are stored in relational databases and need to be pulled directly into R for downstream analysis.
  • Demonstrates practical systems integration: connecting R's statistical ecosystem to Linux-native database infrastructure through shell communication.
Research Output

Scientific Publications

Peer-reviewed research in evolutionary biology, genomics, and bioinformatics.

Science 2026 Co-Author
Repurposing of a DNA segregation machinery into a cytoskeletal system controlling cell shape
Springstein BL et al.
doi: 10.1126/science.aea6343
Mol. Biol. Evol. 2025 First Author
SegMantX: A novel tool for detecting DNA duplications uncovers prevalent duplications in plasmids
Hanke DM, Dagan T.
doi: 10.1093/molbev/msaf242
PhD Thesis 2025 First Author
The prevalence of gene duplication and non-functionalization in plasmid evolution
Hanke DM. Christian-Albrechts-Universität zu Kiel.
→ Dissertation (open access)
Nucleic Acids Res. 2024 First Author
Pseudogenes in plasmid genomes reveal past transitions in plasmid mobility
Hanke DM, Wang Y, Dagan T.
doi: 10.1093/nar/gkae430
BIOspektrum 2024 First Author
2-Methylhopan als Marker für Cyanobakterien in der Erdgeschichte
Hanke DM, Dagan T.
doi: 10.1007/s12268-024-2096-y
Env. Microbiol. Reports 2023
Role of natural transformation in the evolution of small cryptic plasmids in Synechocystis sp. PCC 6803
Nies F, Wein T, Hanke DM, Springstein BL, Alcorta J, Taubenheim C, Dagan T.
doi: 10.1111/1758-2229.13203
mSphere 2022
Natural Competence in the Filamentous, Heterocystous Cyanobacterium Chlorogloeopsis fritschii PCC 6912
Nies F, Springstein BL, Hanke DM, Dagan T.
doi: 10.1128/msphere.00997-21
Scientific Community

Conferences & Talks

Conference Talks
EvolSea 2024 SMBE 2024
Poster Presentations
EvolSea 2022 ISPB 2022 EvolSea 2023 SMBE 2023
Credentials

Certificates & Training

🎓
Machine Learning Degree
opencampus.sh — 2023
🧠
Deep Learning Specialisation
opencampus.sh — 2022/2023
Machine Learning with TensorFlow
opencampus.sh — 2021/2022
🤗
Transformers for NLP and Beyond
opencampus.sh — 2022
🏆
ML Project Manager
Coding.Waterkant Hackathon — 2022
📚
Neural Networks & Deep Learning
Coursera / DeepLearning.AI
🛠️
DeepLearning.AI TensorFlow Developer
Coursera (4-course specialisation)
📋
Project Management
opencampus.sh
🌐
Web Development (HTML, CSS, JS, Bootstrap)
opencampus.sh
📖
University Didactics
CAU Kiel — Academic Teaching
Let's Work Together

Get In Touch

Open to freelance projects, consulting, and collaboration in data science, bioinformatics, and machine learning.

Email
dr.dustin@martin-hanke.de
Send an Email