Scientific Computing

Tiago Antao

Bioinformatics with Python, the book

If you are interested in using Python for Bioinformatics, you might want to check out my new book which provides a comprehensive set of practical recipes in Python covering topics such as: Next-Generation Sequencing, Genomics, Population Genetics, Phylogenetics, Proteomics plus advanced Python tips related to high-performance computing or interacting with R from Python.

The book has a free Github repository.

Virtual Core - Turn-key solution for data science

Virtual Core is a turn-key solution to deploy a complete set of data-science core services that can be used as a base infrastructure for big data analysis. The solution is based on Docker and it currently includes:
  • A LDAP container, along with a web interface (phpldapadmin)
  • Zabbix-based monitoring
  • PostgreSQL database server
  • An extensible software container, currently including Python and R tools for data-science (based on Anaconda)
  • A user container, where users can log-in and run all data-science applications
  • A file server, capable of routing data from other file servers (e.g. communicate with a Samba server and exposing an NFS interface)
  • Cluster software (currently only SLURM)
  • A SLURM-compute container that can be deployed across a cluster (via Docker swarm)
  • A exploratory analysis server, which includes a Jupyter hub
  • Plain web server
Most communications are SSL secured and the system can work as an adhoc SSL certification authority. The containers can be split across a cluster with Docker Swarm. A wizard is included to get the system up and running as fast a possible.

Software

Detection of genes under selection

There are two Jython-based applications available for selection detection:

Epidemiology of drug resistant malaria

OgaraK allows the simulation of the spread of drug resistant malaria. It was developed in Groovy.

Pygenomics

I am currently developing a Population genetics and genomics library called pygenomics (no prize for name originality)

Domain Specific Language for Pharmacology

A very old project is a DSL for pharmacokinetics of drug treatments.

Biopython

I collaborate with the Biopython project. I am mostly responsible for the Population Genetics module. I was the release manager for a couple of versions, was involved with 2 to 3 conversion and installed the integration server (buildbot). I did other bits like a BioSQL version using the MySQL pure-Python driver. I am currently interested in developing teaching material with IPython Notebooks, to that effect I am converting parts of the Biopython tutorial to Notebook format.

Automated code translation

I have been involved in several projects related to automated code translation. For example converting legacy 3270 CA Gen applications to modern web interface ones. On the scientific computing front.
I have also did a semi-automated translation of Fortran to C code in the OpenMalaria simulator.
I have also been involved in the conversion of Python 2 to 3 code (arguably an easier task than the ones above). For example with Biopython and Abjad.

Tutorials

Some tutorials using Jupyter Notebooks

I am also starting to develop a Data Science Course. This is a more ambitious project including text, notebooks and, in the future videos

Technologies

From time to time I also use

Web and VR

Bioinformatics

Programming languages I like

© Tiago Antao (2016)