My open source contributions are listed below. Please click on any header to view it.

Open Source Package

I am consolidating my Hive Plot and Circos Plot implementations into a single visualization package. Design goals include:

  • A rational naming system (where not already existent) for each plot.
    • Hive Plots
    • Circos Plots
    • Panel Plots (with Horizontal/Vertical orientations)
    • Arc Plots
    • ...and more!
  • A declarative API for each type of plot, the user to delcare:
    • Node positioning
    • Node size (area, radius)
    • Node colour
    • Edge colour
    • Edge linewidth

Contributors include Jon, Leo, and Nelson - very happy to be working with them!

Resource: GitHub Repository

Code

polcart is a small utility for converting between cartesian (x, y) and polar (r, θ) coordinates. Generally useful across plotting interfaces.

Resource: GitHub Repository

Code

pyflatten is a package for flattening nested data structures. Developed originally by David Duvenaud, Matt Johnson and Dougal MacLaurin, I packaged it into an independent utility.

Resource: GitHub Repository

Code

I contributed an implementation of horizontal (HBar)and vertical (VBar) bar glyphs, with the help of Bryan Van de Ven and Sarah Bird (lead developers on Bokeh).

Resource: Pull Request

Code

Circos plots are an aesthetically pleasing way of visualizing networks, in which nodes are ordered around the circumference of a circle, and edges are drawn using Bezier curves within the circle. Because of the lack of a Python implementation, I wrote one with Justin Zabilansky and Jon Charest.

Resource: Source Code

Code

Hive plots are a rational way of visualizing networks. Out of frustration at the lack of a Python implementation, I wrote my own version, built on top of matplotlib, with an API design to be compatible with networkx. This project also happens to be my first independent open source contribution, with growing interest in the project on GitHub.

Resource: Source Code

Code

I contributed to the matplotlib enhancement proposal 12, which proposed reorganizing the examples gallery to make it easier for matplotlib users to find relevant examples.

Part of the problem was that there were old pylab examples, but the new pyplot API was now preferred over the old pylab API (which was really present mostly to convert MATLAB users). I helped fix all the pylab examples by changing

from pylab import *

statements to explicit

import matplotlib.pyplot as plt
import numpy as np

Resource: Pull Requests

Workshop

network-analysis

I taught myself graph theory in graduate school, as a paradigm/tool for analyzing influenza evolutionary trajectories (please see my Research page for more info). Borrowing the theme of Allen Downey's "X Made Simple" series, I have started my own Network Analysis Made Simple series of Jupyter notebooks, to share this knowledge freely with everybody.

Resource: Notebooks

Workshop

I had noticed a growing interest in the use of machine learning (ML) to answer tough biological questions at the Broad Institute. During graduate school, I had taught myself the practical aspects of ML through the scikit-learn API; I found it to be a great introductory path into machine learning. In collaboration with Andres Colubri, David Dao and Jane Hung (Broad Institute), we put together a workshop for members of the Broad Community, with the materials freely available.

Resource: Notebooks

Workshops

After two PyCons and one SciPy conference, I became convinced of the need to apply software engineering principles to data analytics. One theme that I identified was the need for the practice of (semi-)automated data checks. After meeting with Renee Chu at PyCon 2016, we are collaborating on developing tutorial material to teach how to write data and unit tests.

Resource: Notebooks