Post-PyCon 2017 Thoughts

written by Eric J. Ma on 2017-05-21

This year's PyCon 2017 is over! Well, for me at least, as I head back to Boston, a place I've had to call home for the past 6 years.

I've noticed my Portland PyCons have felt different from my Montreal PyCons.

In Montreal, I felt more like a taker, a newcomer, a beginner. In Portland, I felt more like someone who could finally give back to the community. If anything, I hope I've been able to encourage others to also give back to the community.

In Montreal, with respect to the community, I felt like I had to slowly navigate a new landscape of networks with people. There, I met a bunch of people who first became my PyCon community mentors: Stuart Williams and Reuben Orduz, whose years of experience in the community and in life are way beyond mine, became long-distance friends with whom I would look forward to meeting with again at the next PyCon. Carol Willing, a fellow MIT alum whom I met at a SciPy conference, also likewise became a community mentor for me. They didn't have to do much: words of encouragement, encouraging us to contribute back while themselves leading the charge, and connecting people together.

These two years in Portland, I've instead started to get involved with the internal organization of PyCon, volunteering a bit of my time on the Financial Aid committee. That's where I got to meet even more people in the community, and in person too! LVH and Ewa, a husband-and-wife team who have made many community contributions. Karan Goel, a software engineer at Google who led FinAid this year and whom I shadowed for taking on next year's FinAid chair role (I think we'll just share the duties again like this year). Kurt, PSF's treasurer who's been doing this for decades, and even at his age, still loves programming, and who loves black decaf coffee. Brandon Rhodes, who is a Python community celebrity for his eccentricity and entertaining talks, who gave me many words of encouragement as I rehearsed my PyCon talk. Ned Jackson Lovely, for whom no words other than "positive energy radiating through everything he does" can best describe him.

I think the PyCon community has done the "community building" portion of coding really well, and I'm thankful to be able to be part of this community of people. At the end of the day, good code is about bringing a benefit to people. So at the end of the day, while programming is an act of making routine things efficient, it's ultimately still about people, not code in and of itself. Thank you, PyCon community, it's been really fun being a part of the community this far, and I'm looking forward to many more years too!


Thesis Defence Video!

written by Eric J. Ma on 2017-05-15

About two weeks after being done, my thesis defence video is up on YouTube! It can be found here: https://youtu.be/ePqhQusK-3Q?t=1m23s.

My favourite parts are recollecting the thought of being scooped by someone else 4 years ago, saying that some people like doing sampling, and stating how the lessons from my first committee meeting have been passed on. Ahh, so many good memories!


Why I Teach Coding Tutorials

written by Eric J. Ma on 2017-05-13

I'm very excited to be at PyCon! It's a bit of a personal challenge this year, as I'll be leading two tutorials, one on Network Analysis and one on Data Testing.

With a bit of time on hand, I've done a bit of introspection as to why I love doing these tutorials. I think I can boil it down to a few broad themes.

Reason 1: Learning. When it comes to learning material, nothing beats having to teach it to someone else. This means I have to master the material in order to teach it responsibly to someone else.

Reason 2: Reputation. Grounded on the foundation of having mastered the material I'm going to teach, getting out there helps me build a reputation for having both technical mastery and the ability to communicate the material out.

Reason 3: Networking. By going to conferences where my tutorials are accepted, it's a great way to meet people and learn about the latest and greatest out there.

My hope is wherever I end up working, I can continue this craftsmanship!


PyCon 2017: Tutorials and Talk Preview!

written by Eric J. Ma on 2017-05-04

This year, I'll be at PyCon 2017 presenting two tutorials and one talk! I'm very excited to be attending!

The first tutorial I will deliver is on network analysis. The GitHub repository is online, and is the most mature of the three. This will be my 3rd year teaching the tutorial; I first developed the material in 2015, and have been refining it ever since. This year, I have great help from Mridul Seth, a student from India who has also been doing network analysis.

The second tutorial I will be leading is on testing practices for data science. The GitHub repository is online, and will cover the use of automated tests for checking code and data integrity, as well as the use of visualization methods in EDA to sanity-check the data. The material is still in development right now, and I'm hoping to get good feedback from the Boston Python community when I dry-run it locally in the Boston area.

My talk will be on Bayesian statistical analysis using PyMC3. As usual, the materials are available online on GitHub. In it, I will cover the two most common types of statistical analysis problems - parameter estimation and comparison of treatment with controls, and demonstrate the process of reasoning through model building, implementing it in PyMC3, and interpreting the data.

Really excited to be making three contributions back to the Python community. I've benefited much from the use of Python tools, and every PyCon I learn something new, so this is my little way of giving back!


Managing conda environments

written by Eric J. Ma on 2017-05-03

I recently got around to hacking a system for managing my conda environments better. Previously, my coding projects mostly relied on one master environment (with exceptions, e.g. bokeh development, or my Network Analysis Made Simple tutorial), but conflicts started cropping up. Thus, I decided to separate out my environments. However, keeping track of which environments go with which projects began getting tedious.

I thus decided to automate some of the steps involved in maintaining environments, and keep everything centrally managed so my brain doesn't overload. It involves a bit of GitHub and a bit of bash scripting, but altogether gives a ton of flexibility and control over keeping my environments updated.

I start by keeping a central repository of conda environment YAML specifications. Mine is kept here. Each YAML specification includes just the minimum set of packages that I need; conda manages the dependencies.

For example, my environment specification for Bayesian statistical analyses looks as such:

name: bayesian  # for Bayesian analysis
channels: !!python/tuple
- conda-forge
- defaults
- ericmjl
dependencies:
- python=3.6
- matplotlib
- numpy
- pandas
- scipy
- seaborn
- pymc3
- jupyter
- jupyterlab

Now, I've not pinned specific versions here, because I like to keep up with the latest stable releases. However, if version pinning is desired, it's totally possible to pin specific packages to particular versions, using the same syntax as I did for python=3.6.

In each project repository, I have an update_env.sh script, that looks something like this:

wget https://raw.githubusercontent.com/ericmjl/conda-envs/master/lektor.yml -O environment.yml
conda env update -f environment.yml

The key idea here is that I download only the relevant YAML file, export it as a generic environment.yml file, and then run the conda env update command on it to keep the environment up-to-date.

Now, here's the magic. I hacked Christine Doig's conda-auto-env script to execute update_env.sh, and then auto-activate the environment.

If my environment needs change, I can always update the environment YAML spec file (e.g. lektor.yml, or bayesian.yml) in the central repository, and use that to automatically update individual project environments.