This content is from the fall 2016 version of this course. Please go here for the most recent version.

Git and Python

The short answer is that Git works with Python just as well as R. In fact, you can use RStudio as a Python IDE to edit and run Python (.py) scripts. That said, there are plenty of IDEs specifically for Python that have Git integrated. Otherwise, the fallback option is to use Git from the shell or whatever Git client you chose to install.

Git and Jupyter Notebooks

Jupyter Notebooks present a more tricky challenge to tracking and syncing with Git. Jupyter Notebooks can be committed and synced with a GitHub repo. In fact, GitHub will render your notebook directly on your GitHub repo page.

The main issue comes in how Jupyter Notebooks store their content. What you see in the Notebook Viewer is the rendered copy of the notebook. The actual notebook looks something like this:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello world\n"
     ]
    }
   ],
   "source": [
    "print('Hello world')"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [Root]",
   "language": "python",
   "name": "Python [Root]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}

Jupyter Notebooks don’t store their data as a plain-text file, but instead in this structured format. This is fine when it comes to storing and tracking code, but the notebook also includes all output. Sometimes you want to track changes to your code without storing all the output as well.

If that is the case, there are methods for stripping the output from a Jupyter Notebook and only committing changes to the code. However these methods are beyond the requirements of this course. In the future you might want to explore these options. However for this course I recommend you simply track all changes, including code and output. Because there is no integration of Git within the Jupyter Notebook server, you will need to use your Git client or the shell to stage, commit, and sync changes.

This work is licensed under the CC BY-NC 4.0 Creative Commons License.