

Did you mess it up? No worries, just revert back to the last “commit” and try again. If you use version control, you can make radical changes to your code without fear.

You might just leave it alone, because why mess with a piece of code that seems to work? Are you trying to anger the coding gods? Alternatively, you might decide to fix it, but fearful that you might make it worse, you duplicate the file (with a name like “analysis_just_trying_a_thing_v2.R”). Have you been in the situation where you know your code kind of works, but maybe it’s not the best? People often do one of two things in this situation. The changes are right there in your history log. You no longer have to remember when exactly you added that one variable to the model.

Collectively these commits serve as a research log of your work. All changes to your code base are committed to the repository with a brief description of your changes. Second, by using version control systems, you effortlessly create your own research log. Complex coding projects can also be “branched” and then later merged back into the main code base to avoid conflicts between users. Other users can clone this repository and then easily “push” their own changes as well as ”pull” in other changes, allowing everyone to easily stay in-sync. Users make changes to code (“commits”) and then “push” them to a centrally shared repository (located at GitHub, for example). What are some of the benefits of this approach?įirst and probably foremost, git will allow you to collaborate sanely.

Once you start using a version control system, however, it becomes difficult to see how you ever got by before. If you have never used version control systems, then the process can seem arcane and the benefits unclear. Version control systems like git have long been used by programmers to sanely collaborate and organize software projects, and they can serve the same purpose for quantitative researchers who spend much of their time coding. Underlying the GitHub architecture is the version control system, git, which provides further benefits to researchers. GitHub allows people to host public and private “repositories” that allow for the easy communication of research procedures and results. Increasingly, academic scholars, data scientists, and quantitative researchers are turning to GitHub for collaboration and to share data, code, and results. Learn to use GitHub and integrate it into your workflow. Aaron Gullickson will teach GitHub for Data Analysis on September 22-24, 2022.
