How to share nicely!
Using Git, GitHub and DOIs to make your work reproducible
Ann Gledson, Research Software Engineer
The University of Manchester
Overview
- Git / Github
- Gitflow workflow
- Structuring your repository to be sharable
-
Versioning / release process
- DOI: Link Github repo release with Zenodo
Git vs GitHub
- Git
- Distributed version control system
- Installed locally
- High quality version control system
- Excellent branching model
- Tags (snapshots of code for version release)
- GitHub (or GitLab, Bitbucket)
- Repository hosting service for Git
- Cloud-based
- Share code with others
- Resolve conflicts
- Releases (based on Git tags)
Basic use (using git and remote)
- Create a "repository" (project) with a git hosting tool (GitHub/Bitbucket)
- Copy (or clone) the repository to your local machine
- Add a file to your local repo and "commit" (save) the changes
- "Push" your changes to your 'main' branch
- Make a (remote) change to your file with GitHub/Bitbucket and commit
- "Pull" the changes to your local machine
- Create a "branch" (version), make a change, commit the change
- Open a "pull request" (propose changes to the 'main' branch)
- "Merge" your branch to the 'main' branch
Basic use: Local and Remote
Git - from local perspective
Git / Github
- Great tutorial: http://gcapes.github.io/git-course/
Gitflow Workflow
Gitflow branch types
- 'Main' (aka 'master') branch is public release code – each version on master is tagged with a semantic version number (https://semver.org/). Code in master works.
- Release branch (if present) is temporary. Preparation for version release. (Or use develop)
- Commits on develop are stable and should work but some unanticipated unexpected behaviour may be present.
- Feature branches are temporary*. Commits on these may be broken/unstable.
Making your code shareable
- Tailor-made workflow works well if:
- Contributors are working in the same context, same goals
- Communicate frequently
- Or each individual is working on their own branch/version
- Falls down if:
- Code shared with external users
- Change in context / purpose
- Package users expect a single, main branch
- Extensibility logic is in the code, not branches
- Versions: whole code, working snapshots
- Data analysis: build in parameterisation
- Why create releases?:
- code updating / improving
- external dependancies
- planned and tested snapshots / versions
- clearly ordered labels: orientate users
- Data analysis: The code that created a particular set of results.
- GitHub facilitates this...
Gitflow releases
GitHub releases
Gitflow release process
- Test run on develop branch – before making a release, make sure that all tests are passing
- (Can be run as a Continuous integration (CI) job
see: GitHub 'Actions')
- (Can be run as a Continuous integration (CI) job
- (optional) Create an issue that describes what changes have been made since last release version. Make sure everyone agrees that behaviour is as expected. Link to passing tests.
- Merge develop into master and tag with the version number.
Version numbering
- Version number format: MAJOR.MINOR.PATCH
- Increment:
- MAJOR version: when you make incompatible API changes
- MINOR version: when you add functionality in a backwards compatible manner
- PATCH version: when you make backwards compatible bug fixes
Sharing your research: DOI
Linking GitHub version to DOI
- https://guides.github.com/activities/citable-code/
Links
-
Git/GitHub Tutorial: http://gcapes.github.io/git-course/
-
Gitflow Workflow: https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow
-
Semantic version numbers: https://semver.org/
-
Zenodo (DOIs) https://zenodo.org/
-
Linking GitHub to DOI: https://guides.github.com/activities/citable-code/
-
GitHub Actions: https://docs.github.com/en/actions
Git-workflow-and-sharing
By Ann Gledson
Git-workflow-and-sharing
- 1,156