There are endless resources for someone who wants to learn to train a deep learning model, but running a successful deep learning project requires managing many additional moving parts that are much less discussed. This talk contributes to filling that gap in our deep learning education resources.
Thanks to the Chicago ML Meetup for hosting.
Note: The presentation refers to the “Creevey” library. That library has been renamed “Wildebeest.” It also mentions “Tonks”, which has been renamed “Octopod.” Our team previously had a tradition of naming projects with terms or characters from the Harry Potter series, but we renamed them in response to J.K. Rowling’s persistent transphobic comments.
Video
Slides
Abstract
Deep learning projects require managing large datasets, heavy-duty dependencies, complex experiments, and large amounts of code. This talk provides best practices for accomplishing these tasks efficiently and reproducibly. Tools that are covered include:
- The Wildebeest library for processing large collections of files
- pip-tools and nvidia-docker for managing dependencies
- MLflow Tracking for tracking experiments
Additional Resources
Autofocus is a deep learning project that labels animals in images taken by motion-activated “camera traps.” It illustrates many of the ideas discussed in the talk.