Setting up an AI/ML Engineering Dev Environment
As a Senior Director, I don't always have an abundance of time for coding and model work. However, being on parental leave has provided me with additional time for coding and independent learning in between baby naps (and cuddles). Though before I could jump head-on into fine-tuning the latest version of Llama-2 or running the latest diffusion models on a cloud VM, I had to first update and modernize my local development environment. In what follows, I walk through my local development environment including terminal and shell tools, IDE(s), notebook environment(s), cloud compute, prompt eng tooling, and other tools and libraries that I find to be critical for AI/ML eng development.
First, a few quick notes:
Terminal / Shell / Command Line
Ah, the terminal. What better way to feel like a mid 90s cyberpunk hacker than chaining together barely decipherable commands into a dark terminal window. The terminal is where I spend a large majority of my time when coding/hacking, so it's only natural to configure it with quality of life hacks and goodies. In fact, a pro-tip I have for any new AI/ML devs is to try and push as much of your workflow to the command line as possible. The command line is an incredibly powerful tool and if you find yourself commonly clicking around and opening up various windows on your desktop, there are likely unix commands that can replace those steps and save you a lot of time in the long run. Furthermore, you can write your own shell scripts to automate your common workflows.
iTerm-2
If you're developing on a mac like myself, you can use the terminal app that comes pre-installed on MacOS, however I highly recommend using iTerm2, a freely available terminal replacement with an extensive community built around it. iTerm2 has a number of quality of life features that will improve your time spent in the terminal, such as split panes, search capabilities, autocomplete, and other configurability improvements.
Oh My Zsh
Starting in 2019, MacOS made zsh the default shell environment (previously bash), so I don't need to tell you to switch your shell environment, however I highly recommend checking out Oh My Zsh, a companion framework from Robby Russell that greatly extends the vanilla zsh environment in so many ways. Oh My Zsh provides access to a community of color themes and plugins that super charge your shell. I prefer the dst theme and here are a few of my favorite plugins that I recommend adding.
tmux
If you're going to be training AI/ML models then likely you'll be running long-running commands in the background or remotely in the cloud. As such, you don't want your long-running commands to be terminated due to getting disconnected from an ssh connection. Tmux (short for terminal multiplexer) solves this problem by creating persistent terminals that you can attach and detach from. A common use case when training a model on a remote machine is to first create a new tmux session, attach to it, and start your training job. Then, you can walk away, grab a coffee, come back to your laptop and re-attach to that same tmux session (days later even) and check on the progress of your job. In short, always run your model training or long-running jobs behind a tmux session.
htop
With the sizes of modern LLMs continuously increasing and their associated memory and compute requirements growing, you will often find yourself paying close attention to your memory and CPU/GPU resources. Htop is an interactive system monitor that greatly expands on the vanilla "top" monitor that you'll find pre-installed within your shell. A common use case that I typically find myself performing when loading a model on my local machine is to open htop in a terminal window, sort the list of processes by memory usage by typing Shift+M, and then monitoring the memory usage to determine if I'm pushing the limits of my hardware, resulting in disk swapping, poor performance etc.
IDEs / Coding
So far, we've only discussed the terminal / command line. However, at some point you're going to want to write some actual code. I generally find myself writing code in one of 3 places, as discussed below.
Vim
My goto editor for all simple edits, whether dot files, config files, or simple one liner code changes, is Vim. Vim is incredibly powerful once you discover all of the shorcuts and build up the muscle memory for using them (which I've gradually built up over the last 20+ years of using it). There is of course a long-running debate of whether Vim or Emacs is the superior shell editor, with people being emotionally attached to their editor of choice, so you may want to check out both and make your own decision. If using vim, I also recommend checking out the Ultimate vimrc from Amir Salihefendic which will supercharge your vimrc with syntax highlighting and other configurations (and available customizations) that will make working with vim even better.
Jupyter Notebook
If I'm exploring a new model on Hugging Face or playing around with small scale experiments then you'll typically find me hacking in a Jupyter notebook. There's no easier way to get up and experimenting with python code than within a notebook. They're easy to set up, visual, can display images and plots better than within a terminal window, allow annotations and support markdown and latex, and can be easily shared with others, which also makes them a goto source for AI/ML courses and tutorials. If you've made it this far into the blog post, then you likely already know all of this. However, I would also like to caution devs from an overreliance on notebooks. Notebooks have an abundance of hidden state that beginner coders aren't going to be aware of, are much more difficult to manage version control, discourage good software engineering practices like unit and integration tests, etc. I'll defer to ?? Joel Grus 's presentation on the topic for all of the reasons to avoid notebooks, but my recommendation is to use them, but avoid an overreliance on them.
VS Code
In the past, I tended to use IntelliJ for all of my core dev work, including Java, Scala, and even Python. However, more recently, I was turned onto Microsoft's VS Code, which has been gaining a lot of popularity within the Python community over the past 3-5 years and it has now become my goto IDE for python hacking. If I'm doing any serious development on a library, github repo, etc, whether personal or for work, then I now turn to VS Code. You'll want to install the Python extension to take advantage of Python support, but once you do, you'll be presented with a simple, but convenient Python IDE that has all of the primary tools you would want from a more comprehensive IDE like IntelliJ, but without all of the overhead. A small pro-tip that I would recommend is to select auto save (File --> Auto Save) to ensure your changes are never lost.
领英推荐
Inference / Compute
Depending on RAM and CPU, many models like distilbert-base-uncased can be run directly on your laptop. In fact, if using Hugging Face, you can use their helpful model memory calculator to quickly gauge the memory footprint of a model. However, at some point you are going to be training a larger model, perhaps that even requires GPU support, and you're going to want to access some compute in the cloud. I typically have 2 primary compute sources that I utilize for my personal projects.
AWS Sagemaker
AWS Sagemaker and it's offerings have grown considerably over the last 5-6 years. What began as simple connective tissue between notebooks, models, and inference endpoints, has evolved into a bountiful platform and ecosystem of building blocks across the ML development lifecycle. Production use cases may utilize a more fuller sampling of these tools, such as Pipelines, Feature Store, and ML Ops, however for my personal projects, I tend to primarily use Sagemaker notebooks and endpoints exclusively. I've heard that Google Colab is also pretty great for cloud notebook and GPU access, but given my familiarity and coupling within the AWS ecosystem, I've stuck with Sagemaker for my needs.
Modal
More recently, an exciting cloud compute offering that I've been using and enjoying is Modal, from Erik Bernhardsson and team. I'll mention that Erik is a friend and many years ago my boss at Spotify, so I had a reason to keep an eye on this project, but after using it, I can confirm that Modal is one of the quickest and simplest ways to get your AI/ML models running in the cloud. By building their own container system (adios Docker/Kubernetes!) they were able to produce a serveless compute offering that can scale to hundreds of GPUs and back down to zero in seconds. This tightened feedback loop of being able to quickly make a code change, scale it up to run/train via Modal, get results, and iterate, can be a significant improvement to your AI/ML development cycle. I'm looking forward to continuing to use and experiment with Modal.
AI/ML Libraries and Packages
Ok, so we've now covered terminal, IDEs, and compute, but when it comes time to actually putting together models, where does one start. Luckily, there are some amazing tools and frameworks available today that make it easier than ever to jump in and start training and fine tuning your own models.
Hugging Face
If you've made it this far, and you have any familiarity with AI/ML development, then you're likely already very familiar with Hugging Face, but I obviously still needed to include it for completeness. Hugging Face provides a hub of SOTA open source AI/ML models, along with libraries and tools for easily working with them. In working with trandformer models and LLMs, you almost never want to train a model from scratch and your go to should be to pull down a pre-trained model from the Hugging Face model hub and fine-tune it for your domain, taking advantage of the transfer learning capabilities of these foundational pre-trained models. I also highly recommend their NLP Course for anyone who is new to the space of transformers and LLMs (or just wants to get started working with Hugging Face models).
Transformers
As mentioned above, Hugging Face doesn't just provide a model hub, but also provides tools and packages for dealing with models. One of the most valuable of which is the Transformers library. Using this library, you can easily pull down any model from the model hub, extend it using your deep learning framework of choice (PyTorch, TensorFlow, and Jax are all supported), easily encode data and make inferences using abstractions like Pipelines and Tokenizer, etc.
Diffusers
Relatedly, if you are interested in working with diffusion models, particularly for generative images or audio, then the Diffusers library is a must have. Diffusors works in tandem with the Transformers library, so you'll want to install that first, but afterwards you'll have access to DiffusionPipelines and other classes that will make working with diffusion models much easier.
Axolotl and PEFT
As mentioned earlier, the general workflow for modern deep learning and transformer models should be to pull down a SOTA pre-trained model and fine tune it on domain specific data. For the actual fine tuning, the 2 most common libraries being used are Axolotl from OpenAccess AI Collective and PEFT from Hugging Face. Both libraries provide easy to use abstractions for performing LoRA and other performance efficient fine tuning methods, integrations with Hugging Face models, and useful abstractions for training on GPUs.
Gradio
At some point, you're going to want to showcase a model you've built in some way and build a simple UX around it. In some cases, a Jupyter notebook or Sagemaker notebook will suffice, but if you want any UX beyond what a notebook supports, then you'll want to check out Gradio. Gradio provides a super simple framework for turning your Python ML models into user-friendly demos. Furthermore, when you're ready to host your Gradio app, they provide simple intergration with Hugging Face Spaces.
Conclusion
That's a wrap for now. I'll likely be posting some more as I work through some personal projects, but in the meantime I'd love to hear from others on your favorite tools and frameworks that are most critical to your coding and development environment. Leave a comment!
Software Engineer
1 年Thank you Chris! Much appreciated. Congratulations to you and your family, may you all enjoy health, abundance and happiness ????
CEO Planet Argon, one of Inc. Magazine's Best Workplaces 2023 & 2024, Creator @ohmyzsh, Host of Maintainable Software Podcast, Guitarist in The Mighty Missoula
1 年Thanks for using and supporting Oh My Zsh, Chris!
There's no reason to have the vim/emacs discussion here, is there? Seriously though, I also like ollama.
Senior HR Business Partner to SLT & VP+ Tech Executives @ Indeed | Ex-UHG & Advisory Board Company I Culture Champion | Executive Coach | Organizational Change Management
1 年Really appreciated the 90s nostalgia nod and the Hackers reference!
Tech Lead & Software Engineer @ HF | prev: co-founder XetHub, Apple, Turi, AWS, Microsoft
1 年Where do you keep your code, data & models? We've been building XetHub as part of your ML/AI Development workflow. Keep everything you are using (models included) in the same repo, have true portability across environments, avoid having to rebuild your env when going to different compute. Oh, and stream your data/models so you aren't paying for idle compute. Love your feedback on XetHub (it's free, and lovingly built mostly in Seattle).