A New Year's Resolution: Giving Back to Open Source with Faster CTGAN on Apple Silicon
Yaw Joseph Etse
Head of Privacy, Open Banking Engineering @ Capital One | Angel Fellow, Investor
This holiday season, I got a chance to reinvest in something I love: contributing to open source! - Adding Apple Metal Performance Shaders (MPS) support to CTGAN, a powerful tool for generating synthetic data.
Why Synthetic Data and CTGAN Matter
In an age of increasing data security and privacy concerns, synthetic data is a game-changer (more on this topic later). CTGAN (Conditional Tabular Generative Adversarial Network) is a leading method for creating realistic synthetic datasets from real-world tabular data. This has applications in:
However, training CTGAN models can be computationally intensive, especially with large datasets. This is where Apple's MPS comes in.
Harnessing the Power of Apple's GPUs with MPS
Apple's Metal Performance Shader (MPS) framework provides a way to tap into the immense power of Apple GPUs. By offloading compute-intensive tasks like machine learning model training to the GPU, MPS can significantly accelerate performance. This is crucial for tools like CTGAN, where training time can be a bottleneck.
Key benefits of MPS include:
领英推荐
Speeding Up CTGAN with MPS: My Contribution
My contribution focused on enabling CTGAN to utilize MPS for training on macOS devices with Apple silicon. This involved modifying the CTGAN codebase to leverage MPS functions for key mathematical operations.
The results were impressive:?In my initial tests, I observed?2x to 5x speedups?in CTGAN training times compared to CPU-only training.
I'm looking forward to seeing even greater performance gains and wider adoption of synthetic data generation techniques.
Further Reading: