Intern-shipping: My Statsig Internship
When I accepted my Statsig internship offer and hopped on a plane to Seattle, I did not envision the next four months playing out like this.
I’m James Cahyadi, a fourth-year Computer Science student at the University of Waterloo, and I spent this Fall completing my sixth internship—this time at Statsig.
This was my first time interning at a startup and due to this fact, this internship felt significantly different from the rest. During each of my internships, I’ve always learned something new, but at Statsig, I learned my favorite thing of all: How to be scrappy.
The team
During my time at Statsig, I worked on the data infrastructure team alongside my mentor, Pablo Beltran, and my manager, Eric Lui.
The data infrastructure team primarily works on improving and optimizing our data pipeline to make the lives of data engineers and data scientists easier.
I was fortunate to be surrounded by such smart engineers on my team who were always willing to answer all my questions. We also had multiple team events during my internship, such as Whirlyball, painting ceramics, and countless team lunches.
Shipping at Statsig
The culture at Statsig was a lot different from all the companies I’ve worked at in the past.
A common phrase you hear about startups is that they move fast. For Statsig, I believe this is an understatement because everyone moves at light speed, which allows for a lot of work to be done in a limited amount of time. It's astounding to hear the work that everyone ships during the team and company-wide standups.
Our fast-paced work culture speaks volumes about how passionate and hard-working everyone at Statsig truly is. As part of our fast-moving culture, I was able to work on three major projects during my internship.
My Statsig projects
Dagster asset sensors
The first project I worked on was expanding functionality for our custom Dagster asset sensors, which were built recently.
These sensors run approximately every thirty seconds and give us more control over asset materializations. More specifically, we wanted assets to (re)materialize once their parent assets (re)materialize.
For my first project, I expanded these sensors to add logic for freshness policies so that downstream assets would only rematerialize if they are X minutes stale compared to their parent assets.
This complex problem presented itself with many edge cases, such as ensuring a child asset is not rematerializing because its parent asset is materializing at a rate faster than the freshness policy, as well as considering the start and end times of asset materializations. I also made optimizations to the sensors to process asset partitions concurrently whenever possible which helped to significantly speed up the sensors.
BigQuery Writer
The main project that I worked on during my internship was creating a new service called BigQuery Writer. BigQuery Writer’s purpose was to replace the final steps of our data streaming pipeline, where we write processed event data generated from our SDKs into BigQuery tables.
Previously, we would use Spark to stream data from Eventhub to write to a temporary BigQuery table (called bronze), and then we would run two scheduled queries to write data from bronze to two BigQuery tables (called silver), one for exposure events and one for custom events, which would be the final landing spot for the data.
Cons to this previous approach were:
BigQuery Writer is a service that reads events from PubSub (previously Eventhub) and uses the BigQuery Storage Write API to write the events directly into the appropriate silver table. BigQuery Writer allowed us to completely remove the existence of the Spark job and bronze table while significantly reducing costs and drastically reducing the end-to-end time for an SDK event to land in silver.
BigQuery Writer was an intimidating service to write in the sense that I had to rearchitect a critical part of the data pipeline responsible for writing millions of events per second into BigQuery. The transition from our old streaming pipeline to using BigQuery Writer also had to be seamless to ensure customers see no disruptions in their data.
It was definitely a huge responsibility that my team entrusted me with and was undoubtedly the most impactful project I’ve worked on across my six internships.
Statshot
As part of Statsig’s hackathon, I worked on building an internal screenshotting tool called Statshot.
Every quarter, Statsig runs a company-wide two-day hackathon as part of a tradition adopted from Facebook. The theme of this quarter’s hackathon was to build something to improve an internal tool/process at Statsig.
Statshot aimed to solve two internal problems:
Statshot was the Chrome extension I developed with Chong Xie to tackle both of these problems. With Statshot, users could take a screenshot of their entire tab via a keyboard shortcut, which will simultaneously generate a shortened URL linking to the current tab.
Users can also crop the image or annotate it to their preference. And lastly, pasting the short link into any Workplace Chat group will automatically include the screenshot tied to that link.
Although Statshot did not take the Hackathon grand prize, it is now being used more internally and has helped the infra team with their image-sharing struggles.
In the end, I was amazed by all the creative hackathon ideas that people came up with. It was unbelievable to see how much people could accomplish over two days, and that is just a testament to how talented everyone at Statsig is.
Final Thoughts
As my internship comes to an end, it is amazing to see how much I’ve grown as an engineer and person throughout a short span of four months. I’m grateful to my entire team for helping me to gain countless invaluable career skills.
I would recommend working at Statsig to anyone, especially for those who love ping pong tournaments, office dogs, boba Fridays, and working on interesting projects. This has arguably been my favorite internship and I will always remember this time at Statsig.
Software @ frec.com
1 年Awesome work James!
Software Engineer at Statsig
1 年I can vividly recall our earlier conversations and back and forth emails spanning multiple internship sessions waiting for the right time for the stars to align to get you out to Seattle. I'm so glad this was the result! It was wonderful having you work with us. Thank you!
Half marketing, half AI, all memes.
1 年Awesome post, sir! You've had a wonderful impact, and have done a great job recapping it all too. Hope to work with you again soon! ??