My CodeDay Labs Internship: An Open Source Journey with Zitadel
1. About the Project
This summer I participated in an eight-week software engineering internship with CodeDay Labs in collaboration with CTI Accelerate. During this internship, I had the opportunity to contribute to an open-source project called ZITADEL. ZITADEL is an identity and access management (IAM) solution designed to meet the needs of modern applications, whether they are running on public, private, or hybrid cloud environments.
ZITADEL helps developers and organizations manage who can access their applications and services securely. In today’s digital landscape, where data breaches and unauthorized access are major concerns, having a reliable IAM solution like ZITADEL is crucial. It offers features like Single Sign-On (SSO), which allows users to access multiple applications with one set of login credentials, and multi-factor authentication, adding an extra layer of security by requiring more than just a password to log in.
The typical users of ZITADEL are developers and IT teams who need to manage user identities and control access to their applications. For example, imagine a company that offers a suite of online tools to its clients. With ZITADEL, the company can ensure that each client’s employees can securely access the tools they need, without having to manage separate logins for each tool. If one employee leaves the company, their access can be revoked across all tools instantly, helping to keep the company’s data secure.
2. The Issue
We addressed two issues within the ZITADEL project that enhanced its usability and functionality.?
The first issue was Issue #8129 (Add tooltip to indicator of the inherit button on "Feature Settings"). Within the "Feature Settings" of a ZITADEL instance, most features are set to inherit their values from programmatic defaults. However, it was unclear what these default values were, leading to confusion. Additionally, there was a red bubble next to the "Inherit" button, but users were uncertain whether this indicated that the default value was set to false or if it was just decorative.
To resolve this, we added a tooltip to the "Inherit" button, making the feature more intuitive and helping users understand its function without needing to consult documentation. This improvement enhances the overall user experience, making the platform easier to navigate. You can view our solution in PR #8238.?
Next, we tackled Issue #7966 ([cli/mirror] Allow file as destination and source). Previously, the mirror command was limited to database-to-database migrations, which restricted its flexibility.?
By extending the command to support file-based migrations, we significantly increased the versatility of the ZITADEL CLI. This enhancement allows users to move data between files and databases seamlessly, accommodating a wider range of use cases and making the CLI more adaptable to various environments. The solution for this issue is detailed in PR #8431.
3. Codebase Overview
Tech Stacks:
System Diagram:
The diagram above illustrates the overall architecture of the ZITADEL system. It includes key components such as the GUI, HTTP server, various APIs, and the ZITADEL core, which contains the command and query handlers, event store, and projection spooler. These components interact within a CockroachDB cluster to manage identity and access management tasks.
Zitadel Mirror Command:
Our project involves enhancing the mirror command to handle data migration between databases and files. The diagram below shows the process of the mirror command. First, we define the source and destination databases. The command then systematically copies tables from one database to another, re-computing projections as needed. A verification step ensures that the migration is successful by comparing the number of entries in both databases.
Workflow: Handling Mirror Command with Files
Let’s walk through the workflow of the mirror command in ZITADEL with the new file mirroring feature:
4. Challenges
One of the technical challenges we encountered was effectively adapting the mirror command to support file and database migrations without introducing unnecessary complexity. Initially, our proposed solution involved adding three new flags: --to-files, --from-files, and --path-to-dir. These flags would manage whether the mirroring process should operate on files or databases. However, we were uncertain if this was the best approach.
First attempt:
Our first attempt involved adding the --to-files, --from-files, and --path-to-dir flags. The idea was to let users explicitly specify whether they wanted to mirror data to or from files. This solution seemed straightforward, but we were concerned that it might overcomplicate the command-line interface and make the codebase harder to maintain.
Second attempt:
We reached out to the project maintainer for guidance on whether our flag-based approach was appropriate. The maintainer advised against adding new flags and pointed out that the system could inherently detect whether the source or destination was a file based on its type. This feedback was pivotal, as it helped us pivot away from a potentially cumbersome solution.
The picture below shows the destFile.yaml configuration, where the destination is specified as a file. This configuration is a part of our refined approach, where we no longer need additional flags.
Third attempt:
Based on the maintainer's feedback, we revised our approach. We introduced global variables (isSrcFile, isDestFile, and filePath) to determine if the source or destination was a file. This approach simplified the command structure and reduced potential errors by eliminating the need for additional flags. It also streamlined the code by enabling direct checks within the mirroring functions.
Here is a code snippet of the revised mirror command implementation:
In the code snippet above from line 3-9, we first check if the source is a file (isSrcFile). If it is, we only connect to the destination database since we are reading from a file. We then call the copyUniqueConstraintsFromFile and? copyEventsFromFile functions to handle the actual copying process from the file to the database.
On line 10-16, we handle the scenario where the destination is a file (isDestFile). We connect to the source database, read the data, and then write it to the appropriate files by calling copyEventsToFile and copyUniqueConstraintsToFile.
On line 17-27, this default case is for handling the traditional database-to-database migration. We connect to both the source and destination databases and call copyEventsDB and copyUniqueConstraintsDB to manage the data transfer between the two databases.
This terminal screenshot below shows the process of running the mirror command to import data from CSV files back into the PostgreSQL database, confirming the successful import with logs of data being copied back into tables.
领英推荐
Overall, This structure allows the copyEventstore function to flexibly handle different types of sources and destinations (either databases or files) without requiring complex flag management.
5. Solution?
As I have shown above, our final solution was to enable the mirror command to handle migrations between databases and files. We updated multiple files, including mirror.go, auth.go, config.go, event_store.go, system.go, and verify.go, to implement this functionality.
We defined the configuration files (destFile.yaml and srcFile.yaml) to specify whether the source or destination was a database or a set of files. For example, in destFile.yaml, the destination is configured as a local directory to store the database content as CSV files. The command now supports exporting data from tables like system.assets, system.encryption_keys, and others to CSV files, as well as re-importing this data back into a database.
The chart below illustrates the Reader/Writer Pipe mechanism, showing the process flow for copying data from files to databases (CopyFromFile) and from databases to files (CopyToFile). This mechanism ensures efficient data transfer while maintaining flexibility in the source and destination types.
Below is an example of the content in one of the CSV files generated by the export. This confirms that the data has been correctly migrated from the database to the file.
We also ensured that the reverse process—importing data back into the database from CSV files—was seamless. The terminal output confirms that data was successfully migrated from the files back into the database.
By doing so, we solved the problem of making the mirror command more flexible and user-friendly. Now, users can simply specify their source and destination, and the system handles whether it needs to read/write to a database or a file, eliminating the need for users to manage this manually.
Testing:
To ensure that our solution worked as intended, we conducted comprehensive testing across multiple scenarios:
In the screenshot above, we see the successful execution of the mirror command, which confirms that the data, including unique constraints, has been successfully copied into the PostgreSQL database. The log outputs provide details about the number of records processed and the time taken for each operation.
Below is a code snippet from the updated mirror.go that illustrates how we determine whether the source and destination are files or databases and execute the appropriate actions:
This snippet shows the conditional logic used to determine if the data should be copied to or from files based on the configuration provided. Depending on the setup, it executes the appropriate functions for file-to-database, database-to-file, or database-to-database migrations.
6. Conclusion
After implementing and rigorously testing our solution, our first pull request was successfully approved and merged into the main project. We are now awaiting the approval of our second pull request. This summer has been a fantastic learning experience, giving me valuable insights and chances to grow while working on real-world challenges in software development.
I’m incredibly thankful for the opportunity to work with CodeDay Labs this summer. Huge thanks to Tyler Menezes, Utsab Saha, and the Computing Talent Initiative for giving me this awesome opportunity. Big appreciation to Lalla Sankara for all the guidance and support. And a special shoutout to my amazing teammates, Andy Vo and Xiaoxuan Wang—collaborating with you both was a blast! I’m excited about what the future holds and look forward to continuing my contributions to Zitadel and other open-source projects.
Thanks for reading!
Related Links:
Issue 1:
PR 1:
Issue 2:
PR 2:
Presentation Slide:?
(Overview of our project and solutions)
Latina l Junior data engineer @Inchcape | Computer Science | Python | Strong solving problems l AI enthusiastic learner
3 个月Well done Ting!
Solutions Engineer at Capital Group | Data and Analytics | Information Technology | ServiceNow |
3 个月Well done Ting! It was an honor being your mentor this summer!