Removing blocked users from a self-hosted GitLab CE. The story of one mistake.
When administering a self-hosted GitLab, there may come a need to remove Blocked users over time. For example, if many people use the system and authentication is configured through LDAP, employees may leave over time and inactive accounts remain.
It was important for us to get rid of users from projects because the list had become difficult to navigate, but at the same time preserve the history and digital footprint that the employee left in the system.
First try.
GitLab allows you to get a list of all blocked users, which is why we decided to use this and wrote a simple loop that removes all Blocked users from the system.
# Looking first 60 pages
for ((i=1;i<=60;i++)); do
# Getting blocked users per each page
for id in $(curl -H "Private-Token: <private_token>" https://gitlab.example.com/api/v4/users\?blocked\=true\&per_page=1000\&page=$i | jq -r '.[].id'); do
# Printing user ID and deleting user per each user
echo $id && curl -X DELETE -H "Private-Token: <private_token>" https://gitlab.example.com/api/v4/users/$id?hard_delete=false
done
done
However, we made a mistake by not making sure that the users returned by the API request matched the truly blocked users. If there are more than 1000 blocked users, GitLab stops displaying the exact number, and it seemed impossible to check by quantity, so we selectively checked a few users.
The script did not work as planned, and 90% of users were deleted from the system. The situation was terrible, but not critical, as we could recover from the backup. After spending some time making a control backup and restoring to a version before running the script, we started GitLab and were surprised to find that users continued to be deleted ??
System recovery.
It was nighttime, and we had enough time to figure out the cause before employees started work. Our first thought was that the tasks were still in the database or queue. We looked into the logs and saw the message:
"queue":"delete_user"
As it turned out, Sidekiq stores the queue in Redis. We had this thought and even had the idea of executing FLASH ALL if we couldn’t find another solution. Digging into the GitLab’s Rails console, we…
$ sudo gitlab-rails console
…we found the queue we needed
irb(main):023:0> Sidekiq::Queue.new("delete_user").size
=> 959
As it turned out, there were 959 jobs in the queue at that time. We couldn’t remove the queue, we got an error message saying “Running jobs cannot be killed”. But a series of actions led us to the solution that we could clear the queue.
领英推荐
irb(main):024:0> Sidekiq::Queue.new("delete_user").clear
=> [1, true]
irb(main):025:0> Sidekiq::Queue.new("delete_user").size
=> 0
After making sure that users were no longer being deleted, we restored the system to the state it was in before running the deletion script. Upon running the system again, we saw that users were no longer being deleted ??
By the way, the same actions could have been performed using the API.
“Why almost all users started to be deleted instead of blocked ones” remained an open question for us and we decided to go forward.
Second try.
Based on previous experience, we decided not to delete users from the system completely, but only exclude them from projects and groups. With this approach, in addition to avoiding repeating the previous night’s adventure, we guaranteed to preserve the digital trail of the employee. We decided to implement this using Python and the python-gitlab library, adding a dry-run option to ensure that everything goes according to plan.
gl = gitlab_auth(token)
project = gl.projects.get(project_name)
members_list = project.members.list(all=True)
blocked_project_members = [member for member in members_list if (member.state == "blocked" or member.state == "ldap_blocked")]
if dry_run:
colored_print("Dry run, relax ^_^", fg="green")
deleted_members = []
for member in blocked_project_members:
colored_print(f"Removing {member.web_url} from project {project.web_url} with project id {project.id}", fg="yellow")
if dry_run:
colored_print(f"Dry run removal attempt for {member.name}", fg="green")
continue
else:
try:
project.members.delete(member.id)
except Exception as e:
colored_print(f"Error when trying to remove {member.web_url} from group {group.web_url}", fg="red")
deleted_members += [member.web_url]
return deleted_members
With this approach, we managed to exclude users without deleting anything unnecessary.
Conclusions:
Thanks you for reading, have a great day! ??