Chapter 2: Update Without Where
I decided to share the most interesting, tragic, nonsense and, funny stories of my 15 years as a DBA. Names, dates and locations will be omitted to avoid direct connections with the real facts.
In the early 2000’s I started to work in a small bank, startup style. It was a wonderful time where I made friends and learned a lot. Informality was everywhere, from dress code to IT processes. We played with ourselves that the company's project management methodology was "Go Horse ".
It was also a period where I had freedom to innovate, and I could collaborate with interesting things like database alerts via SMS, a distributed DW, and Agile PM methodology (at least for my team). I learned data mining, my fist data science steps. And even being the DBA I won the "Best idea for business development " award.
One day, probably in 2006, a young developer asked for production database access, what I denied as always. I was never much of a fan of the "Go Horse " and always struggled against certain practices. But the IT manager gave him access anyway and that’s the root cause of the problem. He decided to not only check what he should, but also to fix his new home phone number in the contacts central database, where there was information of employees, vendors, clients, and so on. He was a client of the bank too and his mistake was to update the table without the WHERE clause. He typed but didn’t use it. As a result, all contacts of the bank got his phone number.
And he didn't notice the problem for a while. This only happened when his mother called his cell phone, saying the home phone was ringing nonstop. The calls were as diverse as possible, from late payment collection to credit offerings and other services. Obviously the bank's call center didn't notice, neither the billing area, nor the legal area, and so on.
The only way to revert the problem was by restoring a backup in another area, I could not overwrite the other 100+ tables. Also, new contacts were free of the problem, we would need to do a balance line process. It was a big database and the restore would hurt all other systems performance in the middle of the day. To do that balance line, the restored database should be joined with the affected one, meaning that it should be in one of the production servers and this hypothesis had already been ruled out because we were in the middle of the day and would have impact on other systems. The development servers had no space, completing the impossible situation.
Imagine how happy I was: He went against my guidance, acted behind my back and caused a huge problem, difficult to solve. All my priorities put aside, to act on this problem. But I needed to help him.
I fixed the problem restoring the database on the DW server and exporting the affected data to the production server, in a temporary table. After the data was loaded, we could fix the updated table only. The process took hours and in the meantime, the phone from the boy's house had to stay off. At the same time, all the bank's service areas took the afternoon off.
You can check all other chapters here: https://www.dhirubhai.net/in/rodrigossz/detail/recent-activity/posts/
Tks!
Head of Data Engineering | Data Manager | MDM | CDP | AWS Community Builder | Mentoring
6 年hahaha lendo esse post, eu lembro de um problema igual que passei em um ambiente de produ?ao. Tivemos que restaurar o backup em um outro servidor de produ??o, exportar a tabela afetada do backup e importar no banco que recebeu o update errado. Meu chefe louco no telefone quando eu falei do problema pra ele, correria no time de dev para subirmos uma página temporaria, etc. Bons tempos de loucura. Depois disso acabou que adquirimos uma ferramenta que exportava diretamente de imagens de backup, um dump de uma tabela X, que se tivessemos na epoca do problema, resolveria em 1/10 do tempo que levamos.