Life comes at you fast – especially in the software world. People think that being a software developer is a well-paid job, you get to work from home, all easy-peasy–but that’s only half the truth.
Production issues, last-minute changes to requirements, late-breaking discoveries, security challenges, budget overruns, and clients changing their minds make it so there is never a dull moment. As a developer, I’ve had to react to some pretty crazy situations.
What’s a worst-case scenario on a software project and – what if it comes to pass?
This is one of the craziest situations I’ve had to deal with – cyber-attack on vacation and an immovable launch date.
I joined the Acme Hospital project about 9 months before a cyberattack took place. They’re a large enterprise with thousands of employees and an older legacy application that they wanted upgraded. However, their codebase was in much worse shape than we had thought. Our team started the rewrite a year ago and launched Phase 1 of work (about 25% of legacy code) and it was – not a slam dunk. First-time launch for a client rarely is – we discovered all kinds of bugs while integrating with the current codebase, dealing with the client’s IT department was a major pain, and on top of everything they had high-security requirements since we were dealing with HIPAA sensitive data.
So 3 months before the incident in question, we were in planning meetings with Sharyn (the client manager) and my Team Lead Shaloo, and told them the situation:
- The longer we maintained the old codebase – the higher the risk,
- The launch date was already scheduled and unmoveable, so we needed to make decisions right away in order to plan, and
- If we could immediately convert all the code to our new codebase, we could control the project.
But the client chose not to adopt this plan. We didn’t have great credibility with them because of the bugs from Phase 1, and they didn’t want to increase the budget for a full rewrite. So we ended up scoping only about 50% of the rewrite.
So the decision was made. Shaloo drew up the plans for the rewrite, we finished software design, estimated the scope and got approval to move forward. 3 months in, with 3 weeks to go for launch, and we were looking pretty good: my development tasks were over, and QA was going over and testing everything. We should’ve been able to reclaim some credibility with a smooth launch now, and we were feeling good. Right? Wrong.
The week before Christmas we got a call: the old codebase had been hacked.
Not just that, but a bunch of data had been deleted from the database. And of course, that wasn’t all; the client’s entire team was on vacation and we were left with just a skeleton staff. The launch date was only 3 weeks out. As they say in the movies – it was the perfect time to panic.
Escalations were made, leaders were called and vacation plans were impacted. I won’t pretend we all had perfect responses but after a few minutes of frantic questions, we gathered what had happened: bad actor/s had used a SQL injection attack to drop tables and mess with the data. As bad news goes – that wasn’t terrible. So they didn’t have unrestricted access to the server, and we could block it by taking the offending feature from the legacy codebase down. But what would users do if we took parts of the legacy application down? We didn’t have this complex feature in scope for the rewrite.
Time for our internal team to huddle. Rajiv (our CEO), Shaloo, and I ran through the scenarios. This looked really bad for everyone – especially the managers who are our internal stakeholders. We had made the recommendation to upgrade and they refused. But this was an important project with high visibility – and the blowback would affect us too. On the flip side, if we take on more scope, we may not be able to deliver quality.
Damned if we do and damned if we don’t.
Rajiv turned to me, “Jesus, do you think you can rewrite that feature in 2 weeks?”
It was my turn to panic, “Wait, we just spent two and a half months writing 25% of the code, and you want me to do an equivalent amount of work in 2 weeks?”
“I know it’s a big ask, and we don’t need to rebuild everything exactly as it was done in the old app – but we need to do the minimum to make sure the feature is functional enough at launch. If we don’t do this we can’t launch and if we can’t launch all the senior executives at Acme are going to get involved. People who trust us may be impacted as well. So I ask again – do you think it’s possible?”
It was time to put up or shut up, “I think – it’s possible.” Cue the superhero music.
From there it was non-stop action. Shaloo and I dove deep to get detailed requirements and design this feature out in a day or two. We got approval from Sharyn to go full tilt. Rajiv moved all other priorities off my plate to make room for this. QA was ready to jump on things. With a do-or-die spirit, the team really pulled together.
So even though we had a lot to do, the mood was good because the target was clear. And I kid you not – I worked 12-14 hours a day for 3 straight weeks. So how did it turn out?
It was probably one of the big successes I’ve been a part of. The launch was better than Phase 1 because we better understood the old codebase and environment. We turned our stakeholders into fans, and I earned serious bragging rights. Not bad, right?
So what’s the takeaway? Be a team player.
Can you be the person on the team that people turn to when the chips are down?
If you can do it – step up. 9 times out of 10 – we usually find reasons to not put ourselves under pressure because we don’t want to be the fall guy if we risk it. There were many reasons not to attempt a Hail Mary here but I knew Shaloo and Rajiv had my back even if I failed. The risk was high, we had good arguments to make, but the reward was high too – dare I say we saved a career or two and made friends for life.
Story by Jesus Fernandez, Senior Software Engineer @Informulate