Thoughts on software engineering
Yehan(Davis) Zhang
A huge fan of data intensive App! Voyage starts from OLAP @ Redshift -> Datalake @ Onehouse.
Some thoughts on the question "What's the end-to-end procedure of driving a project?"
Design phase
1. Motivation and customer benefits. Start from a narrative story revealing the pain points with qualitative descriptions. It justifies the initiation of working on something. Also from PM and SDM's perspective, they will use it to prioritize among many other projects. It also greatly impacts how well the project will get funded and stuffed.
Interpret the same thing, instead of from a customer/user surface perspective, but from a technical/product behavioral perspective. - Functional requirements, Perf requirement, rejected requirements. Capturing the key invariants / input-output behavior.
Do Not commit to any timeline at this stage.
External resource: Input from SDM, PM, and any other stakeholders.
2. Detailed design phase.
Background on components that we are going to touch. It gives a simplified view of the existing architecture featured with some invariants/if-then behavioral patterns. Who are concurrently dependent on the component and what are their expectations of how this comp should work? Ideally distilled into a state machine.
Workflow that breaks things down into interactions between components.
Proof of concept. Very important. Prototype showing the viability of the idea. Better with primitive numbers. For big projects it might not be possible, so POC on smaller components.
Misc:
External resources:
At this stage, for any delays in closing up the design keep SDM aware. You are expected to get timely support with external resources you need and SDM is the best resort to ensure on this.
About timeline negotiation
Sometimes you need to give a timeline without fully closing the design, it depends on how much risk / potential waste of effort you are willing to take. If beyond the safety threshold, please do push back.
Regarding the timeline, in our team typically 1~2 persons will take up 95% of all the implementation while testing can be separately evenly. Due to organizational commitment something has to happen within a year / a quarter, which denies the possibility of a timely delivery if bottlenecked by the backbone persons. Hence manager and engineers have to work together to choose between any of the following:
领英推荐
POC - Implementation - testing should take around 2:1:3 time. As far as I'm concerned, it is not so much saying that cutting testing by half is an option, as we pretending it is an option until we can no longer pretend when the cut-off date is running close, where we are rendered in a worse situation of changing things at the very last minute.
Take margins into account for supporting other projects like design discussions, code reviews/ops/customer issues/and uncertainty about the design by extra 3~4 weeks depending on past observations. If not planned and risking the committed timeline, tell them you cannot do it without delay.
3.1 Coding
While working on a super large code base like what we have at Redshift, it is never surprising that we have to touch code that is 3~5 years old without explicit owners maintaining it.
The first rule is always to leave the code in a cleaner state after you have worked on it.
It is vital and common that the first task is not about coding the new features but refactoring the existing code to lay an extensible foundation. It could take 1~2 weeks to clean up, add more validations, and even add the missing test coverages.
The trade-off among extensibility, agility, performance
A project is backed by its business value, not how neat the code is written. Software engineering would be more like a problem of how we balance code extensibility, alige feature delivery, and code performance. It is reasonable that we first work on something quick and dirty to get things to work, then come back later during code freeze to unravel the twists of if-else, unnecessary copies, code duplicates, and ugly workarounds. Sometimes we have to make the suboptimal choice from a pure coding perspective due to a shortage of manpower, tight timelines, and calculated risks. The standards are elastic with boundaries. It's not about perfect coding, but keeping things under control. It means tracking the technical debts, and day 1 code bugs, and keeping them in the backlog with priorities. We don't fix every bug being spotted, but keep things under control.
Having said that, I still do code cleanup + and writing tests for the existing code before working on them. Most of the time alige delivery is not contradictory to maintaining the code. Not only gives you confidence about the product quality but also speeds things up if we are working on well-structured code.
3.2 Anything beyond coding
Weekly follow-up. For project starving on the resources, counting days we lost, and keeping SDM up to date.
For explicit deprioritization of the projects, call out 3 weeks before we are going to lose it. It's paramount to review weekly with SDMs.
Last but not least, document everything with a tracking quip, and email threads of meeting summary. No one would ever remember what has been decided yesterday.
4. Testing
More to come about mocking and dependency injections.
Back to the Component design.
5. Miscs
From an engineer's perspective, I feel there is growing value in knowing beyond the scope. Project delivery is by no means that we are given some problem -> work on a design -> coding -> deploy -> monitor. Things can get deprioritized, trashed, abandoned, or badly resourced as something exciting in the first 2 months loses its business value after that. To get insights into these uncertainties, and change the entire problem formulations, our thinking process has to be deeply rooted in non-technical factors like customer values, how the management team vision the goals at the organization level, how other ongoing projects are changing the common infrastructures, among many other things. Working smart might not only mean being diligent on well-defined problems but also extending the scope of concerns to see how the technical questions are strategically generated.
It's just a vague idea as of now, but I'm sure in the future I will get a better understanding.
To be continued..
A huge fan of data intensive App! Voyage starts from OLAP @ Redshift -> Datalake @ Onehouse.
1 年I'm sure there are no new ideas. Just a summary for me to retrospect my work in the past.