Ch. 1: The technical basics of 10X

Ch. 1: The technical basics of 10X

Hello,

I hope you have been enjoying this series so far and it is helping you out in some way. I missed out the last post because of a personal tragedy this week - i lost my companion of 9 years - my beautiful dog Kali. Its not just me but a whole community missing her. If a dog has ever loved you and you were the centre of their lives, you can imagine what I am going through. But death is as much part of us as life, and separation as much part of us as union.

Peace to you Kali! <3

Now coming back to our ikigai as engineers - I am back here to share with you today on some technical basics that we need to get right on this journey.

Touching the Basics

In this series we will be going through a lot of things. But if you are at early stages of your learning journey, I recommend you to get your hands dirty and concepts right, atleast to a reasonable proficiency, with the following fundamentals before you dive into other things. Hence this chapter is aptly called - the basics.

When beginning the journey you should get a decent foothold on the following topics.

We will cover the following topics

  • Software engineering career prospects
  • Repeating patterns in programming - learning multiple languages and repeating patterns (covered via personal stories)
  • What is boilerplate? And the need for teams to avoid it.
  • Abstracting patterns and writing or adopting DSLs or frameworks (covered via personal stories)
  • Four types of data: image, audio, video & text
  • Datastructures and algorithms
  • Basic Unix commands
  • On bugs and importance of QA

Software Engineering Career Prospects

What is this all about? What is software? What is the expanse and depth of this domain from an eagle eye point of view?

Types of Careers and Roles

What are the different roles in software industry? For example, web developer, software engineer, dev-ops engineer, QA engineer, security engineer, observability engineer, site reliability engineer, technical lead, technical architect, solution architect, technical writer, developer relations manager, developer advocate, engineering manager, scrum master, chief of information, technology and security.

And which one may fit as your ikigai? Its a process of discovery.

All languages share subset of universal patterns with tweaks

Be comfortable writing code in any language of your choice.

For starters

If you like making games and are completely new to programming, Scratch would be a great starting point to get the taste of basic programming. Its for kids! :D If you are looking to straightaway play with software development, Python or JS should be great starting points. Especially - JS and TS being full stack languages. Yet, the sky is the limit.

Keep in mind: Grasp and understand the repeating patterns

What you need to focus on is what are the typical repeating concepts every language is trying to address in its own unique way. Just focus on getting those concepts. Like condition checks, looping, assignment, data transformations, error handling, functions, recursion, importing, procedural programming, object oriented programming, functional programming, async I/O etc. After that, every new language is just about getting some syntax and libraries memorised. And now search engines, IDEs, online content and AI make this job even easier for us.

My journey with programming and building DSLs

Let me share my journey of not only using programming languages but building parsers, compilers, DSLs. I hope this helps you be inspired to not only understand language design but build your own DSLs for play!

In 2002, I touched a computer for first time only 3 months prior to joining IIT Kanpur - by opening an email account at an internet cafe. :-) I did not even know how to type properly on a keyboard. And the irony - I was going to study Computer Science at one of the best places in the world! (I wanted to originally take Aeronautics - but that is another story)

An aerial view of the Computer Science and Engineering department at IIT Kanpur

Imagine being a novice and being surrounded by some of the most intelligent minds around you who already know programming or even Unix commands! You now have to study with them, build projects with them. The bar was very high. We were in a 10X environment.

In our days at IIT Kanpur in 2006 our batch had a course called "Introduction to Java". Those days we used BlueJ for practice. I had to start finding my place in this new world of programming that had opened up to me - and from there had begun the journey of programming, classes, inheritance, functions, POJOs.

IIT Kanpur was a very geeky environment. Not only did you get surrounded by those hungry to learn and excel, but really talented professors and students who set high benchmark for you in that student life. We were lucky to have professor late Shri Amitaba Mukerjee teach us Java in a way which was fun! He once told me - "Ayush it is perfectly healthy and actually better to have some As, Bs and a fair sprinkling of some Cs, Ds and maybe even an F! The best earning from this place is not your grades but those experiences and lessons that will serve you for your lifetime. Stay curious. Have fun!" I scored a B in Java in that semester. I failed to score an F in IITK, but I sure managed a fair sprinkling of the rest of the grades in a journey which was truly expansive and rewarding :-)

If there is one thing about colleges like IIT - Its that if you are looking for inspiration - you just need to turn your head left or right. Such places are imbibed with 10X spirit. I wish every team in the world was like that! It starts with you. You can be like that! 10X.

Students and young engineers should know multiple languages

We were not only lucky to have 100 MBPS campus LAN and DC++, the environment was charged up with curiosity, healthy competition and aspiration. By the fourth year we kids had not only made some complex assignments and projects but also done some freelancing gigs. We worked on a variety of nature of problems For ex. in the third year Nihit Purwar , Rachna Agarwal and I had made a C++ compiler (in C++!) as part of the "introduction to compilers" course project. We had played with or at least seen - plethora of languages and scripts like Php, Python, SQL, Javascript, Perl, Lisp, C, C++, Objective C, C Sharp (even saw fortran, cobol and punched cards), and whatever else we needed to pick, play and adopt when building our projects or just spending an afternoon on a friend's computer. During my early professional career with Oracle, iXigo and my first startup product called Metataste - a taste based movie recommendation engine between 2011-14, I work in mostly Objective C, Java, R, Javascript and a little bit of Haskell to understand functional programming.

In my entire experience, after seeing the first 3 languages of different kinds, I knew I can work in any new language.

Authoring own frameworks and DSLs

In 2014-15, I decided to put an end to Metataste (a yet to be matched product deserving to be alive till date - another story). It was a vector data based personalization engine using Mongodb and Lucene working together in dual write mode. In order to build this in a configure over code manner, so that it was always flexible and adaptable, we developed a Java framework over Tomcat, using annotations. (As you can guess by now - I love to build frameworks )

By that time I had worked mostly in Java based projects for around 8 years. I found it a bit too verbose and definitely not a silver bullet for everything! So came Nodejs in 2014, on top of which was laid the first stone (line of code) of Elasticgraph starting 2015. (We will release this sometime this year on Github). Given my proclivity to abstract repeated patterns of code, in between 2017 to 20, I developed the larger part of Elasticgraph to be used for the digital archive of His Holiness the 14th Dalai Lama - a project with 30+ tables, Socket based CRUD, joins, denormalisation, data dependencies, migration and some complex cases. Migrating data from so many tables to new schema was a pain. And I have always detested boilerplate in code.

Developing an English like DSL

Hence I developed one of the features of EG - an English likes scripting language for working with Elasticgraph in Nodejs, using the lovely Pegjs parser. Authoring DSLs is real fun and also empowering. The code I am about to show below does the following things (along with configurations of course) -

It scrolls over an ES index, fetches some more documents, at the same time creates documents when lookups fail, joins data across relationships, avoids N+1 query problem (via in mem caching), assigns variables (with immutability and concurrent access behind the scenes), transforms the data, indexes computed documents using bulk requests, manages 'foreign key' connections, data dependencies and also materialized views! This is perhaps 70 lines of code along with some configuration files where the relationship, joins and materialized views models are defined. If you were to achieve the same in any langauge from scratch, how many lines of boilerplate code would you need to author?

Tell me how easy is it to understand the following code (do notice the lack of boilerplate)

const fillSpeakersTranslatorsAndLinkWithEvent = [
    'iterate over old-contents where {$exist: event_id} as old-content. Get 25 at a time. Flush every 5 cycles. Wait for 100 millis',
    [
        'get event *old-content.event_id',
        'if *event is empty, display "empty event", *old-content.event_id',
        'if *event is empty, stop here',
        'search old-content-to-audio-channel where {content_id: *old-content._id} as cac',
        'async each *cac.hits.hits as old-content-to-audio-channel'
        [
            'get old-audio-channel *old-content-to-audio-channel.audiochannel_id as old-audio-channel', //Fetch by id. No need to mention _source or fields. Both places, including top level object will be checked for existence of audiochannelId field
            'search first person where {_id: *old-audio-channel.speaker_id} as person. Create if not exists.', //Creates person in index if not found there. And assign to the variable person.
            //Handle event.speakers/translators. This guy is either a speaker or a translator. Set the relevent linking
            //Initializer
            'roleType is speaker if *old-audio-channel.translation_type is empty. Else is translator',
            'roleFeatures are {person._id: *old-audio-channel.speaker_id, primaryLanguages._id: *old-audio-channel.language_id}',
            //Can include pure JS functions for complex logic
            (ctx) => { //ctx has variables, es client and in mem cache
                if (ctx.get('roleType') === 'translator') {
                const translationType = ctx.get('old-audio-channel')._source.translation_type
                ctx.get('roleFeatures').translationType = translationType
                }
                return ctx
            },
            'search first *roleType where *roleFeatures as speakerOrTranslator. Create if not exists.',
            'if *speakerOrTranslator is empty, display "empty speaker", *roleFeatures, *roleType',
            'if *speakerOrTranslator is empty, stop here',
   
            //'display *roleType, *speakerOrTranslator._id, *roleFeatures',
            'link *speakerOrTranslator with *event as events',
        ],
    ],
    (ctx) => debug('Done ' + n++ + ' iterations')
];

//Now run the script
eg.dsl.execute(fillSpeakersTranslatorsAndLinkWithEvent);        

Elasticgraph was written using Vim in Nodejs using Javascript in the days when async/await was not a standard JS feature. We used Q and async-q the promise libraries prominently in use at that point. (A technical debt we would like to clear and modernise it). We also made an UI admin panel which was auto-generated using configurations again! Zero coding was required for a user to set up an entity model domain CRUD API (both REST + Socket) using Nodejs and a ReactJS UI. You can find a video demonstration of the Elasticgraph featureset here.

Great programming is not about the code you write. It is about the code you do not have to write. If same logic is to be implemented using pure third generation language code using Elasticsearch Nodejs client library, this same logic would be be of at least a thousand lines of quite complex boilerplate, making it difficult to author, understand & maintain. It will also take a reasonably smart engineer a decent chunk of time to write or to understand it.

The coming of Godspeed and its 4th gen abstractions

From Elasticgraph which is great for Elasticsearch + Nodejs based usecase, the buck of innovation and scope of work moved forward to solving the larger problem for modern microservices, serverless, event driven/sourced systems based use cases, which can be reasonably complex in nature. We adopted 'types' in the JS world using Typescript, in 2022 when we started building V1 of Godspeed Meta-Framework - which provides fourth generation abstractions in development of modern API and event driven systems over 3rd gen frameworks. It currently supports Nodejs. We want to come up with same abstractions for Java, Python and other languages based ecosystems as a community initiative, and I welcome your contributions.

Typical use cases in modern software development

Here are some usecases which typical software development covers when building modern applications at scale and complexity. Godspeed framework was made first for our own team to implement the same with least boilerplate.

You should know these patterns

  • Event driven systems (sync/async)
  • Services or serverless
  • Data collection and pipelines
  • Data syncing or dual writes
  • Primary and secondary datastores
  • Write through or read through cache
  • Data federation
  • Backend for frontend
  • Event sourced systems
  • Distributed transactions
  • Policy framework
  • Workflow Orchestrators
  • Dynamic workflow engines
  • Authentication
  • Authorization
  • Graph search and analytics
  • Full stack application

The vision is to standardise and democratise development of modern applications for students and global teams - both startups or enterprise.

Now if you give me any other language to pick on,or ask me to author one, I should be able to deliver with relative ease. Why? Because I have by now seen most concepts in programming languages. The patterns just keep repeating with minor to major tweaks. But the fundamentals of logic, datastructures, algorithm and computation remain the same. And further, expanding beyond language to building complex systems, there are again repeated patterns which can be understood, learned and abstracted. Something we will cover in this 10X series.

The concept of logic will always remain the same, no matter how many generations of languages come and go. So if you have visualised and understood these concepts now, you can switch easily between tools.

Business Logic and Boilerplate

The logic which serves a customer's or user's need is business logic. The logic and integration code that enables business logic to run without hiccups is boilerplate. Boilerplate does not directly serves the user, the business or the world. It serves the basic functioning of the software. For ex. multiplexing your queries into a bulk request, then de-multiplexing the responses for further handling of respective API calls.

The actual business logic serves the user. The boilerplate does not directly serve the user. So why should it exist?

In my tryst with programming and professional software development, between our first semester course "Programming in Java" in 2002 to '24 so far, I have seen that in the oceans of lines of code lie hidden the real pearls - the actual business logic. And hidden like a needle in a haystack - the bug!
Teams should ideally focus on the tip of the iceberg. With the integration layer sorted once and for all.

From most to almost all engineering teams till date spend majority amount of their time and effort in writing, managing and debugging code which is not serving the user, but enabling them to serve the user. For ex. Initialising an http server, or setting up and querying a database or API with authentication and token refresh. This job even if done by AI, increases the entropy, effort and chance of mistakes by 10X. Why? Because the scope of work has gone beyond just writing business logic to entire wiring up of the system. Hence more chances of errors and longer time to find and fix errors, add new features, replacing an integration or doing optimisations.

I remember a Drone Pilot Marketplate project I did as part of a team of 3 backend engineers, and a larger team with frontend engineers, testing engineers , scrum master and product analyst. All the backend team did for three months was write CRUD APIs with validations, try/catch/retry and finally the pearl in between all of this - the actual functionality of the app - which was not even 5% of the project code. The logic layer was so thin! Absolutely non-existent! We were simply exposing the CRUD over a database via REST APIs and doing validations of incoming API call inputs by as Express middleware. And as is the case with typical such implementations - there was not just extra effort, but questionable outcome. There were three sources of truth floating around between the team - the db schema, the coded API validations and the Postman collection going around in emails.

With something like Godspeed's meta-framework this job could have been finished many times faster and better, with single source of truth, using Schema Driven Development guardrail. It would have saved three months of effort not just of the backend team, but also the frontend, the QA, the scrum master, and most importantly - the customer.

I remember our scrum master would often call us desperately even during nighttime, saying hey guys - this week is the release. Can we push harder? And what did we push harder? CRUD APIs with three sources of truth! Not his decision, not his fault, but the approach caused inefficiency and struggle of all.

Data & datastructures

There are four types of data: image, audio, video and text.

  • Programming is but reading, transforming, computing and sending data.
  • Data is organised and represented in structures (data-structures).
  • Multiple structures can come together to form more complex structures which people call types. Or a Class in Java, a struct in C, a 'document' in a Nosql database. For ex. Stone, tree, animal, human.
  • Data and structures are everywhere - in a file, a database or memory of a software.
  • Whether its the file system, datastore, API, system or process memory - everything is a source or store of data, a place where you send or retrieve data from. Its called datasource!

In initial stages of learning an ample effort and attention needs to be invested in understanding and playing with data, structures and then as well types. It opens the mind. It starts with the very basic ones - an boolean, string, number, array, hash etc. Playing with map, reduce, filter helps in learning and as well improving the skill of visualisation.

A good programmer can visualise data in its different forms (structures) and connections (relationships). They start by visualising well the input picture, the output picture, and they visualise the path to realise the outcome. They document it. Whether as // TODO comment in code or in A Jira or Github spec. Then finally, they proceed for implementation.

Algorithms

Next up for algorithms - we sometimes need to do more complex work than just transforming data from one format to another like map-reduce in memory. This is the beginning.

An inspired programmer travels far and wide with the travelling salesman problem, Kruskal's shortest path algorithm, binary search, heap sort and the like. They admire the sheer human ability to visualise and solve puzzles. And so can you! Working on puzzles, datastructures, algorithms, trying to solve a problem with least computational complexity (Big O notation) - you should not prepare these because you have to prepare for an interview. These have to be part of your initial journey to open your mind! To exercise those muscles which will make you a 10X engineer.

Basic Unix Commands

IMO the best software of the world is written in Unix based systems. One should know fundamentals of operating systems and basic unix commands and tools, because they allow you to be more efficient and empowered in your development cycle. Period. A very basic knowledge of shell programming, and few commands goes a long way in helping us in day to day work, much more efficiently.

My personal useful list of commands to play with: cd, ls, cp, mv, ls, ln, tar, curl, ps, netstat, kill, pkill, xargs, grep, sed, find, awk, top, htop, sudo, cat, more, less.

They can be combined and used together in tandem. My favourite one to find services by process name match and stop all of them together.

"ps -ef | grep service_name | awk '{print $2}' | xargs kill -9"        

About Errors, Debugging and Importance of Testing

An error or bug is unexpected behaviour of the software.

Earlier the discovery of bugs, the better

When a bug bites on local machine it hurts less, than when in production. Hence it is imperative to find and fix bugs before the code moves to UAT or production. The cost of bug in production is 10 times the cost in developer's local machine or dev environment.

Bad UX hurts more than the goodness of good UX

Quality of user experience makes or breaks the game of any software adoption and growth. We humans have a negativity bias. One bad UX is most of the times worse than 99 good UX. In order to ensure we have a sound quality, testing is of paramount importance.

So teams follow a proper process called SDLC in which there is another sub-process called CI/CD automation. Teams write different kind of unit, integration and functional tests which are run as part of CI/CD automation, along with scans like static code scan, vulnerability scan, network scan etc. Developers often do not write test cases. But they need to be written and automated by someone. And ideally by the one who wrote the code in the first place!

//A code shipped without test automation is a huge risk of breaking something that worked earlier!        

Conclusion

Learning the essentials of software development is critical. Without that one is like an untrained engineer without a vision.

There is so much to talk about when we start of in this journey of programming and software development. Will keep on updating these entries as I maintain this series with the help of friends and community like you. I will stay open for community to also share suggestions or write a guest post and help improve this content for benefit of all.

Feel free to reach me out with your thoughts. Stay tuned for the next blogs in the series on advanced programming topics.

Original Blog is from https://godspeed.systems/blog/10x-engineering-the-fundamental-topics





Vaishnav Deore

Google Summer of Code 24@CircuitVerse | Final Year @IIITDMJ

8 个月

Really helpful blog to get started with becoming 10X Engineer. Key Takeaways: 1. Patterns exist in the Programming so hunt for it and live through it. So build multiple project with different sets of goal in mind. 2. Boilerplates helps you ship code faster. 3. Source of data is the truth, in modern application most of the time we just implement CRUD'y application with this data floating b/w server -> client with validation and all, this can be simplified with GodSpeed.

要查看或添加评论,请登录

Ayush Ghai的更多文章

社区洞察

其他会员也浏览了