Is software reuse leading to dependency hell?
Top 100 NPM packages and their dependencies till level 4

Is software reuse leading to dependency hell?

What is dependency?

Dependency is a term used when your code depends on someone else’s code usually someone external. Direct dependency is the code you reuse, transitive dependency is the dependency of the code you depend on.?

At the beginning of the software development era, reusing external software was a challenge. Usually, the code available with a language or framework e.g., JDK in Java or in-built libraries in C was available.

You were able to download and use external frameworks and libraries from an established sources e.g., Apache foundation, Sprint source, etc. There was no way anyone could easily get code for reuse from informal sources.

How was the software reuse problem solved?

This problem has been solved over the last 25 years by the way of various tools:

Discoverability – Powerful search engines and language-specific websites e.g., pypi.org for finding python packages, have made finding software code easy.

Dependency or Package Managers – Various package managers are now popular one for each programming language e.g., Maven for Java, PIP for Python, or NPM for JavaScript has made it very easy to access dependent packages. They also resolve transitive dependencies when building software.

However, with great power comes great responsibility in the last few years the ease of adding a dependency to software has resulted in major challenges sometimes colloquially known as dependency hell.

So, what could go wrong?

Unreliable source - A few decades ago software reuse while difficult to obtain was always obtained from a reliable source that has a governance process e.g., Apache foundation. Java Community Process etc. Now using software from unknown sources has no reliability, exposing your software to flaws and failures in the dependency.

Granularity - Earlier the granularity of a dependency was high usually containing a piece of software fulfilling a major function. The scaling up of package managers has resulted into developers exposing a single function as software. An example is an NPM package called “Left Pad” which just left pads a string. This has resulted in a very high number of dependencies difficult to manage.

Complex transitive dependencies – A single dependency might seem appropriate. Still, it may bring a large number of complex transitive dependencies as the entire tree will eventually form the overall dependencies again resulting in challenges.

Security vulnerabilities – A developer is not just responsible for the software he writes, he becomes liable for all the software dependencies as well. A security vulnerability in any dependent software will have to be addressed.

Guidance on reusing software and including dependencies

Avoid the dependency – The best dependency is the one that does not exist. For simple functionalities which can be written in a single function, it is best to avoid the reuse of someone else’s software. An example from my experience, I was writing a code in Python to calculate a loan repayment schedule. I needed the PMT function, and it was available in a popular library NumPy. NumPy is a very well-supported library, however, I found that the actual code for creating the PMT function is just a few lines and given in the NumPy documentation itself. So, avoid the dependency wherever you can write a function yourself.

Check the dependency – If you must reuse software look at the following characteristics.

  • Design – Is the software documentation clear? Does it have a good API design?
  • Code Quality – Is the code well written? Most dependencies are open source, and a quick read would give you a sense of how clean the code is.
  • Testing – Does the dependency have unit test, and can you run to pass?
  • Maintenance – Check the frequent updates and the issue tracker has not too many open issues.
  • Security – Check for open vulnerabilities. Also, check for past vulnerabilities and how quickly they were closed.
  • Dependencies – Check how many transitive dependencies the package have. Does including this bring a large number of other dependencies? The best dependency is the one with zero transitive dependencies.

Abstract the dependency – It is best you don’t tie up the dependency directly into your code. You should abstract the dependency by writing a wrapper over the dependency and making it generic. This will help in the future, if you have to change the package the impact will only be at one place rather than everywhere you may have used the dependency.

Watch & upgrade the dependencies – Your job is not over once you have included the dependency in your software. You must constantly watch the dependency of the issue and find an appropriate time to keep the dependency current.

Conclusion

The large software development community and modern package managers have given tremendous power to reuse the software and accelerate your software development. However, software reuse and dependency management come with their own perils and challenges. A cautious approach toward dependency management will ensure you make the best use of open-source software without getting into dependency hell.

Sagar Mudlage

Principal Engineering Manager | Technical Program Manager | Digital Products and Platform Engineering

2 年

Very well written..

Balaji Govindarajan

Principal - Enterprise Architecture | Digital Transformation, Platform Architecture

2 年

Good points. Establishing internal repository and SCA is also good practice to manage dependencies

Chandrashekhar Waghmare

Cloud and Data Leader@Motilal Oswal. AWS | Redshift | Databricks | Snowflake | Kubernetes

2 年

Too much of everything is worse

Atul Mangla

Senior Techinical Leader (Engineering | Delivery) | Ex-Digit, Amadeus, BT, IBM | NIT Surat

2 年

Well written... esp the abstraction of dependencies is quite insightful.. thxs for that

要查看或添加评论,请登录

社区洞察

其他会员也浏览了