What You Should Know About Open-Source Software (at the Very Least)
This site is not affiliated with or endorsed by the Open Source Initiative. Visit https://opensource.org/ for more information.

What You Should Know About Open-Source Software (at the Very Least)

This article gives an overview of what I believe is the minimal knowledge you should have if you are using or consider using open-source software (OSS). It is mainly based on knowledge that I have acquired over the past three years as a tech founder and open source-contributor.

Some background information

I’ve been an avid OSS user for more than 10 years, mainly using it for mathematical and statistical analysis in both private and business settings. In recent years, I have contributed to OSS and therefore become more familiar with aspects related to licensing, security and quality.

Open-source benefits

OSS enables us to achieve tremendous results without paying for software licenses. In fact, I believe that many fundamental technologies like operating systems, programming languages, database management systems and container orchestration have better open-source implementations than proprietary solutions.

Open-source caveats

OSS usually comes without warranty or support. Hence, it is entirely up to the user to assess the quality of the software and independently develop a sufficient understanding of how to use it. For widely used programs, this is usually not a big issue because community support and examples are readily available online. However, the opposite is true for lesser-known programs, and this introduces several significant risks.

Licensing

A commonly held misconception is that if some code is freely available online, everyone is allowed to use it exactly how they wish. However, this is very far from the truth. OSS requires licenses in the same way that proprietary software does. For example, there are cases where large companies have gotten into trouble for copy pasting from Stack Overflow.

If someone makes their code publicly available, it is under exclusive copyright by default in most parts of the world. Hence, you are not allowed to simply copy code from the internet unless the author gives you permission to do so. By uploading the code to, for example, GitHub, the copyright holder accepts some terms of service. However, these do not include a license for you to use or distribute the code, but merely the right to view and fork their public code repository.

There are several nuances to OSS licenses that are out of scope for this article, for example, copyleft. These nuances are mainly relevant for software distributors. From the perspective of an OSS user, the main point is that you should ensure that there is a license and understand the limitations it imposes. Most OSS licenses simply prevent you from claiming that you wrote the code and require you to properly credit the author if you create derivative works by including their copyright and permission notices.

Security

I used to think that if a package was downloaded and installed using built-in package managers, it must be of high quality and have proper security. My reasoning was as simple as it was naive. I did not know how to package code and make it available through official repositories, so I assumed that people who did had to be highly skilled programmers and IT professionals. I also believed that since the code was open source, many people would be reading it and quickly spotting as well as fixing any security vulnerabilities.

After carefully examining and contributing to OSS, I quickly realized that my initial assumptions were incorrect. I now fully understand why IT departments rightfully have reservations about allowing people to install packages from, for example, the Python Package Index (PyPI). Python is currently the world’s most popular programming language, and PyPI has more than 500,000 packages. Anyone can make code instantly available on PyPI, making it practically impossible for PyPI maintainers to spot malicious code before it can be installed somewhere.

But how would someone find a package containing malicious code and actually install it? There are several sinister ways that attackers go through incredible lengths to fool someone into believing that a package is legitimate or to introduce malicious code into trusted packages. The interested reader can search the web to find examples. An easy-to-understand example is the publishing of packages with typosquatting names, for example, introducing a common typo into the package name or adding / removing a letter (request vs requests).

Although the most widely used OSS is likely to be secure, there are very real and significant risks to having full access to public repositories, simply because these contain so many packages and everyone can contribute to them. Hence, industries are forming around OSS security, for example, private repository solutions where IT departments can scan and approve packages before they can be installed on organizational devices. Unfortunately, these solutions are neither free nor self-managing; they require significant resources to operate and maintain. Another suggestion is to run OSS on servers that are completely isolated from organizational devices and only work with non-sensitive data.

Quality

I believe that a lot of OSS is of very high quality, especially the core technologies that I mention above. However, there is so much OSS and everyone can easily publish their code that you cannot assume that it is of good quality just because it is available for installation through built-in package managers, or because it has many GitHub stars. I have seen repositories with thousands of stars having no unit tests and quite clearly erroneous implementations. Remember that the number of stars can be manipulated, and after that most people tend to assume that someone else looked into the code and verified its quality. However, this is rarely the case, so you most likely have to look and assess yourself.

As a rule of thumb, you can assume that the further a package is from core computing technologies that are widely used in many applications across different fields, the more you are required to look into the code and assess the quality yourself. Hence, packages related to highly specialized areas far from core software engineering will most likely need quite careful attention, because unfortunately unit testing is not a skill that people are born with, and it seems that it is still insufficiently taught at universities.

Conclusion

Although OSS allows us to be much more productive, it is not a free lunch. It is unlikely to be able to solve all your problems, and it comes with its own limitations and risks. The security issues inherent to OSS need to be taken seriously, especially in the times that we currently live in. However, avoiding OSS entirely will most likely result in a large productivity loss. Hence, the best solution is to use OSS carefully and have realistic expectations about what it can and cannot do. For core technologies, you probably will be able to find a good solution. For highly specialized applications, you have to be lucky to find a good solution.

要查看或添加评论,请登录

Anton Vorobets的更多文章

  • Last Chance to Maximize Your Benefits

    Last Chance to Maximize Your Benefits

    November 2 will be the last chance to access the Portfolio Construction and Risk Management book including all benefits…

  • Fully Flexible Resampling

    Fully Flexible Resampling

    The Fully Flexible Resampling method introduced in Chapter 3 of the Portfolio Construction and Risk Management book is…

  • Crowdfunding Relaunch

    Crowdfunding Relaunch

    The writing of the Portfolio Construction and Risk Management book is progressing nicely with positive feedback from…

  • Resampled Portfolio Stacking

    Resampled Portfolio Stacking

    Portfolio optimization remains one of the most research areas in quantitative investment management. Yet it is also one…

    2 条评论
  • Portfolio Construction and Risk Management Newsletter

    Portfolio Construction and Risk Management Newsletter

    This newsletter will update you on innovations that solve real-world portfolio construction and risk management…

  • Learnings From the First Two Years as a B2B SaaS Startup Founder

    Learnings From the First Two Years as a B2B SaaS Startup Founder

    In this article, I will share my main learnings from being a B2B SaaS entrepreneur during the first two years of a…

社区洞察

其他会员也浏览了