
Leveraging third-party libraries and frameworks is essential in most modern software projects, and the projects we build at Art+Logic are no exception. The pressure on developers to rapidly deliver features is high, and there are so many commonalities in the details of each project (particularly in Web development) that a lot of development time can be saved by using well-designed libraries that handle the details.
Taking on an external project as a dependency also carries a number of risks however, risks that range from minor to downright catastrophic. Third-party projects can
- lose maintainers without warning, and consequently become insecure or nonfunctional through lack of updates,
- become compromised by malicious code which in turn compromises deployed environments or developer machines,
- lead to software bloat that negatively impacts application performance,
- cause maintenance problems by adding more API surface for developers to learn, and
- create painful upgrade/migration scenarios which consume developer resources.
In other words, while there's a lot to be gained, there's also a lot that can occasionally go wrong. A fate suffered by many a software project goes something like this:
- At the outset of the project, a few core dependencies are selected. They're vetted to be well-maintained projects that will remain so for a long time.
- As time goes by, further dependencies are added one by one, each solving a particular problem. Whether it's justified to add them or not, each adds a little more to build times, to application size, and sometimes to application latency. The dependencies are rarely updated, because doing so is behind-the-scenes maintenance work whose value is difficult to appreciate; it incurs development and testing costs without improving the end user experience.
- One day, a high-priority dependency-related issue arises. Perhaps the old version of a library stops working and breaks the application until it is updated, or perhaps a critical feature cannot be implemented without the features provided by a newer version of a dependency. The team works to upgrade that dependency.
- In order to upgrade that one dependency, other related dependencies need to be upgraded as well. Soon, many dependencies have been upgraded, and many code changes have been made in order to migrate to their new APIs. The process takes many hours, and introduces many regressions because it was done as one sweeping change rather than as individual upgrades that could have been tested more granularly.
- What looked like a simple bug or feature request turns into tens of hours of maintenance work, and stakeholders are unhappy.
There is a sweet spot somewhere in between "reinventing the wheel" and "dependency hell", and while it differs from project to project, it sits closer to the "reinventing the wheel" side than one might intuitively expect. That's because the positive effects of adding a dependency are usually very immediate ("I had a problem, now this package solves my problem"), whereas the negative effects are delayed (since they mostly relate to long-term maintenance). Consequently, over-use of third-party dependencies is a form of tech debt, quickly solving problems in the short term but causing larger ones in the long term.
In this post, we'll outline some rules to follow in order to minimize the risks posed by third-party dependencies, with the aim of ensuring the long-term stability of a project.
Rule #1: Minimize the Number of Dependencies
It may be obvious, but the fewer dependencies a project has, the lower the risk they pose. Here are some strategies to achieve this:
- Consolidate libraries: If one library can do a reasonable job at filling the role of two or more others, then prefer the single multipurpose library over the multiple specialized ones.
- Sometimes the multipurpose option has limitations, or it is not as nice to use. However, it also means fewer library maintainers to rely on, fewer sources of documentation for developers to locate, and fewer package updates to manage.
- Borrow Code: If the code being leveraged is small and self-contained, if library's license allows for copying the code, and if that code has little reason to change in the future, then copy it into the project rather than linking to it as a dependency. We gain the functionality without needing to maintain another dependency.
- This is obviously not appropriate when it's illegal. Check the library's license first.
- This not appropriate when the code has reason to change (in particular when its use has security implications). In that case, it's valuable to be able to continue accessing future updates from the upstream maintainer.
- Ensure that any borrowed code is tagged with code comments indicating its origin, in case it needs to be understood in the future.
- Do it yourself: If the functionality is easy to implement yourself, then do it, rather than rely on a dependency.
- Look out for icebergs; many things look easy from the surface but are full of complexity underneath and are best left to domain experts.
- Keep it simple: Take on dependencies when you must in order to deliver a maintainable product, not when it makes only marginal improvements.
- Separate your runtime and development dependencies: Many package managers have features or workflows for listing runtime dependencies separately from development dependencies. Since development dependencies aren't deployed, they pose fewer risks, so we you be somewhat more liberal with them than with runtime dependencies.
Rule #2: Audit Dependencies
All projects ought to be audited before being added as a dependency: this means reviewing the project to ensure that it is secure, well-implemented, and well-maintained.
For NPM and PyPi packages, Snyk Advisor is a useful starting point for assessing a package's health. It algorithmically scores a package based on things like reported security vulnerabilities, frequency of maintenance, number of contributors, and number of downloads.
Here is a checklist for auditing a candidate (or existing) dependency:
- Security
- ☐ Has no history of security vulnerabilities, OR past security vulnerabilities have been taken seriously and patched promptly.
- ☐ Has a documented policy for reporting and responding to security issues.
- Popularity
- ☐ Has a healthy popularity (e.g. downloads per month).
- This is a proxy for how much demand there is to continue maintaining the project, and how likely it is that it would be forked by the community if the current maintainers were to disappear.
- ☐ Has a healthy popularity (e.g. downloads per month).
- Maintenance
- ☐ Has a steady pace of development up to present-day.
- Check for projects which may look active due to historic popularity, but which are no longer being updated.
- ☐ Has a steady pace of releases.
- Note that this does not follow automatically from a steady development pace (e.g. devs plugging away for years at a shiny new rewrite).
- ☐ Has a history of prompt handling of bug reports and feature requests.
- ☐ Respects semantic versioning policy.
- It's a major issue if minor/patch releases can't be trusted not to have breaking changes.
- ☐ Provides thorough migration guides for major version upgrades.
- It's a major issue if you can't update a major version number with confidence.
- ☐ The source code looks like you could maintain or extend it if you needed to.
- You might need to, if the library gets abandoned someday.
- ☐ Has a steady pace of development up to present-day.
- Community
- ☐ Has a healthy number of contributors.
- Use a tool like GitHub's "Insights" tab to view contributor statistics.
- The definition of "healthy" will depend on the project, but generally you want to see that the project owner accepts contributions from others, and that those contributions make up a meaningful portion of the codebase.
- ☐ Has good developer documentation to assist new contributors.
- This shows interest in having many contributors rather than a one-man show, and also helps you if you ever had to pick up an abandoned project.
- ☐ Has funding (e.g. corporate sponsorships or donations).
- A project with funding is more likely to stick around than one that's maintained by hobbyists.
- ☐ Has a healthy number of contributors.
If more than one of those boxes is unchecked, then the package is at higher risk of posing future security or maintenance problems, and consequently it should be avoided if a better alternative exists, or else the risks should be communicated to project management and/or stakeholders as appropriate.
Rule #3: Keep Dependencies Up-to-date
Typically, a project's dependencies should be updated on a regular cadence. This is not just to receive security patches and other bugfixes, nor is it so that developers can have the newest and shiniest features; it's to prevent the situation described earlier in this post where out-of-date dependencies need to be suddenly and immediately made current, a long and error-prone process that can grind a project to a halt.
Like other forms of technical debt reduction however, it can be difficult to convince stakeholders of the importance of budgeting some time away from immediate end product improvements and towards keeping dependencies up-to-date. At Art+Logic, we approach this topic with our clients by:
- Building the client's trust in us, by establishing a track record of rapidly delivering them high-quality results. When a client trusts that we're not out to talk them into things they don't need, they're more likely to be interested in hearing the team's maintenance priorities.
- Communicating the trade-offs of the chosen development pattern. It's not necessarily a bad decision for a project to prioritize feature development over reducing technical debt or risk; what's important however is that the client understands the trade-off they're making.
- When possible, agreeing upon a regular maintenance budget that the team can use for, among other things, keeping dependencies up-to-date. This relieves both the team and the client from having to routinely negotiate maintenance efforts, and it serves as an explicit agreement between the team and the client of what the balance should be between short-term features and long-term stability.
As far as the procedure goes for applying updates, the focus should be on doing it as atomically as possible; that is, to do it in small steps and return to a functioning application in between each step. If everything is updated all at once, then in the likely scenario that this breaks the application until some fixes are made, it is much harder to identify the root causes of each bug. A little time spent recompiling and smoke-testing the application in between each upgrade saves a lot of time spent struggling to get a broken application working again.
Final Thoughts, and Language-specific Advice
Often, the biggest challenge with managing the dependencies of a software project is just finding the time or the budget to do it. Since the consequences of letting dependencies gather dust for too long can be so severe however, it's worth finding a away. It's also important for teams to always be asking themselves the question: "do we really need this library?" Keeping the dependency list small is the easiest way to avoid complications further down the road.
Each language and ecosystem has its own tools which can help with these efforts. We'll close out here with a few recommendations for programming languages we frequently work with:
JavaScript
- Use
npm-check-updatesto conveniently check for package updates. It has a--upgradeoption which writes all version updates topackage.jsonautomatically; it's often worth giving that a shot to see if you're lucky enough to have no issues, and then if there are breakages simply revert it and apply the updates one by one. - Get in the habit of using
npm cias your default alternative tonpm install. Thenpm installcommand is commonly used in order to make locally installed packages match what's defined in the repository'spackage*.jsonfiles, butnpm installupdates thepackage-lock.jsonfile if it finds newer versions. This is typically not something you want to happen implicitly!npm ciwill only install exactly what's inpackage-lock.json, leaving upgrades to happen as a separate and explicit action.
Python
- Consider using Poetry as a package manager, rather than a
pip freeze > requirements.txtapproach.pip freezelumps transitive dependencies in with direct dependencies, making it impractical to go back and determine which packages are still in use. Poetry on the other hand uses a "lockfile" approach, where there are two files: the manifest defining the desired packages (pyproject.toml), and a separate lockfile that specifies the exact version to install for each package (poetry.lock). The lockfile provides the reproducibility needed for deployments, while the manifest file preserves your intentions, making it easy to review dependencies or to automatically upgrade them.