Upgrading Node Stack
Table of Contents
Job
Dealing with a dependency hell is a dirty job.
It probably got its name directly from how fun that activity is. Years ago I heard description that something is like a chewing on a broken glass, but slightly less entertaining. It would fit.
Some time ago I was tasked with upgrading libraries for a medium-sized project. It wasn't legacy by any means, as dozens of developers divided between the teams were working on it daily. I wouldn't describe structure as trivial. It was neatly split into parts which kept dependencies between them and had adequate volume of source for the role it was fulfilling.
Task's definition of a goal was simple - reach React 18. Without getting into the details - request made sense. It provided some features that'd be beneficial for project but also could resolve some long standing issues.
One of the worst of the tasks to get.
It's not challenging intelectually - one has to upgrade, do some corrections and work is, seemingly, done. But there are few reasons which make it somewhat sub-fun to work on:
- One-shot delivery
- Uncertainty, so it cannot be estimated
- Hard to report progress
- Almost impossible to parallelize
- Depending on project velocity and change scope - can conflict with current development efforts
- Might require arbitrary architectural decisions
- Changes are tedious and repetitive
- Problem waterfalling (i.e. only after fixing issue A issue B becomes visible)
- Roulette of an outcome, especially with long test suites
- NO-OP for business
Sensible approach
There are 2 methods I've used in the past with reasonable success.
First one - lucky shot. Upgrade everything, fingers crossed, timebox on and hope for the best. With reasonably small library "lag" (up to two years) or within some upgrade friendly technologies (like Ruby on Rails) it was working well. It reminds me somewhat of a shaking cables technique. You pick tangled cables and shake them. Often you can untangle many.
Second one - educated shot. Similar, but either tries to pin-point versions of the foundational libraries or upgrade one at a time. Few shots might have to be made. With similar comparison as above - it's like trying to figure out some of the tangles and hope that by removing them from the problem space other will resolve by themselves.
Surprise
Neither did work.
During "library herding" I discovered that oldest pieces were more than 5 years old. Not all, as some were almost up to date. They were those, which hadn't had big web of dependencies with either other library or other part of the project.
It wasn't secret that some libraries were old and recon did bring some information about prior tries.
As one can imagine: 5 years is a lot in JavaScript world. During my research I found out that most very popular libraries introduce major version every 2 years. Those libraries often interact with each other in one way or another (e.g. set of Emotion, React, React-Router and Babel).
And so with given "lag" I had, on average, two big leaps to made. Migration were well documented and usually required replacing import with another one or one construct in the other. Something that I could and did automatize to some degree with help of my trusty Emacs. Still a lot of legwork.
I begun my attempts. Whatever angle I took I ended up with hundreds of files in changelist and half-working project. Not good.
As I dove deeper it got worse: some libraries were incompatible or sunset entirely. They wouldn't work with updated stack, had no alternatives and required appropriate refactoring. Some required half the file to be changed in order to be fixed.
I still pushed through. When I saw, what I thought, light in the end of the tunnel I was long behind development branch with half the code turned upside down.
Un-roll
I strongly believe that when things go sideways there is only one thing reasonable engineer can do: talk to direct manager.
Not only it's common courtesy, letting them know there's a problem. It's way easier for manager to shift resources when deadline is 3 months away than when it's in 3 days. Managers (especially experienced ones) have a lot of tricks in their sleeves and while they might not add to engineering side of things they might add resources or point to right direction.
We spoke about problems I was experiencing: uncertainty, conflicts with current development, lack of confidence in change. Sure, I could've push through given enough time, but in the end no one would want to merge it in. I hadn't had any ideas except for the one I loathed: full feature rewrite into upgraded stack.
We discussed options, like adding more people (work wouldn't be divisible), getting resources to speed up the loop (wasn't critical issue), rotational shifts and hoping for luck etc. At some point he asked: cannot we simply split work into the milestones?
Obviously - we should do that. All the evidence was clear. How one can split work when picking up one piece pulls behind the another, though? If that'd be done periodically then the leap wouldn't be as big and hard to make. So I started day-dreaming of steps that could've been taken if upgrades were done bi-yearly.
Idea
And then it struck me. I told manager that I have an idea and jumped off the call.
In any given moment in time stack was compatible with itself, I thought. No one would release version that'd be simply unusable. So, instead of aiming toward the version, I should aim toward the point in time.
I verified quickly and confirmed that NPM Registry provides exact date of release specific version. I could - with some work, have stack "locked in time" for specified date. Like traveling in a time machine.
I called my manager and communicated solution:
- Upgrade would move in time steps, bigger or smaller depending on upgrade difficulty
- Work estimate would still be hard to figure out, but more plausible depending on where we are
- Progress would be fully known between the steps (we'd have precise date where we are in time)
- Some work could've be duplicated
- Total effort would be greater than one-shoting
- …but we'd gain higher stability and confidence as well as less development conflicts
- Nice side effect was that I wouldn't have to spend N-months working on it, as it could've been done in batches
The manager consulted the idea and it was accepted. Soon I began with initial step, amused by how I was going to be a time traveler.
Implementation
Idea was simple:
- Take current
package.json
file - Using data from NPM Registry lock version to either:
- Latest stable (no RC, alphas etc.) version at given
DATE
package.json
version if newer than stable
- Latest stable (no RC, alphas etc.) version at given
- Install, verify, fix if needed, commit and release (also known as: rest of the owl)
Doing this by hand wasn't an option so I wrote script to do that for me. It took some edge conditions to fix (package didn't exist at the date?) but worked fine enough to be usable.
There were couple steps. I didn't constrain myself in neither direction, but if I saw that 6 month step touched only minor versions I increased the length. Some steps required minimal amount of work (less then a day and mostly cosmetic sed
-powered changes), and some required more work but it'd be mostly "how to fix that" rather than "modify 500 files".
Then there were irreplaceable libraries. Nothing is irreplaceable: due to isolating the change, some determination and help of open source I pushed through.
Still, a lot of legwork, but at least manageable. In the end the goal was reached on deadline.
Afterthought
Even though today I think that with these 3 mentioned techniques any project can be updated I believe it's a thankless, difficult job.
Some tasks are hard, because they require obscure knowledge or specific capabilities. Some tasks are hard, because projects suffer from many complex issues and software debt is bringing everyone down.
With upgrade above tricks are helpful but in the end it's all about menial work and determination. And also:
-
Hadn't I had complex macros for (interactive) code replacement I would drown in — soul-sucking, boring, tedious and repetitive — changes.
-
Hadn't the project had extensive unit and E2E tests I'd made buggy release. It wasn't rare that my branch was declined due to some issue that wasn't caught by unit test suite.
But I did it — I left the dependency hell. And as with all tough things done in the past: it was fun in there.
Post Scriptum
Script mentioned above had its issue. I had an idea to do something in Rust and decided to recreate it: https://github.com/exlee/npm_time_machine