What if we had more finished libraries?

Mark Jason Dominus has a piece from a while ago in which he says the following:

I released the Text::Template module several years ago, and it was immediately very successful. It’s small, simple, fast, and it does a lot of things well. At the time, there were not yet 29 templating systems available on CPAN.

Anyway, the module quickly stabilized. I would get bug reports, and they would turn out to be bugs in the module’s users, not in the module; I would get feature requests, and usually it turned out that what the requester wanted was possible, or even easy, without any changes to the module. Since the module was perfect, there was no need to upload new versions of it to CPAN.

But then I started to get disturbing emails. “Hi, I notice you have not updated Text::Template for nine months. Are you still maintaining it?” “Hi, I notice you seem to have stopped work on Text::Template. Have you decided to abandon this approach?” “Hi, I was thinking of using Text::Template, but I saw it wasn’t being maintained any more, so I decided to use Junk::CrappyTemplate, because I need wanted to be sure of getting support for it.”

I started wondering if maybe the thing to do was to release a new version of Text::Template every month, with no changes, but with an incremented version number. Of course, that’s ridiculous. But it seems that people assume that if you don’t update the module every month, it must be dead. People seem to think that all software requires new features or frequent bug fixes. Apparently, the idea of software that doesn’t get updated because it’s finished is inconceivable.

I blame Microsoft.

In my post about really boring programming language fantasies, I expressed a desire for more things in the standard library. A lot of people objected that the standard library is where software goes to die.

And… this is I guess somewhat true, but really the standard library is where software goes when it’s stable, and programmers seem to have lost their taste for stable software. After all, you’re not really living unless your dependencies are constantly shifting under you in backwards incompatible ways, right?

But… I share the fear. Because basically I don’t believe you can finish software. I mean sure it’s possible that MJD wrote bug free software. For such a constrained problem I’m definitely not ruling it out (but, on the other hand, text is terrible, and I’ve certainly written incredibly buggy solutions to highly constrained problems). But also he was writing in Perl.

And I don’t mean that pejoratively. As Avdi put it in another piece, we’re still catching up to Perl. Perl is amazing for backwards compatibility of language releases. In comparison, Python is terrible. For example, Hypothesis doesn’t currently run on Python 3.6 because they decided to remove a function that I depended on from the standard library because it’s been deprecated in favour of a function that doesn’t exist on Python 2.7. Thanks. (I haven’t fixed this yet not because I’m not going to, but because Python 3.5 isn’t even out yet so this is not high on my priority list).

And from the outside it’s often very hard to tell the difference between a finished and an abandoned library, but the balance of probability tends to be towards the latter. There are a lot of abandoned libraries and very few that haven’t had any bugs in them that needed fixing in the last four years.

Part of what motivates this is I’ve run into a bunch of libraries that from an API point of view “should” be finished: They have a very small surface area, are widely used, and haven’t been updated in a while. Unfortunately they’re instead either abandoned or struggling to find enough time to actually put out a new released version. For example bitarray is a very nice library that sadly you should never use because it’s abandoned and has some known and serious bugs that have not been fixed and probably never will be. Naturally it’s also widely deployed.

This is frustrating. The end of life for software should be stability, not bitrot, and I’d like a way to fix that. Here’s the way I’m currently speculating about:

It sure would be nice if there were an organisation which basically existed to take on maintainership of finished libraries. A library maintained by that organisation basically gives you the guarantee that bugs will be fixed, it will be ported to new versions of the language, the documentation will be improved, etc. but no new features will be added and the existing API will remain backwards compatible for basically all time. Any incompatible changes to the library are purely functional: They create a new library, not mutate the old one.

Sketch rules for how it could work (I’m imagining this being for Python, but it’s pretty portable to other languages):

  1. There is a corresponding Github organisation containing all libraries the organisation maintains.
  2. There is a shared email address which has ownership of the libraries on pypi (or whatever your package repository is)
  3. Anyone can and should feel free to work on any given project. Members of the organisation are actively encouraged to do so. A pull request requires sign-off from at least one, ideally two, members of the organisation.
  4. Any library maintainer can ask for the organisation to take ownership of their library. They put it to an internal vote and if they say yes the project is transferred to them. This should usually come with an invite to add the maintainer to the organisation too.
  5. People who want to join the organisation can ask to do so and it is again put to a vote.

Minimum criteria for libraries to be included:

  1. Must be under an OSI approved license.
  2. Must be version at least 1.0.0.
  3. Must have some threshold level of usage (a library nobody is using is probably not finished because you’re almost certainly wrong about the requirements).
  4. Must run on all of some required set of versions of the language (I’d pick CPython 2.7 and 3.4 and pypy for Python)
  5. Must not have any dependencies on libraries that are not maintained by the organisation (testing dependencies are totally fine).
  6. The latest release must not have introduced any new features or breaking changes.

This is not really a fantasy idea. I think it’s quite practical, and I would be prepared to be a founding member of this organisation if other people would be interested in joining it, however I don’t currently actually have any finished libraries to contribute (Hypothesis might be finished at some point, but it probably won’t be – e.g. I’ll likely be adding new and novel strategy combinators forever). Do you?

This entry was posted in programming, Python on by .

13 thoughts on “What if we had more finished libraries?

  1. Zeth

    Interesting post. Things will always change but there are only
    so many times that one can face the “dependencies are constantly
    shifting under you in backwards incompatible ways” issue before
    one becomes a minimalist.

    I basically seem to have devolved into a working set of software
    which is whatever libraries are on the default Debian install
    plus the Python Standard library plus nginx. Any dependency
    outside the set has to really bring something good for me to
    consider using it. It has to be a central part of the product
    basically. Before you bring on a dependency, one should ask
    yourself, “if the provebial stuff hits the fan, could I maintain
    this myself?”.

    I think there is certainly scope for outer rings of stability
    around the standard library, and in a way you have that already
    However, a lot of Python libraries are very dependent on some
    other piece of software – i.e. they are bindings for it, so they
    can never really be fixed.

    Getting away from my web development mindset, as the free
    software OS gets tidied up and made more consistant, e.g. with
    systemd etc, there is certainly a scope for some point down the
    line for their to be a comprehensive Python system library or
    something (for the Linux platform at least).

  2. Paul Moore

    This seems like a great idea. I would happily be part of such an organisation (with the proviso that I have little time – but it strikes me that this is an ideal project for people with little available time…)

    Some thoughts:

    1. The organisation should offer some common set of “supported platforms”, which would *replace* the list of supported platforms for the libraries it adopts (being adopted may imply loss of support for Python 2.6, for example). I would suggest the Python versions and platforms supported by Python/pypy itself as a basis.
    2. The biggest chunk of work would likely be porting to newer Python versions. And hopefully that wouldn’t be an onerous task.
    3. For projects with extension modules, the organisation should have a policy on whether they provide binary wheels. I would say they should, as otherwise they would need to offer build support. That means a need for a certain level of Windows knowledge within the organisation (to build those wheels).


    1. david Post author

      Yeah, I agree that this would actually be quite a good project for people who want to contribute to open source but don’t have a massive amount of time. It spreads the workload around quite well.

      I totally agree about the set of supported platforms. I’d probably like some sort of fine grained structuring here, e.g. with RFC style must/should/could/should not.

      The way I would personally split this:

      MUST SUPPORT: Windows, Linux, OSX. CPython 2.7, CPython 3.4+. pypy.
      SHOULD SUPPORT: Other *nix operating systems
      COULD SUPPORT: pypy3 (this is currently annoying because pypy3 is on 3.2. I am hopeful that this situation will improve), CPython 2.6. Jython, IronPython.
      SHOULD NOT SUPPORT: CPython < 2.6. Note that the intended meaning of "should not support" here is "no official support". It's not required that the project actively refuses to run on those versions, just that it is not officially supported and work to support them is very unlikely. RE wheels: I think mandating binary wheels are a great idea. I have very little Windows or packaging related knowledge to comment much on this though. In general I think it would be useful to standardize as much of the testing and build process across the projects maintained by the organisation as possible. One slight complication here is the question of how to build C extensions into such a thing at all because pypy. This suggests that having a blessed set of allowed external dependencies such as CFFI is important.

  3. Franklin Chen

    One heuristic that would help people determine whether a library was good and stable vs. abandoned and suspect would be a community-supported regular build (on many versions of whatever compilers are relevant) along with a reasonable test suite (written by the author or contributed by the community).

    1. david Post author

      Yeah, although travis and similar more or less achieve that, which makes this pretty much a subset of my standard heuristic of “If the last commit isn’t recent, how many reasonable looking unanswered pull requests and issues and are there on the github page?” :-)

    1. david Post author

      Yeah, there are a lot of similarities to the apache foundation. I think the scope is different (and smaller, thus more manageable by a handful of individuals) though: I really intended this concept to be for smaller libraries which have reached a point of full maturity. Apache projects don’t seem to have quite such strong stability requirements – they’re still growing, living, things, they’re just maintained by the foundation and considered fairly mature.

  4. Pingback: David R. MacIver on finished libraries | Virtuous Code

  5. Rafael

    If there is an issue tracker. Plenty users, no important open issues. Then, it’s just stable. If it has plenty open issues. It is abandoned. If there are few users and no issues, then you are on your own. Maybe open an issue? :)

  6. @ndy

    So like a “stable distributor”: like the Debian project but specifically for library code rather than end user code?

    That’s definitely interesting, especially from the point of view you tell it.

    …but I think that there’s another side to the coin: very few people are actually good at picking dependencies, let alone acknowledge it as a specific task that needs to be done as opposed to something that “just happens”.

    > And from the outside it’s often very hard to tell the difference between a finished and an abandoned library, but the balance of probability tends to be towards the latter.

    I think the difference is in the eye of the beholder. If it does what is needed and either fulfills your roadmap entirely or is simple enough to maintain then it’s “finished” or at least an appropriate choice for your project. However, it takes a couple of hours and quite a lot of experience and insight to actually sit and asses a dependency with that kind of qualifier. Putting aside security vulnerabilities for now, I prefer to put that effort in up front and know that I’ll never *have* to change that library rather than quickly picking something and then always having to adapt to the latest and greatest that is thrust upon me. I don’t think most people are like that and I don’t think most people understand that it’s an “investment” in technical debt or how to manage such investments.

    So we need a “dependency distributor” but we also need to talk more about dependencies and how to pick them.

  7. Pingback: A survey of Quickchecks | David R. MacIver

  8. Pingback: Finished Libraries - Push cx

Comments are closed.