Making Hypothesis sustainable

You might have noticed Hypothesis is atypically good for an open source library of its age and size. There’s a very simple reason for this, and it’s not that I’m brilliant, it’s that at this point I’ve sunk about 6 months of full time work into it.

The reasons for doing so are complicated and not very relevant, but the main point is that I’m making very little money off doing so, to a degree where this is obviously not something I can continue doing indefinitely and really should have stopped months ago.

But this is ridiculous. I’ve written high quality software which at this point I’m pretty sure is collectively saving people significantly more money than paying me a full time salary would involve. If I can’t figure out a way to make money out of that, something is clearly broken (well, many things are clearly broken, but it would be nice if this weren’t one of them).

The original plan was to sell training and consulting, but either I’m not talking to the right people or this is a very hard sell (or both). So far this is looking like it will be in the category of “Provides a nice top up but isn’t even close to enough to sustain development”.

So I would like you to do two things:

Thanks in advance.

This entry was posted in Hypothesis, Python, Uncategorized on by .

Heuristics are for learning, not for teaching

Often complex bodies of knowledge and issues where the answer is “it depends” are boiled down to simple rules.

  • Look both ways before crossing.
  • Wear a helmet when cycling.
  • Don’t eat too much sugar (or is it fat we’re not supposed to eat now? Or processed foods? Are vegetables bad for you now?)
  • Correlation is not causation.
  • Don’t compare floating point numbers for equality.

You can probably think of many others.

These rules are useful. They provide you with handy heuristic shortcuts that you can remember and follow without having to load the whole complex set of circumstances back into your head. They’re things to bear in mind and follow when the cost of thinking about it is greater than the cost of just following the rule.

But that’s all they are. They are useful heuristics. They are there to help you in your daily activities, not to be preached as gospel.

And what this means is that if you think you see someone who is not following the heuristic you do, that’s OK. You should at most say “hey so here’s a heuristic I use in these circumstances. Maybe you would find this useful?”.

If you instead say “You must always follow this heuristic and you are a bad person and should feel bad for not doing so” then you have forgotten the purpose of your heuristic and you are a bad person and you should feel bad.

There may be circumstances in which you are in a position of consensually teaching someone about something and you knowingly say “Here is a heuristic that is not always true, but it is useful and teaching you the full complex set of actual rules would take way more time than it’s worth to you right now”. That’s OK. That is not the circumstance you find yourself in with strangers.

I would like “assume competency until proven otherwise” to be a basic rule of courtesy. I know many people aren’t competent, but treating someone like they’re incompetent is a fast track to an extremely unpleasant interaction, and saying “Have you considered that maybe you should unerringly follow this simple reductive rule instead of the complex picture of the situation you thought you had?” is one of the easiest ways to achieve that. Don’t do it.

PS. Attempting to boil this post down to simple reductive rules in order to annoy me is extremely predictable and not actually funny. Don’t do that either.

This entry was posted in Uncategorized on by .

A wish list for programming languages

Once upon a time I was really into programming languages. I cared a lot about Scala and Haskell, I was interested in all sorts of weird languages (Shout out to anyone else who has used Nice or Clay).

These days, eh. I mostly write Python. Some C. I could do C++ if I had to but I generally don’t have to. I’ve considered checking out Julia but, well.

It’s not that I’m no longer interested, and it’s certainly not that I’m against exotic languages, or no longer care about type systems. I’m very glad that there are other people who still actively pursue these things, as the current state of programming languages is pretty piss-poor and I would like it to be better, but I find that these days I lack the energy to care and my priorities in a language have shifted to things that are more… pedestrian.

And yet somehow still really hard to satisfy.

So here’s a laundry list of stuff that would feature in my dream language. Advance warning that I am a grumpy old man and this is a super boring list that contains almost no cool features. Also it’s not in any particular order of priority – it’s mostly the order I thought of things in – and it’s definitely not complete – it’s just the stuff I thought of before I got bored of writing this post.

Also, most of these are things that you can’t without get large amounts of time, effort and money. They’re boring, not easy, and if anything them being boring makes them harder because you can’t really get people excited about working on them.

Community

Community is so important.

Here is what I want out of a programming language community:

  1. Large. Small communities are nice but I want a community who I can share the work load with, and a small community isn’t it.
  2. Friendly. Elitism is toxic, and a community that isn’t helpful to beginners or goes on and on about how they’re super smart for using this language that other people don’t get is the worst.
  3. Diverse, and committed to it. Codes of Conduct for everyone all round.
  4. Committed to quality. Documentation matters. Testing matters. We like having high quality libraries and we’re prepared to put the work (and, where possible, money) in to get them.

Packaging Infrastructure

Good packaging infrastructure is vital. And so hard to do. Basically nobody does it well. Packages should be:

  1. Easy to create new packages. If a problem could be solved by creating a new package it should be easier to solve by creating a new package than by not. You should never find yourself going “oh god but I have to write all that XML”.
  2. Versioned, with version constraints between dependencies automatically resolved
  3. Local to a project
    • without a lengthy compile each time you install into a new project
    • no pollution in the global install namespace
  4. Easy to mirror
  5. But with a good standard central repository
  6. Clearly marked for stability
  7. Try hard at maintaining compatibility between package versions
  8. Easy to write in a way that is portable to other versions of the language (e.g. don’t be Scala where a package compiled for one version of the language doesn’t work with any others, even between point releases)
  9. Easy to write in a way that is compatible with multiple operating systems.

It would be great if you could install multiple versions of a library in the same binary, but having this work correctly is sufficiently rare that I’m worried this might be a grass is greener on the other side issue. I’d be more comfortable with this is in a statically typed languages where you can wall off different versions from eachother by having them be distinct types.

Most languages eventually get something which is approximately this. Cabal with sandboxes, pip with virtualenv, ruby with bundler, all manage most of this (mirroring is typically not handled well. I think maybe it is in Cabal but it’s not in python or ruby).

Testing tools

There should be a standard test runner that works sufficiently well that nobody bothers writing their own unless they’re someone with an rspec fetish, and the community should laugh at the people with rspec fetishes and tell them to go play elsewhere politely suggest that maybe this isn’t adding very much to the testing workflow.

There should be good code coverage tools. They should work reliably with minimal overhead (if it takes twice as long to run under coverage then this is very sad and people will use it less). It should be able to do branch coverage. It would be great if it could do predicate coverage. More features – e.g. stats on paths and traces – would be amazing.

It would be fantastic to steal the CPAN feature that tests run on install and report back pass/fail information to somewhere sensible, which means you want testing integrated with the packaging system. Given the aforementioned versioning constraints and per project installs you probably only want to do this one per distinct set of versions of dependent libraries.

Obviously all languages should have a Quickcheck like testing tool (if I didn’t think this I probably wouldn’t have sunk more than six months of free full time labour into making Hypothesis).

Good tools for working with source code

I’m mostly very indifferent to Go, but there’s one feature that I think it gets so very right and wish everyone else would steal right now.

Your standard library should include a precise parser which for any valid (or, ideally, nearly valid) source code you can parse to an AST then print the AST as a bytewise identical file to the original.

It should also include a pretty printer that outputs code in a “standard” and correct format.

It should also be easy to make tools that use the AST representation to make changes to your source code.

Basically: I want good refactoring and reformatting tools, and in order to get that I want standard ways of building them.

I also want good static analysis tools. A well designed language obviates the need for a lot of these, but there’s always room for improvement.

Foreign Function Interface

There should be a standard, good, foreign function interface which makes it easy to bind to C libraries which does not expose internals.

In reality almost every language has too many ways to do it, none of them good.

Relatedly, please run valgrind clean by default. I know it’s a pain, and I know it hurts some microbenchmarks, but it makes debugging code integrated with C so much easier.

Text handling

Text is:

  1. Efficiently represented.
  2. Always immutable (Note: Having a separate editable representation for text is perfectly reasonable, but it’s not your default).
  3. Always unicode.
  4. Always understood to be a variable length encoding that you cannot index into by an offset.
  5. Easy to read and write to a variety of encodings.

Anything else is wrong and you are a sinner for contemplating it.

Equality works correctly

  1. There is a single operator you use for equality. You do not use different ones for different types.
  2. Differently typed values are never equal. Yes I know this violates the Liskov Substitution Principle. I don’t even slightly care. Ideally comparing different types for equality should be an error.
  3. Equality is reflexive. That is, x == x, always. I don’t know what to do about NaN here. So far my most practical solution involves a time machine. My second most practical answer is “Ignore IEEE and deal with the resulting confusion”.
  4. Equality is symmetric.
  5. Equality is transitive.

Good numeric hierarchy

Your language should be able to represent:

  1. Signed and unsigned fixed size integers of various machine sizes
  2. Arbitrary precision integers
  3. Double and single precision floats
  4. Arbitrary precision rational numbers
  5. Fixed width decimal arithmetic

These should all be easy to convert to eachother (but not compare equal if they are of different types!), and they should certainly all consistently use standard operators.

Most of this should be implementable as libraries rather than needing to be baked in to the language.

Packed data

At some point you are going to need to deal with arrays of “primitive” types – bytes, doubles, machine words, etc. If you cannot represent this in an efficient way when you come to do this, you will be sad. Ideally you want to do this in a way that makes it easy to interact with the aforementioned foreign function interface.

Ideally this would also support arrays of structs of some sort. I don’t really care about representing structs as individual values efficiently, but for large arrays of data it matters.

Namespacing and scoping

Everything should have a clear, lexical, scope. It should be obvious where a variable is introduced. The answer should never be “into the global namespace”. It should be hard to make typos and not notice.

As far as I can tell, basically the only languages which get this right are statically typed or a lisp. (ETA: Apparently perl with use strict also gets this right).

Higher order functions

Languages should have first class functions, and higher order functions like map or filter.

This one… is doing pretty well actually. This debate is over and we won. The last time I checked, Java was the only mainstream language that didn’t do this. Since Java 8 last year there are no mainstream languages that don’t do this.

A REPL

Not much to say here except that a REPL is so invaluable to how I work that it’s really painful using languages without one. I can do it of course, but it tends to involve writing lots of tiny little throwaway programs that act as a poorer version of a REPL.

Take typing seriously

I’m fine with dynamically typed languages. I’m also fine with languages with fairly serious static type systems (Haskell, OCaml, F#. Even C++ and C# are pretty OK). But if you’re going to have a type system don’t half-arse it. Good type systems are good, but bad type systems are worse than no type systems.

Note: Type system wars in the comments will not make it through moderation.

Solid, high performance, implementation

Why are we all using slow and unreliable implementations? It makes me really sad.

I mean, I do know the answer, it’s because writing a concurrent garbage collector and a high performance compiler is hard and reusing language-specific VMs mostly works but has its own set of problems.

Basically I want garbage collections and threading to just work, and I want to be able to write code that looks as if it should be reasonably low level and have it not produce something that’s hundreds of times slower than the equivalent C. If you can compile high level abstractions down to low level code, that’s great too.

Yes I know that low level concurrency is passé and we’re all doing message passing now. A good message passing API on top of the concurrency primitives would be great, but I want the primitives too.

Rich standard library

It has major problems, but the size and (mostly) quality of the Java standard library is one of the few things I miss about it.

The standard library should have all of the normal really boring things we need to get things done.

  • File system access
  • Sockets – client and server
  • A solid HTTP client (I’m ambivalent as to whether there should also be a server. Experience of how little ones from the standard libraries of existing languages are used suggests no)
  • Parsing for standard formats – XML, JSON, etc.
  • Good concurrency primitives
  • Pseudo random number generators
  • Invoking and running external programs
  • Probably many others I’m forgetting

There are plenty of things that shouldn’t be in the standard library because you want a faster release cycle or because there are multiple good ways to do them, but in general there are things that we’ve basically got figured out and are commonly needed and those should be standardized.

Collections Library

I really want a good collections library. With standard interfaces for things. We seem to have settled on “Eh, you’ve got hash tables and dynamically sized arrays, what more do you want?”. I’ll tell you what more I want:

  • Uniform interfaces. There are many things I dislike about Python but high up on the list is that if I write add when I meant append or append when I meant add one more time I’m going to scream
  • Immutable collections (not just frozenset. I want efficiently updateable immutable collections)
  • Sorted collections
  • Heaps
  • Priority queues

Java collections library I miss you. Please come back?

Database access

There should be a standard API for talking to a relational database. It doesn’t need to (and shouldn’t) bundle everything into it, but it would be nice if the API were standard and the standard library came with e.g. a sqlite3 adapter.

Summary

I think this can mostly be summarized by saying that I want is completeness and quality. The domain of programming is large, messy, and broken, and it would be nice if the language that I use to interact with it were a bastion of things mostly working and being easy rather than fighting against me every step of the way. There’s enough stuff that is common and known how to do well that it would be great to just do it well and then stop having to worry about it.

This will of course never happen, but there are enough standard sources of annoyance and things that languages get wrong that it sure would be nice if we could do without, and every one we manage to fix is one less thing to worry about.

Edit to add: You are welcome to suggest languages in the comments if you really feel the need to, but I am unlikely to dignify them with a response. Chances are extremely good that I am aware of the language you are suggesting and do not feel it lives up to this list.

This entry was posted in Hypothesis, Uncategorized on by .

Hypothesis 1.7.2 is out

I’ve just released Hypothesis 1.7.2. It’s quite a small diff against 1.7.1, but there were some significant bugs in there that I thought it was worth pushing out a fix sooner rather than later. Most of these have been in there for a while, so you might want to upgrade, but if you’re not currently seeing any weird behaviour you’re probably not being affected by them.

  • When using floats() with stale data in the database you could sometimes get values in your tests that did not respect min_value or max_value.
  • When getting a Flaky error from an unreliable test it would have incorrectly displayed the example that caused it.
  • 2.6 dependency on backports was incorrectly specified. This would only have caused you problems if you were building a universal wheel from Hypothesis, which is not how Hypothesis ships, so unless you’re explicitly building wheels for your dependencies and support Python 2.6 plus a later version of Python this probably would never have affected you.
  • If you use flatmap in a way that the strategy on the right hand side depends sensitively on the left hand side you may have occasionally seen Flaky errors caused by producing unreliable examples when minimizing a bug. This use case may still be somewhat fraught to be honest. This code is due a major rearchitecture for 1.8, but in the meantime this release fixes the only source of this error that I’m aware of.
This entry was posted in Hypothesis, Python on by .

Using z3 to try to solve Feedback Arc Set for Tournaments

I’ve been meaning to check out Z3 for a while – it’s an open source theorem prover from Microsoft with great looking Python bindings. I’m partly interested just out of pure curiousity, partly because there are some interesting potential applications for Hypothesis (it’s currently unclear to me how possible those are – the architecture of Hypothesis is not well set up for concolic testing, nor are the semantics of Python, but still).

Anyway, I thought I’d turn it loose on my favourite evil problem, feedback arc set for tournaments. It’s at that weird boundary where what you want is sortof a SAT solver and sortof an integer linear programming solver. There’s some suggestion that using hybrid approaches would be better than either, but I haven’t really found any easily usable hybrid solvers (there’s a fork of minisat that might work. It’s on my TODO list)

Anyway, basically what we do with z3 is that we create the initial problem, ask it for a feasible solution, then successively ask it to improve on that feasible solution until it no longer can. Here’s the code:

The good: z3 was extremely easy to use. The API was pleasant, reasonably intuitive, and basically did what I expected.

The bad: This solution is strictly worse than a solution which just generates a massive LP file and hands it off to lp_solve. So although it was very easy to solve this problem with z3, z3 appears not to be well suited to it. Z3 struggled at n = 15 and seemed to get stuck at n = 20.

This shouldn’t necessarily be considered indicative that Z3 is “bad”. This is a problem with a lot of constraints (O(n^3)), so it’s probably not representative of the sort of things Z3 is good at, and it is representative of the sort of things that lp_solve is good at. So I’m not really deriving any lessons from the fact that Z3 wasn’t very good at this problem . I’m more interested in the fact that it was super easy to use, so I really should look at using it more in future.

This entry was posted in Hypothesis, Python on by .