Radii in function spaces

Let \(V\) be a metric space and \(A, W \subseteq V\). Define

\[ r(A, W) = \inf \{ R : \exists w \in W, A \subseteq \overline{B}(w, R) \} \]

Where \(\overline{B}(w, R)\) is the closed ball of radius \(R\) centered on \(w\).

i.e. we’re looking at the radius of \(A\) when the center is restricted to lie in \(W\).

When \(W = V\) we will simply write \(r(A)\)

A \(W\)-center for \(A\) is a point \(w \in W\) such that \(A \subseteq \overline{B}(w, r(A, W))\). Note that it is not guaranteed that such a point exists (example: Let \(V = W = [-1, 1] \setminus \{0\}\). \(r(V) = 1\) but V has no center).

The case we’re interested in for this post is where we have some topological space \(X\) and set \(V = l^\infty(X)\), the space of bounded functions \(X \to \mathbb{R}\), and \(W = C(X)\), the subspace consisting of all continuous functions.

Why are we interested in this? Good question. I have absolutely no practical reason for studying this question and am not aware of any applications. There were three reasons I found it interesting:

  1. It can be regarded as the question “Given a bunch of possibly discontinuous functions, how well can you simultaneously approximate them by a continuous function” (this is the problem I started with which lead to the whole paper, although with only one discontinuous function)
  2. Studying radius is a fairly natural geometric problem and \(C(X)\) is a natural example of a Banach space
  3. There’s a lot of surprisingly neat theory involved in the question

In this post we’ll establish three results:

  1. A very nice characterization of \(r(\mathcal{F}, C(X))\)
  2. If \(X\) is normal then every bounded \(\mathcal{F} \subseteq l^\infty(X)\) has a \(C(X)\)-center
  3. A theorem about how the C(X) radius of \(\mathcal{F} \subseteq l^\infty(X)\) can be determined from “sufficiently large” subsets (see later)

A characterization of \(r(\mathcal{F}, C(X))\) and the existence of centers

Let \(\mathcal{F} \subseteq l^\infty(X)\) be bounded. We define the following functions:
$$\mathcal{F}^*(x) = \inf\limits_{U \ni x} \sup \bigcup\limits_{f \in \mathcal{F}} f(U) $$ $$\mathcal{F}_*(x) = \sup\limits_{U \ni x} \inf \bigcup\limits_{f \in \mathcal{F}} f(U) $$

where U is restricted to ranging over open sets.

Clearly for \(f \in \mathcal{F}\) we have \(\mathcal{F}_*(x) \leq f(x) \leq \mathcal{F}^*(x) \). The idea is that these functions provide better behaved tight bounds on \(\mathcal{F}\). Specifically:

Theorem: \(\mathcal{F}^*(x)\) is upper semicontinous. \(\mathcal{F}_*(x)\) is lower semicontinuous.
Proof:

We’ll just show that \(\mathcal{F}^*(x)\) is upper semicontinous. The other case will follow similarly.

So we must show that the set \(A = \{ x : \mathcal{F}^*(x) < t \}\) is open. Let \(x \in A\). Then there exists \(U \ni x\) such that \(\forall f \in F, y \in U, f(y) < t\) (by definition of \(\mathcal{F}^*\)).

But then for any \(y \in U\) we must have \(\mathcal{F}^*(y) < t\). Therefore \(A\) contains an open neighbourhood of x. x was arbitrary, hence A must be open.
QED

Believe it or not, this is almost all we need to construct centers:

Theorem: \(\frac{1}{2} \sup\limits_x (\mathcal{F}^*(x) – \mathcal{F}_*(x)) \leq r(\mathcal{F}, C(X))\)

If \(X\) is normal then this inequality is an equality and \(\mathcal{F}\) has a C(X) center.

Proof:

It may seem that this theorem is two unconnected results, but in fact our proof of the latter will simply fall out of the proof of the former.

Let \(R = r(\mathcal{F}, C(X))\) and \(S = \frac{1}{2} (\sup\limits_x \mathcal{F}^*(x) – \mathcal{F}_*(x))\).

We will first show \(S \leq R\).

Let \( \epsilon > 0 \). Let \(g \in C(X)\) be such that \(\mathcal{F} \subseteq \overline{B}(g, R + \epsilon)\).

Fix \(x \in X\) and find \(U \ni x\) open such that for \(y \in U, |f(x) – f(y)| < \epsilon\).

Then for \(y \in U, f \in \mathcal{F}\) we have \(|f(y) - g(x)| \leq |f(y) - g(y)| + |g(y) - g(x)| \leq R + 2\epsilon\).

This we must have \(\mathcal{F}^*(x) \leq g(x) + R + 2 \epsilon\) and \(\mathcal{F}_*(x) \geq g(x) - R - 2 \epsilon\).

Hence we have \(\mathcal{F}^*(x) - \mathcal{F}_*(x) \leq 2R + 4\epsilon\).

But \(\epsilon\) was arbitrary, so we must have $$\mathcal{F}^*(x) - \mathcal{F}_*(x) \leq 2R$$ and hence \(S \leq R\).

Now assume \(X\) is normal.

From the definition, we have $$\mathcal{F}^*(x) - S \leq \mathcal{F}_*(x) + S$$

The left hand side is upper semicontinuous, the right hand side is lower semi continuous, therefore by the Katětov–Tong insertion theorem there exists a continuous function \(g\) with $$\mathcal{F}^*(x) – S \leq g \leq \mathcal{F}_*(x) + S$$

Now let \(f \in \mathcal{F}\)

We have $$f(x) – S \leq \mathcal{F}^*(x) – S \leq g \leq \mathcal{F}_*(x) + S \leq f(x) + R$$

Or, rearranging, $$g(x) – S \leq f(x) \leq g(x) + S$$

Hence \(||f – g|| \leq R\), and \(\mathcal{F} \subseteq \overline{B}(g, S)\)

Thus \(R \leq S\) and hence \(R = S\)

But additionally we have shown that actually \(\mathcal{F} \subseteq \overline{B}(g, R)\), and thus \(g\) is a C(X)-center for \(\mathcal{F}\).

QED

Finding radii from radii of subsets

The diameter of a set is very nicely behaved in terms of the diameter of its subsets: It is the supremum of the diameter of all two-element subsets simply by matter of its basic definition. The radius is not so well behaved.

Consider the Banach space \(c_0\), the space of all sequences converging to 0. Consider the set \(E = \{ e_n \}\) where \((e_n)_k\) is 1 if \(k \leq n\) and 0 elsewhere. Every finite subset of this has radius half (it is contained in the ball of radius half around \(\frac{1}{2} e_N\) for some sufficiently large N), but the overall set has radius \(1\) because any center for it with smaller radius than that would not converge to 0.

It’s fairly easy to turn this into an example of a \(C(X)\) with similar behaviour.

Let \(X = \{ \pm \frac{1}{n} : n \in \mathbb{N} \} \cup \{ 0 \} \}\) and let \(\mathcal{F} = \{f_n\}\) where \(f_n(x) = \mathrm{sign}(x)\) if $|x| \geq \frac{1}{n}\), else \(f_n(x) = 0\).

Then once again any finite subset is contained within \(\overline{B}(\frac{1}{2} f_N, \frac{1}{2})\) for some sufficiently large N, but \(\mathcal{F}^*(0) = 1\) and \(\mathcal{F}_*(0) = -1\) so \(r(\mathcal{F}) = 1\).

(The idea of this example is that this space looks like two copies of the space of convergent sequences glued together by requiring that they converge to the same thing. We then force that value to be 0 by considering sequences where the desired center converges to opposite things).

So radius with centers in \(C(X)\) isn’t determined by finite subsets in general (incidentally, it is for centers in \(l^\infty(X)\)). How large do the subsets need to be in order to determine it?

There turns out to be a nice theorem in this regard.

First we’ll need a definition:

Let \(X\) be a topological space. The character of a point \(x \in X\) written \(\chi(x)\) is the smallest cardinality of a neighbourhood base of \(x\). The character of the space X, written \(\chi(X)\), is \( \sup\limits_{x \in X} \chi(x)\).

Lemma: Let \(x\) be a point with \(\chi(x) \leq \kappa\) and let \(\mathcal{F} \subseteq l^\infty(X)\). There exists \(\mathcal{G} \subseteq \mathcal{F}\) with \(|\mathcal{G}| \leq \kappa\), \(\mathcal{G}^*(x) = \mathcal{F}^*(x)\) and \(\mathcal{G}_*(x) = \mathcal{F}_*(x)\).

Proof:

We’ll construct \(\mathcal{G}_n\) of size \(\leq \kappa\) with \(\mathcal{G}^*(x) \geq \mathcal{F}^*(x) – \frac{1}{n}\) and \(\mathcal{G}_*(x) \leq \mathcal{F}_*(x) + \frac{1}{n}\). Then \(\mathcal{G} = \bigcup\limits_n \mathcal{G}_n\) will produce the desired set.

In fact it will suffice to construct a set \(\mathcal{H}\) with \(\mathcal{H}^*(x) \geq \mathcal{F}^*(x) – \frac{1}{n}\). We can then repeat an identical construction to get the other condition and take the union of the two sets.

So, let \(\{U_\alpha : \alpha < \kappa \}\) be a neighbourhood base for \(x\). By definition of \(\mathcal{F}^*(x)\) for each \(\alpha\) we can find \(f_\alpha \in \mathcal{F}\) with \(\sup f_\alpha(U_\alpha) \geq \mathcal{F}^*(x) - \frac{1}{n}\).

Let \(\mathcal{H} = \{ f_\alpha \}\). Then if \(U \ni x\) is open it contains some \(U_\alpha\) and thus \(\sup f_\alpha(U) \geq \mathcal{F}^*(x) - \frac{1}{n}\). Hence \(\mathcal{H}^*(x) \geq \mathcal{F}^*(x) - \frac{1}{n}\) as desired.
QED

This lemma contains the bulk of the work we'll need to prove the following theorem:

Theorem: Let \(X\) be a topological space with \(\chi(X) \leq \kappa\). Let \(\mathcal{F} \subseteq l^\infty(X)\). There exists \(\mathcal{G} \subseteq \mathcal{F}\) with \(|\mathcal{G}| \leq \kappa\) and \(r(\mathcal{G}, C(X)) = r(\mathcal{F}, C(X))\).

Proof:
Let \(R = r(\mathcal{G}, C(X))\).

We’ll use our previous characterization of the radius. So we can find \(x_n\) with \(\mathcal{F}^*(x_n) – \mathcal{F}_*(x_n) \geq R – \frac{1}{n}\). As per the lemma we can find a subset \(\mathcal{G}_n\) with \(|\mathcal{G}_n| \leq \kappa\), \(\mathcal{G}_n^*(x_n) = \mathcal{F}^*(x_n)\) and \(\mathcal{G}_{n*}(x_n) = \mathcal{F}_*(x_n)\). Then \(r(\mathcal{G}_n) \geq R – \frac{1}{n}\) hence if \(\mathcal{G} = \bigcup \mathcal{G}_n\) then \(r(\mathcal{G}) = r(\mathcal{F})\) as desired.
QED.

In particular if \(X\) is first countable then the radius is determined by countable subsets.

It’s worth noting that this is not an equivalence. Easy counter examples happen because there are first countable spaces with non first countable compactifications.

It is however tight in that for a regular uncountable cardinal \(\kappa\) if we take the order topology on two copies \(\{ a_\alpha : \alpha \leq \kappa \}\) and \(\{ b_\alpha : \alpha \leq \kappa \}\) with the quotient that \(a_\kappa = b_\kappa\). We can then construct an example exactly like our modification of the \(c_0\) example (this relies on the fact that every function in \(C(\kappa)\) is eventually constant).

References

  • The basic result about centers existing isn’t at all new, but I can’t track down the exact reference right now (various forms of it were proved at various times)
  • The inspiration for the theorem about characters comes from a mathoverflow answer by Sergei Ivanov. I have no idea if it’s new or not. I expect not though. It’d be surprising if no one else had proven it or something like it before.

Advance warning of advanced mathematics

A long, long time ago (about 6 years now) my friend John Bytheway and I were working on a joint paper. It was an odd little thing – a combination of Banach Space geometry and point-set topology with just enough analysis in between to glue the two together. There was probably something publishable in it, but we never really got it to a point where we were happy enough with the results to try. Still, I really quite liked the material in it, and I seem to come back to it every few years, polish bits of it, have a tinker and wonder if it’s worth trying to get it published. Part of this was we discovered later on that less of it was new than we’d previously believed.

I’ve come to the conclusion that I really don’t care enough about the stamp of approval for being published in a journal but that I’d quite like to get the results out in public, for some closure if nothing else but also in case anyone finds them interesting/useful. So I’m going to dig up some of the individual material and try to package them up so they stand alone as blog posts and just put them up here.

A nicer way to complete metric spaces

It’s a standard result that you learn almost as soon as you learn about metric spaces that every metric space has a completion.

The normal way to prove this is to construct a pseudometric on the space of Cauchy sequences on the metric space and then quotient out pairs with zero distance to get a metric space. I think this is a bit of a waste of time – the details are fiddly and annoying and it’s not terribly enlightening. Here’s a nicer way.

Let \(X\) be a set. It’s a standard result that \(l^\infty(X)\), the set of bounded functions \(X \to \mathbb{R}\) together with the uniform metric, is a complete metric space.

Now let \(X\) be a metric space and fix arbitrary \(c \in X\).

Claim: The function \(\iota : x \to f_x\) where \(f_x(y) = d(x, y) – d(c, y)\) is an isometric embedding of \(X\) into \(l^\infty(X)\).

If this claim is true we’re done. We’ve got an isometric embedding of \(X\) into a complete metric space, so the closure of the image is its completion.

In order to prove this we’ll need the following elementary result:

Lemma: Let \(x, y, z \in X\). Then \(|d(x, z) – d(y, z)| \leq d(x, y)\).
Proof:

Simple application of the triangle inequality. \(d(x, z) \leq d(x, y) + d(y, z)\) so \(d(x, z) – d(y, z) \leq d(x, y) \). Similarly \(d(x, z) – d(y, z) \geq -d(x, y) \). QED

Now to prove that \(\iota\) is an isometric embedding.

First we must show that \(f_x \in l^\infty(X)\). This is a simple consequence of the above lemma: $$|f_x(y)| = |d(x, c) – d(y, c)| \leq d(x, c)$$

Hence \(f_x\) is bounded

Now for isometry:

Consider \(x, w \in X\). Then $$|f_x(y) – f_w(y)| = |d(x, y) – d(y, c) + d(y, c) – d(w, w)| = |d(x, y) – d(w, y)| \leq d(x, y)|$$

So, taking the supremum over \(y\), we have \(||f_x – f_w|| \leq d(x, w)\). We now need only show equality.

For this, consider \(|f_x(w) – f_w(w)| = |d(x, w) – d(x, x)| = d(x, w)\). Hence we must have \(||f_x – f_w|| \geq d(x, w)\) and the result is proved.

Notes

  • I’m afraid I don’t remember where I first saw this so I can’t provide a reference, but it’s definitely not original to me. I probably encountered it in some textbook long ago
  • The thing I like about this is that it’s quite explicit, and at each step along the way you do the only thing that could possibly work once you’ve had the initial idea of trying to embed it into \(l^\infty(X)\): The only function you know about on it is the metric, so you have to use that. It’s not bounded, so what’s a non-artificial way to make it so, etc. The details are also much less fiddly than the normal proof because we did all the hard work of showing completeness when we proved that \(l^\infty(X)\) was complete.

The power of fixed point theorems in Set Theory

For reasons that aren’t relevant here, I was reminded of the power of fixed point theorems on partially ordered sets to prove some fundamental results in set theory earlier. This is an expository post about that. If you don’t care about set theory you probably won’t care about this.

I’m going to mention two fixed point theorems and use each of them to give a simple proof of a well known result in set theory.

Preliminaries

A poset is a set \(X\) with a relation \(\leq\) on it that is

  1. Reflexive: \( \forall x. x \leq x \)
  2. Antisymmetric: \(\forall x, y. x \leq y\) and \(y \leq x \implies x = y\)
  3. Transitive: \(\forall x, y, z. x \leq y\) and \(y \leq z \implies x \leq z\)

Additional reminders:

  • A poset is said to be totally ordered if additionally it is Total: \( \forall x, y. x \leq y\) or \(y \leq x\).
  • A chain is a subset of a poset which is totally ordered under the restriction of the relationship to it.
  • \(x\) is an upper bound for a set \(A\) if \(\forall a \in A. a \leq x \)
  • \(x\) is a least upper bound for a set \(A\) if it is an upper bound and \(x \leq y\) for all upper bounds of A \(y\). When such an upper bound exists it is unique (by anti-symmetry) and we write it as \(\sup A\)
  • A poset is said to be complete (resp. chain-complete) if every subset (resp chain) has a least upper bound
  • Every chain complete poset has a least element (the least upper bound of the empty set).

Examples of posets are any of the normal sets of numbers – e.g. \(\mathbb{N}, \mathbb{Q}, \mathbb{R}\). All of these are total orders, none of them are complete. \([0, 1] \subseteq \mathbb{R}\) is a complete total order.

For any \(X\), the power set \(\mathcal{P}(X)\) is a complete poset with the order being set inclusion.
Our fixed point theorems will concern functions

Our fixed point theorems are both about particular types of mappings of posets to themselves.

Knaster–Tarski fixed point theorem

(This is actually a slightly weakened form of the theorem because it’s slightly easier to prove and we don’t need the full thing)

Theorem: Let \(X\) be a complete poset and \(f : X \to X\) be order presering. That is \(x \leq y \implies f(x) \leq f(y)\). Then \(f\) has a fixed point.

Proof:

Let \(A = \{ x : x \leq f(x) \} \).

If \(x \in A\) then \(f(x) \in A\) as if \(x \leq f(x)\) then \(f(x) \leq f(f(x))\) due to \(f\) being order preserving.

Suppose now that \(y = \sup A\). Then if \(x \in A\) we have \(x \leq y\) and so \(f(x) \leq f(y)\). But \(x \leq f(x)\) by hypothesis of \(x \in A\), so we have \(x \leq f(x) \leq f(y)\). Therefore \(f(y)\) is also an upper bound of \(A\) and thus, because \(y\) is the least upper bound, \(y \leq f(y)\) and thus \(y \in A\).

But then as per above we must also have \(f(y) \in A\), and thus \(f(y) \leq y\). Therefore \(f(y) = y\).

QED.

This then lets us provide a very short proof of one of the basic results of set theory.

Cantor–Bernstein–Schroeder theorem

Let \(A, B\) be sets, and let \(f : A \to B\) and \(g : B \to A\) be injections. There exists a bijection \(h : A \to B\)

Proof:

The idea is that we want to find some subset \(H \subseteq A\) for which we can define our function h as \(h|_H = f\) and \(h|_{H^c} = g^{-1}\).

Claim: If \(g(f(H)^c) = H^c\) then the function h is well defined and is a bijection.

Well defined is easy: By the assumption, every element of \(H^c\) is in the image of \(g\), so \(g^{-1}\) is well defined on it.

Injective: It’s injective when restricted to \(H\) and to \(H^c\), the only question is whether we can have \(h(x) = h(y)\) with \(x \in H\) and \(y \in H^c\). But by assumption and injectivity of \(g\) we have \(g^{-1}(H^c) \subseteq f(H)^c\), so this is impossible.

Surjective: By assumption, \(g^{-1}(H^c) = f(H)\), so the function is surjective.

So now all that remains is to find such an H. Which we will now do by using Knaster-Tarski to pull it out of a hat.

Rewrite the requirement as \(H = g(f(H)^c)^c\)

i.e. we have the function \(S(A) = g(f(A)^c)^c\) and are looking for \(S(H) = H\).

Suppose \(A \subseteq B\). Then \(f(A) \subseteq f(B)\), so \(f(B)^c \subseteq f(A)^c\), so \(g(f(B)^c) \subseteq g(f(A)^c)\), so \( g(f(A)^c)^c \subseteq g(f(B)^c)^c\). i.e. \(S(A) \subseteq S(B)\).

So \(S\) is an order preserving map on the complete poset \((\mathcal{P}, \subseteq)\). Thus by the Knaster–Tarski fixed point theorem it gives us \(H\) we want.
QED

I find Knaster-Tarski a particularly “magic” fixed point theorem usage here because there’s no obvious connection between \(A\) and \(S(A)\) – we’ve just defined \(S(A)\) so that if it had a fixed point it would be the thing we wanted, and thus a solution appears.

Our next fixed point theorem is one I find much more intuitive, which makes it perhaps slightly odd that the proof we’ll use it in is intimately connected with the Axiom of Choice (which is intuitive itself but produces unintuitive results) and Zorn’s lemma (which is too technical to really have much intuition about).

Bourbaki–Witt fixed point theorem

Let \(X\) be a chain-complete poset and \(f : X \to X\) such that \(f(x) \geq x\). For every \(x \ni X\), \(f\) has a fixed point \(y \geq x\)

Proof:

The proof is by transfinite induction. That is, we’ll take some ordinal \(\kappa\) with no injection into \(X\) (such an ordinal exists by Hartog’s Lemma) and define \(g : \kappa \to X\) as follows:

  1. \(g(0) = x\)
  2. \(g(\alpha + 1) = f(g(\alpha))\)
  3. \(g(\beta) = \sup \{ g(\alpha) : \alpha < \beta\}\) when \(\beta\) is a limit ordinal (this is well defined because g produces a chain by construction and \(X\) is chain complete)

\(g\) is, by construction and that \(x \leq f(x)\), a monotone increasing function. i.e. \(\alpha \leq \beta \implies g(\alpha) \leq g(\beta)\). It’s not injective, therefore we can find some \(\alpha < \beta\) with \(g(\alpha) = g(\beta)\). Then \(\alpha + 1 \leq \beta\), so \(g(\alpha) \leq g(\alpha + 1) \leq g(\beta) = g(\alpha)\).

Therefore \(g(\alpha) = g(\alpha + 1) = f(g(\alpha))\) and \(g(\alpha)\) is our desired fixed point (and is \(\geq x\) because \(g(0) = x\).

QED

Now, how do we use this?

Well, it provides a very nice proof of Zorn's lemma by way of a closely related result.

Hausdorff maximality theorem

Let \(X\) be a poset. Every chain \(C \subseteq X\) is contained in a maximal chain (that is one not properly contained in any other chain).

This is a theorem very closely related to Zorn’s lemma – either can very easily be proved from the other. But Zorn’s lemma is a simple corollary of Hausdorff’s maximality theorem (whileas the other direction requires at least a bit of machinery), and this way round we get to apply a nice fixed point theorem to give a very simple proof.

Proof:

Let \(\mathcal{C}\) be the set of chains of X ordered by inclusion.

Claim: This is a chain complete poset.
Proof: Let \(\mathcal{A} \subseteq \mathcal{C}\) be a chain. Let \(A = \bigcup \mathcal{A}\). If \(x, y \in A\) then \(x, y \in B\) for some \(B \in \mathcal{A}\) (because it’s a chain, so find one containing each and take the larger of the two). Therefore \(x \leq y\) or \(y \leq x\). Hence \(A \in \mathcal{C}\). Clealy it’s a least upper bound for \(\mathcal{A}\) as it is one in the poset \(\mathcal{P}(X)\).

So, we’ll now apply the Bourbaki Witt fixed point theorem. Define \(f : \mathcal{C} \to \mathcal{C}\) as follows:

  1. If A is maximal, \(f(A) = A \)
  2. If A is not maximal, use the axiom of choice to arbitrarily pick some chain strictly containing \(A\) as \(f(A)\)

By construction \(f(A) \supseteq A\), by the above \(\mathcal{C}\) is chain complete, therefore by Bourbaki Witt \(f\) has a fixed point above every element of \(\mathcal{C}\). The elements of \(\mathcal{C}\) are the chains of \(X\) and the fixed points of \(f\) must be maximal. Hence the result is proved.

QED

As promised, Zorn’s lemma is just a simple corollary:

Zorn’s Lemma

Let \(X\) be a poset such that every chain has an upper bound (not necessarily a least one). \(X\) has a maximal element.

Proof:

Pick a maximal chain, \(C\). Let \(x\) be an upper bound for \(C\). If there were some element \(y \) with \(x < y\) then \(C \cup \{y\}\) would be a chain strictly containing \(C\). Therefore no such \(y\) exists and \(x\) is maximal.

Conclusion

A lot of the proofs presented for both of these results are pretty inscrutable. They’re not hard to understand, but there are quite a lot of details to get right.

I’ve got to admit, this mostly exhausts my list of cute things in set theory you can prove by way of fixed point theorems. I’m sure there are more, and I’d be interested to hear about them, but it’s been a while since I’ve looked at this and these are the only two I remember.

Update: In talking to Imre Leader, who taught me the course in which I learned Bourbaki Witt, he points out that as far as set theory is concerned Bourbaki Witt is almost entirely useless. It can essentially be regarded as “the choice free part of Zorn’s lemma”, and as a result no one actually uses it because the hypotheses of Zorn’s lemma are so much easier to verify.

“Agda is now mainstream”

Various people on the internets have been pointing out that the results for this is a mainstream language on hammer principle are a bit weird. In particular Agda was showing up as number 10 for it which if you know of Agda you’ll realise how ridiculous it is and if you don’t, well exactly.

It’s not there any more, as I tweaked the algorithm slightly to prune statement/item pairs with very little data, but the basic problem of the confusingness of this particular statement remains. The problem is alluded to on the text of the page, but I expect no one reads that text anyway. Oh well. So, quoted here:

For items where we’ve got a lot of responses the numbers should be pretty good. For the rest you can probably just expect them to hang around the middle of the rankings wondering what they’re doing here.

This is because the way the ranking algorithm works is that it sorts by an initial score and then looks for evidence to move items from that point. If there’s little data that score will be about in the middle and there’ll be little evidence to move it.

This is true on all the pages, but the problem with this particular one is that in this case lack of data is correlated with the correct position of the language in the ranking. This means that particularly non mainstream languages will lack data and thus will be sitting there in the middle, while as only mildly non mainstream languages will be parked down at the bottom because there was a lot of evidence that they weren’t mainstream.

The main problem here is not really the ranking algorithm (although it could be improved. I have an improvement in the works, but it’ll take a while to plumb in), it’s that the site doesn’t make it clear when an item’s position in a list is really more of a rough guess than a reality. I don’t currently have any good ideas as to how to indicate that, so the status quo will probably remain for now.

It’s also worth noting that the aggregate rankings are really more an amusement than they are a reliable source of decision making (especially now that the comparison pages provide you with more hard data). They should inspire, and they should lead your line of inquiry, but certainly don’t take them as rock solid fact. Even if they captured and presented the data perfectly you’re still just looking at a bunch of answers to subjective question from a bunch of randoms on the internet. Fun and interesting, but not necessarily representative of The Truth.