dissoc

Infinite rest

Some functions types in Clojure seamlessly handle infinite arguments, while others misuse them and freeze our programs.

Clojure 1.11.1
user=> (apply (fn [& args]) (range))
nil
user=> (apply (fn []) (range))
^C ;; loops forever

Let’s level the playing field.

In the first case, the infinite arguments are preserved as a lazy sequence which is never realized by the function body, while the second case pours them into a Java array before the function body is called, looping “forever” (until the JVM runs out of memory or the size limitations of arrays are hit).

The compiler decides the behavior of fn in this respect based on the presence of a rest-argument. The second case of using a Java array is more fundamental in Clojure, so special support is needed to do something other than loop forever with infinite arguments.

So which other function types accept infinite arguments and why?

Vars simply apply their mapped functions, and thus inherit their ability to handle infinite arguments from the functions they hold. So, since fn’s with rest arguments can handle infinite arguments, then so can vars containing them:

user=> (defn yes-inf [& args] [:yes-inf (take 10 args)])
user=> (apply yes-inf (range))
[:yes-inf (0 1 2 3 4 5 6 7 8 9)]
user=> (apply #'yes-inf (range))
[:yes-inf (0 1 2 3 4 5 6 7 8 9)]

On the other hand, fn’s with fixed arguments (and vars containing them) cannot handle infinite arguments:

user=> (defn no-inf [])
user=> (apply no-inf (range))
^C
user=> (apply #'no-inf (range))
^C

Intuitively, we can do better: since fixed-arity functions in Clojure only support up to 20 parameters, we could decide to throw an error instead of walking past the 21st argument.

Collections are functions that usually accept 1 or 2 arguments, but they also loop forever when passed infinite arguments.

user=> (apply {} (range))
^C
user=> (apply [] (range))
^C

Again, intuition says we don’t need to walk past the 3rd argument, so this behavior can be improved. We can also add atomic values like symbols and keywords to this category of function.

Multimethods, like vars, are another kind of function that delegates to other functions. Unfortunately, they only support finite arguments.

Clojure 1.11.1
user=> (defmulti disp-first (fn [target & _] (class target)))
user=> (defmethod disp-first Long [target & _] (inc target))
user=> (apply disp-first 1 (range))
^C

Intuitively, if the dispatch function and the dispatch method both support infinite arguments, then so should the multimethod. This can be improved.

Let’s follow these breadcrumbs of intuition to enhance all Clojure functions with better handling of infinite arguments.

First, we noticed that vars inherit their capability for infinite args, and yet multimethods could potentially do the same. To bring them up to speed, we can use the same trick as Var, but twice: instead of the multimethod calling invoke on itself, first apply the dispatch function then apply the chosen method.

Now multimethods with rest parameters support infinite arguments!

user=> (defmulti disp-first (fn [target & _] (class target)))
user=> (defmethod disp-first Long [target & _] (inc target))
user=> (apply disp-first 1 (range))
2

But this does not handle fixed-arity multimethods, since they depend on the functions they contain for infinite argument handling:

user=> (defmulti mm (fn []))
user=> (apply mm (range))
^C

This brings us back to AFn, where fn’s applyTo implementation lives. The basic algorithm is to realize 20 arguments to determine whether we can call invoke without building the final array, otherwise pour the 21st argument and beyond into an array to call invoke. We can’t get around calling invoke, but let’s add features to AFn so then it can throw an exception rather than diverging in cases where it’s pointless to traverse possibly-infinite args.

One such case is a fn without rest arguments, which can only support up to 20 arguments. The compiler always extends AFunction in this case, so let’s hardcode the following case in AFn’s applyTo: if the function being applied is an AFunction, then throw an arity exception for 21 or more arguments. The risk: if anyone implements an AFunction that supports rest arguments, it won’t work for 21+ args. Let’s assume this never happens.

With this tweak, fixed-arg fn’s, multimethods, vars, and even vars containing multimethods now error instead of diverge:

user=> (apply identity (range))
Wrong number of args (21+) passed to: clojure.core/identity
user=> (defmulti mm identity)
user=> (apply mm (range))
Wrong number of args (21+) passed to: clojure.core/identity
user=> (apply #'mm (range))
Wrong number of args (21+) passed to: clojure.core/identity

That works for all 0-20 arg fn’s; now for everything else that extends AFn.

I settled on a new AFn method that returns this.class if the current class does not have a rest argument. If the runtime type of the collection is identical to that class, then we can short-circuit rest-argument support. If it’s not equal, then it must be a extending class that we don’t control, so continue as normal.

After implementing this method on all functional collections in Clojure, we now have the desirable semantics: error over divergence.

user=> (apply {} (range))
Wrong number of args (21+) passed to: clojure.lang.PersistentArrayMap

Libraries providing their own functional collections need to update their implementations to opt-in.

user=> (apply (proxy [clojure.lang.MapEntry] [0 1])
              (range))
^C

These ideas have been implemented in this pull request.

To summarize, we’ve made every function in Clojure (except those with rest-arguments) lazier either directly (fns, collections, keywords) or indirectly (multimethods, vars). This enables infinite arguments to error instead of diverge, unless the function itself walks its infinite arguments. Given that most functions only support fixed arguments, this enhancement has broad applicability. In the case of multimethods, we’ve also added support for legitimate uses of infinite arguments.

Interestingly, this post is essentially the opposite of my previous one: there for collections, we fixed arity exceptions to reflect the actual number of arguments passed; here we cut off the processing of arguments before all arguments can even be counted.


Thanks to Santiago Gepigon III for pairing on this and the post title.

09 Jul 2024