Recursive Resampling

We are, as we say, “creatures of habit.” We do something, and then we do it again the same way, and the next time, and the time after that. Every time’s a little different; some corners are cut, unnecessary stuff gets left out, but this is what we do. We learn how to do something and then we do it that way. Sometimes habits gets passed down through generations – we still put our food on dishes made of baked mud. We don't eat off the same dishes, but we make ours in the same way as ever; the method perseveres.

I am going to describe an algorithm that is so intuitive and simple that it is almost embarrassing to write about it. The modeled process could be called “exercise in futility.” It is not a useful algorithm (though we use it all the time), but only summarizes or instantiates a particular phenomenon that we do not seem to have a regular name for, or way to recognize when it happens in everyday life. So this description will hopefully serve as a reference-point. When you see this happening in your workplace, in your family and social life, in your community, in the news or in science or the arts, you can point to it and say “This is an example of recursive resampling.”

First the algorithm. It is customary and useful to present these things in pseudocode. So here it is:

Initialize an array X of length L with some values
Loop
- Randomly sample with replacement from X to fill a new array of length L called NewX
- Rename X=NewX
Until done

Sampling with replacement means the same item can be picked multiple times. Fill NewX with values from X, assigning them to arbitrary positions in the new array. So if X was [car, apple, box, air, radio], then NewX could be, for example, [air, box, air, car, apple]. It doesn’t matter what’s in the array, no operations are performed on them beyond selection. Once the array NewX has been filled, it become X for the next iteration. In this example, note that “air” has been selected twice, and “radio” was not selected at all. That means, probabilistically, that “air” is twice as likely to be selected in the next iteration, and “radio” will not appear in any future samples-of-samples

As a programmer, I would have the random number generator produce L random integers R_i from 1 to L, where L is the number of items in the array – its length – and then fill NewX with the R_ith elements from X. Say the first random number is 4, I will fill the first slot in the NewX array with whatever is in slot 4 of X. If the second random number is 3, I will fill the second slot with the third element of X, and so on, until I have filled the new array. This is not the only way to program it.

We are taking a population of things that exist (which can be methods), sampling items from it, and generating a new population of the same size from the sampled values. Then we do the same thing again, sampling from the new population, iteratively.

Here is the algorithm in non-computerish language:

Do something to something
Repeat
- Take the result of what you’ve done
- Do the same thing to it
Until you’re done

I’ll spoil it for you: diversity will be lost. Most of the time, the entire array will come to be filled with a single value. The example above could end up as [air, air, air, air, air]. Values that appeared more often in the initial population are more likely to survive, but that’s not a certainty. Rare values in the starting data set will likely disappear quickly, again not a certainty. There is nonzero probability that when the population finally converges it will be filled with an initially-rare value. But a betting person would come out ahead overall by choosing a popular value from the initial population.

Discussion

This is the template for habit or following a routine. In many things, we do what we did the last time, learning from our own behavior and the observed behaviors of those around us. My original inspiration for this was the vision of chatbots sampling language from an Internet that is largely filled with the output of chatbots. But this happens everywhere. Bands in my town play the songs that they hear other bands playing, and new bands play what they hear those bands playing. When my office had to produce a new report we did it the same way we did the last report, which was based on the report before that, and so on -- office assistants printed paper copies of the draft report and carried them around to every manager in the organization like they have done since the 1930's. Habits are a type of self-mimicking behavior, where we imitate activities from time to time, sampling from a previous event to create the current one. We do this as individuals and as groups.

Of course real-life examples are not as crisp and pure as a computer program. Innovation does flow into a real system from outside, or emerges intrinsically, usually as a result of mistakes and miscommunication, and so this process does not inevitably proceed directly to uniformity. But you will recognize the algorithm everywhere.

Informal Program Results

As a demonstration, I wrote a program in the SuperCollider language, which has a good library of random number generators and sampling techniques. I used an initial population of integers, zero to nine, where each integer occurred one time more than its value. That is, 2 occurred three times, 6 appeared seven times. I added one because of the ineffectiveness of having zero appear zero times. The population had 55 elements in it – 1+2+3+…10=55. Part of the question was how initial frequencies affected the final dominant value, and this seemed like the easiest way to model a diverse situation without requiring a Stroop-test-like transformation to decipher results. Bigger values have higher initial frequencies. Easy.

Here is a table of the results of running the algorithm fifty times. A data set was filled with integers as described above, each one occurring i+1 times in the array, where i was its face value. Then the resampling algorithm was run for 50 trials. The table shows totals of how many times each value survived in the population at the end of 200 iterations.

Total frequencies of occurrence of values after 50 trials of 200 iterations.

Value	Init. freq.	Final freq.
0	1	0
1	2	13
2	3	55
3	4	96
4	5	292
5	6	220
6	7	562
7	8	399
8	9	275
9	10	838

Four of the trials did not converge to a single dominant value within the number of iterations allotted. These were split between pairs, namely: 3/7, 6/9, 1/6, and 4/9.

As seen, the correspondence between initial proportion and probability of ultimate dominance was strong.

This means that an attitude or approach that is initially rare is likely to die out, and an attitude that is common may become more prevalent. The “rich get richer” phenomenon turns up a lot in complex systems, and is not surprising here. In fact, there is nothing surprising here, or new. This is an obvious situation. To destroy innovation and drive a system to exhaustion, just keep doing things the way we did them last time.

As expected, the values that appeared most frequently in the initial population tended to replace all the “minority” values by the end of the trial. The “split dominance” outcomes seemed to occur when the lower-probability values failed to extinguish, and it is possible that 1, 3, 4, or 6 might have increased their count if the trial had run for more iterations in the non-converged trials. The value that is selected by the recursive resampling process is stochastically predictable but is not strictly known ahead of time; the loss of diversity is strictly known.

Summary

Habits offer the advantage that we don’t have to think about something every time we do it. We don’t have to “re-invent the wheel,” because we already have the habit of using the wheel and now we can just take a wheel as a known, frozen concept and use it in our current project. This is efficient and frees our creative imagination for other things, or for nothing, even if the wheel was not the best solution to our problem. That’s just the way we do it, we use the wheel.

Habits offer the disadvantage that they reject innovative problem solutions from the git-go. “That’s just the way we do it” is the workplace alternative to visionary problem-solving and developing a process that specifically fits the current situation. Everybody knows this, the present algorithm simply shows how it happens. Work that used to be written out by hand was done exactly the same way after typewriters were invented, except typed, and exactly the same way as that when computers came along and substituted for typewriters. Meanwhile, the unimaginably powerful universal Turing machine that can simulate any conceivable system sits on our desktops, simulating a typewriter that simulated a pencil.

When we reuse a process we do not replicate every detail of it, and inevitably some initial features will be omitted from the repeated instance, to be forgotten and lost. The implementation of the process narrows (“becomes more efficient”), and procedures need to be trained and enforced when they don’t make sense any more. Deviations need to be punished. “That’s just the way we do it.” Habit, though it reduces conflict and cognitive effort, is the biggest barrier to innovation. When we resample recursively from past samples the outcome is eventually going to be rigid, empty formality, meaninglessness, and collapse. You can tell your boss that.