100 Bushels of Corn, Revisited

About the authors

John Mount is a data scientist based in San Francisco, with 20+ years of experience in machine learning, statistics, and analytics. He is the co-founder of the data science consulting firm Win-Vector LLC, and (with Nina Zumel) the co-author of Practical Data Science with R, now in its second edition.

Nina Zumel is a data scientist based in San Francisco, with 20+ years of experience in machine learning, statistics, and analytics. She is the co-founder of the data science consulting firm Win-Vector LLC, and (with John Mount) the co-author of Practical Data Science with R, now in its second edition.

Introduction

Nina Zumel presented the “100 Bushels of Corn” puzzle here as a fun example of using R as a calculator. What if we want R to solve the puzzle for us, instead of merely being a calculator?

Let’s give that a go.

Setting Up the Problem

100 bushes of corn are distributed to 100 people such that every man receives 3 bushels, every woman 2 bushels, and every child 1/2 a bushel. How many men, women, and children are there?

We can write the 100 Bushels of Corn problem as finding integer vectors x that satisfy a %*% x = b for the following a, b. The first row specifies the constraint on the total number of men, women and children; the second row specifies the constraint on how many bushels each person gets. Notice that we doubled the values of the second equation to keep everything integral.

a <- matrix(c(1, 1, 1, 6, 4, 1),
            nrow = 2,
            ncol = 3,
            byrow = TRUE)

a

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    6    4    1

b <- as.matrix(c(100, 200))

b

     [,1]
[1,]  100
[2,]  200

There are at least two main ways to solve this:

Brute force. This is kind of the point of computers and programming languages.
Linear algebra, in particular linear algebra over the ring of integers. This approach shows some of the richness of the R package environment curated at CRAN.

Let’s take a quick look at these two solution styles.

Brute Force Solution

A brute force solution is as follows. First, we get a simple upper bound on each variable. We can do this by checking one row of a %*% x = b.

upper_bounds <- floor(b[2] / a[2, ])
upper_bounds

[1]  33  50 200

Now we try all plausible solutions.

for (x1 in 0:upper_bounds[[1]]) {
  for (x2 in 0:upper_bounds[[2]]) {
    # use a row of a to solve for x3
    x3 <- (b[1] - (a[1, 1] * x1 + a[1, 2] * x2)) / a[1, 3]
    # check constraints
    if ((x3 >= 0) && (abs(x3 %% 1) < 1e-8)) {
      x = as.matrix(c(x1, x2, x3))
      if (all(a %*% x == b)) {
        print(c(x1, x2, x3))
      }
    }
  }
}

[1]  2 30 68
[1]  5 25 70
[1]  8 20 72
[1] 11 15 74
[1] 14 10 76
[1] 17  5 78
[1] 20  0 80

And this gives us exactly the 7 solutions Nina found. This is some of the magic of having a programmable computer: one can cheaply try a lot of potential solutions without needing a lot of theory.

Ring Theory Solution

It turns out there is a systematic way to find all of the integral solutions to a linear system quickly, at least for low dimensional solution spaces. To do this we will use a ring theory or linear algebra matrix factorization called the Hermite normal form. Fortunately, R has a package for this, called numbers. We attach this package as follows.

library(numbers)

This package will find for us a lower diagonal integer matrix h and a square unimodular matrix u such that h = a %*% u. Unimodular matrices map the space of integer vectors Zⁿ to the same space of integer vectors in a 1 to 1, onto, and invertible manner. This means finding integer solution vectors to a %*% x = b is equivalent to the problem of finding integer solutions to h %*% y = b, where x = u %*% y. This second problem is easier, as h is lower diagonal- so solving this system is just a matter of “back filling”.

Let’s first find h, u.

hnf <- hermiteNF(a)
h <- hnf$H
u <- hnf$U
stopifnot(all(h == a %*% u))

# lower triangular transform of a
h

     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0

# unimodular transform
u

     [,1] [,2] [,3]
[1,]    7   -1   -3
[2,]  -12    2    5
[3,]    6   -1   -2

We now have a %*% u = h. a %*% x = b implies h %*% y = b where x = u %*% y. Let’s solve for a specific solution xs by back substitution.

# back substitute to solve h %*% y = b
# this uses the fact that h is lower-triangular
h_rank <- sum(diag(h) != 0)
stopifnot(h_rank > 1)
y <- numeric(ncol(a))
y[1] <- b[1] / h[1, 1]
for (i in 2:h_rank) {
  y[i] <- (b[i] - sum(h[i, 1:(i - 1)] * y[1:(i - 1)])) / h[i, i]
}
stopifnot(all(b == h %*% y))
xs <- u %*% y
stopifnot(all(a %*% xs == b))
xs

     [,1]
[1,]  500
[2,] -800
[3,]  400

It is a standard result of linear algebra that all solutions of a %*% x = b are of the form x = xs + z where a %*% z = 0 (that is, z is in the null space of h and a). In our case, the null space is spanned by the last column of u.

Let’s show this column.

null_basis <- u[, (h_rank + 1):ncol(u), drop = FALSE]

null_basis

     [,1]
[1,]   -3
[2,]    5
[3,]   -2

So in our case: all integer solutions of a %*% x = b are of the form [500, -800, 400] + k * [-3, 5, -2] for integer k.

Now we just need to pick k to make everything non-negative (an implicit puzzle condition!). The sign changes of entries of [500, -800, 400] + k * [-3, 5, -2] happen for k where one of the coordinates is equal to zero. These are:

500 - 3 * k == 0
-800 + 5 * k == 0, and
400 - 2 * k == 0.

So the k of interest are in the range ceiling(max(0, 800/5)) <= k <= floor(min(500/3, 400/2)), or 160 <= k <= 166. This gives us:

for (k in 160:166) {
  soln <- xs + k * null_basis
  print(c(soln[1, 1], soln[2, 1], soln[3, 1]))
}

[1] 20  0 80
[1] 17  5 78
[1] 14 10 76
[1] 11 15 74
[1]  8 20 72
[1]  5 25 70
[1]  2 30 68

And these are again exactly the solutions Nina Zumel gave in the first 100 Bushels post.

Conclusion

R gives the ability to exploit any combination of innate ability and knowledge, borrowed ability, or brute force in solving problems.