Food for Thought III -- Solving

Now that we have a mechanism for building menus, testing whether they’re incompliance, and scoring them, we’ll get to the actual solving of the menus.

Getting a Solution

We’ve got a fixed set of constraints and an objective function: to minimize cost.

Given these conditions, it makes sense to use a simple linear programming algorithm. The implementation we use for solving is the GNU linear program solver which has an R interface via the Rglpk package.

You’ll need to install the GLPK outside of an R session. I’ve found the easiest way is a simple brew install glpk but you can also wget http://ftp.gnu.org/gnu/glpk/glpk-4.54.tar.gz, unzip the tarball, and configure and make it from there.

The Rglpk_solve_LP() function is going to do the work for us. What solve_it() below will do is grab the elements of a menu that we need for this function, pass them to the solver in the format it needs, and return a solution that is a list of a few things we’re intersted in: the cost of our final menu, the original menu, and the multiplier on each food’s portion size.

Kind of a lot going on in solve_it(), which I’ll walk through below. If you’re only interested in what we get out of it, feel free to skip this section 😁.

Into the bowels of solve_it()

What we first have to do is get the raw values of every nutrient, if our nutrients are in per 100g form. (If they’re already in raw form, we’re all set.) We know they’re in raw form already if the df_is_per_100g flag is FALSE. Whichever form we get our menu data in, we’ll transform it to the other form in order to return that in our list at the end.

Rglpk_solve_LP() needs something to optimize for, which for us will be df[["cost"]]. We’ll tell it we want to minimize that.

Next we need to set up a series of constraint inequalities. On the left hand side of each inequality will be the raw values of each nutrient we’ve got in our menu. That will be followed by a directionality, either ">" if the value of that nutrient is a positive or a "<" if it is a must restrict. Last we’ll supply the upper or lower bound for that particular nutrient, which we supply in bounds. If we’re thinking about Riboflavin in our menu and we’ve got n items in our menu each with some amount of Riboflavin, that would look like:

\(\sum_{i=1}^{n} OurRawRiboflavin_{i} > MinRequiredDailyRiboflavin\)

Now to construct the constraint matrix which I’m cleverly calling constraint_matrix for all the nutritional constraints that need to be met. We’ll make this by essentially transposing our menu’s nutrient columns; whereas in a typical menu dataframe we have one row per food and one column per nutrient, we’ll turn this into a matrix with one row per constraint and one column per food. (In practice we do this by taking our vector of nutrient constraint values, and, in the matrix call of construct_matrix(), creating byrow = TRUE matrix from them.) We can print out the constraint matrix by turning v_v_verbose on.

Cool, so now we can read a given row in this matrix pertaining to a certain nutrient left to right as adding up the value of that nutrient contained in all of our menu foods. That gives us the sum total of that nutrient in the menu.

What we’d need next if we keep reading from left to right is the directionality which the solver accepts as a vector of ">"s and "<"s. Farthest to the right, we’ll need the min or max value of that nutrient, which we’ll supply in the vector rhs for “right hand side” of the equation. We get rhs from the user-supplied nut_df, or dataframe of nutrients and their daily upper or lower bounds.

We’ll specify a minimum serving size per food by creating bounds from min_food_amount and max_food_amount. This acts as the other half of the constraint on the solver; not only do we need a certain amount of each nutrient, we also need a certain amount of each food.

Finally, we can specify that we want only full serving sizes by setting only_full_servings to TRUE. If we do that, we’ll tell the solver that the types must be integer, "I" rather than continuous, "C".

If we turn verbose on we’ll know whether a solution could be found or not, and what the cost of the solved menu is.

Return

The native return value of the call to Rglpk_solve_LP() is a list including: vector of solution amounts, the cost, and the status (0 for solved, 1 for not solvable). solve_it() will take that list and append a few more things to it, so we can check that it’s working correctly and pipe it into other things to distill menus and nutritional information out . We’ll append the nutritional constraints we supplied in nut_df, the constraint matrix we constructed, and the nutrient values in our original menu in both raw and per 100g form.

solve_it <- function(df, nut_df = nutrient_df, df_is_per_100g = TRUE, only_full_servings = FALSE, 
                     min_food_amount = 1, max_food_amount = 100, 
                     verbose = TRUE, v_v_verbose = FALSE, maximize = FALSE) {
  
  # If our nutrient values are per 100g (i.e., straight from menu_builder)
  if (df_is_per_100g == TRUE) {
    df_per_100g <- df        # Save our original df in df_per_100g
    df <- get_raw_vals(df)   # Get the raw values
  } else {
    df_per_100g <- get_per_g_vals(df)
    df <- df
  }
  
  n_foods <- length(df$shorter_desc)
  nut_quo <- quo(nut_df$nutrient)
  
  dir_mr <- rep("<", nut_df %>% filter(is_must_restrict == TRUE) %>% ungroup() %>% count() %>% as_vector())       # And less than on all the must_restricts
  dir_pos <- rep(">", nut_df %>% filter(is_must_restrict == FALSE) %>% ungroup() %>% count() %>% as_vector())     # Final menu must be greater than on all the positives
  
  dir <- c(dir_mr, dir_pos)
  rhs <- nut_df[["value"]]      # The right-hand side of the equation is all of the min or max nutrient values
  obj_fn <- df[["cost"]]             # Objective function will be to minimize total cost
  
  bounds <- list(lower = list(ind = seq(n_foods), 
                              val = rep(min_food_amount, n_foods)),
                 upper = list(ind = seq(n_foods), 
                              val = rep(max_food_amount, n_foods)))
  
  construct_matrix <- function(df, nut_df) {       # Set up matrix constraints
    mat_base <- df %>% select(!!nut_quo) %>% as_vector()    # Get a vector of all our nutrients
    mat <- matrix(mat_base, nrow = nrow(nut_df), byrow = TRUE)       # One row per constraint, one column per food (variable)
    return(mat)
  }
  
  const_mat_names <- str_c(df$shorter_desc,  # Use combo of shorter_desc and NDB_No
        df$NDB_No, sep = ", ")  # so that names are interpretable but also unique
  
  mat <- construct_matrix(df, nut_df)
  constraint_matrix <- mat %>% dplyr::as_data_frame() 
  names(constraint_matrix) <- const_mat_names
  
  constraint_matrix <- constraint_matrix %>% 
    mutate(
      dir = dir,
      rhs = rhs
    ) %>% left_join(nut_df, by = c("rhs" = "value")) %>% 
    select(nutrient, everything())
  
  if(only_full_servings == TRUE) {    # Integer values of coefficients if only full servings
    types <- rep("I", n_foods)
  } else {
    types <- rep("C", n_foods)
  }
  
  if(v_v_verbose == TRUE) {
    v_v_verbose <- TRUE
    message("Constraint matrix below:")
    print(constraint_matrix)
  } else {
    v_v_verbose <- FALSE
  }
  
  out <- Rglpk_solve_LP(obj_fn, mat, dir, rhs,                    # Do the solving; we get a list back
                        bounds = bounds, types = types, 
                        max = maximize, verbose = v_v_verbose)   
  
  out <- append(append(append(                                           # Append the dataframe of all min/max nutrient values
    out, list(necessary_nutrients = nut_df)),
    list(constraint_matrix = constraint_matrix)),                        # our constraint matrix
    list(original_menu_raw = df))                                            # and our original menu
  
  if (!is.null(df_per_100g)) {
    out <- append(out, list(original_menu_per_g = df_per_100g))
  }
  
  if (verbose == TRUE) {
    message(paste0("Cost is $", round(out$optimum, digits = 2), ".")) 
    if (out$status == 0) {
      message("Optimal solution found :)")
    } else {
      message("No optimal solution found :'(")
    }
  }
  
  return(out)
}

Let’s build a random menu,

set.seed(314159)

our_random_menu <- build_menu()

and solve it.

(our_menu_solution <- our_random_menu %>% solve_it())
## Cost is $125.24.
## No optimal solution found :'(
## $optimum
## [1] 125.2432
## 
## $solution
##  [1]  1.00000  1.00000  1.00000  1.00000  1.00000 10.58151  1.00000
##  [8]  1.00000  1.00000  1.00000  1.00000  1.00000  1.00000  1.00000
## [15]  1.00000  1.00000  1.00000  1.00000  1.00000  1.00000
## 
## $status
## [1] 1
## 
## $solution_dual
##  [1]  8.1307503  4.0165434  3.9712254  3.6987557  2.7325283  0.0000000
##  [7]  2.1841592  5.2410793  5.9448688  7.1626185  8.9845478  2.9833969
## [13]  6.4718038  0.3418326  3.8686881 -0.9211807  5.3783274  7.0898251
## [19]  5.1757662  8.2020831
## 
## $auxiliary
## $auxiliary$primal
##  [1]  103.781315 2400.000000  415.251500   34.149306  170.137486
##  [6] 1244.599769   18.382433  869.344103 1990.594500 4306.013885
## [11]   22.277383    1.817886    2.133465  120.428909   79.115000
## [16]    2.505162    5.845449  310.238894   49.908853   62.940138
## [21] 3436.806179
## 
## $auxiliary$dual
##  [1] 0.00000000 0.01441065 0.00000000 0.00000000 0.00000000 0.00000000
##  [7] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
## [13] 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000
## [19] 0.00000000 0.00000000 0.00000000
## 
## 
## $necessary_nutrients
## # A tibble: 21 x 3
##    nutrient      value is_must_restrict
##    <chr>         <dbl> <lgl>           
##  1 Lipid_Tot_g     65. TRUE            
##  2 Sodium_mg     2400. TRUE            
##  3 Cholestrl_mg   300. TRUE            
##  4 FA_Sat_g        20. TRUE            
##  5 Protein_g       56. FALSE           
##  6 Calcium_mg    1000. FALSE           
##  7 Iron_mg         18. FALSE           
##  8 Magnesium_mg   400. FALSE           
##  9 Phosphorus_mg 1000. FALSE           
## 10 Potassium_mg  3500. FALSE           
## # ... with 11 more rows
## 
## $constraint_matrix
## # A tibble: 45 x 24
##    nutrient      `MILK & CRL BAR, 08510` `LAMB, 17063` `OLIVES, 09194`
##    <chr>                           <dbl>         <dbl>           <dbl>
##  1 Lipid_Tot_g                      2.74         18.9            1.03 
##  2 Sodium_mg                       79.8          39.1          110.   
##  3 Cholestrl_mg                     1.50         92.6            0.   
##  4 FA_Sat_g                         2.28          9.39           0.136
##  5 Niacin_mg                        2.28          9.39           0.136
##  6 Protein_g                        1.62         20.8            0.146
##  7 Calcium_mg                     102.           14.4           14.1  
##  8 Phosphorus_mg                  102.           14.4           14.1  
##  9 Iron_mg                          1.50          1.78           0.498
## 10 Magnesium_mg                     5.25         16.2            0.600
## # ... with 35 more rows, and 20 more variables: `POTATOES, 11841` <dbl>,
## #   `BEEF, 23412` <dbl>, `BEVERAGES, 14154` <dbl>, `SOYMILK, 16139` <dbl>,
## #   `PORK, 10180` <dbl>, `ONIONS, 11284` <dbl>, `BEEF, 13811` <dbl>, `MALT
## #   BEV, 14305` <dbl>, `SQUASH, 11467` <dbl>, `BF, 23050` <dbl>,
## #   `PUDDINGS, 19234` <dbl>, `AVOCADOS, 09037` <dbl>, `PUDDINGS,
## #   19323` <dbl>, `TURMERIC, 02043` <dbl>, `BABYFD, 03994` <dbl>, `MILK,
## #   01059` <dbl>, `VEAL, 17425` <dbl>, dir <chr>, rhs <dbl>,
## #   is_must_restrict <lgl>
## 
## $original_menu_raw
## # A tibble: 20 x 30
##    shorter_desc   GmWt_1 serving_gmwt  cost Lipid_Tot_g Sodium_mg
##    <chr>           <dbl>        <dbl> <dbl>       <dbl>     <dbl>
##  1 MILK & CRL BAR  25.0         25.0   9.28      2.74      79.8  
##  2 LAMB            85.0         85.0   4.58     18.9       39.1  
##  3 OLIVES          15.0         15.0   5.56      1.03     110.   
##  4 POTATOES        74.0         74.0   4.04      3.86      23.7  
##  5 BEEF            85.0         85.0   3.10      8.76      25.5  
##  6 BEVERAGES      258.         258.    1.45      0.       101.   
##  7 SOYMILK        243.         243.    3.83      3.57     114.   
##  8 PORK            85.0         85.0   6.27     13.4       71.4  
##  9 ONIONS           5.00         5.00  5.96      0.0230     1.05 
## 10 BEEF            28.4         28.4   7.42      5.26      17.9  
## 11 MALT BEV        29.6         29.6   9.04      0.0355     3.85 
## 12 SQUASH         127.         127.    3.02      0.343      2.54 
## 13 BF              85.0         85.0   7.17      7.06      48.4  
## 14 PUDDINGS       112.         112.    3.36      0.392    209.   
## 15 AVOCADOS       150.         150.    4.02     22.0       10.5  
## 16 PUDDINGS       147.         147.    4.29      5.14     362.   
## 17 TURMERIC         3.00         3.00  5.39      0.0975     0.810
## 18 BABYFD         140.         140.    7.11      0.518      1.40 
## 19 MILK           244.         244.    7.18      8.44     139.   
## 20 VEAL            85.0         85.0   9.28      2.24      74.8  
## # ... with 24 more variables: Cholestrl_mg <dbl>, FA_Sat_g <dbl>,
## #   Protein_g <dbl>, Calcium_mg <dbl>, Iron_mg <dbl>, Magnesium_mg <dbl>,
## #   Phosphorus_mg <dbl>, Potassium_mg <dbl>, Zinc_mg <dbl>,
## #   Copper_mg <dbl>, Manganese_mg <dbl>, Selenium_µg <dbl>,
## #   Vit_C_mg <dbl>, Thiamin_mg <dbl>, Riboflavin_mg <dbl>,
## #   Niacin_mg <dbl>, Panto_Acid_mg <dbl>, Vit_B6_mg <dbl>,
## #   Energ_Kcal <dbl>, solution_amounts <dbl>, Shrt_Desc <chr>,
## #   NDB_No <chr>, score <dbl>, scaled_score <dbl>
## 
## $original_menu_per_g
## # A tibble: 20 x 30
##    shorter_desc   solution_amounts GmWt_1 serving_gmwt  cost Lipid_Tot_g
##    <chr>                     <dbl>  <dbl>        <dbl> <dbl>       <dbl>
##  1 MILK & CRL BAR               1.  25.0         25.0   9.28      11.0  
##  2 LAMB                         1.  85.0         85.0   4.58      22.3  
##  3 OLIVES                       1.  15.0         15.0   5.56       6.87 
##  4 POTATOES                     1.  74.0         74.0   4.04       5.22 
##  5 BEEF                         1.  85.0         85.0   3.10      10.3  
##  6 BEVERAGES                    1. 258.         258.    1.45       0.   
##  7 SOYMILK                      1. 243.         243.    3.83       1.47 
##  8 PORK                         1.  85.0         85.0   6.27      15.7  
##  9 ONIONS                       1.   5.00         5.00  5.96       0.460
## 10 BEEF                         1.  28.4         28.4   7.42      18.6  
## 11 MALT BEV                     1.  29.6         29.6   9.04       0.120
## 12 SQUASH                       1. 127.         127.    3.02       0.270
## 13 BF                           1.  85.0         85.0   7.17       8.30 
## 14 PUDDINGS                     1. 112.         112.    3.36       0.350
## 15 AVOCADOS                     1. 150.         150.    4.02      14.7  
## 16 PUDDINGS                     1. 147.         147.    4.29       3.50 
## 17 TURMERIC                     1.   3.00         3.00  5.39       3.25 
## 18 BABYFD                       1. 140.         140.    7.11       0.370
## 19 MILK                         1. 244.         244.    7.18       3.46 
## 20 VEAL                         1.  85.0         85.0   9.28       2.63 
## # ... with 24 more variables: Sodium_mg <dbl>, Cholestrl_mg <dbl>,
## #   FA_Sat_g <dbl>, Protein_g <dbl>, Calcium_mg <dbl>, Iron_mg <dbl>,
## #   Magnesium_mg <dbl>, Phosphorus_mg <dbl>, Potassium_mg <dbl>,
## #   Zinc_mg <dbl>, Copper_mg <dbl>, Manganese_mg <dbl>, Selenium_µg <dbl>,
## #   Vit_C_mg <dbl>, Thiamin_mg <dbl>, Riboflavin_mg <dbl>,
## #   Niacin_mg <dbl>, Panto_Acid_mg <dbl>, Vit_B6_mg <dbl>,
## #   Energ_Kcal <dbl>, Shrt_Desc <chr>, NDB_No <chr>, score <dbl>,
## #   scaled_score <dbl>

How long did that take?

system.time(our_random_menu %>% solve_it(verbose = FALSE))
##    user  system elapsed 
##   0.014   0.000   0.014

Not long. Thanks for being written in C, GLPK!

Solve menu

Okay so our output of solve_it() is an informative but long list. It has all the building blocks we need to create a solved menu; now we just need to extract those parts and glue them together in the right ways. Here’s where solve_menu() comes in.

solve_menu() takes one main argument: the result of a call to solve_it(). Since we’ve written the return value of solve_it() to contain the original menu and a vector of solution amounts – that is, the amount we’re multiplying each portion size by in order to arrive at our solution – we can combine these to get our solved menu.

We also return a message, if verbose is TRUE, telling us which food we’ve got the most servings of, as this might be something we’d want to decrease. (Now that I’m thining about it, maybe a more helpful message would take a threshold portion size and only alert us if we’ve exceeded that threshold.)

solve_menu <- function(sol, verbose = TRUE) {
  
  solved_col <-  tibble(solution_amounts = sol$solution)    # Grab the vector of solution amounts
  
  if (! "solution_amounts" %in% names(sol$original_menu_per_g)) {   # If we don't yet have a solution amounts column add it
    df_solved <- sol$original_menu_per_g %>% 
      bind_cols(solved_col)            # cbind that to the original menu
  } else {
    df_solved <- sol$original_menu_per_g %>% 
      mutate(
        solution_amounts = solved_col %>% as_vector()    # If we've already got a solution amounts column, replace the old one with the new
      ) 
  }
  
  df_solved <- df_solved %>% 
    mutate(
      GmWt_1 = GmWt_1 * solution_amounts,
      cost = cost * solution_amounts
    ) %>% 
    select(shorter_desc, solution_amounts, GmWt_1, serving_gmwt, everything()) 
  
  max_food <- df_solved %>%                                   # Find what the most of any one food we've got is
    filter(solution_amounts == max(df_solved$solution_amounts)) %>% 
    slice(1:1)                                           # If we've got multiple maxes, take only the first
  
  if (verbose == TRUE) {
    message(paste0("We've got a lot of ", max_food$shorter_desc %>% as_vector()), ". ",
            max_food$solution_amounts %>% round(digits = 2), " servings of ",
            max_food$shorter_desc %>% as_vector() %>% is_plural(return_bool = FALSE), ".")  
  }
  
  return(df_solved)
}

Let’s see what our tidied output looks like.

our_solved_menu <- our_menu_solution %>% solve_menu()
## We've got a lot of YOGURT. 2.04 servings of it.
our_solved_menu %>% kable(format = "html")
shorter_desc solution_amounts GmWt_1 serving_gmwt cost Lipid_Tot_g Sodium_mg Cholestrl_mg FA_Sat_g Protein_g Calcium_mg Iron_mg Magnesium_mg Phosphorus_mg Potassium_mg Zinc_mg Copper_mg Manganese_mg Selenium_µg Vit_C_mg Thiamin_mg Riboflavin_mg Niacin_mg Panto_Acid_mg Vit_B6_mg Energ_Kcal Shrt_Desc NDB_No score scaled_score
PUMPKIN&SQUASH SD KRNLS 1.000000 118.0000 118.00 3.630000 49.05 256 0 8.544 29.84 52 8.07 550 1174 788 7.64 1.275 4.490 9.4 1.8 0.070 0.150 4.430 0.570 0.100 574 PUMPKIN&SQUASH SD KRNLS,RSTD,W/SALT 12516 -1276.0938 3.6447775
USDA COMMODITY 1.000000 28.0000 28.00 2.180000 18.56 507 73 4.968 14.95 9 0.99 17 162 231 2.39 0.047 0.018 21.7 0.0 0.688 0.223 2.619 0.618 0.234 231 USDA COMMODITY,PORK SAUSAGE,BULK/LINKS/PATTIES,FRZ,RAW 07907 -3413.2143 -0.4408219
YOGURT 2.042026 347.1444 170.00 8.392727 1.25 66 5 0.752 4.93 171 0.07 16 135 219 0.83 0.013 0.004 4.9 0.8 0.042 0.201 0.107 0.552 0.045 86 YOGURT,VANILLA FLAVOR,LOWFAT MILK,SWTND W/ LO CAL SWTNR 01231 -2557.1636 1.1957167
BAGEL 1.661217 164.4604 99.00 6.428908 1.32 422 0 0.360 10.56 219 3.57 29 99 107 0.83 0.128 0.539 21.5 0.0 0.568 0.344 4.515 0.407 0.070 264 BAGEL,PLN,TSTD, ENR W/CA PROP(INCLUDE ONION,POPPY,SESAME) 18001 -3301.3825 -0.2270297
EGGNOG 1.000000 254.0000 254.00 8.030000 4.19 54 59 2.591 4.55 130 0.20 19 109 165 0.46 0.013 0.005 4.2 1.5 0.034 0.190 0.105 0.417 0.050 88 EGGNOG 01057 -2574.0448 1.1634444
PIE 1.000000 28.3500 28.35 4.990000 9.30 225 42 3.171 4.50 94 1.27 19 98 186 0.46 0.066 0.198 7.1 1.7 0.092 0.201 0.782 0.445 0.047 204 PIE,PUMPKIN,PREP FROM RECIPE 18327 -3335.9004 -0.2930186
ALMONDS 1.000000 157.0000 157.00 4.850000 55.17 1 0 4.208 21.23 291 3.68 274 466 699 3.07 0.955 2.460 4.1 0.0 0.092 0.781 3.665 0.229 0.118 607 ALMONDS,OIL RSTD,WO/SALT 12065 -721.3391 4.7053194

Solve nutrients

We’ll want to do something with nutrients that’s analagous to what we’re doing in solve_menu(). This function will let us find what the raw nutrient amounts in our solved menu are, and let us know which nutrient we’ve overshot the lower bound on the most. Like solve_menu(), a result from solve_it() can be piped nicely in here.

One part of the solution returned by the solver is a vector of the values of the constraints – that is, our nutrients – at solution. That lives in $auxiliary$primal and becomes our solved_nutrient_value in the function below.

Recall also that we took nut_df, the dataframe of nutritional requirements handed to us by the user, and appended it to the solution so that it’s also returned as a result of our call to solve_it(). This means the outcome of solve_it() will let us compare the required_value for each nutrient to its solved_nutrient_value. We calculate the ratio of these two for every nutrient, and if verbose is TRUE, let the user know which nutrient they’ve overshot the daily minimum on the most.

solve_nutrients <- function(sol, verbose = TRUE) {
  
  solved_nutrient_value <- list(solution_nutrient_value =       # Grab the vector of nutrient values in the solution
                              sol$auxiliary$primal) %>% as_tibble()
  
  nut_df_small_solved <- sol$necessary_nutrients %>%       # cbind it to the nutrient requirements
    bind_cols(solved_nutrient_value)  %>% 
    rename(
      required_value = value
    ) %>% 
    select(nutrient, is_must_restrict, required_value, solution_nutrient_value)
  
  ratios <- nut_df_small_solved %>%                # Find the solution:required ratios for each nutrient
    mutate(
      ratio = solution_nutrient_value/required_value
    )
  
  max_pos_overshot <- ratios %>%             # Find where we've overshot our positives the most
    filter(is_must_restrict == FALSE) %>% 
    filter(ratio == max(.$ratio))
  
  if (verbose == TRUE) {
    message(paste0("We've overshot the most on ", max_pos_overshot$nutrient %>% as_vector()), 
            ". It's ", 
        max_pos_overshot$ratio %>% round(digits = 2), " times what is needed.")
  }
  
  return(nut_df_small_solved)
}

Remember that we saved the result of solve_it() in our_menu_solution. Let’s see what those ratios look like in our solution.

our_menu_solution %>% solve_nutrients() %>% kable(format = "html")
## We've overshot the most on Manganese_mg. It's 5.07 times what is needed.
nutrient is_must_restrict required_value solution_nutrient_value
Lipid_Tot_g TRUE 65 169.4820333
Sodium_mg TRUE 2400 1569.6958522
Cholestrl_mg TRUE 300 199.5642224
FA_Sat_g TRUE 20 28.7622223
Protein_g FALSE 56 120.0422927
Calcium_mg FALSE 1000 1831.3843476
Iron_mg FALSE 18 22.5596835
Magnesium_mg FALSE 400 1240.8231364
Phosphorus_mg FALSE 1000 3098.4038299
Potassium_mg FALSE 3500 3500.0000000
Zinc_mg FALSE 15 20.0494305
Copper_mg FALSE 2 3.3243791
Manganese_mg FALSE 2 10.1346005
Selenium_µg FALSE 70 88.6549205
Vit_C_mg FALSE 60 9.1931056
Thiamin_mg FALSE 2 1.6120579
Riboflavin_mg FALSE 2 3.2686977
Niacin_mg FALSE 20 20.0000000
Panto_Acid_mg FALSE 10 4.9760988
Vit_B6_mg FALSE 2 0.7804418
Energ_Kcal FALSE 2300 2709.0637613

Swapping

Our menus often aren’t solvable. That is, at the minimum portion size we set, there’s no way we can change portion sizes in such a way that we stay under all the maximum values for each must restrict and under the minimums for all positive nutrients as well as have enough calories.

In these cases, we’ll need to change up our lineup.

Single Swap

I only use single swapping for the cases where we’re above the max threshold on must restricts, but you could imagine implementing the same funcitons to deal with positives.

The idea with a single swap is to see which must restricts are not satisfied, find the food that is the max_offender on that must restrict (i.e., contributes the most in absolute terms to the value of the must restrict) and then swap it out. We try to replace_food_w_better(), that is, swap it out for a food from a pool of better foods on that dimension. We define better as foods that score above a user-specified z-score cutoff on that must_restrict. If there are no foods that satisfy that cutoff, we choose a food at random from the pool of all possible foods.

smart_swap_single <- function(menu, max_offender, cutoff = 0.5, df = abbrev, verbose = FALSE) {
  
  swap_count <- 0

    for (m in seq_along(mr_df$must_restrict)) {   
      nut_to_restrict <- mr_df$must_restrict[m]    # grab the name of the nutrient we're restricting
      message(paste0("------- The nutrient we're restricting is ", nut_to_restrict, ". It has to be below ", mr_df$value[m])) 
      to_restrict <- (sum(menu[[nut_to_restrict]] * menu$GmWt_1, na.rm = TRUE))/100   # get the amount of that must restrict nutrient in our original menu
      message(paste0("The original total value of that nutrient in our menu is ", to_restrict)) 
      
      if (to_restrict > mr_df$value[m]) {     # if the amount of the must restrict in our current menu is above the max value it should be according to mr_df
        swap_count <- swap_count + 1
        
        max_offender <- which(menu[[nut_to_restrict]] == max(menu[[nut_to_restrict]]))   # Find the food that's the worst offender in this respect
        
        message(paste0("The worst offender in this respect is ", menu[max_offender, ]$Shrt_Desc))
        
        menu[max_offender, ] <- replace_food_w_better(menu, max_offender, 
                                                           nutrient_to_restrict = nut_to_restrict, cutoff = cutoff)
        
        to_restrict <- (sum(menu[[nut_to_restrict]] * menu$GmWt_1, na.rm = TRUE))/100   # recalculate the must restrict nutrient content
        message(paste0("Our new value of this must restrict is ", to_restrict)) 
      } else {
        message("We're all good on this nutrient.") 
      }
    }
  
  if (verbose == TRUE) {
    print(paste0(swap_count, " swaps were completed."))
  }
  
  return(menu)
}

do_single_swap <- function(menu, solve_if_unsolved = TRUE, verbose = FALSE,
                        new_solution_amount = 1){  # What should the solution amount of the newly swapped in foods be?
  
  if (verbose == FALSE) {
    out <- suppressWarnings(suppressMessages(menu %>% 
      smart_swap_single(menu))) 
  } else {
    out <- menu
      smart_swap_single(menu) 
  }

  return(out)
}

In practice, that looks like:

our_random_menu %>% do_single_swap() %>% kable(format = "html")
shorter_desc solution_amounts GmWt_1 serving_gmwt cost Lipid_Tot_g Sodium_mg Cholestrl_mg FA_Sat_g Protein_g Calcium_mg Iron_mg Magnesium_mg Phosphorus_mg Potassium_mg Zinc_mg Copper_mg Manganese_mg Selenium_µg Vit_C_mg Thiamin_mg Riboflavin_mg Niacin_mg Panto_Acid_mg Vit_B6_mg Energ_Kcal Shrt_Desc NDB_No score scaled_score
PEPPERS 1 11.60 11.60 2.86 0.20 238 0 0.029 0.92 9 0.46 10 18 166 0.12 0.065 0.115 0.3 74.4 0.059 0.030 0.477 0.079 0.233 26 PEPPERS,SWT,GRN,CKD,BLD,DRND,W/SALT 11822 -3369.125 -0.3565344
USDA COMMODITY 1 28.00 28.00 2.18 18.56 507 73 4.968 14.95 9 0.99 17 162 231 2.39 0.047 0.018 21.7 0.0 0.688 0.223 2.619 0.618 0.234 231 USDA COMMODITY,PORK SAUSAGE,BULK/LINKS/PATTIES,FRZ,RAW 07907 -3413.214 -0.4408219
YOGURT 1 170.00 170.00 4.11 1.25 66 5 0.752 4.93 171 0.07 16 135 219 0.83 0.013 0.004 4.9 0.8 0.042 0.201 0.107 0.552 0.045 86 YOGURT,VANILLA FLAVOR,LOWFAT MILK,SWTND W/ LO CAL SWTNR 01231 -2557.164 1.1957167
BAGEL 1 99.00 99.00 3.87 1.32 422 0 0.360 10.56 219 3.57 29 99 107 0.83 0.128 0.539 21.5 0.0 0.568 0.344 4.515 0.407 0.070 264 BAGEL,PLN,TSTD, ENR W/CA PROP(INCLUDE ONION,POPPY,SESAME) 18001 -3301.383 -0.2270297
EGGNOG 1 254.00 254.00 8.03 4.19 54 59 2.591 4.55 130 0.20 19 109 165 0.46 0.013 0.005 4.2 1.5 0.034 0.190 0.105 0.417 0.050 88 EGGNOG 01057 -2574.045 1.1634444
PIE 1 28.35 28.35 4.99 9.30 225 42 3.171 4.50 94 1.27 19 98 186 0.46 0.066 0.198 7.1 1.7 0.092 0.201 0.782 0.445 0.047 204 PIE,PUMPKIN,PREP FROM RECIPE 18327 -3335.900 -0.2930186
SOUP 1 253.00 253.00 8.43 0.45 194 0 0.074 1.10 12 0.33 13 23 217 0.20 0.121 0.125 2.0 0.4 0.055 0.046 0.770 0.173 0.084 33 SOUP,VEG SOUP,COND,LO NA,PREP W/ EQ VOLUME H2O 06967 -3182.024 0.0011525


Wholesale Swap

The wholesale swap takes a different approach. It uses knowledge we’ve gained from solving to keep the foods that the solver wanted more of and offer the rest up for swapping. The intuition here is that foods that the solver increased the portion sizes of are more valuable to the menu as a whole. We sample a percent_to_swap of the foods that the solver assigned the lowest portion size to, shamefully dubbed the worst_foods.

wholesale_swap <- function(menu, df = abbrev, percent_to_swap = 0.5) {
  
  # Get foods with the lowest solution amounts
  min_solution_amount <- min(menu$solution_amounts)
  worst_foods <- menu %>% 
    filter(solution_amounts == min_solution_amount)
  
  if (nrow(worst_foods) >= 2) {
    to_swap_out <- worst_foods %>% sample_frac(percent_to_swap)
    message(paste0("Swapping out a random ", percent_to_swap*100, "% of foods: ", 
                   str_c(to_swap_out$shorter_desc, collapse = ", ")))
    
  } else if (nrow(worst_foods) == 1)  {
    message("Only one worst food. Swapping this guy out.")
    to_swap_out <- worst_foods
    
  } else {
    message("No worst foods")
  }
  
  get_swap_candidates <- function(df, to_swap_out) {
    candidate <- df %>% 
      filter(! (NDB_No %in% menu)) %>%    # We can't swap in a food that already exists in our menu
      sample_n(., size = nrow(to_swap_out)) %>% 
      mutate(solution_amounts = 1)    # Give us one serving of each of these new foods
    return(candidate)
  }
  swap_candidate <- get_swap_candidates(df = df, to_swap_out = to_swap_out)
  
  if (score_menu(swap_candidate) < score_menu(to_swap_out)) {
    message("Swap candidate not good enough; reswapping.")
    swap_candidate <- get_swap_candidates(df = df, to_swap_out = to_swap_out)
    
  } else {
      message("Swap candidate is good enough. Doing the wholesale swap.")
      return(swap_candidate)
  }
  
  newly_swapped_in <- get_swap_candidates(df, to_swap_out)
  
  message(paste0("Replacing with: ", 
                 str_c(newly_swapped_in$shorter_desc, collapse = ", ")))
  
  out <- menu %>% 
    filter(!NDB_No %in% worst_foods) %>% 
    bind_rows(newly_swapped_in)
  
  return(out)
}

Let’s do a wholesale swap.

our_random_menu %>% wholesale_swap() %>% kable(format = "html")
## Swapping out a random 50% of foods: USDA COMMODITY, EGGNOG, YOGURT, PUMPKIN&SQUASH SD KRNLS
## Swap candidate not good enough; reswapping.
## Replacing with: POPCORN, YOGURT, PORK, GOJI BERRIES
shorter_desc solution_amounts GmWt_1 serving_gmwt cost Lipid_Tot_g Sodium_mg Cholestrl_mg FA_Sat_g Protein_g Calcium_mg Iron_mg Magnesium_mg Phosphorus_mg Potassium_mg Zinc_mg Copper_mg Manganese_mg Selenium_µg Vit_C_mg Thiamin_mg Riboflavin_mg Niacin_mg Panto_Acid_mg Vit_B6_mg Energ_Kcal Shrt_Desc NDB_No score scaled_score
PUMPKIN&SQUASH SD KRNLS 1 118.00 118.00 3.63 49.05 256 0 8.544 29.84 52 8.07 550 1174 788 7.64 1.275 4.490 9.4 1.8 0.070 0.150 4.430 0.570 0.100 574 PUMPKIN&SQUASH SD KRNLS,RSTD,W/SALT 12516 -1276.0938 3.6447775
USDA COMMODITY 1 28.00 28.00 2.18 18.56 507 73 4.968 14.95 9 0.99 17 162 231 2.39 0.047 0.018 21.7 0.0 0.688 0.223 2.619 0.618 0.234 231 USDA COMMODITY,PORK SAUSAGE,BULK/LINKS/PATTIES,FRZ,RAW 07907 -3413.2143 -0.4408219
YOGURT 1 170.00 170.00 4.11 1.25 66 5 0.752 4.93 171 0.07 16 135 219 0.83 0.013 0.004 4.9 0.8 0.042 0.201 0.107 0.552 0.045 86 YOGURT,VANILLA FLAVOR,LOWFAT MILK,SWTND W/ LO CAL SWTNR 01231 -2557.1636 1.1957167
BAGEL 1 99.00 99.00 3.87 1.32 422 0 0.360 10.56 219 3.57 29 99 107 0.83 0.128 0.539 21.5 0.0 0.568 0.344 4.515 0.407 0.070 264 BAGEL,PLN,TSTD, ENR W/CA PROP(INCLUDE ONION,POPPY,SESAME) 18001 -3301.3825 -0.2270297
EGGNOG 1 254.00 254.00 8.03 4.19 54 59 2.591 4.55 130 0.20 19 109 165 0.46 0.013 0.005 4.2 1.5 0.034 0.190 0.105 0.417 0.050 88 EGGNOG 01057 -2574.0448 1.1634444
PIE 1 28.35 28.35 4.99 9.30 225 42 3.171 4.50 94 1.27 19 98 186 0.46 0.066 0.198 7.1 1.7 0.092 0.201 0.782 0.445 0.047 204 PIE,PUMPKIN,PREP FROM RECIPE 18327 -3335.9004 -0.2930186
ALMONDS 1 157.00 157.00 4.85 55.17 1 0 4.208 21.23 291 3.68 274 466 699 3.07 0.955 2.460 4.1 0.0 0.092 0.781 3.665 0.229 0.118 607 ALMONDS,OIL RSTD,WO/SALT 12065 -721.3391 4.7053194
POPCORN 1 28.35 28.35 6.30 9.50 540 0 1.415 12.60 11 2.28 151 264 241 3.83 0.545 NA 8.6 0.0 0.350 0.110 2.070 NA 0.170 424 POPCORN,OIL-POPPED,LOFAT 25001 -3332.4276 -0.2863794
YOGURT 1 150.00 150.00 2.80 2.92 34 13 1.832 8.25 86 0.07 10 110 131 0.39 0.018 0.034 9.5 NA 0.036 0.227 0.210 0.390 0.050 106 YOGURT,GREEK,STRAWBERRY,DANNON OIKOS 01276 -2917.3655 0.5071076
PORK 1 85.00 85.00 4.48 5.84 73 59 1.974 21.22 30 0.71 18 226 313 2.74 0.081 0.009 31.6 0.0 0.540 0.338 7.113 1.115 0.466 143 PORK,FRSH,LOIN,BLADE (CHOPS OR ROASTS),BONE-IN,LN,RAW 10032 -2937.8497 0.4679473
GOJI BERRIES 1 28.00 28.00 6.53 0.39 298 0 0.000 14.26 190 6.80 NA NA NA NA NA NA NA 48.4 NA NA NA NA NA 349 GOJI BERRIES,DRIED 09110 -3384.9004 -0.3866934

Full Solving

set.seed(10)
fully_solved <- build_menu() %>% solve_full(verbose = FALSE)
fully_solved %>% kable(format = "html")
shorter_desc solution_amounts GmWt_1 serving_gmwt cost Lipid_Tot_g Sodium_mg Cholestrl_mg FA_Sat_g Protein_g Calcium_mg Iron_mg Magnesium_mg Phosphorus_mg Potassium_mg Zinc_mg Copper_mg Manganese_mg Selenium_µg Vit_C_mg Thiamin_mg Riboflavin_mg Niacin_mg Panto_Acid_mg Vit_B6_mg Energ_Kcal Shrt_Desc NDB_No score scaled_score
BEETS 1.000000 136.0000 136 6.040000 0.17 78 0 0.027 1.61 16 0.80 23 40 325 0.35 0.075 0.329 0.7 4.9 0.031 0.040 0.334 0.155 0.067 43 BEETS,RAW 11080 -2918.1362 0.5056343
PORK 1.000000 85.0000 85 6.800000 2.56 43 84 0.791 31.11 5 1.06 25 237 363 2.86 0.068 0.009 37.7 0.0 0.603 0.510 7.685 0.802 0.491 156 PORK,LEG SIRLOIN TIP RST,BNLESS,LN & FAT,CKD,BRSD 10962 -2878.8351 0.5807674
PICKLE RELISH 1.000000 15.0000 15 9.990000 0.46 1091 0 0.044 1.50 5 1.25 19 40 78 0.21 0.082 0.015 0.0 1.0 0.040 0.040 0.500 0.007 0.015 91 PICKLE RELISH,HOT DOG 11944 -3515.7267 -0.6367982
LAMB 1.000000 85.0000 85 7.090000 7.63 72 97 3.540 26.52 7 1.94 22 168 230 2.33 0.094 0.004 12.4 0.0 0.140 0.380 7.810 0.650 0.535 175 LAMB,AUS,FRSH,RACK,RST,FRNCHED,DNUDED,BONE-IN,LN,0",CKD,RSTD 17447 -3119.3119 0.1210403
POTATOES 1.000000 75.0000 75 5.640000 0.08 5 0 0.026 2.14 13 0.86 23 55 417 0.29 0.103 0.157 0.4 5.7 0.082 0.033 1.035 0.301 0.345 79 POTATOES,RUSSET,FLESH & SKN,RAW 11353 -2988.2450 0.3716050
FRUIT COCKTAIL 1.000000 248.0000 248 4.940000 0.07 6 0 0.010 0.39 6 0.29 5 11 88 0.08 0.069 0.144 0.5 1.9 0.018 0.019 0.374 0.060 0.050 73 FRUIT COCKTAIL,CND,HVY SYRUP,SOL&LIQUIDS 09100 -3106.6213 0.1453014
TOMATOES 1.000000 54.0000 54 7.130000 2.97 107 0 0.426 14.11 110 9.09 194 356 3427 1.99 1.423 1.846 5.5 39.2 0.528 0.489 9.050 2.087 0.332 258 TOMATOES,SUN-DRIED 11955 -1180.3855 3.8277460
JUTE 1.000000 28.0000 28 1.610000 0.25 8 0 0.038 4.65 208 4.76 64 83 559 0.79 0.255 0.123 0.9 37.0 0.133 0.546 1.260 0.072 0.600 34 JUTE,POTHERB,RAW 11231 -3106.0957 0.1463062
SOY FLOUR 1.192203 259.9865 88 21.389799 8.90 9 0 1.290 49.81 285 8.20 285 675 2090 4.10 1.600 3.150 58.9 0.0 1.088 0.280 2.950 1.550 1.050 372 SOY FLOUR,LOW-FAT 16118 -340.1026 5.4341411
HEARTS OF PALM 1.000000 146.0000 146 8.740000 0.62 426 0 0.130 2.52 58 3.13 38 65 177 1.15 0.133 1.394 0.7 7.9 0.011 0.057 0.437 0.126 0.022 28 HEARTS OF PALM,CANNED 11961 -3477.9434 -0.5645667
BABYFOOD 1.000000 150.2587 99 6.875476 0.11 40 0 0.000 1.01 17 0.31 13 30 112 0.24 0.036 0.144 0.3 3.3 0.014 0.042 0.566 0.348 0.114 34 BABYFOOD,DINNER,MXD VEG,JR 03279 -3237.0691 -0.1040798
SCUP 1.000000 85.0000 85 8.630000 2.73 42 52 0.640 18.88 40 0.53 23 185 287 0.48 0.051 0.035 36.5 0.0 0.110 0.100 4.100 0.750 0.300 105 SCUP,RAW 15090 -2949.4539 0.4457632
TURKEY 1.479702 105.0588 71 6.510688 9.82 77 85 3.060 27.87 32 2.30 23 199 280 4.27 0.154 0.023 37.8 0.0 0.060 0.241 3.561 1.205 0.330 208 TURKEY,ALL CLASSES,LEG,MEAT&SKN,CKD,RSTD 05194 -3063.7769 0.2272084
BEANS 1.000000 253.0000 253 3.490000 5.15 422 5 1.948 5.54 61 1.99 43 109 358 0.73 0.159 0.255 5.7 1.1 0.136 0.049 0.408 0.155 0.090 155 BEANS,BAKED,HOME PREPARED 16005 -2986.3686 0.3751922

Okay! In part four we’ll get a handle on what proportion of menus are actually solvable at all by running some simulations.