You need to go to the monthly farmer’s market in town. Everything that you sell needs to be loaded on your trusty donkey who will be accompanying you on the trip. As you will be staying in town a few nights you need to pack an overnight bag as well.
Deciding on what to pack is difficult. Other than personal items, your stall needs to have as much variety as possible to draw in the crowds, but you also need to pack as many of your most profitable items. You have to have some items and would like to have more of others.
What is the knapsack problem
This is a category of combination problems known as the Knapsack problem. Having to make an optimum collection of items that will maximize or minimize some properties of the collection. Either minimize waste, maximize profitability or both.
How this problem came about
The problem I am trying to solve can be classed as a knapsack problem but has some major differences. It came about while trying to solve a math problem I was helping my daughter with. It was one of those nice ones with multiple answers and you could make as complex as you wanted.
There were a list of 16 numbers and we had to see how many ways you could make a target number. In this case 5000.
numbers = [100,1628,40,3273,3564,20,200,1525,250,1727,125,50,3372,25,3475,1436]
An easy answer would be, just add some numbers together. The authors of this question put in numbers that neatly added up to 5000. There were also some quick wins (
50*100) and of course you could get creative by repeating numbers (
I really wanted to see if I could get a definite number of how many possible combinations there were for a given formula length. It was an excellent application for coding.
How this is different to the original knapsack problem
With the original knapsack problem, you can keep adding to the collection and stop at the limit. It means after a point, you don’t have to consider some combinations.
This problem was different. However long the formula is, you could make the result smaller by subtracting or dividing the result.
This means you cant discard combinations while growing the formula length like you can do with the original knapsack problem.
The first unsuccessful try
First I tried to brute force the solution by iterating through all possible combinations.
It didn’t go so well. With 16 numbers and 4 possible operations on them (+-*/), each new number was 64 times a combination of all the previous numbers that came before it. It becomes computationally intractable to solve for more than 5 numbers.
Each subsequent number made the task exponentially more complex than the last. In the graph, the jump from 4 to 5 and the jump from 5 to 6 make the previous numbers jumps insignificant.
However this approach was useful because it revealed some properties of the task and the solution.
The algorithm liked repeating numbers so you could take the previous number and add and subtract or divide and multiply by the same number. The algorithm was modified to only include unique numbers in the solution.
Also, there were no possible combinations of 3 numbers that gave the target. But there were 15 2 number combinations and 755 of 4 number combinations.
Second try using random numbers
For the second try, the formula was created purely randomly. It tried different random number combinations and if successful it would be added to the list of successes.
Running the random number algorithm for a different number of seconds and looking at the results revealed a property of statistics. That when a sample gets larger it tends towards the same properties of the population from which it is drawn.
The longer the algorithm ran, the more it tended towards the final pattern. There were also actually quite a few combinations that contained all 16 numbers.
20/3372-1525/125+1628-100-40/200+250+1727+3475/25+50+3273/3564*1436 3475/100+40/20+3564+1525-125-25/3372/1628*250*1727/200*1436*50/3273 3475*1525/3372-50/1628-40+3273*200/1727+250*1436/100+125/3564-20*25
And it didn’t take the algorithm very long to find these.
These types of problems where you build statistical models to reveal insights about real world processes has been dealt with in other posts in this blog.
If you have made to the end, thanks for reading. Please put any opinions and thoughts in the comments section as they always lead to interesting dialog.