Hi I'm looking for a way to group a set of numbers in excel using a solver so that I can minimize the total number of groups while minimizing the difference between each number and the "group number".
For example, lets say I have: a = 2.3 b= 3.5 c= 3.1 d=2.6 e=4.1 f=3.2
Now I want to create Group1 and Group2 so that each number a-f is in either G1 or G2, and a-Gx(or y) + b-Gx(or y) +c-Gx(or y) + ... + f-Gx, where Gx(or y) is the number that G1(or G2) stand for.
Lets say G1 is 3.2 and G2 is 4.1. Then G1 has a,c,d,f; G2 has b and e. The sums are 3.2-a = 3.2-2.3= .9, 3.2-c = 3.2-3.1=.1, ... 4.1-e=4.1-4.1=0, and the total sum will be 2.2.
I have a list of about 25 numbers, and I want to minimize the total number of groups first, and then the total sum second. My goal is to get them down to between 3 and 7 total groups. Anyone know a way to do this in excel?
> Hi > I'm looking for a way to group a set of numbers in excel using a > solver so that I can minimize the total number of groups while > minimizing the difference between each number and the "group number".
> For example, lets say I have: > a = 2.3 > b= 3.5 > c= 3.1 > d=2.6 > e=4.1 > f=3.2
> Now I want to create Group1 and Group2 so that each number a-f is in > either G1 or G2, and a-Gx(or y) + b-Gx(or y) +c-Gx(or y) + ... + f-Gx, > where Gx(or y) is the number that G1(or G2) stand for.
> Lets say G1 is 3.2 and G2 is 4.1. Then G1 has a,c,d,f; G2 has b and > e. > The sums are 3.2-a = 3.2-2.3= .9, > 3.2-c = 3.2-3.1=.1, > ... > 4.1-e=4.1-4.1=0, and the total sum will be 2.2.
> I have a list of about 25 numbers, and I want to minimize the total > number of groups first, and then the total sum second. My goal is to > get them down to between 3 and 7 total groups. > Anyone know a way to do this in excel?
You've run into the problem of formulating LP's in Excel with Frontline. You have to explicitly generate every constraint because Frontline does not manage variables using indexes. You either have to carefully enumerate every constraint on the Worksheet or generate the constraints with Visual Basic into order populate sheet ranges with the constraints. And then pass the required ranges to the Solver using Frontline Solver functions.
I'd do it for you, but I have a rum and something to drink.
> Hi > I'm looking for a way to group a set of numbers in excel using a > solver so that I can minimize the total number of groups while > minimizing the difference between each number and the "group number".
> For example, lets say I have: > a = 2.3 > b= 3.5 > c= 3.1 > d=2.6 > e=4.1 > f=3.2
> Now I want to create Group1 and Group2 so that each number a-f is in > either G1 or G2, and a-Gx(or y) + b-Gx(or y) +c-Gx(or y) + ... + f-Gx, > where Gx(or y) is the number that G1(or G2) stand for.
> Lets say G1 is 3.2 and G2 is 4.1. Then G1 has a,c,d,f; G2 has b and > e. > The sums are 3.2-a = 3.2-2.3= .9, > 3.2-c = 3.2-3.1=.1, > ... > 4.1-e=4.1-4.1=0, and the total sum will be 2.2.
> I have a list of about 25 numbers, and I want to minimize the total > number of groups first, and then the total sum second.
I don't understand. Isn't the minimum number of groups equal to 1, just by taking all 25 numbers in a single group? It thus seems that you need some type of "tradeoff" between the number of groups and the "sum" objective. Perhaps you can seek to minimize c*G + d*S, where G = number of groups, S = sum associated with your groups, and c, d are weighting constants or tradeoff coefficients. Another question: do you want all differences in a group to be >= 0 (by taking the largest element in the group to be that group's "representative", or are you willing to use, instead, the *absolute difference*. For example, in your G1 = {a,c,d,f} = {2.3, 3.1, 2.6, 3.2}, you would minimize sum |x - m| (summed over the group) by choosing m = median of G1 = any number between 2.6 and 3.1; you do not even need to have m in G1, although it does no harm to choose m in G1 if you want (eg., by choosing either m = 2.6 or m = 3.1). For example, if we take m = 3.1 we get the sum |3.1 - 2.3| + 0 + |3.1 - 2.6| + |3.1 - 3.2| = 0.8 + 0.5 + 0.1 = 1.4; this is smaller than the "all positive" sum (3.2 - 2.3) + (3.2 - 3.1) + (3.2 - 2.6) + 0 = 0.9 + 0.1 + 0.6 = 1.6 obtained by using "3.2" to represent G1. So, you should clarify for us whether you are allowed to use the median, or not, and whether you can use the sum of absolute deviations, or not.
One way to solve your problem would be to solve separately the cases of 1 group, 2 groups, 3 groups, etc., then take the best. Assuming you are allowed to use the group median as the group representative and then to use absolute deviations, your problem for k groups is: partition the set into k subsets in such a way that the overall sum objective is minimized. This means that you need to decide which elements go into each group; then you need to take the median of each group and then compute the necessary sum. If this is an accurate statement of your problem, you are just dealing with an instance of the *1-dimensional p-median problem*. The general p-median problem is NP-hard, but has a 0,1 programming formulation that could be solved in an "industrial strength" version of the EXCEL Solver; in your case it is too large to fit into the default version that comes with typical MS Office installations. However, because you are dealing with the ONE DIMENSIONAL version, there is a polynomial time algorithm available to solve the problem. It can be found in Jackson, Rouskas and Stallmann, "The directed p-median problem: Definitions, complexity and algorithms", Euro. J. of Operational Research, Vol. 179, # 3 (2007), pp. 1097--1108. If you do not have access to this journal, a freely downloadable description of the work is available in the thesis of Jackson, at http://www.lib.ncsu.edu/theses/available/etd-11052003-111956/unrestri... . I have not yet read this paper in detail, so I don't know whether the algorithm is EXCEL-friendly.
> > Hi > > I'm looking for a way to group a set of numbers in excel using a > > solver so that I can minimize the total number of groups while > > minimizing the difference between each number and the "group number".
> > For example, lets say I have: > > a = 2.3 > > b= 3.5 > > c= 3.1 > > d=2.6 > > e=4.1 > > f=3.2
> > Now I want to create Group1 and Group2 so that each number a-f is in > > either G1 or G2, and a-Gx(or y) + b-Gx(or y) +c-Gx(or y) + ... + f-Gx, > > where Gx(or y) is the number that G1(or G2) stand for.
> > Lets say G1 is 3.2 and G2 is 4.1. Then G1 has a,c,d,f; G2 has b and > > e. > > The sums are 3.2-a = 3.2-2.3= .9, > > 3.2-c = 3.2-3.1=.1, > > ... > > 4.1-e=4.1-4.1=0, and the total sum will be 2.2.
> > I have a list of about 25 numbers, and I want to minimize the total > > number of groups first, and then the total sum second.
> I don't understand. Isn't the minimum number of groups equal to 1, > just by taking all 25 numbers in a single group? It thus seems that > you need some type of "tradeoff" between the number of groups and the > "sum" objective. Perhaps you can seek to minimize c*G + d*S, where G = > number of groups, S = sum associated with your groups, and c, d are > weighting constants or tradeoff coefficients. Another question: do you > want all differences in a group to be >= 0 (by taking the largest > element in the group to be that group's "representative", or are you > willing to use, instead, the *absolute difference*. For example, in > your G1 = {a,c,d,f} = {2.3, 3.1, 2.6, 3.2}, you would minimize sum |x > - m| (summed over the group) by choosing m = median of G1 = any number > between 2.6 and 3.1; you do not even need to have m in G1, although it > does no harm to choose m in G1 if you want (eg., by choosing either m > = 2.6 or m = 3.1). For example, if we take m = 3.1 we get the sum |3.1 > - 2.3| + 0 + |3.1 - 2.6| + |3.1 - 3.2| = 0.8 + 0.5 + 0.1 = 1.4; this > is smaller than the "all positive" sum (3.2 - 2.3) + (3.2 - 3.1) + > (3.2 - 2.6) + 0 = 0.9 + 0.1 + 0.6 = 1.6 obtained by using "3.2" to > represent G1. So, you should clarify for us whether you are allowed to > use the median, or not, and whether you can use the sum of absolute > deviations, or not.
> One way to solve your problem would be to solve separately the cases > of 1 group, 2 groups, 3 groups, etc., then take the best. Assuming you > are allowed to use the group median as the group representative and > then to use absolute deviations, your problem for k groups is: > partition the set into k subsets in such a way that the overall sum > objective is minimized. This means that you need to decide which > elements go into each group; then you need to take the median of each > group and then compute the necessary sum. If this is an accurate > statement of your problem, you are just dealing with an instance of > the *1-dimensional p-median problem*. The general p-median problem is > NP-hard, but has a 0,1 programming formulation that could be solved in > an "industrial strength" version of the EXCEL Solver; in your case it > is too large to fit into the default version that comes with typical > MS Office installations. However, because you are dealing with the ONE > DIMENSIONAL version, there is a polynomial time algorithm available to > solve the problem. It can be found in Jackson, Rouskas and Stallmann, > "The directed p-median problem: Definitions, complexity and > algorithms", Euro. J. of Operational Research, Vol. 179, # 3 (2007), > pp. 1097--1108. If you do not have access to this journal, a freely > downloadable description of the work is available in the thesis of > Jackson, athttp://www.lib.ncsu.edu/theses/available/etd-11052003-111956/unrestri... > . I have not yet read this paper in detail, so I don't know whether > the algorithm is EXCEL-friendly.
> Good luck.
> R.G. Vickson
> > My goal is to > > get them down to between 3 and 7 total groups. > > Anyone know a way to do this in excel?- Hide quoted text -
> - Show quoted text -- Hide quoted text -
> - Show quoted text -
You are right, there is a tradeoff. I should have said that more explicitly. Like I mentioned, I'm trying to get it down between 3 and 7 groups. I could just create however many groups that I want to try, then solve the problem (though I'm not sure how to decide which elements go in each group without just best-guessing), but I wanted to see if there was a way to have the solver be able to easily adjust between solving for 3 groups or 4 groups or ...
I guess I will have to look into optimization with more than one objective function. I've seen a couple of presentations for such things, but have never learned it myself ( I just got my BS in math this last spring semester, so I'm still inexperienced).
Additionally, yes, all members of each group have to be >= that group's representative number. I'm trying to charge a fee based on a certain criteria, and have to make sure the fees will result in a profit for each element in each group. So, no medians, no absolute deviations.
Thank you for your help, I'll try looking into optimization of more than one objective.
> On Aug 4, 7:46 pm, Ray Vickson <RGVick...@shaw.ca> wrote:
> > On Aug 4, 2:20 pm, guggd2...@hotmail.com wrote:
> > > Hi > > > I'm looking for a way to group a set of numbers in excel using a > > > solver so that I can minimize the total number of groups while > > > minimizing the difference between each number and the "group number".
> > > For example, lets say I have: > > > a = 2.3 > > > b= 3.5 > > > c= 3.1 > > > d=2.6 > > > e=4.1 > > > f=3.2
> > > Now I want to create Group1 and Group2 so that each number a-f is in > > > either G1 or G2, and a-Gx(or y) + b-Gx(or y) +c-Gx(or y) + ... + f-Gx, > > > where Gx(or y) is the number that G1(or G2) stand for.
> > > Lets say G1 is 3.2 and G2 is 4.1. Then G1 has a,c,d,f; G2 has b and > > > e. > > > The sums are 3.2-a = 3.2-2.3= .9, > > > 3.2-c = 3.2-3.1=.1, > > > ... > > > 4.1-e=4.1-4.1=0, and the total sum will be 2.2.
> > > I have a list of about 25 numbers, and I want to minimize the total > > > number of groups first, and then the total sum second.
> > I don't understand. Isn't the minimum number of groups equal to 1, > > just by taking all 25 numbers in a single group? It thus seems that > > you need some type of "tradeoff" between the number of groups and the > > "sum" objective. Perhaps you can seek to minimize c*G + d*S, where G = > > number of groups, S = sum associated with your groups, and c, d are > > weighting constants or tradeoff coefficients. Another question: do you > > want all differences in a group to be >= 0 (by taking the largest > > element in the group to be that group's "representative", or are you > > willing to use, instead, the *absolute difference*. For example, in > > your G1 = {a,c,d,f} = {2.3, 3.1, 2.6, 3.2}, you would minimize sum |x > > - m| (summed over the group) by choosing m = median of G1 = any number > > between 2.6 and 3.1; you do not even need to have m in G1, although it > > does no harm to choose m in G1 if you want (eg., by choosing either m > > = 2.6 or m = 3.1). For example, if we take m = 3.1 we get the sum |3.1 > > - 2.3| + 0 + |3.1 - 2.6| + |3.1 - 3.2| = 0.8 + 0.5 + 0.1 = 1.4; this > > is smaller than the "all positive" sum (3.2 - 2.3) + (3.2 - 3.1) + > > (3.2 - 2.6) + 0 = 0.9 + 0.1 + 0.6 = 1.6 obtained by using "3.2" to > > represent G1. So, you should clarify for us whether you are allowed to > > use the median, or not, and whether you can use the sum of absolute > > deviations, or not.
> > One way to solve your problem would be to solve separately the cases > > of 1 group, 2 groups, 3 groups, etc., then take the best. Assuming you > > are allowed to use the group median as the group representative and > > then to use absolute deviations, your problem for k groups is: > > partition the set into k subsets in such a way that the overall sum > > objective is minimized. This means that you need to decide which > > elements go into each group; then you need to take the median of each > > group and then compute the necessary sum. If this is an accurate > > statement of your problem, you are just dealing with an instance of > > the *1-dimensional p-median problem*. The general p-median problem is > > NP-hard, but has a 0,1 programming formulation that could be solved in > > an "industrial strength" version of the EXCEL Solver; in your case it > > is too large to fit into the default version that comes with typical > > MS Office installations. However, because you are dealing with the ONE > > DIMENSIONAL version, there is a polynomial time algorithm available to > > solve the problem. It can be found in Jackson, Rouskas and Stallmann, > > "The directed p-median problem: Definitions, complexity and > > algorithms", Euro. J. of Operational Research, Vol. 179, # 3 (2007), > > pp. 1097--1108. If you do not have access to this journal, a freely > > downloadable description of the work is available in the thesis of > > Jackson, athttp://www.lib.ncsu.edu/theses/available/etd-11052003-111956/unrestri... > > . I have not yet read this paper in detail, so I don't know whether > > the algorithm is EXCEL-friendly.
> > Good luck.
> > R.G. Vickson
> > > My goal is to > > > get them down to between 3 and 7 total groups. > > > Anyone know a way to do this in excel?- Hide quoted text -
> > - Show quoted text -- Hide quoted text -
> > - Show quoted text -
> You are right, there is a tradeoff. I should have said that more > explicitly. Like I mentioned, I'm trying to get it down between 3 and > 7 groups. I could just create however many groups that I want to try, > then solve the problem (though I'm not sure how to decide which > elements go in each group without just best-guessing), but I wanted to > see if there was a way to have the solver be able to easily adjust > between solving for 3 groups or 4 groups or ...
> I guess I will have to look into optimization with more than one > objective function. I've seen a couple of presentations for such > things, but have never learned it myself ( I just got my BS in math > this last spring semester, so I'm still inexperienced).
> Additionally, yes, all members of each group have to be >= that > group's representative number.
In your original post's example, you had the representative number of a group = largest element of the group; now you say you want it to be the smallest. OK, I think the two problems may be closely related, perhaps even equivalent (I'm not sure). If a group's representative is the *smallest* number in the group, and if you want to minimize the groups' sum, I think it follows that the groups must be non- overlapping. That is, if you first sort all the numbers x(j), j=1,2,...,n [n = 25 in your case] in ascending order, then the smallest element of G2 is >= largest element of G1, etc. We can see why, by using an interchange argument. Say we have a partition in which x(a) and x(a+2) are in G1, but x(a+1) = smallest element of G2; suppose also the largest element of G2 > largest element of G1. So we have: G1 ...G1 G2 G1 ... G1 ... G2..., where the middle 'G2' is x(a +1). Now go to a new grouping in which x(a+1) is put into G1 while x(a +2) is put into G2. All terms of the G2-sum are reduced by the amount d = x(a+2)-x(a+1); all elements of the G1-sum before (a+1) or after (a +2) are unchanged; the (a+2) term of the G1 sum is reduced by d. Therefore, the G1 + G2 sum is reduced. Now we have a new grouping in which the smallest element of G2 is closer to the end of G1. If it is still < some elements of G1, continue the procedure, moving it still closer to the end of G1. In this way, we either get a better grouping in which G1 and G2 no longer overlap, or else we had to stop because the smallest element of G2 came up against some other elements of G2 that are < largest element of G1; in other words, we might get the following: G1...G1 G2 G2 ..G2 G1 ...G1 G2 .... G2.... . Now we cannot perform the previous type of interchange, but we can do another type of swap: suppose the largest element of G2 that is < largest element of G1 is x(c). Thus, x(c) is in G2, x(c+1),...,x(e) are in G1 and x(e +1),... are in G2. The new grouping in which x(c+1) is put into G2 and x(c) into G1 is as good as the original, because while the x(c)-term of the G2 sum increases (by x(c+1)-x(c)) the x(c+1)-term of the G1-sum decreases by the same amount, leaving the total unchanged (all the other terms of both sums are unaffected). In this way, we move one of the inter-mingled G2 terms closer to the end of G1 without worsening the total. Continuing in this way, we can eventually eliminate one of the G2-terms from the string inside G1. Then doing it again, we eliminate another one, etc. Eventually, we are back to the situation in which the smallest element of G2 is either >= largest element of G1, or else it is < largest element of G1 but no other elements of G2 are. At that point, we can go back to perform the first kind of interchange again. Continuing like this, we obtain a grouping as good as, or better than the original, but in which G1 and G2 no longer overlap. Note that other groups G3, G4, ... did not enter the argument and are unaffected by the interchanges, so we can apply the argument first to G1, G2, then to G2, G3, etc. (Actually, I see now that while the G1-G2 interchanges do not affect G3,..., these other groups could affect the G1-G2 interchanges. For example, if we have x(a) = smallest no. in G2, with x(a+1) in G3 and x(a+2) in G1, for example, then we would need to swap x(a) and x(a+2) instead of x(a+1) and x(a+2). However, the swap is still seen to be beneficial, and it still moves the smallest element of G2 nearer to the end of G1. Similar considerations apply if there are several elements of G3,...,Gk between x(a) [in G2] and the next element x(a+r) in G1.)
So, I think we can say that the groups should be non-overlapping, like this: G1....G1 G2....G2 G3...G3 .... Gp...Gp. That simplifies the search procedure a lot. Complete enumeration of the group-boundaries can be done in O(n^p) computations, so for small p (2 or 3 groups) and moderate n this is just about practical. For larger p it is starting to get out of hand, so better methods should be sought.
I think that one might still be able to apply some interchange arguments to come up with practical search procedures leading to optimal results, but I need to think some more about it.
> Hi > I'm looking for a way to group a set of numbers in excel using a > solver so that I can minimize the total number of groups while > minimizing the difference between each number and the "group number".
> For example, lets say I have: > a = 2.3 > b= 3.5 > c= 3.1 > d=2.6 > e=4.1 > f=3.2
> Now I want to create Group1 and Group2 so that each number a-f is in > either G1 or G2, and a-Gx(or y) + b-Gx(or y) +c-Gx(or y) + ... + f-Gx, > where Gx(or y) is the number that G1(or G2) stand for.
> Lets say G1 is 3.2 and G2 is 4.1. Then G1 has a,c,d,f; G2 has b and > e. > The sums are 3.2-a = 3.2-2.3= .9, > 3.2-c = 3.2-3.1=.1, > ... > 4.1-e=4.1-4.1=0, and the total sum will be 2.2.
> I have a list of about 25 numbers, and I want to minimize the total > number of groups first, and then the total sum second. My goal is to > get them down to between 3 and 7 total groups. > Anyone know a way to do this in excel?
Chapter 3 of Jackson's thesis (link provided in a previous message) presents a polynomial-time DP algorithm for your problem, assuming that a group's "representative" is the *maximum* element in the group. Whether or not the algorithm is easy to do in EXCEL is another matter, but I guess you could write a Visual Basic module to do it. If your problem is (as you stated later) that the "representative" is the *minimum* group member, you could probably transform that problem into the other. Alternatively, you could modify Jackson's algorithm to handle the new problem.