Here are the notes about an open problem that I took from today's class for posting on the web site. Assume there is a space to store data. There are k different data items. There is no index so data reads are done by randomly accessing the data space to see if the data is there. Pr[item i is requested] = Pi; SUM(P1 + P2 ... Pk) = 1 C represents the total capacity of the storage medium. Ci = the amount of times data i is located in C. Sum(C1 + C2... Ck) = C. The problem then reduces to how do we minimize sum(Pi * C / Ci) This can be solved by a theorem by Canchy-Schwartz sum(ai^2)sum(bi^2) >= (sum (ai*bi))^2 and is equal with ai/bi = aj/bj This theorem reduces sum (Pi * C / Ci) to be minimized when Ci / (Pi)^(1/2) is constant. This means that a given data item should be stored in relation to the square root of it's popularity. To apply to a direct application, Assume we have a distributed network with k different nodes each storing different data. If data A is four times as popular as data B is, the network should store twice as many copies of the data to equalize data access. This is very interesting result. Matt Hammond