Large-scale Optimization

Large-scale optimization research is conducted by The Logistics Institute of the School of Industrial and Systems Engineering under the direction of Ellis Johnson.

Airline Crew Scheduling

Airline Crew Scheduling is a computational intense problem that can benefit from the parallelism offered by the cluster. The problem involves assigning crews to flights such that the airline incurs least cost. A crew itinerary or a pairing is a sequence of flights such that the arrival station of a flight in the pairing is the same as the departure station of the next flight in the pairing. The problem is to assign pairings in such a way that the pairings partition all the legs. Relevant papers include:

Pairing Generation

We addressed the question of parallelizing pairing generation because it is one of the most computationally intense parts of airline crew scheduling methodologies. We approached the problem through two different implementations. The first implementation used strictly message passing communication, MPI specifically. The second implementation exploited the underlying structure of the 16 node quad Pentium Pro 200 MHz cluster, specifically by employing threads for concurrent computation on a node and MPI for communication across nodes.

As can be seen from Figure 1, the mixed MPI/threads implementation substantially outperforms the non-threaded code, demonstrating the effectiveness of threads in reducing communication overhead.

diego_MPI-threads

Figure 1

Figures 2 and 3 depict performance results using the mixed model under two heterogeneous cluster scenarios. These runs were performed using three data sets of sizes small, medium, and large. Given that our environment consists of 200 MHz quad and 300 MHz dual shared memory multiprocessors, it is actually not clear whether one should use more strongly connected, slower (i.e., quad Pentium Pro) nodes or more weakly connected, faster (i.e., dual Pentium II) nodes for program runs. The results depicted in the second graph show speedup when employing a mixed environment comprised of an equal number of 300 and 200 MHz nodes. The last graph depicts speedup when maximizing the number of 300 MHz nodes. It is apparent from these measurements that this application's computation versus communication ratio is such that faster processors should be employed whenever possible, as both speedup curves are close to linear, with diminishing returns for larger numbers of processors due to the limited problem sizes used in these experiments. However, while speedup is better with the well-connected machine, actual execution times are superior with the faster processors.

diego_best1

Figure 2

diego_best2

Figure 3

D. Klabjan and K. Schwan, Airline Crew Pairing Generation in Parallel, Technical Report TLI/LEC-99-02, 1999, pdf.

A Parallel Primal-Dual Simplex Algorithm

Recently, the primal-dual simplex method has been used to solve linear programs with a large number of columns. In this work we present a parallel primal-dual simplex algorithm that is capable of solving linear programs with at least an order of magnitude more columns than the previous work. The algorithm repeatedly solves several linear programs in parallel and combines the dual solutions to obtain a new dual feasible solution. The primal part of the algorithm involves a new randomized pricing strategy. We tested the algorithm on instances with thousands of rows and tens of millions of columns. For example, an instance with 1700 rows and 45 million columns was solved in about 2 hours on 12 processors.

Table 1 summarizes the most significant computational times. Every column in a linear programming instance has on average 10 nonzeros. The algorithm achieves good speedups on a moderate number of processors. Bigger speedups are not attained mostly because of the non-scalability of the algorithm.

 

Number of  rows

Number of columns

Number of processor

Speedup

Execution time (secs)

449

42 million

1

1.00

3,000

 

 

8

3.33

900

 

 

12

4.16

720

 

 

16

4.19

715

1742

46 million

1

1.00

24,000

 

 

12

3.22

7,440

 

 

16

4.00

6,000

 

 

20

3.41

7,020

3143

46 million

1

1.00

62,460

 

 

12

2.30

27,120

 

 

16

2.50

24,960

 

 

20

2.51

24,840

Table 1

 

Larger speedups can be obtained on problems with more rows. Figure 4 shows the speedup for a problem with 10,000 rows and 20,000 million columns.

Figure 4

To summarize, this research shows how to effectively use a distributed memory cluster consisting of 300 MHz Intel Dual Pentium II processors to solve large-scale linear programs. The problems we are able to solve are substantially larger than the previously largest solved linear program and are, to our knowledge, the largest linear programs solved so far. R. Bixby et al., Very Large-scale Linear Programming: A Case Study in Combining Interior Point and Simplex Methods, Operations Research 40, pp. 885-897, were able to solve a linear program with 800 rows and 8 million columns which is by an order of magnitude smaller than our linear programs. Recently, F. Barahona and R. Anbil, On some Difficult Linear Programs Coming from Set Partitioning, Technical Report RC-21410, IBM T.J. Watson Research Center, developed a new linear programming algorithm and they solve linear programs with up to 5,000 rows and half a million variables.

D. Klabjan, Ellis L. Johnson, George L. Nemhauser, A Parallel Primal-dual Simplex Algorithm, Technical Report TLI/LEC-99-10, pdf.

Solving Large Airline Crew Scheduling Problems

The parallel pairing generation and the parallel primal-dual algorithm are combined into a new methodology to airline crew scheduling. Repeatedly, a large number of pairings are generated and then the linear programming relaxation is solved with the parallel primal-dual solver. After the linear programming relaxation is solved, a commercial branch-and-bound solver CPLEX is used to obtain an integer solution. The solver is enhanced with a strong branching rule that is carried out in parallel. 

 

Number of

rows

FTC

So far

best FTC

344

2.86%

3.43%

449

0.31%

0.93%

654

6.60%

10.0%

Table 2

 

Table 2 summarizes the computational results. FTC (flight-time-credit) is a traditional measure of quality for airline crew scheduling solutions. We improve the current solution substantially in all three instances. On average for each instance 20 processors are used and the overall execution time was 12 hours.

D. Klabjan, Ellis L. Johnson, George L. Nemhauser, Solving Large Airline Crew Scheduling Problems: Random Pairing Generation and Strong Branching, Technical Report TLI/LEC-99-11, pdf.

to IHPCL main page