The vast majority of the time spent in the processing of Shor's
algorithm is in the discrete Fourier transform step. In the discrete
Fourier transform we iterate from 0 to *q*, and for each possible
value in that range we iterate over the entire register and perform
some mathematical operations. It is trivial to divide this work among
multiple process elements. One can simply iterate on each process
element from 0 to *q*, and for each value in the range iterate over
some prescribed subrange of the register.

In general Shor's algorithm simulation seems a good candidate for parallelization. The simulation can roughly be divided into three phases: prepossessing, simulation of the quantum register, and post processing. During the simulation of the quantum register, all the work is done in the form of applying the same operation to an entire array, where each array location represents one of the base states of the quantum register. This agrees with our conception of how a quantum register would function, as in a quantum computer, we are not free to perform an operation on only certain portions of the superposed state of the register, we must perform the operation on all portions.