This came to me indirectly via one of my TAs.


        I have a question regarding pipelining and the branch predictor
        Why not have 2 pipelines and when a branch is encountered, have
        the other pipeline fetch the instructions as if the branch were
        to be taken and the current pipeline fetch the instruction as if
        it isnt to be taken.  Each pipeline would have it's own PC but
        share the registers, condition codes, and stack pointer.


Indeed you could.  But be careful.  What if both paths need to update the
stack?  Actually, there are other problems with what you need to do about
sharing, but this is enough for openers.


        When the branch instruction is calculated, the control logic
        could generate a signal selecting which pipeline to activate.


Yup.  In fact, a guy named Dollas (first name started with A, Illinois
PhD, if I remember correctly), who spent some time at Duke University
in the late 1980s as I recall before he returned to Greece, suggested
exactly that.  Also, IBM produced a machine for its 360/370 series that
did that.  I do not remember whether they ever released it as a product
or not.


        It seems then that you would never need a branch predictor and
        even though it slows the machine down if it doesnt take the branch
        (assuming we can only fetch for 1 pipeline at a time),


Actually, fetch bandwidth is probably not the problem.  If you are fetching
one instruction per cycle, instead fetch two from that path every other
cycle.
Certainly it is more complicated than this, but to a first approximation,
this is not the problem.


        it would speed it up if the branch is taken.
        This would mean that the branch instruction would not affect the
        performance of the machine either way.  It was just a thought
        I guess and I was wondering if not having a branch predictor (maybe
        something like I just explained) had any advantages or was even
        plausible with pipelining


OK, got it.  Actually, it would work ...sort of.  Is it better than a
branch predictor?  Well, we have twice as much logic to do the job.  Is
that better than having one pipeline and predicting accurately 80% of
the time, and using all that logic for something else.  Science of
trandeoffs,
remember?  If branches were predictable 50-50, it would be a different
matter.  On current machines, we are well over 90%.  In our research
group, we are closer to 98% prediction accuracy.  So, the first question
is, is it worth the resources?  You are doubling the cost of the logic.

Now, then, let's take it one step further: Suppose another branch comes
along before the first branch is resolved.  Now we need 4 pipelines.  What
if a third branch comes along before the first is resolved -- then we need
8 pipelines.

Branches show up on the order of every five instructions.  Modern machines
fetch multiple instructions each cycle.  Say, 5 -- conservative by
tomorrow's
standards (when you graduate).  That means every fetch cycle, we need to
double the number of pipelines.  Say, ten cycles before the branch is
resolved.  That means 1024 pipelines (in general), of which only one is
doing useful work -- that is, 0.1%.

Could we spend the logic in other more effective ways?

Hope this helps.  Thank you for the question.  Guess I do not have to cover
this in class  ...unless someone asks!

Good luck with the rest of the course.

Yale Patt