Y'all are talking two different workflows. One, the voice is the story everything revolves around. The second is a timeline where the voice is placed according to the visual story.
When I do a :30 spot, I do the voice, it won't change, it is the foundation of the story, so I attach the visuals to it to match the voice's timing.
When I do an educational narrative for a TV show, I record the voice as attached, then blade it up into the necessary segments, then move them to where they need to be in order to match the visual process, which is the foundation of the story.
There's no set in stone way to do either, also.