In response to 'test driven development', I propose 'stress driven architecture'.
I propose this partly in jest. But there is wisdom in capacity planning and prototyping systems before full scale development, as well as periodic tests to see if recently implemented design decisions have negatively impacted system performance.
Some may say, and actually have, that this violates the 'premature optimization is the root of all evil' principle. But it should be noted that this principle is based on the 80/20 (90/10) rule. If 80% of your time is spent in only 20% of your code, it doesn't make much sense to spend time optimizing code that rarely gets executed. So you should profile the code to find any bottlenecks and then remove them after much of the application has been developed.
But as you step back from the code in an application, and look at the system architecture, the 'code' becomes communication paths (channels and/or connections), and applications. So maybe it could be said that 80% of your time is spent communicating across 20% of your paths.
If this is true, it makes perfect sense to profile the system and to try and remove bottlenecks as early as possible since architectural decisions are made early and changes after development and/or deployment may not be feasible without significant cost. So you can say major architectural decisions are effectively done before coding begins.
Not making architectural decisions early is imposible. Usually the first decisions a team makes are around whether to use Java or perl, J2EE or PHP, Squid or local cache, JSP+POJO+O/R or JSP+JDBC, etc etc...
So you need to validate your architecture early, if it's not one already validated by previous projects or by the marketplace.
And to validate it, a scaled prototype, or an early alpha of the system, should be stress tested as early as possible. You may very well find that OS issues or network topography issues may be key constraints, not the code or applications themselves.
For example, your only solution may be to scale laterally by adding more machines, instead of adding more cpu's or memory. And this change may force your first tier to be stateless as well as require you to order additional equipment.
Here is the full quote:
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." - Hoare
And one on efficiency:
"More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity." - W.A. Wulf
"Belated pessimization is the leaf of no good."
--Len Lattanzi
The reason why premature optimization is a bad thing is not because it is a waste of effort. It is nothing to do with the 80/20 rule. It is because optimization typically complicates the structure, and so compromises clarity, quality and adaptability. I think this is Hoare's main concern.
And local optimization may not even have the desired effect on performance. Removing a bottleneck can simply create problems elsewhere. As your quote from Wulf indicates, people do all sorts of things in the name of efficiency without always achieving it.
This is precisely why you need to think of performance AND adaptability AND other whole-system properties from an architectural point of view. But that's not premature optimization - it's simply timely architecture.