This question has intrigued my curiosity for many years.
I have had the opportunity to talk both to Box and to Cox about their transformation (Box and Cox, 1964). I conversed with the late George Box (deceased last March at age 94) when I was a visitor in Madison, Wisconsin, back in 1993-4.
A few years later I talked to David Cox at a conference on reliability in Bordeaux (MMR’2000).
I asked them both the same question, I received the same response.
The question was: What was the theory that led to the derivation of the Box-Cox transformation?
The answer was: “No theory. This was a purely empirical observation”.
The question therefore remains: Why is the Box-Cox transformation so effective, in particular when applied to a response variable in the framework of linear regression analysis?
In a new article, posted in my personal library at the American Statistical Association (ASA) site, I discuss this issue at some length. The article is now generally available for download here (Article #1 below).