August 9, 2012
Hi Dan,

Thank you for lying about my looks over time Dan. I only wish it were true.

And I think the Ball and Brown presentations along with the other plenary sessions at the 2012 American Accounting Association Annual Meetings will eventually be available on the AAA Commons. Even discussants of plenary speakers had to sign video permission forms, so our presentations may also be available on the Commons. I was a discussant of the Deirdre McCloskey plenary presentation on Monday, August 6. If you eventually view me on this video you can judge how badly Dan Stone lies.

My fellow discussants were impressive, including Rob Bloomfield from Cornell, Bill Kinney from the University of Texas (one of my former doctoral students), and Stanimir Markov from UT. Dallas. Our moderator was Sudipta Basu from Temple University.

The highlight of the AAA meetings for me was having an intimate breakfast with Deirdre McCloskey. She and I had a really fine chat before four others joined us for this breakfast hosted by the AAA prior to her plenary presentation. What a dedicated scholar she is across decades of writing huge and detailed history books --- http://en.wikipedia.org/wiki/Deirdre_McCloskey 

In my viewpoint she's the finest living economic historian in the world. Sadly, she may also be one of the worst speakers in front of a large audience. Much of this is no fault of her own, and I admire her greatly for having the courage to speak in large convention halls. She can't be blamed for having a rather crackling voice and a very distracting stammer. Sometimes she just cannot get a particular word out.

My second criticism is that when making a technical presentation rather than something like a political speech, it really does help to have a few PowerPoint slides that highlight some the main bullet points. The AAA sets up these plenary sessions with two very large screens and a number of other large screen television sets that can show both the speaker's talking head and the speaker's PowerPoint slides.

I the case of Deidre's presentation and most other technical presentations, it really helped to have read studied her material before the presentation. For this presentation I had carefully studied her book quoted at

The Cult of Statistical Significance: How Standard Error Costs Us Jobs, Justice, and Lives, by Stephen T. Ziliak and Deirdre N. McCloskey (Ann Arbor: University of Michigan Press, ISBN-13: 978-472-05007-9, 2007)
http://www.cs.trinity.edu/~rjensen/temp/DeirdreMcCloskey/StatisticalSignificance01.htm 

Page 206
The textbooks are wrong. The teaching is wrong. The seminar you just attended is wrong. The most prestigious journal in your scientific field is wrong.

You are searching, we know, for ways to avoid being wrong. Science, as Jeffreys said, is mainly a series of approximations to discovering the sources of error. Science is a systematic way of reducing wrongs or can be. Perhaps you feel frustrated by the random epistemology of the mainstream and don't know what to do. Perhaps you've been sedated by significance and lulled into silence. Perhaps you sense that the power of a Roghamsted test against a plausible Dublin alternative is statistically speaking low but you feel oppressed by the instrumental variable one should dare not to wield. Perhaps you feel frazzled by what Morris Altman (2004) called the "social psychology rhetoric of fear," the deeply embedded path dependency that keeps the abuse of significance in circulation. You want to come out of it. But perhaps you are cowed by the prestige of Fisherian dogma. Or, worse thought, perhaps you are cynically willing to be corrupted if it will keep a nice job

She is now writing a sequel to that book, and I cannot wait.

A second highlight for me in these 2012 AAA annual meetings was a single sentence in the Tuesday morning plenary presentation of Gergory S. Berns, the Director of the (Brain) Center for Neuropolicy at Emory University. In that presentation, Dr. Berns described how the brain is divided up into over 10,000 sectors that are then studied in terms of blood flow (say from reward or punishment) in a CAT Scan. The actual model used is the ever-popular General Linear Model (GLM) regression equation.

The sentence in question probably passed over almost everybody's head in the audience but mine. He discussed how sample sizes are so large in these brain studies  that efforts are made to avoid being mislead by obtaining statistically significant GLM coefficients (due to large sample sizes) that are not substantively significant. BINGO! Isn't this exactly what Deidre McCloskey was warning about in her plenary session a day earlier?

This is an illustration of a real scientist knowing what statistical inference dangers lurk in large samples --- dangers that so many of our accountics scientist researchers seemingly overlook as they add those Pearson asterisks of statistical significance to questionable findings of substance in their research.

And Dr. Berns did not mention this because he was reminded of this danger in Deirdre's presentation the day before. Dr. Berns was not at the meetings the day before and did not listen to Dierdre's presentation. Great scientists have learned to be especially knowledgeable of the limitations of statistical significance testing --- which is really intended more for small samples rather than very large samples are used in capital markets studies by accountics scientists.


2012 AAA Meeting Plenary Speakers and Response Panel Videos ---
http://commons.aaahq.org/hives/20a292d7e9/summary
I think you have to be a an AAA member and log into the AAA Commons to view these videos.
Bob Jensen is an obscure speaker following the handsome Rob Bloomfield
in the 1.02 Deirdre McCloskey Follow-up Panel—Video ---
http://commons.aaahq.org/posts/a0be33f7fc

My threads on Deidre McCloskey and my own talk are at
http://www.cs.trinity.edu/~rjensen/temp/DeirdreMcCloskey/StatisticalSignificance01.htm

September 13, 2012 reply from Jagdish Gangolly

Bob,

Thanks you so much for posting this.

What a wonderful speaker Deidre McCloskey! Reminded me of JR Hicks who also was a stammerer. For an economist, I was amazed by her deep and remarkable understanding of statistics.

It was nice to hear about Gossett, perhaps the only human being who got along well with both Karl Pearson and R.A. Fisher, getting along with the latter itself a Herculean feat.

Gosset was helped in the mathematical derivation of small sample theory by Karl Pearson, he did not appreciate its importance, it was left to his nemesis R.A. Fisher. It is remarkable that he could work with these two giants who couldn't stand each other.

In later life Fisher and Gosset parted ways in that Fisher was a proponent of randomization of experiments while Gosset was a proponent of systematic planning of experiments and in fact proved decisively that balanced designs are more precise, powerful and efficient compared with Fisher's randomized experiments (see http://sites.roosevelt.edu/sziliak/files/2012/02/William-S-Gosset-and-Experimental-Statistics-Ziliak-JWE-2011.pdf )

I remember my father (who designed experiments in horticulture for a living) telling me the virtues of balanced designs at the same time my professors in school were extolling the virtues of randomisation.

In Gosset we also find seeds of Bayesian thinking in his writings.

While I have always had a great regard for Fisher (visit to the tree he planted at the Indian Statistical Institute in Calcutta was for me more of a pilgrimage), I think his influence on the development of statistics was less than ideal.

Regards,

Jagdish

Jagdish S. Gangolly
Department of Informatics College of Computing & Information
State University of New York at Albany
Harriman Campus, Building 7A, Suite 220
Albany, NY 12222 Phone: 518-956-8251, Fax: 518-956-8247

Hi Jagdish,

You're one of the few people who can really appreciate Deidre's scholarship in history, economics, and statistics. When she stumbled for what seemed like forever trying to get a word out, it helped afterwards when trying to remember that word.


Interestingly, two Nobel economists slugged out the very essence of theory some years back. Herb Simon insisted that the purpose of theory was to explain. Milton Friedman went off on the F-Twist tangent saying that it was enough if a theory merely predicted. I lost some (certainly not all) respect for Friedman over this. Deidre, who knew Milton, claims that deep in his heart, Milton did not ultimately believe this to the degree that it is attributed to him. Of course Deidre herself is not a great admirer of Neyman, Savage, or Fisher.

Friedman's essay "The Methodology of Positive Economics" (1953) provided the epistemological pattern for his own subsequent research and to a degree that of the Chicago School. There he argued that economics as science should be free of value judgments for it to be objective. Moreover, a useful economic theory should be judged not by its descriptive realism but by its simplicity and fruitfulness as an engine of prediction. That is, students should measure the accuracy of its predictions, rather than the 'soundness of its assumptions'. His argument was part of an ongoing debate among such statisticians as Jerzy Neyman, Leonard Savage, and Ronald Fisher.

.
Many of us on the AECM are not great admirers of positive economics ---
http://www.trinity.edu/rjensen/theory02.htm#PostPositiveThinking

Everyone is entitled to their own opinion, but not their own facts.
Senator Daniel Patrick Moynihan --- FactCheck.org ---
http://www.factcheck.org/

Then again, maybe we're all entitled to our own facts!

"The Power of Postpositive Thinking," Scott McLemee, Inside Higher Ed, August 2, 2006 --- http://www.insidehighered.com/views/2006/08/02/mclemee

In particular, a dominant trend in critical theory was the rejection of the concept of objectivity as something that rests on a more or less naive epistemology: a simple belief that “facts” exist in some pristine state untouched by “theory.” To avoid being naive, the dutiful student learned to insist that, after all, all facts come to us embedded in various assumptions about the world. Hence (ta da!) “objectivity” exists only within an agreed-upon framework. It is relative to that framework. So it isn’t really objective....

What Mohanty found in his readings of the philosophy of science were much less naïve, and more robust, conceptions of objectivity than the straw men being thrashed by young Foucauldians at the time. We are not all prisoners of our paradigms. Some theoretical frameworks permit the discovery of new facts and the testing of interpretations or hypotheses. Others do not. In short, objectivity is a possibility and a goal — not just in the natural sciences, but for social inquiry and humanistic research as well.

Mohanty’s major theoretical statement on PPR arrived in 1997 with Literary Theory and the Claims of History: Postmodernism, Objectivity, Multicultural Politics (Cornell University Press). Because poststructurally inspired notions of cultural relativism are usually understood to be left wing in intention, there is often a tendency to assume that hard-edged notions of objectivity must have conservative implications. But Mohanty’s work went very much against the current.

“Since the lowest common principle of evaluation is all that I can invoke,” wrote Mohanty, complaining about certain strains of multicultural relativism, “I cannot — and consequently need not — think about how your space impinges on mine or how my history is defined together with yours. If that is the case, I may have started by declaring a pious political wish, but I end up denying that I need to take you seriously.”

PPR did not require throwing out the multicultural baby with the relativist bathwater, however. It meant developing ways to think about cultural identity and its discontents. A number of Mohanty’s students and scholarly colleagues have pursued the implications of postpositive identity politics. I’ve written elsewhere about Moya, an associate professor of English at Stanford University who has played an important role in developing PPR ideas about identity. And one academic critic has written an interesting review essay on early postpositive scholarship — highly recommended for anyone with a hankering for more cultural theory right about now.

Not everybody with a sophisticated epistemological critique manages to turn it into a functioning think tank — which is what started to happen when people in the postpositive circle started organizing the first Future of Minority Studies meetings at Cornell and Stanford in 2000. Others followed at the University of Michigan and at the University of Wisconsin in Madison. Two years ago FMS applied for a grant from Mellon Foundation, receiving $350,000 to create a series of programs for graduate students and junior faculty from minority backgrounds.

The FMS Summer Institute, first held in 2005, is a two-week seminar with about a dozen participants — most of them ABD or just starting their first tenure-track jobs. The institute is followed by a much larger colloquium (the part I got to attend last week). As schools of thought in the humanities go, the postpositivists are remarkably light on the in-group jargon. Someone emerging from the Institute does not, it seems, need a translator to be understood by the uninitated. Nor was there a dominant theme at the various panels I heard.

Rather, the distinctive quality of FMS discourse seems to derive from a certain very clear, but largely unstated, assumption: It can be useful for scholars concerned with issues particular to one group to listen to the research being done on problems pertaining to other groups.

That sounds pretty simple. But there is rather more behind it than the belief that we should all just try to get along. Diversity (of background, of experience, of disciplinary formation) is not something that exists alongside or in addition to whatever happens in the “real world.” It is an inescapable and enabling condition of life in a more or less democratic society. And anyone who wants it to become more democratic, rather than less, has an interest in learning to understand both its inequities and how other people are affected by them.

A case in point might be the findings discussed by Claude Steele, a professor of psychology at Stanford, in a panel on Friday. His paper reviewed some of the research on “identity contingencies,” meaning “things you have to deal with because of your social identity.” One such contingency is what he called “stereotype threat” — a situation in which an individual becomes aware of the risk that what you are doing will confirm some established negative quality associated with your group. And in keeping with the threat, there is a tendency to become vigilant and defensive.

Steele did not just have a string of concepts to put up on PowerPoint. He had research findings on how stereotype threat can affect education. The most striking involved results from a puzzle-solving test given to groups of white and black students. When the test was described as a game, the scores for the black students were excellent — conspicuously higher, in fact, than the scores of white students. But in experiments where the very same puzzle was described as an intelligence test, the results were reversed. The black kids scores dropped by about half, while the graph for their white peers spiked.

The only variable? How the puzzle was framed — with distracting thoughts about African-American performance on IQ tests creating “stereotype threat” in a way that game-playing did not.

Steele also cited an experiment in which white engineering students were given a mathematics test. Just beforehand, some groups were told that Asian students usually did really well on this particular test. Others were simply handed the test without comment. Students who heard about their Asian competitors tended to get much lower scores than the control group.

Extrapolate from the social psychologist’s experiments with the effect of a few innocent-sounding remarks — and imagine the cumulative effect of more overt forms of domination. The picture is one of a culture that is profoundly wasteful, even destructive, of the best abilities of many of its members.

“It’s not easy for minority folks to discuss these things,” Satya Mohanty told me on the final day of the colloquium. “But I don’t think we can afford to wait until it becomes comfortable to start thinking about them. Our future depends on it. By ‘our’ I mean everyone’s future. How we enrich and deepen our democratic society and institutions depends on the answers we come up with now.”

Earlier this year, Oxford University Press published a major new work on postpositivist theory, Visible Identities: Race, Gender, and the Self,by Linda Martin Alcoff, a professor of philosophy at Syracuse University. Several essays from the book are available at the author’s Web site.


Steve Kachelmeier wrote the following on May 7, 2012

I like to pose this question to first-year doctoral students: Two researchers test a null hypothesis using a classical statistical approach. The first researcher tests a sample of 20 and the second tests a sample of 20,000. Both find that they can reject the null hypothesis at the same exact "p-value" of 0.05. Which researcher can say with greater confidence that s/he has found a meaningful departure from the null?

The vast majority of doctoral students respond that the researcher who tested 20,000 can state the more meaningful conclusion. I then need to explain for about 30 minutes how statistics already dearly penalizes the small-sample-size researcher for the small sample size, such that a much bigger "effect size" is needed to generate the same p-value. Thus, I argue that the researcher with n=20 has likely found the more meaningful difference. The students give me a puzzled look, but I hope they (eventually) get it.

The moral? As I see it, the problem is not so much whether we use classical or Bayesian statistical testing. Rather, the problem is that we grossly misinterpret the word "significance" as meaning "big," "meaningful," or "consequential," when in a statistical sense it only means "something other than zero."

In Accountics Science R2 = 0.0004 = (-.02)(-.02) Can Be Deemed a Statistically Significant Linear Relationship
"Disclosures of Insider Purchases and the Valuation Implications of Past Earnings Signals," by David Veenman, The Accounting Review, January 2012 ---
http://aaajournals.org/doi/full/10.2308/accr-10162

. . .

Table 2 presents descriptive statistics for the sample of 12,834 purchase filing observations. While not all market responses to purchase filings are positive (the Q1 value of CAR% equals −1.78 percent), 25 percent of filings are associated with a market reaction of at least 5.32 percent. Among the main variables, AQ and AQI have mean (median) values of 0.062 (0.044) and 0.063 (0.056), respectively. By construction, the average of AQD is approximately zero. ΔQEARN and ΔFUTURE are also centered around zero.

Jensen Comment
Note that correlations shown in bold face type are deemed statistically significant a .05 level. I wonder what it tells me when a -0.02 correlation is statistically significant at a .05 level and a -0.01 correlation is not significant? I have similar doubts about the distinctions between "statistical significance" in the subsequent tables that compare .10, .05, and .01 levels of significance.

Especially note that if David Veenman sufficiently increased the sample size both  -.00002 and -.00001 correlations might be made to be statistically significant.


Just so David Veenman does not think I only singled him out for illustrative purposes
In Accountics Science R2 = 0.000784 = (-.028)(-.028) Can Be Deemed a Statistically Significant Linear Relationship
"Cover Me: Managers' Responses to Changes in Analyst Coverage in the Post-Regulation FD Period," by Divya Anantharaman and Yuan Zhang, The Accounting Review, November 2011 ---
http://aaajournals.org/doi/full/10.2308/accr-10126


 

I might have written a commentary about this and submitted it to The Accounting Review (TAR), but 574 referees at TAR will not publish critical commentaries of papers previously published in TAR ---
http://www.trinity.edu/rjensen/TheoryTAR.htm

How Accountics Scientists Should Change: 
"Frankly, Scarlett, after I get a hit for my resume in The Accounting Review I just don't give a damn"
http://www.cs.trinity.edu/~rjensen/temp/AccounticsDamn.htm
One more mission in what's left of my life will be to try to change this
http://www.cs.trinity.edu/~rjensen/temp/AccounticsDamn.htm 




The Cult of Statistical Significance:  How Standard Error Costs Us Jobs, Justice, and Lives, by Stephen T. Ziliak and Deirdre N. McCloskey (Ann Arbor:  University of Michigan Press, ISBN-13: 978-472-05007-9, 2007)

Page 206
Like scientists today in medical and economic and other sizeless sciences, Pearson mistook a large sample size for the definite, substantive significance---evidence s Hayek put it, of "wholes." But it was as Hayek said "just an illusion." Pearson's columns of sparkling asterisks, though quantitative in appearance and as appealing a is the simple truth of the sky, signified nothing.

 pp. xv-xvi
The implied reader of our book is a significance tester, the keeper of numerical things. We want to persuade you of one claim:  that William Sealy Gosset (1879-1937) --- aka "Student" of Student's t-test --- was right and that his difficult friend, Ronald A. Fisher, though a genius, was wrong.  Fit is not the same thing as importance. Statistical significance is not the same thing as scientific finding. R2. t-statistic, p-value, F-test, and all the more sophisticated versions of them in time series and the most advanced statistics are misleading at best.

No working scientist today knows much about Gosset, a brewer of Guinness stout and the inventor of a good deal of modern statistics. The scruffy little Gossset, with his tall leather boots and a rucksack on his back, is the heroic underdog of our story. Gosset, we claim, was a great scientist. He took an economic approach to the logic of uncertainty. For over two decades he quietly tried to educate Fisher. But Fisher, our flawed villain, erased from Gosset's inventions the consciously economic element. We want to bring it back.

. . .

Can so many scientists have been wrong for the eighty years since 1925? Unhappily yes. The mainstream in science, as any scientist will tell you, is often wrong. Otherwise, come to think of it, science would be complete. Few scientists would make that claim, or would want to. Statistical significance is surely not the only error of modern science, although it has been, as we will show, an exceptionally damaging one. Scientists are often tardy in fixing basic flaws in their sciences despite the presence of better alternatives. ...

Continued in the Preface


Page 3
A brewer of beer, William Sealy Gosset (1876-1937), proved its (statistical significance) in small samples. He worked at the Guinness Brewer in Dublin, where for most of his working life he was head experimental brewer. He saw in 1905 where the need for a small-smle test because he was testing varieties of hops and barley in field samples with N as small as four. Gosset, who is hardly remembered nowadays, quietly invented many tools of modern applied statistics, including Monte Carlo analysis, the balanced design of experiments, and, especially, Student's t, which is the foundation of small-sample theory and the most commonsly7 used test of statistical significance in the sciences. ... But the value Gosset intended with his test, he said without deviation from 1905 until his death in 1937. was its ability to sharpen statements of substantive or economic significance. ... (he) wrote to his elderly friend, the great Karl Person:  "My own war work is obviously to brew Guinness stout in each way as to waste as little labor and material as possible, and I am hoping to help to do something fairly creditable in that way." It seems he did.

Page 10
Sizelessness is not what most Fisherians (deciples of Ronald Fisher) believe they are getting. The sizeless scientists have adopted a method of deciding which numbers are significant that has little to do with the humanly significant numbers. The scientists re counting, to be sure:  "3.14159***," they proudly report of simply "****." But, as the probablist Bruno de Finetti said, they proudly report scientists are acting as though "addition requires different operations if concerned with pure number or amounts of money" (De Finetti 1971, 486, quoted in Savage 1971a).

Substituting "significance" for scientific how much would imply that the value of a lottery ticket is the chance itself, the chance 1 in 38,000, say in or 1 in 1,000,000,000. It supposes that the only source in value in the lottery is sampling variability. It sets aside as irrelevant---simply ignores---the value of the expected prize., the millions that success in the lottery could in fact yield. Setting aside both old and new criticisms of expected utility theory, a prize of $3.56 is very different, other things equal, from a prize of $356,000,000. No matter. Statistical significance, startlingly, ignores the difference.

Continued on Page 10

Page 15
The doctor who cannot distinguish statistical significance from substantive significance, an F-statistic from a heart attach, is like an economist who ignores opportunity cost---what statistical theorists call the loss function. The doctors of "significance" in medicine and economy are merely "deciding what to say rather than what to do" (Savage 1954, 159). In the 1950s Ronald Fisher published an article and a book that intended to rid decision from the vocabulary of working statisticians (1955, 1956). He was annoyed by the rising authority in highbrow circles of those he called "the Neymanites."

Continued on Page 15


pp. 28-31
An example is provided regarding how Merck manipulated statistical inference to keep its killing pain killer Vioxx from being pulled from the market.

Page 31
Another story. The Japanese government in June 2005 increased the limit on the number of whales that may be annually killed in the Antarctica---from around 440 annually to over 1,000 annually. Deputy Commissioner Akira Nakamae explained why:  "We will implement JARPS-2 [the plan for the higher killing] according to the schedule, because the sample size is determined in order to get statistically significant results" (Black 2005). The Japanese hunt for the whales, they claim, in order to collect scientific data on them. That and whale steaks. The commissioner is right:  increasing sample size, other things equal, does increase the statistical significance of the result. It is, fter all, a mathematical fact that statistical significance increases, other things equal, as sample size increases. Thus the theoretical standard error of JAEPA-2, s/SQROOT(440+560) [given for example the simple mean formula], yields more sampling precision than the standard error JARPA-1, s/SQROOT(440). In fact it raises the significance level to Fisher's percent cutoff. So the Japanese government has found a formula for killing more whales, annually some 560 additional victims, under the cover of getting the conventional level of Fisherian statistical significance for their "scientific" studies.


pp. 250-251
The textbooks are wrong. The teaching is wrong. The seminar you just attended is wrong. The most prestigious journal in your scientific field is wrong.

You are searching, we know, for ways to avoid being wrong. Science, as Jeffreys said, is mainly a series of approximations to discovering the sources of error. Science is a systematic way of reducing wrongs or can be. Perhaps you feel frustrated by the random epistemology of the mainstream and don't know what to do. Perhaps you've been sedated by significance and lulled into silence. Perhaps you sense that the power of a Roghamsted test against a plausible Dublin alternative is statistically speaking low but you feel oppressed by the instrumental variable one should dare not to wield. Perhaps you feel frazzled by what Morris Altman (2004) called the "social psychology rhetoric of fear," the deeply embedded path dependency that keeps the abuse of significance in circulation. You want to come out of it. But perhaps you are cowed by the prestige of Fisherian dogma. Or, worse thought, perhaps you are cynically willing to be corrupted if it will keep a nice job


 

See the review at
http://economiclogic.blogspot.com/2012/03/about-cult-of-statistical-significance.html

Costs and Benefits of Significance Testing ---
http://www.cato.org/pubs/journal/cj28n2/cj28n2-16.pdf

Jensen Comment
I'm only part way into the book and reserve judgment at this point. It seems to me in these early stages that they overstate their case (in a very scholarly but divisive  way). However, I truly am impressed by the historical citations in this book and the huge number of footnotes and references. The book has a great index.

For most of my scholastic life I've argued that there's a huge difference between significance testing versus substantive testing. The first thing I look for when asked to review an accountics science study is the size of the samples. But this issue is only a part of this fascinating book.

Deirdre McCloskey will kick off the American Annual Meetings in Washington DC with a plenary session first thing in the morning on August 6, 2012. However she's not a student of accounting. She's the Distinguished Professor of Economics, History, English, and Communication, University of Illinois at Chicago and to date has received four honorary degrees ---
http://www.deirdremccloskey.com/
Also see http://en.wikipedia.org/wiki/Deirdre_McCloskey

I've been honored to be on a panel following her presentation to debate her remarks. Her presentation will also focus on  Bourgeois Dignity: Why Economics Can't Explain the Modern World.

Steven T. Ziliac is a former professor of economics at Carnegie who is now a Professor of Economics specializing in poverty research at Roosevelt University ---
http://en.wikipedia.org/wiki/Stephen_T._Ziliak




May 11, 2012 reply from Jagdish Gangolly

Hopefully this is my last post on this thread. I just could not resist posting this appeal to editors, chairs, directors, reviewers,... by Professor John Kruschke, Professor of Psychological and Brain Sciences and Statistics at Indiana University.

His book on " Doing Bayesian Data Analysis: A Tutorial with R and BUGS " is the best introductory textbook on statistics I have read.

Regards to all,

Jagdish

Here is the open letter: ___________________________________________________

An open letter to Editors of journals, Chairs of departments, Directors of funding programs, Directors of graduate training, Reviewers of grants and manuscripts, Researchers, Teachers, and Students:

Statistical methods have been evolving rapidly, and many people think it’s time to adopt modern Bayesian data analysis as standard procedure in our scientific practice and in our educational curriculum. Three reasons:

1. Scientific disciplines from astronomy to zoology are moving to Bayesian data analysis. We should be leaders of the move, not followers.

2. Modern Bayesian methods provide richer information, with greater flexibility and broader applicability than 20th century methods. Bayesian methods are intellectually coherent and intuitive. Bayesian analyses are readily computed with modern software and hardware.

3. Null-hypothesis significance testing (NHST), with its reliance on p values, has many problems. There is little reason to persist with NHST now that Bayesian methods are accessible to everyone.

My conclusion from those points is that we should do whatever we can to encourage the move to Bayesian data analysis. Journal editors could accept Bayesian data analyses, and encourage submissions with Bayesian data analyses. Department chairpersons could encourage their faculty to be leaders of the move to modern Bayesian methods. Funding agency directors could encourage applications using Bayesian data analysis. Reviewers could recommend Bayesian data analyses. Directors of training or curriculum could get courses in Bayesian data analysis incorporated into the standard curriculum. Teachers can teach Bayesian. Researchers can use Bayesian methods to analyze data and submit the analyses for publication. Students can get an advantage by learning and using Bayesian data analysis.

The goal is encouragement of Bayesian methods, not prohibition of NHST or other methods. Researchers will embrace Bayesian analysis once they learn about it and see its many practical and intellectual advantages. Nevertheless, change requires vision, courage, incentive, effort, and encouragement!

Now to expand on the three reasons stated above.

1. Scientific disciplines from astronomy to zoology are moving to Bayesian data analysis. We should be leaders of the move, not followers.

Bayesian methods are revolutionizing science. Notice the titles of these articles:

Bayesian computation: a statistical revolution. Brooks, S.P. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 361(1813), 2681, 2003.

The Bayesian revolution in genetics. Beaumont, M.A. and Rannala, B. Nature Reviews Genetics, 5(4), 251-261, 2004.

A Bayesian revolution in spectral analysis. Gregory, PC. AIP Conference Proceedings, 557-568, 2001.

The hierarchical Bayesian revolution: how Bayesian methods have changed the face of marketing research. Allenby, G.M. and Bakken, D.G. and Rossi, P.E. Marketing Research, 16, 20-25, 2004

The future of statistics: A Bayesian 21st century. Lindley, DV. Advances in Applied Probability, 7, 106-115, 1975.

There are many other articles that make analogous points in other fields, but with less pithy titles. If nothing else, the titles above suggest that the phrase “Bayesian revolution” is not an overstatement.

The Bayesian revolution spans many fields of science. Notice the titles of these articles:

Bayesian analysis of hierarchical models and its application in AGRICULTURE. Nazir, N., Khan, A.A., Shafi, S., Rashid, A. InterStat, 1, 2009.

The Bayesian approach to the interpretation of ARCHAEOLOGICAL DATA. Litton, CD & Buck, CE. Archaeometry, 37(1), 1-24, 1995.

The promise of Bayesian inference for ASTROPHYSICS. Loredo TJ. In: Feigelson ED, Babu GJ, eds. Statistical Challenges in Modern Astronomy. New York: Springer-Verlag; 1992, 275–297.

Bayesian methods in the ATMOSPHERIC SCIENCES. Berliner LM, Royle JA, Wikle CK, Milliff RF. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM, eds. Bayesian Statistics 6: Proceedings of the sixth Valencia international meeting, June 6–10, 1998. Oxford, UK: Oxford University Press; 1999, 83–100.

An introduction to Bayesian methods for analyzing CHEMISTRY data:: Part II: A review of applications of Bayesian methods in CHEMISTRY. Hibbert, DB and Armstrong, N. Chemometrics and Intelligent Laboratory Systems, 97(2), 211-220, 2009.

Bayesian methods in CONSERVATION BIOLOGY. Wade PR. Conservation Biology, 2000, 1308–1316.

Bayesian inference in ECOLOGY. Ellison AM. Ecol Biol 2004, 7:509–520.

The Bayesian approach to research in ECONOMIC EDUCATION. Kennedy, P. Journal of Economic Education, 17, 9-24, 1986.

The growth of Bayesian methods in statistics and ECONOMICS since 1970. Poirier, D.J. Bayesian Analysis, 1(4), 969-980, 2006.

Commentary: Practical advantages of Bayesian analysis of EPIDEMIOLOGIC DATA. Dunson DB. Am J Epidemiol 2001, 153:1222–1226.

Bayesian inference of phylogeny and its impact on EVOLUTIONARY BIOLOGY. Huelsenbeck JP, Ronquist F, Nielsen R, Bollback JP. Science 2001, 294:2310–2314.

Geoadditive Bayesian models for FORESTRY defoliation data: a case study. Musio, M. and Augustin, N.H. and von Wilpert, K. Environmetrics. 19(6), 630—642, 2008.

Bayesian statistics in GENETICS: a guide for the uninitiated. Shoemaker, J.S. and Painter, I.S. and Weir, B.S. Trends in Genetics, 15(9), 354-358, 1999.

Bayesian statistics in ONCOLOGY. Adamina, M. and Tomlinson, G. and Guller, U. Cancer, 115(23), 5371-5381, 2009.

Bayesian analysis in PLANT PATHOLOGY. Mila, AL and Carriquiry, AL. Phytopathology, 94(9), 1027-1030, 2004.

Bayesian analysis for POLITICAL RESEARCH. Jackman S. Annual Review of Political Science, 2004, 7:483–505.

The list above could go on and on. The point is simple: Bayesian methods are being adopted across the disciplines of science. We should not be laggards in utilizing Bayesian methods in our science, or in teaching Bayesian methods in our classrooms.

Why are Bayesian methods being adopted across science? Answer:

2. Bayesian methods provide richer information, with greater flexibility and broader applicability than 20th century methods. Bayesian methods are intellectually coherent and intuitive. Bayesian analyses are readily computed with modern software and hardware.

To explain this point adequately would take an entire textbook, but here are a few highlights.

* In NHST, the data collector must pretend to plan the sample size in advance and pretend not to let preliminary looks at the data influence the final sample size. Bayesian design, on the contrary, has no such pretenses because inference is not based on p values.

* In NHST, analysis of variance (ANOVA) has elaborate corrections for multiple comparisons based on the intentions of the analyst. Hierarchical Bayesian ANOVA uses no such corrections, instead rationally mitigating false alarms based on the data.

* Bayesian computational practice allows easy modification of models to properly accommodate the measurement scales and distributional needs of observed data.

* In many NHST analyses, missing data or otherwise unbalanced designs can produce computational problems. Bayesian models seamlessly handle unbalanced and small-sample designs.

* In many NHST analyses, individual differences are challenging to incorporate into the analysis. In hierarchical Bayesian approaches, individual differences can be flexibly and easily modeled, with hierarchical priors that provide rational “shrinkage” of individual estimates.

* In contingency table analysis, the traditional chi-square test suffers if expected values of cell frequencies are less than 5. There is no such issue in Bayesian analysis, which handles small or large frequencies seamlessly.

* In multiple regression analysis, traditional analyses break down when the predictors are perfectly (or very strongly) correlated, but Bayesian analysis proceeds as usual and reveals that the estimated regression coefficients are (anti-)correlated.

* In NHST, the power of an experiment, i.e., the probability of rejecting the null hypothesis, is based on a single alternative hypothesis. And the probability of replicating a significant outcome is “virtually unknowable” according to recent research. But in Bayesian analysis, both power and replication probability can be computed in straight forward manner, with the uncertainty of the hypothesis directly represented.

* Bayesian computational practice allows easy specification of domain-specific psychometric models in addition to generic models such as ANOVA and regression.

Some people may have the mistaken impression that the advantages of Bayesian methods are negated by the need to specify a prior distribution. In fact, the use of a prior is both appropriate for rational inference and advantageous in practical applications.

* It is inappropriate not to use a prior. Consider the well known example of random disease screening. A person is selected at random to be tested for a rare disease. The test result is positive. What is the probability that the person actually has the disease? It turns out, even if the test is highly accurate, the posterior probability of actually having the disease is surprisingly small. Why? Because the prior probability of the disease was so small. Thus, incorporating the prior is crucial for coming to the right conclusion.

* Priors are explicitly specified and must be agreeable to a skeptical scientific audience. Priors are not capricious and cannot be covertly manipulated to predetermine a conclusion. If skeptics disagree with the specification of the prior, then the robustness of the conclusion can be explicitly examined by considering other reasonable priors. In most applications, with moderately large data sets and reasonably informed priors, the conclusions are quite robust.

* Priors are useful for cumulative scientific knowledge and for leveraging inference from small-sample research. As an empirical domain matures, more and more data accumulate regarding particular procedures and outcomes. The accumulated results can inform the priors of subsequent research, yielding greater precision and firmer conclusions.

* When different groups of scientists have differing priors, stemming from differing theories and empirical emphases, then Bayesian methods provide rational means for comparing the conclusions from the different priors.

To summarize, priors are not a problematic nuisance to be avoided. Instead, priors should be embraced as appropriate in rational inference and advantageous in real research.

If those advantages of Bayesian methods are not enough to attract change, there is also a major reason to be repelled from the dominant method of the 20th century:

3. 20th century null-hypothesis significance testing (NHST), with its reliance on p values, has many severe problems. There is little reason to persist with NHST now that Bayesian methods are accessible to everyone.

Although there are many difficulties in using p values, the fundamental fatal flaw of p values is that they are ill defined, because any set of data has many different p values.

Consider the simple case of assessing whether an electorate prefers candidate A over candidate B. A quick random poll reveals that 8 people prefer candidate A out of 23 respondents. What is the p value of that outcome if the population were equally divided? There is no single answer! If the pollster intended to stop when N=23, then the p value is based on repeating an experiment in which N is fixed at 23. If the pollster intended to stop after the 8th respondent who preferred candidate A, then the p value is based on repeating an experiment in which N can be anything from 8 to infinity. If the pollster intended to poll for one hour, then the p value is based on repeating an experiment in which N can be anything from zero to infinity. There is a different p value for every possible intention of the pollster, even though the observed data are fixed, and even though the outcomes of the queries are carefully insulated from the intentions of the pollster.

The problem of ill-defined p values is magnified for realistic situations. In particular, consider the well-known issue of multiple comparisons in analysis of variance (ANOVA). When there are several groups, we usually are interested in a variety of comparisons among them: Is group A significantly different from group B? Is group C different from group D? Is the average of groups A and B different from the average of groups C and D? Every comparison presents another opportunity for a false alarm, i.e., rejecting the null hypothesis when it is true. Therefore the NHST literature is replete with recommendations for how to mitigate the “experimentwise” false alarm rate, using corrections such as Bonferroni, Tukey, Scheffe, etc. The bizarre part of this practice is that the p value for the single comparison of groups A and B depends on what other groups you intend to compare them with. The data in groups A and B are fixed, but merely intending to compare them with other groups enlarges the p value of the A vs B comparison. The p value grows because there is a different space of possible experimental outcomes when the intended experiment comprises more groups. Therefore it is trivial to make any comparison have a large p value and be nonsignificant; all you have to do is intend to compare the data with other groups in the future.

The literature is full of articles pointing out the many conceptual misunderstandings held by practitioners of NHST. For example, many people mistake the p value for the probability that the null hypothesis is true. Even if those misunderstandings could be eradicated, such that everyone clearly understood what p values really are, the p values would still be ill defined. Every fixed set of data would still have many different p values.

To recapitulate: Science is moving to Bayesian methods because of their many advantages, both practical and intellectual, over 20th century NHST. It is time that we convert our research and educational practices to Bayesian data analysis. I hope you will encourage the change. It’s the right thing to do.

John K. Kruschke, Revised 14 November 2010, http://www.indiana.edu/~kruschke/


Mean and Median Applet --- http://mathdl.maa.org/mathDL/47/?pa=content&sa=viewDocument&nodeId=3204
Thank you for sharing  Professor Kady Schneiter of Utah State University

This applet consists of two windows, in the first (the investigate window), the user fills in a grid to create a distribution of numbers and to investigate the mean and median of the distribution. The second window (the identify window) enables users to test their knowledge about the mean and the median. In this window, the applet displays a hypothetical distribution and an unspecified marker. The user determines whether the marker indicates the postion of the mean of the distribution, the median, both, or neither. Two activities intended to facilitate using the applet to learn about the mean and median are provided.


Above all, Mr. Silver urges forecasters to become Bayesians. The English mathematician Thomas Bayes used a mathematical rule to adjust a base probability number in light of new evidence

Book Review of The Signal and the Noise
by Nate Silver
Price:  16.44 at Barnes and Noble
http://www.barnesandnoble.com/w/the-signal-and-the-noise-nate-silver/1111307421?ean=9781594204111

 

"Telling Lies From Statistics:  Forecasters must avoid overconfidence—and recognize the degree of uncertainty that attends even the most careful predictions," by Burton G. Malkiel, The Wall Street Journal, September 24, 2012 --- 
http://professional.wsj.com/article/SB10000872396390444554704577644031670158646.html?mod=djemEditorialPage_t&mg=reno64-wsj

It is almost a parlor game, especially as elections approach—not only the little matter of who will win but also: by how much? For Nate Silver, however, prediction is more than a game. It is a science, or something like a science anyway. Mr. Silver is a well-known forecaster and the founder of the New York Times political blog FiveThirtyEight.com, which accurately predicted the outcome of the last presidential election. Before he was a Times blogger, he was known as a careful analyst of (often widely unreliable) public-opinion polls and, not least, as the man who hit upon an innovative system for forecasting the performance of Major League Baseball players. In "The Signal and the Noise," he takes the reader on a whirlwind tour of the success and failure of predictions in a wide variety of fields and offers advice about how we might all improve our forecasting skill.

Mr. Silver reminds us that we live in an era of "Big Data," with "2.5 quintillion bytes" generated each day. But he strongly disagrees with the view that the sheer volume of data will make predicting easier. "Numbers don't speak for themselves," he notes. In fact, we imbue numbers with meaning, depending on our approach. We often find patterns that are simply random noise, and many of our predictions fail: "Unless we become aware of the biases we introduce, the returns to additional information may be minimal—or diminishing." The trick is to extract the correct signal from the noisy data. "The signal is the truth," Mr. Silver writes. "The noise is the distraction."

The first half of Mr. Silver's analysis looks closely at the success and failure of predictions in clusters of fields ranging from baseball to politics, poker to chess, epidemiology to stock markets, and hurricanes to earthquakes. We do well, for example, with weather forecasts and political predictions but very badly with earthquakes. Part of the problem is that earthquakes, unlike hurricanes, often occur without warning. Half of major earthquakes are preceded by no discernible foreshocks, and periods of increased seismic activity often never result in a major tremor—a classic example of "noise." Mr. Silver observes that we can make helpful forecasts of future performance of baseball's position players—relying principally on "on-base percentage" and "wins above replacement player"—but we completely missed the 2008 financial crisis. And we have made egregious errors in predicting the spread of infectious diseases such as the flu.

In the second half of his analysis, Mr. Silver suggests a number of methods by which we can improve our ability. The key, for him, is less a particular mathematical model than a temperament or "framing" idea. First, he says, it is important to avoid overconfidence, to recognize the degree of uncertainty that attends even the most careful forecasts. The best forecasts don't contain specific numerical expectations but define the future in terms of ranges (the hurricane should pass somewhere between Tampa and 350 miles west) and probabilities (there is a 70% chance of rain this evening).

Above all, Mr. Silver urges forecasters to become Bayesians. The English mathematician Thomas Bayes used a mathematical rule to adjust a base probability number in light of new evidence. To take a canonical medical example, 1% of 40-year-old women have breast cancer: Bayes's rule tells us how to factor in new information, such as a breast-cancer screening test. Studies of such tests reveal that 80% of women with breast cancer will get positive mammograms, and 9.6% of women without breast cancer will also get positive mammograms (so-called false positives). What is the probability that a woman who gets a positive mammogram will in fact have breast cancer? Most people, including many doctors, greatly overestimate the probability that the test will give an accurate diagnosis. The right answer is less than 8%. The result seems counterintuitive unless you realize that a large number of (40-year-old) women without breast cancer will get a positive reading. Ignoring the false positives that always exist with any noisy data set will lead to an inaccurate estimate of the true probability.

This example and many others are neatly presented in "The Signal and the Noise." Mr. Silver's breezy style makes even the most difficult statistical material accessible. What is more, his arguments and examples are painstakingly researched—the book has 56 pages of densely printed footnotes. That is not to say that one must always agree with Mr. Silver's conclusions, however.

Continued in article

Bayesian Probability --- http://en.wikipedia.org/wiki/Bayesian_probability

Bayesian Inference --- http://en.wikipedia.org/wiki/Bayesian_inference

Bob Jensen's threads on free online mathematics and statistics tutorials are at
http://www.trinity.edu/rjensen/Bookbob2.htm#050421Mathematics