New Computer Study Purports to Detect Literary Influences on the Book of Mormon

The Internet’s been abuzz with hype this week about a new computer study by Duane and Chris Johnson which purports to identify four literary influences on the Book of Mormon (besides the King James Bible):

What’s really impressive about the Johnson and Johnson study is its scope and methodological sophistication. The study compares the Book of Mormon with a sample of more than 130,000 scanned texts published between 1500 and 1830. This sample is decidedly imperfect, containing as it does a large number of OCR errors and long S‘s (which the algorithm can’t recognize), but its size is impressive, and the authors seem determined to continue working on their algorithm to compensate for its shortcomings. The methodology the authors have designed to analyze this sample isn’t perfect either, but it’s more exacting than some online detractors have given it credit for. According to the authors, the method has been refined over the course of thirty-six experiments and is continuing to be improved.

Basically, the methodology works like this. First, the algorithm measures the number of four-word phrases that each text shares with the Book of Mormon. (The authors call these four-word phrases n-grams.) A high number of shared n-grams (relative to total word-count) indicates that a text may have some kind of significant literary relationship to the Book of Mormon. (N-grams found in the King James Bible are excluded from the analysis to prevent mutual Bible-borrowing from influencing the results.) An additional analysis, called “Iterative Source Separation” (ISS), then measures the distinctiveness of the shared n-grams. The more distinctive n-grams a text shares with the Book of Mormon, the more literary dependence is indicated. The four texts listed above are the ones that seem to share a statistically significant number of distinctive phrases with the Book of Mormon.

I do see a couple problems with this methodology. First, it seems to me that if ISS really works as advertised, then it would not be necessary to exclude King James Bible phrases from the analysis. In fact, putting the KJV back into the analysis could be a useful way to check the effectiveness of ISS. (Update 10/29: A note on the Johnsons’ website indicates that the KJV has been put back into the analysis, so this issue has been resolved.)

First page of The First Book of Napoleon, one of four books that Duane and Chris Johnson say influenced the Book of Mormon.

First page of The First Book of Napoleon, one of four books that Duane and Chris Johnson say influenced the Book of Mormon.

Second and more importantly, I’m bothered by the fact that both The Late War and The First Book of Napoleon show up as statistically significant influencers of the Book of Mormon. Johnson and Johnson speculate that The First Book of Napoleon influenced The Late War, and that the former’s influence may have been mediated to the Book of Mormon via the latter. One would think, however, that ISS would prevent such mediated influence from showing up as statistically significant in the results. If the ISS analysis works as advertised, then it appears that Late War and Napoleon each share distinctive content with the Book of Mormon that they do not share with each other. I think it unlikely that Joseph Smith had read and been influenced by both of these highly unusual texts (which are histories of modern events written in pseudo-biblical prose). It seems more likely that the distinctive nature of what these authors were doing—writing pseudo-biblical narratives—led them to independently invent some of the same distinctive phrases and/or to independently mutate biblical phrases in some of the same distinctive ways. Since the Koran translation and David Willson’s The Rights of Christ also simulate King James prose, these texts may share distinctive material with the Book of Mormon for the same reason.

One finding of the study I do think is very interesting is that Solomon Spalding’s Manuscript Found and Mercy Otis Warren’s History of the American Revolution (two books often touted as sources for the Book of Mormon) are really no more similar to the Book of Mormon than the average book. Ethan Smith’s View of the Hebrews is a better match, but still within the normal statistical distribution for books of the period. Of course, that doesn’t necessarily mean Joseph Smith wasn’t influenced by these books. It may simply mean that shared four-word phrases aren’t good indicators of influence. It seems noteworthy in at least the case of Mercy Otis Warren’s book, though, because the case for that book having influenced the Book of Mormon has usually rested almost entirely upon lists of shared phrases collated by zealous researchers. Johnson and Johnson’s study reveals that Warren’s History doesn’t actually share more phrases with the Book of Mormon than other books of the period. Researchers, then, must have simply noticed more shared phrases in Warren’s book because it just happens to be one to which they’ve devoted a lot of energy.

[Edit to add: Actually, it occurs to me that the Johnson and Johnson method may be systematically biased toward obscure writings, whereas “mainstream” writings like Warren’s or Ethan Smith’s may be contraindicated because they were popular. Mainstream writings are often quoted and imitated, making less of their text appear distinctive to the algorithm. There are therefore relatively fewer opportunities for a distinctive phrase match with a mainstream work than with an obscure one. To correct for this, you’d need to count shared n-grams relative to the total number distinctive n-grams in a work rather than relative to its total word-count.]

Overall, my assessment of the Johnson and Johnson study is similar to my assessments of most other computer studies of Book of Mormon authorship: it’s intriguing and important, but also somewhat limited and not entirely persuasive. As I’ll suggest in a follow-up post, the study’s findings may be more useful for purposes that were not in its authors’ purview than for answering the questions they were asking.


New Computer Study Purports to Detect Literary Influences on the Book of Mormon — 14 Comments

  1. My problem with these studies is how openly unobjective they are. I am willing to take your word for it on the technical scope and sophistication of the tool they’ve created.

    But at the end of the day, how are we supposed to control group for this? Would this same methodology also show links between books separated by time and geography that can be demonstrated to have had little or no contact with each other?

    I suspect that it could, and I don’t imagine we’re going to see much effort by the authors to settle that issue. The whole thing smacks of a highly elaborate correlation=causation fallacy in action.

    Without a credible control group to demonstrate that this same algorithm couldn’t be used to obtain absurd results, this project remains – as far as I’m concerned – literary hobbyism, not literary scholarship.

  2. Chris,

    Thank you for the very well done post. I like how you present both the strengths and weaknesses of the Johnson word study.

    Rick Grunder first wrote about Late War in his Mormon List 32 (March, 1989), entry 17. At this time Grunder points out similarities between the two books and created a side-by-side comparison of a few paragraphs. This early entry developed into a chapter in Grunder’s Mormon Parallels.

    Luckily for readers,Grunder has made his entry from Mormon Parallels about Gilbert Hunt’s book, available in pdf form on his website. It can be found here:

    It does not really matter if “The Late War” supplied the Book of Mormon with anything. The whole point of the parallels is simply to prove that there is nothing very remarkable about Joseph Smith’s writings or language. If a semi-educated textbook writer like Gilbert Hunt could churn out these ideas and these hebraisms and chiasmus, merely from living in the culture of early nineteenth-century America, then Smith could do the same thing – and he did!

  3. Do you think that merely showing matches in the literary style of the day is enough to render the book “unremarkable” Joe?

    I think it doesn’t come even remotely close to that aim.

  4. Thanks for the review Chris,

    I was very inspired by your research, which I mentioned in the presentation. I found your Urantia article fascinating. I just updated the site with our latest iterations from the ISS algorithm. The First Book of Napoleon is currently fighting for #1 spot just behind The Koran. I understand that biblical style, mixed with common mutations may be the culprit here, so we removed the biblical style as much as possible from The Book of Mormon (ye/thee/thou = you, hast = have….etc), and it didn’t make much difference in the results. We did further tests and even though I am extremely skeptical, it looks like Joseph Smith had access to The First Book of Napoleon. Before we try to figure out the odds of this happening, we have to ask, what are the odds that someone, somewhere would not have had access to a certain combination of biblically styled books (due to interest in the genre) which inspired him to write his own book of the same genre?

    When the inputs to a remix are unknown, the results appear miraculous, but when the inputs are known, it is as explainable as any other book, and in this case, it appears to be more explainable than most of the common books we are testing to establish the baseline.

    Here are the similarities in the first few pages of The Book of Mormon and The First Book of Napoleon:

    condemn not the (writing) … an account … the First Book of Napoleon … upon the face of the earth … it came to pass … the land … their inheritances their gold and silver and … the commandments of the Lord … the foolish imaginations of their hearts … small in stature … Jerusalem … because of the perverse wickedness of the people

    -Parallels in sequence, The First Book of Napoleon (from the first pages)

    condemn not the (writing) … an account … the First Book of Nephi … upon the face of the earth … it came to pass … the land … his inheritance and his gold and his silver and … the commandments of the Lord … the foolish imaginations of his heart … large in stature … Jerusalem … because of the wickedness of the people

    -Parallels in sequence, Book of Mormon / The First Book of Nephi (from the first pages)

    It’s a bit odd, and I’m still not sure if it’s really significant or not, but that’s why we are throwing lots of algorithms and CPU cycles at this problem. It’s a kind of blowing my mind, but that’s the results of our study so far.

    It’s all a bit clearer on the askreality site which I just updated today.

  5. Thanks for the comment, Chris. I’m glad to hear my work played some role in inspiring you to undertake this study! I think computers offer a wealth of possibilities for historical and textual studies, but I also think there are many problems and limitations for such studies to overcome, so I’m glad to see researchers like yourself working so hard to advance the field. I’ll be watching closely for new developments as your research continues. Good luck!

  6. Thanks Chris, and just between me and you, and the others reading this blog, we have a ton of volunteer researchers, scientists, programmers and geeks surfacing that are helping us launch an even better cloud based system with way more power than the first analysis we did. The new cloud platform will give even the least scientific public complete access to do their own experiments and see what inspired their favorite books. I hope that doesn’t spoil the mystery for those folks that like it that way, but for me… well, I like solving mysteries and I’m happy with whatever the truth happens to be.

  7. Personally, I’d correct that statement to say that “I’m happy with whatever the DATA happens to be.”

    Saying that you’re discovering “truth” seems more than a tad presumptuous.

  8. Pingback: The Book of Mormon and the Late War: Direct Literary Dependence?

  9. Ngrams typically need to be scored by point wise mutual information to see how relevant they are. E.g. “he went to the” vs more distinct phrases.

  10. I had first heard of this technology when reading the book “The Secret Life of Pronouns” and I remember thinking that it would be cool if they applied those principles to the BoM, and sure enough, some one did.

    The science, as I understand it, is pretty solid. Basically when we construct sentences, we have an idea we are trying to convey, and we will often use similar key words, but the way that those key words are constructed and put together are handled by our subconscious and they are very, very difficult to change. The “unimportant” words are often very important because of what they reveal about the subconscious mind beneath it.

  11. I greatly enjoyed the post on Ask Reality and find the discussion there of methodology and so forth fascinating. One thing I’ve though about this process is that it would seem to be better at catching similarities in prose style than similarities in content. Thoughts?

  12. Pingback: Quora