Second Assignment
Choose one of the options below and write an essay in response. Your essay should be at least three and a half pages in length
and should not exceed six pages. Use double-space lineation, a 12-point serif font, and one-inch margins on all sides. Your writing
is expected to be error-free and up to university standards.
For the first and third options, your essay is expected to have a thesis: a main point that the whole essay attempts to support or prove.
The essay component for the second option can be more of a research report. The second option requires, in addition to the
paper submission of the report, submission of one or more digital files. These can be emailed to the instructor or delivered in person.
A large portion of the grade will be an assessment of the written component of this assignment. The hands-on approach to text analysis
here is important and should be done well, but I will be mainly looking for intelligent reflections on what you have done with the more technical
aspects of the assignment and the problems with them and opportunities for further work.
The assignment is due on December 4th, but may be handed in up to and including January 15th without penalty. Consult the
course outline for further information about late assignments.
Note that one class meeting, on November 27th, has been set aside as an in-class workshop to help you with the work for this assignment.
Do come to class on that date if you would like one-on-one help with using any of the web sites listed here or help with XML, TEI, XSLT, bigrams, or
with anything else related to this assignment. You are welcome to come to class that day even if you have not begun the assignment: it might
be a good opportunity to begin the assignment.
Option 1
Choose one of the following files: Emma and On
the Origin of Species. Using the document you choose, explore some basic text analysis possibilities and results. You will probably want to
focus on one or more of: word count data, word collocation data, and word distribution data. You may want to include tables and/or images (of word
distributions) in your essay. What information about the text can you retrieve? What are the benefits and limitations of digital text analysis on
this document using your approach? Can you arrive at any conclusions regarding the text using your approach? You are free to use a variety of tools,
but I recommend using one or both of: HyperPo and Voyant.
Option 2
Prepare a TEI-XML edition of the poem “After Ten Years: He” as published here:
B., A. L. “After Ten Years: He.” The Cornhill Magazine 22 (December 1870): 714.
Note: use just the last part of the whole poem: the last three stanzas. This can be found on Google Books:
books.google.com/books?id=k2UJAAAAQAAJ
Your edition should be well-formed XML and valid TEI. You may use the empty TEI shell: TEI shell.
Your edition should include information for all required elements of TEI and should employ the <lg> and <l> tags exhaustively.
Your edition should also employ three or more instances of each of the following tags: <rhyme>, <w>, <seg>, and <interp> (each with
appropriate attributes). If you wish, you may employ more TEI tags and attributes. You may test your TEI-XML, if you wish, at the following web site:
TBE validation service. Your essay will be a discussion of your choices (in tagging and
any other decisions) and the advantages and limitations of transforming the poem into the edition you have produced. If you wish, you may also produce
one or two XSLT files that transform your XML for the purpose of display: include an explanation of these files in your report.
Option 3
Analyse the grapheme bigrams of two to four of the following texts: Emma,
Anne of Green Gables, Tom Sawyer,
and On the Origin of Species. Use the Bigraphemes
tool to assemble your data. Write an essay that characterises the data and attempts to come to some conclusions about the data. What information about
the texts can you retrieve? What are the benefits and limitations of graphemic bigram analysis on these documents using your approach? Can you arrive
at any conclusions regarding the texts using this approach?