Boyle Papers:
Boyle collection contents
About the collection
General Introduction
Bibliography
Biographical register
Glossary
Editorial Policy
Markup Policy
Plot: sites on this page sites in this document
Contents: Markup policy and TEI tag usage in the electronic edition of Robert Boyle's Work-diaries |
Charles Littleton, Markup policy and TEI tag usage in the electronic edition of Robert Boyle's Work-diaries
Markup policy and TEI tag usage in the electronic edition of Robert Boyle's Work-diaries By Charles LittletonIntroduction
This document is divided into two parts. The first sets out how the editors of the Boyle Work-diaries project would like to see the material we have worked presented on the Web, i.e. the various components there are and their relations. The second part will be a more detailed description of the way in which I have used each of the principal TEI elements in these files. This will discuss its function in the file structure, the formatting we would prefer for it, and its attributes and the way they have been used. This last topic will be quite detailed as the attributes contain much data and formatting instructions in them. Web format and layout of the work-diaries
This is summarized in the section on Web presentation found in the separate Editorial Policies document connected to these files. We will reproduce it here, but will add additional comments for programmers and Web formatters, which are distinguished by being within parentheses and italicised. The section reads: Normalized and Diplomatic versions
Readers are offered a 'clean', normalized, and easily readable transcript of an entry which exists parallel to a 'diplomatic' version which includes all the emendations to which the entry was subject. In the normalized version of the work-diary, deletions and alterations to words in the original text have been omitted, unfamiliar abbreviations expanded and Roman numerals expressed as Arabic numbers. Places where there are deletions in the original (marked with ) are marked in the normalized text by a small red 'd' in brackets, and places where there are missing letters (marked with ) are marked by an red ellipsis in brackets; other words for which there is expanded textual commentary are also in red typeface (these include text marked with , with 'hand', 'place' or 'rend' attributes, and ). Similarly, abbreviations and Roman numerals which have been expanded or modernized are marked in green (marked with and ). These colours indicate that such words or sigla are hypertext links, clicking which will take the reader to the parallel 'diplomatic' version of the text, in which all the textual emendements made to the text are noted. The editorial commentary provided in the diplomatic version is written in italics, within square brackets and in a smaller font than the surrounding authorial text (the data contained in these editorial notes is contained in the attribute values of the relevant elements). Square brackets around words of the size of the surrounding text, in both the normalized and diplomatic versions, indicate that that word is an uncertain reading or is a place where there are missing letters in the original supplied by the editors by context - unless there is a note in the entry's editorial notes indicating that the square brackets are the author's own (marked by and elements). Unclear or supplied text in red in the noramlized text indicates that the corresponding word in the diplomatic version will give further information on the causes for its illegibility and will state whether the word is merely unlear or is supplied (contained in the 'reason' attribute of the and elements). The diplomatic and normalized texts are linked both ways and merely clicking on the bracketed and italicised editorial note in the diplomatic text will take the reader back to the same place in the normalized text. Readers should note, though, that links in the diplomatic version are not indicated by different colours, but are in the same black typeface as the surrounding text. (Obviously, it is up to the individual programmer and designer to determine how the sigla indicating deleted or missing words, unclear and supplied readings, expanded abbreviations, etc. are to be represented. We are not wedded to red or green words, etc. However, we do see the linking capability between normalised and diplomatic texts as the key feature of this edition and indeed all the markup policies have been geared primarily for that purpose.) Other links and searching aids
We have also included other hypertext links for relevant words within the text - particularly names, places, and books referenced - which are linked to a separate biographical and topographical register (which can also be accessed independently). (These are indicated by the element with the value of its 'type' attribute indicating whether it is a biographical (type="person"), bibliographical (type="author") or topographical (type="place") reference; all these are to be linked to the biographical register (Register.xml) by means of links. The relevant link is indicated by the value of the 'key' attribute in , which is the value of the corresponding entry's 'id' value in the register (even though, unfortunately, 'key' is not an attribute of IDREF content)). Boyle was frequently indirect when naming his sources, often just referring to 'An eminent physician whom I know'. Where we have been unable to trace the identities of such mysterious figures, the description in the transcription will not be highlighted. Ocassionally we have been able to ascertain who these people are from knowledge of Boyle's life and contacts or from other clues in the text. In these cases, the description is highlighted and the corresponding note will explain how we have determined the actual named identity for Boyle's circumlocution. (Once again, it is up to the individual programmer to determine how to indicate the links of these elements. They should, however, be distinguished in some way from the sigla for the diplomatic transcription discussed above. We also ask that hypertext links, although they may be differently coloured, not be underlined, as is usually done on other web pages. This would confuse the reader as to Boyle's own original underlining practices.)
We intend that each entry should also be linked with certain 'keywords' which describe its nature and content, although not necessarily appearing in the entry itself (and therefore failing to produce the entry in a regular word search). We have attached keywords to a few sample entries, but have decided that devloping keywords for these entries is a long-term project, due to the variety of topics covered in these notes and the complexity of Boyle's chemical thought (or opacity of his language). We believe that this could be the sort of joint scholarly endeavour which electronic technology is now faclitating; we would like the site to become a place of scholarly and scientific exchange as readers discuss the significance of these entries and the concepts which are most central to them. We hope, then, that readers will assist us in devloping the keywords for these entries by contacting us with their ideas for pertinent categories and concepts to incorporate; please email Prof. Michael Hunter at m.hunter@bbk.ac.uk with your views and ideas. (This is obviously a long-term project which we have barely begun. We intend to use the element to indicate keywords. It is not something to worry about at the moment, though.) Structure of the Web version of the transcription
The work-diary transcription has both an Editorial Introduction and the transcriptions of the entries themselves. The Editorial Introduction provides general information on the work-diary as a whole (all this information is contained in the , as follows; we do ask that readers have access to it somehow when consulting a work-diary): the title of the work-diary in , in ); a brief description of its content (in ); notes on the page format on which its entries are recorded (in ); its date of composition (as accurately as that can be ascertained) (in ); the hands which contributed to its composition (with a detailed list of which of the numbered entries are ascribed to them) (in , with the name of each hand in the 'scribe' attribute of each subsidiary element, and the entries ascribed to him in the 'character' attribute); the manuscript reference, with Boyle Papers volume and page and folio numbers (in the element of ); the languages used, with details of how many entries are in a particular language (in the element; the name of each language is in the subsidiary elements, and the number of entries ascribed to it in the 'usage' attribute); its length, expressed as the number of entries contained in the work-diary (in ); and general editorial notes and commentary on the work-diary, often detailing problems with its interpretation or transcription (in ; often there will be many of these, each with its own 'id' values to be referred back to by entries in the work-diaries; on some ocassions I have opted to even make a subsidiary of points under a larger ) . (The above describes all the data relevant to each work-diary in the ; other data contained there is generally repeated for each work-diary, and contains information on the editorial aspects of the work-diary. In particular, the information in apart from itself, is not of direct interest to the reader, but should somehow be accessible so that readers can know who funded the project. and , if I even decide to include that data, is more for the benefit of programmers.)
The basic unit of each work-diary is the entry, usually a short paragraph (sometimes only a sentence) detailing an experimental account or an anecdote illustrating a natural phenomenon (in the element, the basic structural unit of the work-diary files). In this edition, all entries consist of at least two parts: editorial notes ()and the text of the entry itself (). In addition there may be two other sections, if their content is present in the original: a list of marginal notes written at the time of the composition of the entry itself, and thus 'integral' to it ( whose type attribute is marked 'integral'); and a list of retrospective endorsements and notes (see above in the section on Titles and Marginalia) ( whose type attribute is not marked 'integral'). The editorial notes introduce the entry and include: the number of the entry, if assigned to it by the editors (as opposed to authorial numbers, about which see below) (, every entry has a number assigned to it, whether it is editorial or authorial), general notes on features of the entry (); the hand in which the entry is written (derived from the immediately preceding , which are placed within ); an estimate of the date at which the entry was composed, based on evidence of handwriting and dating evidence found elsewhere in the manuscript (derived from either the found in the entry itself, or in the last preceding and the next following)>; and the full reference to the entry (including Boyle Papers volume and page or folio number) (derived from the ed (the BP volume) and n (the page or folio number) attributes of the last preceding element). The editorial notes are followed by a list of the marginal notes integral to the entry, if there are any. These are listed under the categories 'date' (where the scribe provided the date of composition or performance of the experiment) (); 'number' (where he provided a number for the entry) (); 'title' (where he provided a brief description of the content of the entry) (); 'reference' (where the bibliographic details of a work referenced are provided) (); and 'note' (for any miscellaneous or stray marginal memoranda that appear to be in the hand and writing medium of the original scribe) (). This section is followed by a list of the retrospective marginal endorsements. Like the integral notes, these include such categories as 'number', 'title', 'date', 'reference' and 'note' (exactly like the authorial integral notes listed above, except without the prefix 'integral/' in the type attribute) and also include the additional categories 'endorsement' (for those marginal notes which indicate with which of Boyle's more formal writings the entry is to be associated) () and 'mark' (for the many ticks, crosses, circles, etc. which appear in the margins) (). Details on the writing media and location of these marginal endorsements are included next to the text of the endorsements in slightly smaller square brackets (but not italicised) (these are in the place, hand and rend attributes to each ; the place attribute is always present, but is usually 'margin', although sometimes there is additional information detailing the exact location of the note in the margin; when the hand attribute is not present, it is because we cannot make a positive identification of the hand; when the rend attribute is not present, the default of 'ink' should be assumed).
These three sections are followed by the entry text itself. It is important to set out briefly here the principles that have guided us in our transcriptions of the entry texts. This section will not include a discussion of the rationale behind these policies, which can be found in the sources listed above (i.e. in sources listed in this section of the document that appears in the Editorial Policy document; I did not think it was necessary to list them all here). Editorial and transcription policiesSpelling and Punctuation
Spelling and punctuation have been rendered as it appears in the manuscript and have not been normalized. Where this may lead to confusion an editorial annotation has been supplied to provide a 'modernized' form (using the element, although this is very rare, and we have not decided how it is to be employed; for the time being it should be ignored). We have however kept this to an absolute minimum and note where such incidences occur. Certain letters in the original have been transcribed following modern standards. Thus 'ff' is 'F' and long 's' is the modern 's'. For the interchangeable letters 'u' and 'v', the former letter is rendered when a vowel is used in the modern, printed, form of the word and the latter where a consonant would be used. The same rule applies to the equally interchangeable letters 'i' and 'j'. In the manuscripts, quantities are often expressed by Roman numerals, in which the last in a series of 'i's (i.e. '1') is written as a 'j'. In the transcriptions this final 'j' is also transcribed as 'i'. Ampersands, &, are retained throughout and not modernized, as this is still a commonly recognized symbol. Textual emendments
We consider here four principal type of textual emendment: insertions (); deletions (); replacements (where an inserted word is apparently replacing a corresponding deleted word) (, which contains one and one element); and alterations (where a letter of letters of a word are changed in composition) (). All insertions to the text are retained in the transcription and are surrounded by angled brackets, < and >. If an insertion is in the margin or placed in the line, this is indicated by an italicised editorial note in square brackets immediately following the insertion (in the diplomatic version) (in the place attribute of the element, with value of either 'margin' or 'line'). If there is no editorial note following the insertion (and it is not in red in the normalized version), it should be assumed that the insertion is supralinear, the default status for insertions. Where an insertion replaces a deleted text the insertion is provided within angled brackets and the content of the deleted text given within following italicised editorial brackets (the element, created specifically for this project; always has as children one element for the replacing word and one element for the replaced word). Deleted and altered words and passages are recorded and their content or process of alteration explained within following italicised editorial brackets (the content of deleted passages is the content of the element; the description of the alteration which a word has undergone is provided in the sic attribute of the element, which element surrounds the word in its final, altered form). All the explanatory editorial notes are found in the diplomatic version, with links to and from the normalized version. Abbreviations
Boyle and his various amanuenses used a host of different abbreviations in their writing. Abbreviations can be categorized under three types: suspension, where the first letters only of a word are provided; superscription, where the first letters are provided and the terminal letter or letters are written in superscript; or brevigraph, where missing letters are indicated by a sign or a mark.
Suspensions are transcribed as they appear in the original text. Occasionally, for more obscure suspensions, the normalized text will contain the expanded form of the word, appearing in green to indiate that there is a link to the original suspension in the diplomatic version (expansions of abbreviations are provided in the expan attribute of , which element surround the abbreviated form). However, there are so many of these suspensions, and most of their expansions are so obvious or commonplace, that we have not systematically expanded every single instance, and trust readers will be able to determine most of the expansions themselves. We have not expanded initials used to designate people; instead, where the identity of the person is known, a hypertext link at the initial will take the reader to the corresponding entry in the biographical register (people so are identified are tagged by , with the key attribute providing the link to the ID reference in the biographical register)>. In one other particular case, we have purposefully abstained from expanding suspensions. The recipes in Latin often contain many suspensions, with no indications as to what the proper inflected endings of these words should be (see in particular work-diary X). We have not supplied the endings, because we would have had to make assumptions as to the grammatical role of the words based on very little contextual evidence. More importantly, contemporaries themselves habitually thought of the ingredients and processes found in these recipes in terms of their common abbreviations. The abbreviated form was thought of as the word itself, and Boyle and his colleagues did not necessarily consider these terse words merely as shorthand alternatives to longer words. Thus we have not attempted to expand the majority of abbreviations in Latin recipes, except for a few specific terms of art, mostly those which are included in the basic lexicons of medical terminology, whose abbreviations are so short, usually comprising a single letter, that it would otherwise be impossible to determine its sense even by context (e.g. 'satis quantum' for s.q., or 'satis vis' for s.v.).
Most superscriptions and brevigraphs have been silently expanded, with no notice being taken in the markup of their original form (occasionally though, during those times when I felt it was important to record everything, I marked such expanded abbreviations with the element, with a description of the abbreviated form in the abbr attribute, often using entities to represent superscripted letters; for the moment, though, the data in the abbr attribute of is to be ignored and the content of the element processed as normal. At a future date we may decide what to do with these elements, but as they were tagged very inconsistently, their scholarly value is diminished and I am only maintaining them as a matter of record). The most common types of abbreviations that appear in the work-diaries, and our methods of expanding them are as follows: - Thorn ('y') with superscript has been expanded as 'th' with the supplied middle letters and lowered superscript letters. The most common occurrences are: 'the' (ye); 'that' (yt); 'them' (ym); 'their' (yr).
- W with superscript has been expanded as 'w' with the supplied middle letters and lowered superscript letters. The most common occurrences are: 'which' (wch); 'with' (wth); 'what' (wt); 'whether' or 'where' (depending on context) (wr); 'when' (wn); 'where' (wre).
- Superscript r is lowered and the omitted letters supplied. The most common occurrences are: 'our' (or); 'Remember' or 'Receiver' (depending on context) (Rr); 'colour' (color); 'November', et al. (Novemr), etc. Mr, Mrs, Dr, etc. have had ther superscripted letters lowered, but the abbreviations 'Mr', 'Mrs', and 'Dr' have been maintained as these abbreviated forms are still in common use ('Monsr' and other non-English titles are expanded, in this case to 'Monsieur').
- Superscript t(s) is lowered and the omitted letters supplied. The most common occurrences are: 'about' (abt); 'spirit(s)' (spt(s)); 'experiment(s)' (expt(s)); 'part(s)' (pt(s)); 'argument(s) (argut(s)); 'account(s) (acct(s)), 'great' (grt), etc.
- Superscript d is lowered and the omitted letters supplied. This is a rare construction and is usually used to form past tenses of verbs, e.g. 'answered' (answer:d); 'informed' (informd), etc.
- Superscript p is lowered and the omitted letters supplied. This is a very rare construction and only really occurs with two forms: 'Lordship' (Lo:p) and 'Ladyship' (La:p).
- Tilde or macron denoting and ending in 'm' or 'n' is expanded, e.g. 'frō> for 'from'; menstruū for 'menstruum'. This is frequently used in Latin texts as well for inflected endings with m.
- Tilde over 'on' denoting an ending in 'tion' is expanded, e.g. respiraōn for 'respiration'; exhalaōn for 'exhalation'; transmutaōn for 'transmutation'.
- An underscore or ligature under 'p' is expanded to 'pre', 'pro', 'per' or some other similar form. This is most often used in Latin recipes for the word 'præparatus'. This form can also be done (depending on the amanuensis) with a superscript r, e.g. prtended for 'pretended'.
- In Latin texts, the ligature connected to q to denote the enclitic 'que' is accordingly expanded.
- In Latin texts, a terminal ligature descending below the line of writing is expanded to 'ibus'.
- In Latin texts, a terminal ligature ascending above the word is expanded to 'us'.
- In Latin texts, a tilde over terminal letters is, depending on context, expanded to 'tur' or 'ntur'.
Where it cannot be determined what the expanded form of the abbreviation should be it has been transcribed as it appears in the original, complete with superscript.
Where Boyle or the amanuensis combines Arabic numbers and Latin or English superscripts -- e.g. 5th, 2ly, 7ber, 9es, 4er, etc. - the characters are transcribed as they appear in the manuscript, with the superscript maintained (thus the content of the elements in the texts should be superscripted, preferably in a slightly smaller font than the surrounding text). If the meaning is not immediately apparent, an expanded form will be supplied in the normalized version, linked to its corresponding place in the diplomatic version (i.e. the third, fourth and fifth examples above would be given in the noramlized versions as 'September', 'nones' and 'quater'; however 5th and 2ly and similar constructions are deemed so common and obvious as not to require expansion) (the expanded versions of such number-superscript forms is of course provided by the expan attribute of the element which surrounds the abbreviation). Symbols
The many chymical symbols Boyle and his amanuenses use in their work present other problems. As the meaning of the symbols beyond the most common (iron, copper, mercury and silver) may not be known to most readers the symbols have been transliterated into English text and their literal definition placed between curly brackets (in the texts the chemical symbols are represented as entity references, the full definitions of which, i.e. the transliterations within curly brackets, are provided in the entity file BoyleEntities.ent. We have expressed them as entity references to aid future flexibility. If we are ever able to develop a complete set of characters for these symbols we may wish to substitute them for the present transliterations). Throughout, the contemporary terms for substances and processes are used, those similar to the terms Boyle uses in his literal descriptions of these substances. Thus the symbol C.C. is transliterated as {hartshorn}, the term Boyle commonly uses in other cases, and not 'ammonia'; AF is {aqua fortis} and not 'nitric acid', and so on.
Where these symbols appear with terminal superscripts, the curly bracketed transliteration has been maintained, to signify that there is a symbol at this place, followed by the terminal superscript. This may not make immediate sense to the reader, as, for instance, when Boyle's intended meaning of ♂ial' (i.e. 'martial') would be thus rendered as '{iron}ial'. Even more troublesome are the instances where the symbols are used in Latin texts with terminal superscripts providing the proper Latin inflected ending. Thus ♂is (i.e. 'martis') would be rendered as '{iron}is'. In such cases we supply an expanded from of the symbol-superscript construction in the normalized text, which is of course linked to the original form in the diplomatic version (as usual, this expanded form for the normalized version is provided in the expan attribute of the element which contains the symbol-superscript construction; the original construction, complete with raised and smaller superscripted letters, is to go in the diplomatic version). We do consider it important, though, to use the transliterations as place-holders to signify the location of symbols in the diplomatic texts, even if that renders reading them more difficult. Eventually we hope to have access to a complete character set of these symbols whereby we can actually represent them electronically, as well as provide a literal rendering.
Symbols are also used to signify the common measures in recipes - pounds, ounces, drachms and scruples. In the original text these symbols are usually written before the quantity itself, which is usually written as small Roman numerals joined in a cursive style to the unit symbol. The pound, ounce, drachm and scruple symbols are similarly transliterated and the quantity maintained in its position after the measure and written in Roman numerals. Thus the transcription {ounce} iii signifies what would be in modern writing '3 ounces'. In the normalized version the roman numeral in the original is replaced by its Arabic equivalent (the Arabic equivalent is given in the value attribute of the element which contains the Roman numeral; we have not yet decided definitively whether we want to provide readers with the Arabic numbers or whether we expect them to read the Roman numerals in both the diplomatic and normalized versions), but the transliterated unit of measurement still appears before the quantity. Furthermore, there is also an old sign for ½ that often appears with these measurements - ß, or two long 's's, presumably standing for 'semi'. This will be rendered as ; in the transcriptions.
The marks - slashes, crosses, lines, stars/asterisks, etc. - that often stand next to entries are also considered 'symbols' that are expressed by transliterations within curly brackets. We have taken this approach rather than try to express them typographically from the existing character sets (see BoyleEntities.ent for a list of these entity references and their definitions). Finally, the symbol for recipe, ℞, is here represented as {Rx}. Even though the symbol does exist in the chracter sets, its typeface did not seem to mesh very well with the remainder of the text, so we have opted for another straightforward transliteration. Element List
The following provides a list of the elements most commonly used in the section of the files, with a description of how they are used, their attributes and range of attribute values, and their preferred formatting. Much of this information has already been covered in the commentary to the editorial notes above. Please note that this list does not discuss elements in the
An abbreviation which we want to display as it stands in the diplomatic version while providing an expansion of it in the normalised version. The element itself contains the abbreviated form, usually incorporating a superscript ( ). Also used to provide glosses on forms consisting of combinations of characters (chemical and number symbols) and letters, often in Latin. Attributes- expan: contains the expansion of the abbreviation which is to be provided in the normalised version, but linked to its corresponding abbreviation in the diplomatic version
A word or words inserted into the text during Boyle's lifetime (i.e. not editorial annotations by later editors such as Wotton or Miles). Attributes- place: where the insertion occurs. Where there is no place attribute the insertion is interlinear, which is the default value, and marked (usually) by a carat. Place usually then has one of two values: 'margin' where the insertion is located in the left-hand margin; and 'line' where the insertion is made in-line (we can determine whether it is an insertion by differences in ink or hand). The data of this attribute should be provided along with the insertion in the dipliomatic version.
- hand: the hand in which the insertion is written, where it is different from the hand writing the rest of the entry (which is determined by the identity of the new attribute in the ). The value of the hand attribute should be of type IDREF, referring to one of the hand IDS listed in in the . The data of this attribute should be provided with the insertion in the diplomatic version.
- rend: the writing medium in which the insertion is written, when different from the surrounding text, whose default value is ink. Values are either: 'pencil' or 'red' (i.e. a red crayon or pencil often used in these entries). The data of this attribute should be provided along with the insertion in the diplomatic version.
Words altered during composition of the text (as distinct from words completely struck out and replaced, which are dealt with using and tags, contained within the parent element). Not only does this tag indicate words whose individual letter or letters are altered, usually by overwriting, but it also gives details of individual letters inserted or deleted within words. We have taken this route rather than placing and tags within words, whose formatting could be disconcerting to the reader. Attributes- sic: a description of the alteration that has taken place. NB contrary to the TEI Guidelines the value of the sic attribute is not the word as it appeared when unaltered, but is a description of the specific process of alteration, usually taking the form of 'letter A altered from letter B' or some such. Some descriptions can be longer. We have taken this route because the text of the sic attribute is intended to be similar to the footnotes in the editions of Works and Correspondence which provide a narrative of the composition process. Thus the value of the sic attribute should be provided with the final altered form in the diplomatic version.
- resp: the agent responsible for the alteration. By default EVERY is marked resp="author" to indicate that we are recording an alteration made by the original compilers of the entries and are not imposing our own 'corrected' words to the transcripted (where we do feel the need for some corrective editorial intervention we transcribe the uncorrected word exactly, but mark it with , although this is very rare). We have not used the resp attribute to mark individual scribes or amanuenses who make the alterations as that would be next to impossible to determine.
Words or words deleted from the text. The content of the deleted passage is to be included with the diplomatic version. Attributes- hand: extremely rare, and only in cases where they deleted word appears to be in a different hand from the surrounding text. NB. contrary to the TEI Guidelines, the hand attribute in the very rare occasions where it is used refers to the hand of the deleted word NOT the hand which performed the deletion, which would be next to impossible to determine.
Self-contained unit of the work-diary, such as sections and entries Attributes- type: the level of unit of the work-diary. There are two in use in the work-diaries: 'section' and 'entry'. Every work-diary has at least one section, within which all the many entries are contained; some work-diaries have several sections. The
is used primarily to differentiate between headings used for entire work-diaries or sub-sections of work-diaires and headings used in individual entries. Thus a that appears immediately within a can be formatted differently (i.e. larger) than that appearing immediately within a . Sections can be used to navigate within larger work-diaries, where sub-sections exist, but as the numbering continues sequentially across section divisions, too much should not be made of multiple sections where they appear. is the basic unit of the work-diaries. Each one is to be presented separately, with the same editorial commentary and formatting for each one. Please see above in the section on 'Structure of the Web version of the transcriptions' - n: the entry number, determined by the authorial or editorial number assigned to the entry. If no is assigned to the entry, n provides the entry number to be generated in the editorial information. Every
should have attribute n; occasionally it is provided for , but this has been applied far more inconsistently. - id: the unique id number assigned to the entry, very important in navigating hypertext links. It is formed from a concatenation of 'WD' (for work-diary), the work-diary number (in Roman numbers), a hyphen, and then the entry number (i.e. the value of attribute n). Links between the normalised and diplomatic versions depend on navigating to the uniquely identified entry. ALL entries must have an ID; again, this attribute has been far more inconsistently applied, if at all, to
. - lang: the language of the entry. The value is an IDREF linked to the language IDs provided in the element of the . This data can be provided in the editorial commentary, or it can be used for indexing and searching purposes. Not used with
.
The expanded form of an abbreviation. Used here largely as a matter of record, as all the abbreviations so tagged would be provided in their expanded form in both the diplomatic and normalised versions, according to our editorial policies on abbreviations. Attributes- abbr: the abbreviated form of the expansion as it originally appears in the manuscript. Nothing is to be done to the data in the abbr attribute. It has been used to record the original form of abbreviations which, according to our stated editorial policies, we expand silently. For this reason it has been used inconsistently and should be ignored for the moment.
Material not transcribed (because illegible or lost). Attributes- reason: the reason it is missing from the transcription. The default value for this is 'illegible', which may or may not appear; other values give more specific reasons. The value of the reason attribute should be attached to the missing section in the diplomatic version.
- extent: the number of characters or lines missing from the transcription. The value is approximate, and expressed as characters, not letters. It is very important that this information be attached to the missing section in the diplomatic version.
- resp: the editor responsible for omitting the section and for estimating the number of words; will almost always be 'CL' (Charles Littleton). This data does not need to appear with the diplomatic version and is maintained more as a matter of record
Change of scribe/ amanuensis in manuscript, signalled by change of handwriting. This tag is used for 'large-scale' changes in hand, i.e. where the new scribe writes at least one complete entry. Brief alterations in hand within entries are considered additions to the text and are marked by (see above). the appears immediately after the of the first entry associated with the scribe, although the same scribe is probably associated with the integral notes which may also appear in the entry and actually appear before the in our arrangement of data. For consistency's sake, though, we have always associated with the entry's Attributes- new: the amanuensis responsible for this section of the work-diary (until the next ). The value of new is an IDREF referring to the IDS of the scribes listed in the section of the . Information on the scribe should be provided for, or at least made available for, each entry in the work-diaries. This can be done by merely referring back to the last previous
- resp: the editor responsible for identifying the hand. Will usually be 'MH' (Michael Hunter) or, somewhat less authoritative, 'CL' (Charles Littleton)
Heading of a work-diary, sub-section within a work-diary, or individual entry. The formatting of depends upon within which parent element it appears. A immediately within is a work-diary heading (or sub-section heading) and should be relatively large and prominent. Within a , the only refers to that individual entry. It should be formatted separately and differently from the entry text (i.e. usually centred), but should not be prominently larger than the entry text Attributes- place: where the heading appears relative to the body of the page. Most often, the value of the place attribute is 'center' (spelled the American way to avoid confusion if programmers directly use the value of place in their HTML formatting, as HTML will only understand 'center' and not 'centre'). Where there is no place attribute, 'center' should be assumed as its default. The only other common value of place is 'left'. This is for headings which appear in the left margin above the entries, usually dates indicating from which date the entries commence. The headings should be formatted in these positions relevant to the entry texts.
- hand: identity of the scribe who composed the heading, if he can be identified (should be linked to hand IDs listed in ). Where there is a value for this attribute, the data should be provided with the heading in the diplomatic version.
- rend: in the work-diaries proper, this indicates the medium used to compose the heading, when not the default, ink. Thus will be either 'pencil' or 'red' (for a red crayon or pencil frequently used). Where there is a value for this attribute, the data should be provided with the heading in the diplomatic version. However, in the General Introduction and other of the 'front' matter, rend usually takes a numerical value, which indicates which indicates the relative size of the heading, with '1' being the largest (in other words, following the typology of headings found in HTML,
, , etc.) - type: a description of the content of the . Usually one of 'title', 'date', or 'number', similar to the categories for the type attribute used in (indeed it has often been difficult to determine whether text is a or a and both elements have many features in common). These are more perhaps for indexing or searching purposes or for matter of record. At the moment it is not envisaged to present the type values of for readers.
Text to be rendered as typographically distinct in some way (other than by deletion). Attributes- rend: what makes it distinct. Permitted values: "superscript", "underline", "macron" (i.e. overline), "bold".
Line of verse. Used very rarely as there are few examples of verse in the work-diaries. s should be single-spaced and made distinct in spacing from .
Line break in headings and prose sections.
Line group, i.e. a verse passage or a stanza within a verse passage. Used very rarely as there are few examples of verse in the work-diaries.
Probably the most frequently and variously used element in the work-diaries, with a multitude of functions. Its uses are best described through a discussion of its many attributes Attributes- resp: all s are either of type resp="editor" or resp="author"; it is the fundamental division and each such has to be treated differently. are editorial annotations to the entries. They are to be clearly marked as editorial and be made available to readers before or separately from the entry text. elements contain the various marginalia and notes Boyle and his amanuenses themselves appended to the entries, both at the time of composition and retrospectively. They are considered part of the entry text, but are still to be formatted separately (but not subordinate to) the actual entry text in
. The following attributes will be discussed depending on whether they are in or - type: sets out a standard typology to describe the content of the note. has the following types, both within the of the and within the entries within itself:
- note: an editorial annotation with some comment or clarification of the text; a basic annotation
- number: the number of the entry when assigned by the editors in the absence of an authorial number. Also used where there is confusion in the numbering scheme used by the author which would lead to repetition or confusion in the numbering sequence. The difference between editorial and authorial numbers should always be noted to readers
- date: the date of composition of the entry (or performance of the experiment described) expressed in standard dating format -- day month year. This is to be distinguished both from the dates attached to the entries themselves, which often form the basis for the editorial date but often do not include years, are expressed in a variety of formats, or are inconsistent, and from the dates in the value attribute of the element, which, although, in a standard format, are not reader friendly and are intended more for electronic searching. The is intended to provide the skeleton on which the general chronology of the work-diary is based. Each entry can be placed (ideally) in the time between its last preceding and its succeeding . Thus an entry's position relative to the elements throughout the work-diary should be made available to the readers.
- reference: the bibliographical reference to a work quoted from or mentioned in the work-diary. Usually this function is provided by links to in the Biographical Register (see below), but occasionally for frequently cited works I have maintained this type
- format: only in the of the , this describes the physical state of the work-diary as a whole -- size of paper used and its condition, anomalies in pagination, etc. Its data applies to the entire work-diary and should be made available to readers as part of the general introduction to the work-diary
- length: only in the of the , this provides the number of entries found in the work-diary and information about the numbering system used (if any). General information which should be made available in the general introduction to the work-diary
- person, author, location: only used in the Biographical Register (Register.xml) associated with the files. The biographical and bibliographical annotations in the register are all marked as , and not
. signifies that the text is a biographical annotation linked to a person mentioned in the work-diaries; type="author" indicates that the annotation is linked to a bibliographical reference in the work-diaries; type="location" indicates that the annotation is linked to a reference to an obscure place name found in the work-diaries. the three types are to be formatted the same way and the type attribute is largely used as a matter of record AND for more secure and unique linking with the elements which surround the links in the work-diaries has the following values for the type attribute - number, integral/number: indicates that a number has been assigned to the entry in its margin by a scribe contemporary to Boyle himself, either during the actual composition of the entry (integral/number) or retrospectively (number). This data must be provided with the text of the entry in both normalised and diplomatic versions, but separate from the entry text, its type ('number') clearly indicated, and the distinction between integral and retrospective annotation maintained.
- date, integral/date: indicates that a date has been assigned to the entry in its margin by a scribe contemporary to Boyle himself. The same considerations as with number, integral/number above apply. This data sometimes, but not always, provides the basis for our estimations of date in and is also often surrounded by the element whose value attribute supplies the date in a machine-readable form.
- title, integral/title: indicates that a 'title', i.e. a brief description of the content of the entry, has been assigned to the entry in its margin. The same considerations as with number, integral/number above apply.
- reference, integral/reference: indicates that a reference to a work cited or made mention of in the entry is supplied in the entry's margin. The same considerations as with number, integral/number above apply. Often the text of is enclosed in a element and linked to fuller bibliographical information (supplied by the editors) in the Biographical Register.
- endorsement, integral/endorsement: indicates that there is some sort of notation in the entry margin showing with which of Boyle's published or planned works the entry is to be associated. The same considerations as with number, integral/number above apply. It has often been difficult to distinguish between and is a non-literal mark of some sort (cross, tick, vertical line, etc.) in the entry's margin. Same considerations as with number,integral/number above apply. Almost all marks are retrospective. Marks are expressed as entity references whose definitions are literal descriptions of the mark enclosed within curly brackets.
- note, integral/note: indicates that there is an annotation found in the entry's margin which is not easily classifiable according to the above typology, i.e. a miscellaneous category. Same considerations as with number, integral/number above apply. The most common is 'Tbd', an abbreviation for 'Transcribed' which appears with most entries.
- place: where the note physically appears. This attribute is only used with . The value is almost always 'margin', which could almost be taken as the default. That data could be included with the text of each , but it might become repetitious; it is probably better to group all for a specific entry under a general heading, 'Marginal notes'. However, occasionally, when there are several incidents of the same type connected to an entry, the place attribute contains more detailed information about where in relation to the entry text body the marginal annotation appears, usually in the form of, 'at beginning' of entry', or 'against line beginning ....'. In this case, the value of the place attribute is expressed place="margin, at beginning of entry". Either the full text of the attribute can be used, or the substring after 'margin, ' can be reproduced to limit repetition of margin. In any case, any data in the place attribute beyond the default 'margin' should be supplied with the text of the
- hand: the identity of the amanuensis who wrote the marginal annotation, if clearly identifiable and different from the scribe responsible for the entry text as a whole (NB, the hand of is always the same as the hand responsible for the
text and the hand attribute is not used in these cases). Thus this attribute is used only with . We have in general been wary in making such identifications on such scrappy pieces of text, but on a few occasions the hand responsible for marginalia is clearly identifiable. Where the attribute is present, its data should be supplied with the text of the . The value of the hand attribute is IDREF and refers back to the hand IDS in the in the . - rend: the medium used to write the annotation. the attribute is used only with . The default value is ink, in which case the attribute is not used; other possible values are 'pencil' or 'red' (for a red crayon or pencil frequently used, particularly in the marginalia). Where this attribute is present its data should be supplied with the text of id: the unique id value of the , to be used for linking purposes. It is primarily used with . In the of the there are frequently s which provide information which applies to several entries in the work-diary. By giving each of these general s an id value we can provide links between it and each relevant individual entry, each of which has an entry-level which contains in it a
[ to the larger, more general . In the Biographical Register, each also has a unique id value (usually the person's name) by which the name marked by in the entry text is linked to the corresponding annotation in the register, using the key attribute of . There is, in addition, one rare use of id with . Occasionally a siglum -- a cross, dagger, asterisk, etc. -- is found in the entry text which indicates that the reader is to find replacement or explanatory text at a marginal annotation marked by a corresponding siglum. In some cases, this marginal note marked with a siglum is used in a number of different entries. In this case such a has an id value by which the place in the entry text where the siglum appears can be linked to it. The id value usually takes the form 'siglum1', 'siglum2', etc.]
Paragraph. The element encloses the text of the entry found in the body of the page. Most entries are only one long, though some have several. A child element of and sibling of for each entry. Attributes- rend: to indicate how the paragraph is rendered in the original. Used very rarely, if at all, as most entries are just written in plain cursive, with no special rendering.
Page break. Its attributes to be used to generate page references for each entry Attributes- ed: the volume of the Boyle Papers, or other manuscript volume, in which the work-diary is found. To be used as the first part of any reference generated.
- n: the page or folio number of the manuscript on which the entry is found. This is given in the form, e.g. 'p. 25' (or 'fol. 25'). To be used as the second part of any reference generated.
Reference to another text within the document; used to generate links between sections of text within the same file. See below for those elements which are linked to text in a separate file
Replacement of deleted word. This element was included in the DTD specifically for this project. A attribute consists only of one element and one element as children. It indicates that the content of the replaces the content of the , usually in the form of a words interlineated above struck-through words. In its editorial apparatus this project is keen to distinguish between replaced (and replacing) words and words merely inserted or deleted (that is to say, not every that appears next to a is necessarily replacing it and a distinction needs to be made). Our own stylesheets format the content of within differently from appearing by itself, using the form, coming immediately after the appearance of the inserted words, [replacing ' ' deleted] (whereas the content of regular is just noted as '' ' deleted]). We ask that programmers incorporate a similar distinction between types of deletions and insertions when devising stylesheets Attributes- No attributes used with this element
Referring string. Text marked for indexing and/or linking purposes. Used for links to text in a file separate from the file in which the appears. Much of its usefulness derives from its attributes Attributes- key: the value of the key attribute is the id value of the corresponding element, usually in a different file, which should be linked to . Unfortunately, key is of CDATA type not ID type, and thus the validity of the key attribute against the unique id attributes cannot be checked by a parser.
- type: the final typology of has not yet been determined; ones used at present include 'person', 'authorref' and 'bibliography'. Most importantly, the value of this attribute determines the location of the link which is to be created at this element. Those elements of type="bibliography" should be linked to the items in the Bibliography.xml file whose id attribute match the value of the element's key attribute; attributes with all other 'types' should be linked to the corresponding element (based ont he key attribute) in the RegisterNew.xml file. In future we may wish to incorporate links using to the Glossary.xml file, but that has not been incorporated yet. At the moment, the main division is between type="bibliography" and every other type value in
- n: the form of the name or word tagged given in an appropriate index form, i.e. surname, personal name, etc. This is meant for future indexing purposes, in case we do want to generate a list of all people who appear in the work-diaries (in which case we would use the type attribute to distinguish people from places, etc.). At the present, though, we do not foresee any formatting for this attribute.
Faulty text: contains the text as it appears on the manuscript. Used very rarely in these files. NO formatting is envisaged for it at the moment. Attributes- corr: what the author meant to put.
Gap in the text attributable to author or scribe (i.e. not a gap due to illegibility, about which see ). Attributes- dim: dimension. Permitted values: "horizontal", "vertical", i.e. whether the space appears within a line (horizontal) or actually is a space of a number of lines (vertical)
- extent: numerical value in units of characters for horizontal space or lines for vertical space (i.e. roughly how many letters or lines could have been fitted in if it hadn't been left blank), or "unclear" if appropriate, e.g. for an unfinished interlineation.
Letters or words supplied by the editor where there is damaged text but sense can still be determined by context. Should appear within square brackets, or with the same marking as or Attributes- reason: a description of why the text is illegible; should appear with the supplied text in the diplomatic transcription
- resp: the editor responsible for supplying the missing letter(s); most frequently 'CL'. No fromatting envisaged.
Uncertain or conjectural reading supplied by the editor. Should appear within square brackets, or with the same marking as and Attributes- reason: description of the reason why reading is unclear; to be provided with reading in the diplomatic transcription
- resp: the editor responsible for supplying the reading; most frequently 'CL'. No formatting envisaged.
This text is based on the following book(s): Written by Charles Littleton, October-November 2001.
|