Scribal Attribution across Multiple Scripts: A Digitally Aided Approach

Artigo Acesso aberto Revisado por pares

Scribal Attribution across Multiple Scripts: A Digitally Aided Approach

2017; University of Chicago Press; Volume: 92; Issue: S1 Linguagem: Inglês

10.1086/693968

ISSN

2040-8072

Autores

Peter A. Stokes,

Tópico(s)

Natural Language Processing Techniques

Resumo

Previous articleNext article FreeScribal Attribution across Multiple Scripts: A Digitally Aided ApproachPeter A. StokesPeter A. StokesPeter A. Stokes, King's College London ([email protected])Full TextPDFSupplemental Material Add to favoritesDownload CitationTrack CitationsPermissionsReprints Share onFacebookTwitterLinked InRedditEmailQR Code SectionsMoreIt is well known that people at many points in history wrote (and still write) using different scripts, alphabets, and writing systems.1 For most of the Middle Ages, scribes used different scripts for different purposes: the Insular script system included Half uncial, Cursive minuscule; the Gothic system included Textualis Libraria, Textualis Formata, Textualis Currens, and so on.2 This practice is by no means restricted to Latin writing, however: Tibetan scribes used the more formal dbu-can and more cursive dbu-med scripts in the same documents, for example. To consider other alphabets and writing systems, Sicily and the Iberian Peninsula both abound in documents and inscriptions combining Latin, Greek, Arabic, and Hebrew; documents from the Silk Road, such as those found in the Dunhuang caves, include Chinese, Tibetan, Sogdian, Uyghur, Syriac, Hebrew, Greek, and numerous Indic scripts; documents such as Codex Mendoza contain both Latin (Spanish) and Aztec writing systems alongside each other; and, even today, modern so-called foreign-look fonts for the Latin alphabet are designed to imitate Indic, Chinese, or other writing systems.This phenomenon of multigraphism, namely the use of different scripts, alphabets, or writing systems in the same time and place, has been studied in some detail in its social context, such as Greek and Hebrew and Greek and Egyptian; hieroglyphic and cuneiform scripts from Hittite Anatolia; Hebrew additions to Latin documents, and so on.3 However, as Armando Petrucci has pointed out, current paleographical theories and methods tend to assume that societies are "absolutely monographic" (monografico assoluto); he also argued persuasively that the difference between this and multigraphism (multigrafismo) is not superficial but that multigraphism requires us to rethink our theories very deeply.4 He has also drawn the important distinction between "relative" multigraphism (multigrafismo relativo), in which two or more different scripts are used for the same alphabet, and "absolute" multigraphism (multigrafismo assoluto) where different alphabets or, presumably, nonalphabetic writing systems were used;5 it seems likely that each requires a different approach. Although probably all paleographers recognize the existence of multigraphism, few have attempted to develop a coherent method or theory for approaching it, particularly in its "absolute" form, because of the significant methodological challenges that it brings.6 A related and perhaps more manageable phenomenon, which has received somewhat more attention, is "digraphism," which has been defined by Teresa De Robertis as the writing of different scripts at approximately the same level of expertise by a single scribe; as she is at pains to emphasize, this is not a case of one script being executed at different levels of formality, but rather of entirely different scripts.7 Even the question of digraphism is challenging, however. As De Robertis has reminded us, most paleographical discussion of scribal attribution assumes that we are trying to distinguish two scribes writing very similar scripts, rather than identifying one scribe writing very different scripts, and this change in perspective raises some fundamental questions about what we should be looking for.8 How can we determine if two specimens of writing, each in different scripts or writing systems, were written by the same scribe? The question of scribal attribution is challenging enough even with the same script, but when different scripts are involved the problems multiply.9 When comparing a precissa script with a cursiva, for instance, not only are the strokes and pen movements very different, but so also is the cut of the pen itself, and this in turn further alters the writing very significantly. Similarly, in the Middle Ages Greek and Arabic were typically written with much thinner pens and much less contrast between thick and thin strokes than was the case for Latin, even in the same manuscript, as is clearly visible in (for instance) a twelfth-century trilingual psalter from Palermo, which is now preserved in the British Library;10 trying to establish how many scribes were involved in a manuscript like this is very difficult indeed.11 There have been some important discussions here, but there does still seem to be some way to go before we have a clear method in place.12 One new direction in which we might turn is the digital: recent developments in computer simulation of handwriting might begin to show how an individual's hand changes according to pen cut and direction, for example, but such work in this context looks some way off.13 Similarly, techniques in image processing and machine learning are "script agnostic" in principle and have successfully been applied to writing in entirely different alphabets and writing systems, but to my knowledge they have been applied only to one system at a time and have not yet been effective in identifying a single scribe writing entirely different scripts; and even the former simpler case is not without its difficulties, although rapid progress is being made.14 Instead, this article will propose a different approach, one that attempts to draw on the rigor and precision of terminology and modeling that digital methods bring and to combine these with existing "nondigital" paleographical discussion and theory. The result of this discussion will certainly not be an answer to the question of multigraphism (if indeed there is such a thing), or even necessarily of digraphism, but it will offer some theoretical and practical steps towards a framework within which this work can usefully progress.In terms of method, a useful starting point seems to be that of the DigiPal project and the subsequent work that has built on it.15 A key outcome of this project was a new method for describing handwriting, one which used conceptual modeling and semantic labels that paleographers could understand while still providing sufficient rigor to significantly reduce ambiguity and allow the descriptions to be stored, searched, and communicated effectively by computer.16 The method has been implemented in a generic digital framework, but it does not involve automated detection of letters: rather it relies on users selecting those examples of letters which they consider to be significant, and then applying descriptive labels to those letters in ways that are meaningful for the given script and research questions under consideration. This approach avoids the well-known problems of paleographical terminology by keeping these descriptions customizable according to the given implementation:17 it is up to the researcher(s) to decide what forms and features are described and what terms are used in describing them. For example, in DigiPal itself the allograph "Caroline h" is described in terms of two components, an "arch" and an "ascender"; the "ascender" can have various features, such as "long" or "short," "forked" or "tapering," and so on. For Models of Authority, however, which is a different instance of the same software applied to twelfth-century Scottish charters, the allograph h is described as having three components: an "arch," a "limb," and a "shaft"—and the "shaft" can have features such as "forked," but also "looped" or "elongated."18 In other words, the framework provides a common "syntax" for the description of handwriting, within which researchers can place their own vocabularies.19 This then allows asking questions such as "Show me selected examples of all letters with ascenders written by scribes from Worcester, and also those from York, and let me see how the labeled stylistic features are similar or different."20 The spirit in which this is done is not that of the computer providing definitive answers to paleographers' questions, but rather of seeking to improve transparency and clarity of communication, evidence, and argument: goals which have been those of the paleographer for more than a century but which have come to the forefront once again with the advent of digital methods.21 The ability to ask such questions and communicate the results has proven extremely effective for the detailed analytical study of handwriting, for instance by revealing close stylistic similarities between scripts from different places, which in turn indicates close cultural contact.22 The flexibility in vocabulary and the absence of any need to train the software has also meant that it can be applied very easily to non-Latin writing systems and even to decoration. A PhD student supervised by the author has used the DigiPal method for Hebrew and decoration, and the method has also been applied successfully to coins and inscriptions.23 The software has also been tested in small-scale "proof of concept" implementations for Chinese, Arabic, and Cuneiform as well as modern draft manuscripts. In total, the method and framework are now being used in fourteen different research projects, to date, and the approach has had a very wide impact on paleography as a whole.24 Furthermore, the use of a common "syntax" in these descriptions provides substantial flexibility in application but still raises the very real possibility of cross comparison: as long as terms in one vocabulary can be mapped more or less directly to terms in another, it follows that comparisons across the two (or more) instances can be applied. A method such as this that appears to allow at least partial comparison between different scripts suggests promise in addressing questions of digraphism and the attribution of different scripts to a single scribe.In order to develop a method and test its effectiveness, it is always useful to apply it to a corpus, and for this we turn to the manuscripts and particularly charters of late Anglo-Saxon England. These materials are very well studied and the corpus clearly defined; it also constitutes a clear case of "relative" multigraphism and contains numerous identified examples of digraphism.25 Specifically, scribes in this period wrote both Caroline and Vernacular minuscule, the former normally used for writing in Latin and the latter for Old English.26 In addition to these—and rather less studied—are majuscule letters, which were used at the start of sentences (as in modern English) and also in headings, often for personal names, and occasionally in blocks of text and even marginal additions. Scribes typically used forms close to those of Rustic capitals, although some scribes wrote clearly distinguished Square capitals, Rustic capitals, and Uncial; others produced more "mixed" majuscules, combining forms from two or even all three majuscule scripts with little distinction.27 A significant part of the corpus has already been analyzed in a public instance of the DigiPal framework, and the resulting website includes over 950 images of writing, predominantly in Vernacular minuscule but also including some Caroline script and, incidentally, capital letters (primarily Rustic capitals). Over 62,000 images of letters on the page have been annotated and are available for search, and recent additions to the database include some examples of Caroline and various majuscule letters, particularly when those were clearly written by scribes who also wrote Vernacular minuscule. Indeed, within this corpus the set of diplomas constitutes a particularly rich source for digraphism, since they were normally written in both Caroline minuscule and English Vernacular minuscule, with differentiated script for boundary clauses, and also often with important names of people or places written entirely in Rustic capitals or mixed majuscules. It also seems clear that the vast majority of these documents were written by single scribes: the risk of circularity in argument is always present here, but they have been studied in great detail by many paleographers and historians over many years and, apart from the odd specific exception, the general principle that the documents were normally written by a single scribe has not been challenged. Although by no means a complete representation of the surviving corpus then—and, of course, the surviving corpus is by no means necessarily representative of what was produced in the eleventh century—nevertheless this constitutes a significant body of evidence from which to begin.Before proceeding further, however, our consideration of this corpus raises one fundamental question that is pertinent to any discussion of multi- or digraphism. What exactly do we mean by "script"? When do we have two different scripts and when variations of the same script, for instance with different levels of formality or speed of execution? How different must scribal production be before being considered "multigraphic"? Specifically here there is some debate whether Caroline minuscule and English Vernacular minuscule constitute different scripts at all or whether they are simply variant allographs within the same script.28 For anglophone paleography the standard definition of "script" is typically that of Malcolm Parkes: "an alphabet regarded by many contemporary scribes as a suitable model, or, more frequently, a precedent, which each scribe had in his mind's eye when he produced his own rendering of it on the page."29 Script, in other words, is a model that may show variation in execution but which must be shared by a community of scribes. To take this further, Malcolm Parkes's distinction between "essential elements" and "elements of style" is useful here. Parkes defined "essential elements" as "those characteristics of a letter shape, which enable a reader to distinguish one letter from another," and indeed this definition often equates to the "components" in DigiPal's terminology as described above. In contrast are Parkes's "graphic ideas" and the related "elements of style" that are not as clearly defined but which are very similar to DigiPal's "features" and incorporate those additional forms such as wedges, serifs, and other "elements … [that] embellish details in the components of a letter shape."30 In general, then, whereas the essential elements that make up different letters and indeed allographs are clearly different, the elements of style are normally consistent across allographs in a given script and give that script its coherence. For example, m is made up of three minim strokes, whereas r has a minim and a hook, to use one possible terminology; but the minims in both letters in the same script would normally show the same features of style, for instance a wedge at the top and a foot on the bottom. Granted there may be some variation here depending on factors like the ability of the scribe and the care and speed in execution, but the model behind the writing can still be determined. Returning to our Anglo-Saxon scribes, they consistently used different allographs when writing English and Latin, with the same usage being common across almost all scribes of the period; in other words, for some cases the "components" or "essential elements" used to construct a given letter (or, more precisely, grapheme) were different when writing Caroline compared to English Vernacular minuscule; but that variation was consistent and determined by clear principles that were shared by the community of scribes. Furthermore, some of the "features" or "elements of style" were also different, such as the consistent use of wedges on ascenders when writing the vernacular but tapering tops on ascenders when writing Latin; these differences were again consistent both within and between different texts copied by a given scribe (allowing for occasional examples of hybridity or variation in execution). This consistent and surely conscious distinction in both "essential elements" and "elements of style" indicates that the scribes had two different models "in [their] mind's eye" and that these differences were shared by that community; it is therefore argued here that they were writing different (if similar) scripts.31 Similarly, as will be discussed further below, Anglo-Saxon scribes used Rustic or Uncial capital letters after major punctuation and for proper nouns in a way very similar to modern conventions. Here again, the "graphic ideas" applied to the capitals are typically different from those of the minuscule letters, and indeed there (normally) appears to be a conscious difference between Rustic and Uncial forms of the same letter; this therefore implies that the capital letters form a different script (or scripts) from the minuscule one(s). In contrast to this are examples such as those found in the Codex Bezae, Cambridge, University Library, MS Nn.2.41. This contains writing in both Greek and Latin, but the "graphic ideas" applied to the writing in both languages are the same: an ascender in Greek is written in the same way as an ascender in Latin, with the same length, decoration, and so on.32 Allographs that have the same "essential elements" in Greek and Latin, such as Greek omicron and Latin o, are therefore indistinguishable, and this in turn suggests that the scribe's image "in mind's eye" is the same and so therefore is the script.33Returning to digraphism and questions of attribution, Parkes's distinction between "essential" and "graphic" elements is helpful here too. If we are lucky then we may be able to find forms that are common to both scripts. The extreme form of this is what Thomas Dewartes has called "graphic mixing" (mixité graphique), namely when scribes do not maintain the distinction between scripts but mix forms from different scripts in the same word or phrase;34 scribes who do this are not necessarily digraphic, since they may only know one (hybrid) script, though this hybridity does suggest that they are working in a multigraphic context. Even skilled digraphic scribes may sometimes "slip up" and introduce forms from other scripts, and these can be important clues for establishing identity because they give clear points of comparison and the analysis is relatively straightforward.35 Often, however, this is not the case, and indeed Dewartes's analysis suggests that at least in his case. One striking example of this is discussed by Theresa De Robertis regarding fourteenth- and fifteenth-century scribes and the range of scripts that they demonstrably used.36 As she has shown, most of the typical points of paleographical comparison do not straightforwardly hold in this context. In Parkes's terms, the essential elements of letters are clearly different, for example between minuscule a and majuscule A, and so comparison at this structural level is not helpful. Similarly, however, the scripts in the Gothic system vary in terms of their graphic ideas since features such as the treatment of loops on ascenders or feet on minims, the angle of the pen, and weight of the script are an essential part of the difference between a Gothic Textualis and Cursiva, and so on. Using Parkes's terminology again, in these examples neither essential nor graphic elements provide points of comparison. De Robertis has suggested that the internal characteristics of the text may prove helpful here, as well as very specifically idiosyncratic use of allographs; the latter would need to be determined on a case-by-case basis, however, and the former would rarely transfer across language if the scribe is working in a multilingual context. If one cannot find such toeholds, so to speak, then she proposes that one must rely on nonpaleographical information to make such assertions, although she does leave open the door to a paleographical approach by noting that the trained eye can sometimes intuit scribal identity in such cases.37In the late Anglo-Saxon context, the scripts are relatively similar, as discussed above. Regarding Caroline and English Vernacular minuscule, the essential elements in some letters vary between the two scripts: specifically, in a, d, f, g, h, r, and s. However, in all clear examples of the same Anglo-Saxon scribe writing both scripts, the elements of style and indeed the graphic ideas are identical, with the exception only of the treatment of the tops of ascenders and sometimes of descenders.38 This has several important consequences. One is that scribes seem to have considered the scripts to have had no inherent difference in register, implying in turn that Latin may not necessarily have been perceived to be "higher" than Old English;39 this is unlike the digraphic cases studied by De Robertis, Irene Ceccherini, and Dewartes.40 More relevant to this discussion is that there appear to be no specific difficulties in identifying scribes even when they are writing different scripts in different manuscripts: there are enough points of comparison that the approach outlined above still holds. Rather more challenging is the comparison between majuscule and minuscule writing. Clearly the essential elements are very different in this case: indeed, the differences are such that very few elements are common to both. Potentially more similar are the graphic ideas, elements of style, such as wedges, which seem to be common at least between some forms of Rustic capitals and Vernacular minuscule in a late Anglo-Saxon context. However, this hypothesis remains to be tested. How common are the graphic ideas across majuscule and minuscule scripts in this context? And can one identify "components" in the DigiPal sense, such as ascenders, shafts, or bowls, which are common across both sets of letter forms? Are scribes consistent in this respect? All these questions are important in identifying what can and cannot be compared in a digraphic context.Examining the corpus does reveal significant patterns in the formation of letters across scripts. In some cases, the commonality of elements of style seems very clear. Searching the DigiPal database for examples of majuscule and minuscule letters with wedges by the same scribe, for instance, reveals numerous instances where those wedges have identical shapes in all cases, with the same angles and size, and clearly written with the same pen (or cut of pen) held at the same angle. This is particularly striking as the corpus is known for showing a great deal of variation between different specimens of script, even within the same manuscript or document, and so close similarity is normally considered strong evidence for scribal identity.41 A particularly clear example is the scribe of two surviving documents, London, British Library, Stowe Charter 34 and Oxford, Bodleian Library, Eng.hist. a.2, no. XIII.42 The former is a royal diploma, dated to 997, issued by King Æthelred to Æthelwold, bishop of Crediton, and appears to be an original document but was probably written by a scribe from Crediton perhaps after the events that it records.43 The second document is the vernacular will of Bishop Ælfwold and is datable to 1008×1012.44 The Latin material of the royal diploma was written using the tapered ascenders that were typical of the "purer" forms of Anglo-Caroline minuscule of that date, and the scribe also used triangular wedges for the serifs of ascenders, when writing the vernacular boundary clauses and the vernacular will, and of minims in all scripts. The minuscule scripts have been discussed before, but the majuscule letters have received much less attention.45 The scribe was consistent in his use and production of capitals. The vernacular writ contains only minuscule forms, except for an initial Þ at the start of the document and the occasional enlarged minuscule a at the start of some names. However, as was common for the time, the royal diploma included capitals at the start of sections and for prominent names, such as that of the beneficiary. The scribe normally used Rustic forms for these but employed Uncial ones to highlight changes in section, such as the Latin preface to the witness list, and indeed for the first column of the witnesses themselves. Furthermore, his production of all the letters themselves is very consistent, with the pen held at the same angle, the strokes being formed in the same way, and so on. Indeed, when the DigiPal framework is used to scale the script of the Latin main text and the vernacular boundary clauses, it is striking how similar the forms are of those letters such as m and n, which do not vary between scripts despite the change in pen size, as illustrated in Fig. 1, below.46 Importantly, however, the Rustic capital letters also show triangular wedges, and once again these are formed in precisely the same way as those of the ascenders and minims in the Vernacular minuscule. This can be demonstrated by examining samples of such graphs from the DigiPal database, as illustrated in Fig. 2, and indeed by overlaying them on one another to emphasize the point still further, as shown in video 1.47Fig. 1. Sample of results of search in DigiPal framework for graphs by Chaplais's "Scribe 1."49View Large ImageDownload PowerPointFig. 2. Results of DigiPal search for majuscule letters selected from London, British Library, Stowe Ch. 34.52View Large ImageDownload PowerPointAnother different but comparable example of this is in the work of the scribe known as Eadwig Basan.48 His case is slightly more difficult, because unlike the scribe of the Crediton documents, Eadwig did not use wedges on the ascenders of Vernacular minuscule. Indeed, from about the second quarter of the eleventh century onwards, Anglo-Saxon scribes distinguished their treatment of ascenders and descenders in the two forms of minuscule script: writing in the vernacular typically showed forking at the tops as opposed to tapering forms when writing Latin. Similarly, descenders of p and q were normally straight and often with a small serif at the foot, as opposed to those letters that had ascenders only in the Vernacular allographs (f, r, wynn, and some forms of s), in which case they were typically curved to the left.50 In contrast, Rustic capitals were normally written with the now "old-fashioned" wedges, and Square capitals often with horizontal approach strokes. Despite these differences in elements of style, because Eadwig's writing was so very consistent we can instead find very close parallels in the essential elements. For example, the stroke of the lower bowl of minuscule b and Rustic B are identical in the cut, angle, and curve of the pen, as becomes evident if one uses the Lightbox feature in the DigiPal framework to manipulate the images and overlay one on top of the other (see Fig. 3 and the associated video 2.51 Similarly, in both Eadwig's writing and that of the Crediton documents, strokes such as the ascenders in minuscule script and the vertical in allographs, such as Rustic capital D, are again identical in angle, width, and so on. Comparisons can also be made between Eadwig's "regular" script and the altered script with which he wrote the boundary clauses of charters: by digitally expanding the smaller script and then overlaying the two, the strokes can again be seen to be almost precisely identical, the difference being in the altered proportions (with ascenders and descenders very much longer in the boundary clauses). Once again this can be compared and contrasted with other annotated graphs in the corpus to demonstrate just how close Eadwig's forms are to one another, and how distinct these are from those by other scribes. The important details here are therefore in the essential elements more than in the elements of style.Fig. 3. Screenshot of DigiPal Lightbox with selected graphs of Eadwig Basan from London, British Library, MS Stowe Ch. 38. In each pair, the first graph is from charter bounds and the second from the main text.53View Large ImageDownload PowerPointIn cases like these, then, namely eleventh-century English in which (at least) three different scripts were used, we can find enough similarities between them that we could have identified the scribe relatively easily even if the scripts were entirely separate. We can link Caroline minuscule to English Vernacular minuscule based on those allographs that are the same in both scripts, and we can link English Vernacular minuscule to Rustic capitals as well as some Caroline minuscule by the comparison of wedges and foot serifs. All three are also comparable in the formation of essential elements in terms of the angles of strokes, the angle and cut of the pen, and so on.Examples such as these seem promising and suggest a workable and relatively transparent method for identifying the same scribe writing different scripts. The discussion does have some gaps, however. As noted briefly above, both scribes also wrote other scripts beyond the three considered so far. London, British Library, Stowe Ch. 34 includes some capital letters using Uncial form, and Eadwig Basan also wrote Square capitals. These cases are significantly more difficult to identify, however. The Uncial forms were written without any wedges but instead with horizontal approach strokes, and the serifs are of a very different nature. Furthermore, although the pen may have been the same, the letter was drawn more than written: that is, rather than being composed of single strokes, elements such as the back of Uncial E, H, and Þ were produced with deliberately thickened strokes. An exception is Uncial D, which was written with the usual strokes and indeed is closely comparable to that of the round-backed minuscule d used in the Vernacula script. For the other cases, however, it is by no means obvious how such a comparison could be made.However, the challenge goes further than this. As noted ab

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Scribal Attribution across Multiple Scripts: A Digitally Aided Approach