Page 65 - AC/E Digital Culture Annual Report
P. 65

that many of these procedures are comparable to automatic image processing (Rosado, 2015).The ultimate aim is usually to  nd patterns that help understand literary and artistic creations. But text commentary – close reading – continues to play an important role even when statistical methods are used to analyse texts, because researchers shift their attention from the whole to the detail and from the detailto the whole to check that their ideas about the work are correct and accordingly gain a better understanding of the di erent layers of meaning, the central themes, the events and the style. Put another way, distant reading and close reading are not mutually exclusive because researchers usually combine both strategies: they  rst gain an overview and then  lter and examine the details for a deep comprehension. They usually complete their analysis with visualisations of information in the form of marginal annotations, parallel texts that are connected in some way (colours, density, contrast between form and substance, arrows) or more abstract structures like maps, trees and graphs (Jänicke, Franzini, Cheema and Scheuermann, 2015).In the humanities we can only speak of Big Data in connection with the technologies associated with this phenomenon, suchas data mining, stylometry or natural language processing.To sum up, although the volume of data is not comparable to that currently generated by the social media, blogs and major companies, in the humanities (and speci cally in literary studies) we can only speak of Big Data in connection with the technologies associated with this phenomenon, such as data mining, stylometry or natural language processing.Data as a human constructionThe conversation between the humanities and Big Data does not merely boil down to adoptingalgorithms for studying large holdings of texts and images quantitatively. Indeed, digital hu- manists have played an active part in the debates on the nature of data.In a context in which data is equated with objec- tive, irrefutable evidence, it is constantly stated that data is in fact a human construction; thatis, it is conditioned by the time, place, language and ideology of the actors involved in gathering it. For example, the researcher Johanna Drucker (2011) rejects the term “data” – Latin for “that which is given to us” – and uses instead the term “capta” meaning “that which has been taken or collected”; evidently this critical intervention highlights the impartial and incomplete nature of data.Digital humanists have also stressed the tempo- rality of data – for all data has a date of creation and expiry – and the fallacy of separating data from metadata (that is, data such as title, maker, theme, description, date, format, identi er, source, language, etc.). Actually there is no such thing as second-grade data, as embodied bythe root meta; metadata is just as important, selective and impartial as data because it is produced by humans (or rather by algorithms designed by human beings). Equally invalid is the distinction – which dates back to L vi-Strauss’s culinary triangle – between “raw data” and “cooked data” or between “data”, “raw material” and “information”.Indeed, for researchers like Tom Boellstor  (2013), data is dense, interpretative and contex- tual, and it is therefore preferable to speak of “thick data”. Paraphrasing the anthropologist Cli ord Geertz, data should be regarded as “our own constructions of other people’s con- structions” of objects imagined by a particular community.For example, the Text Encoding Initiativeis a non-pro t organisation that publishes Recommendations on how to encode humanistic texts with XML markup language so that theyAC/E DIGITAL CULTURE ANNUAL REPORT 201765Smart culture. Analysis of digital trends


































































































   63   64   65   66   67