Page 66 - AC/E Digital Culture Annual Report
P. 66

66are interchangeable and, more or less, standard. It is a participatory organisation in which any researcher can suggest changes or improvements based on their experience to the set of labels de ned by the consortium. Up until 2012, however, none of its members had questioned the fact that the label <sex> for describing the sex of a person mentioned in a text complied with standard ISO/IEC 5218:2004 and that the attributes (@value) were given as single-digit codes 1 (male), 2 (female), 9 (not applicable) and 0 (not known).The situation was re-examined when a female researcher pointed out that this typologywas sexist, as it put women in second placewith respect to men, and codi ed patriarchal structures with markup language (Terras, 2013). With this I do not wish to detract from the importance of the TEI, especially in giving shape to the Digital Humanities, but rather to stress that technology, data, algorithms and standards are the product of an interpretation of the world and bear cultural marks. In conclusion, data should not be viewed as absolute truths but be questioned critically.Our cultural heritage is not fully digitised, despite the collective e orts of initiatives like Europeana. Only 23% of European collections have currently been digitised.In defence of theoryIn literature on Big Data it is also common to  nd that theory is discredited. The argument is basically as follows: if we have large amounts of data and e ective statistical methods, we do not need theories, models and hypotheses, which need to be proven or refuted with experiments. Put another way, in the era of the Petabyte, scienti c method is obsolete (Anderson, 2008). The dismissal of theories and models has not only been given credit in the business world, but it has also been accepted in a few humanisticwritings. Jean-Gabriel Ganascia (2015: 632โ€“33), for example, claims that a theory or previous hypothesis is no longer necessary if we analyse all the existing data as opposed to a sample or small group, as has been done so far.In contrast to this viewpoint, a considerable number of writings have con rmed the importance of theories, models and hypotheses for research. It should be remembered that our cultural heritage (documents, texts, paintings, images, sounds) is not fully digitised, despite the collective e orts of initiatives like Europeana. According to the latest report issued by the European Commission project ENUMERATE (Nauta and Wietske, 2015), only 23% of European collections have currently been digitised. The survey was answered by some 1,000 European institutions including libraries, museums and archives. These institutions have yet to digitise some 50% of their collections and admit that about 27% of their holdings will not be digitised. These  gures highlight the fact that much of our heritage is not accessible on the internet.Digitisation always involves making a selection based on the resources available to the insti- tution or working group in charge of digitising the documents; but this selection furthermore stems from ideological and identity reasons. It should not be forgotten that museums, libraries and archives are publicly funded institutions and their role is to preserve and disseminate the cultural heritage of a community (for example, a nation). In addition, formats, markup languages and algorithms are also part of a particular culture and ideology and go hand in hand with many assumptions that vary depending on the context.From a humanistic viewpoint, it is thus hard to believe that analysing large amounts of data could renders scienti c method useless, because we never have all the existing data โ€“ one of the vectors of Big Data is the Velocity with which new data is generated โ€“ because the data isBIG DATA IN THE DIGITAL HUMANITIES ยท ANTONIO ROJASSmart culture. Analysis of digital trends


































































































   64   65   66   67   68