Maximizing Workflow and Metadata
Today’s market is replete with software programs and applications—both public domain and for-profit—used for organizing text, audio, photo and video files. This brief article highlights some of the ways that anthropologists can best use these tools. We first identify the basic steps of creating a productive workflow, based upon your own research needs and resources. We then discuss using the metadata fields—codes—to maximize both depth and breadth of data organization and analysis.
The term “workflow,” derived from the software and publishing industries, describes the steps involved in moving from concept to product. It is the recipe you follow for preparing and working with digital files. Just as there are many perfectly good recipes, so too with workflow practices—and in both cases there is also a logical sequence of necessary steps. As with any recipe, you need to decide: (a) what product you want to produce; (b) the ingredients you will need; and (c) the steps needed to produce your product. Just because there are many ways to organize your workflow, however, does not mean they all work equally well. It makes little sense to bake the cake before mixing the ingredients, right? In the same way we want to stress the significance of workflow as an important sequence of procedures, designed to realize your desired outcome.
Staying with the recipe analogy, which are the easiest recipes to follow? Those with the fewest and least complicated steps, right? But even if they have more steps, recipes that have become almost second-nature via long-term practice and familiarity work best. So too with workflow procedures: they tend to be most useful when they are simple and routine. Developing your preferred workflow may take several iterations, but once you figure out your preferred recipe, you will have a reliable process for organizing and working with your data. And since repetition generates familiarity, following the same procedures each and every time helps ensure the consistency of your desired outcome. You are less likely to forget steps, and if you do make a mistake, it is that much easier to backtrack and identify the source of the error. In any case, your workflow can be as simple or as detailed as you need—but once you find what works for you, stick with it. For us, the key steps to good workflow are: copying; renaming; selection; image treatment; optimization; and backup.
When you sit down to start dealing with your data files, create a destination folder on your computer using a short, relevant name (eg, AAA-2012). Next, copy all the files to your destination folder. Remember, even aesthetically “bad” images/video/audio may be ethnographically significant. Before you do anything else, backup this folder on an external drive or disc. Now you can work with the files on your computer, since you have a safe backup in case of error or emergency. Now it is time to rename your files from their generic file names such as “IMG0001.” Naming your files is as crucial as it is simple. For research purposes it is the single most important step in workflow process. Do not make the mistake of naming your images for specific people. Rather, use a consistent system, perhaps using some combination of the event, location and date. Modern software allows you to use the “batch” editing function to rename all of your images at once, saving time and ensuring consistency. Now make a copy of your renamed files—this copy is what you want to work with going forward.
Now you can select the images you want to work with, and put those in an appropriately labeled subfolder (Marion uses “select,” Crowder uses “keepers”). This is where you may start to work on image treatment, such as rotating, cropping, red-eye reduction and adjusting color balance. No matter what, make sure you are editing a copy, and rename the file in a way that (a) shows that it is a copy, and (b) designates what image treatments you have used. For instance you could add “#” to the file names of images that you have cropped and “+” for those you have rotated. Use whatever system makes sense to you, but be consistent. Just to play it safe, why not create a key that you can refer to whenever you are working with your files?
Next, you want to optimize your files for whatever purpose you intend to use them. Not touching them until later? Leave them alone. Emailing or posting online? Reformat to 72ppi (pixels per inch),— the resolution of most screens—to provide a much smaller file size that is easy to share, loads fast, yet will not print well. Publishing or printing the images? You will probably need to be working with 300 dpi (dots per inch) files. In any case, save the optimized versions of your files with appropriate designations (eg, “@300”). Finally, backup your work as you go.
Find the workflow recipe that works for you. Adjust the steps to your own research, preferences, and software programs. Follow the routine and you will avoid errors, and have data you can readily access, analyze, and share now and into the future. Hint: write down keywords (see Figure 1) on a sticky note and place it on your monitor—now you can always know where you are in the process at a glance.
Metadata are codes that are embedded in all digital files—whether word documents, photographs, audio files, or Powerpoint presentations—that describe particular characteristics of the file itself. Some metadata are generated by your device (eg, computer or camera). You can edit or provide other metadata. For example, a simple word document embeds: (a) the date and time it is created; (b) the number of revisions that have taken place (and the total editing time); and (c) the numbers of characters, words, and pages in the document. This data describes technical characteristics of the file itself, and are not easily manipulated. Other metadata include the last computer used to update a file, the author’s name, and possibly the company that has rights to the software application. These data are provided by the administrator of the computer (eg, you, your university or employer) and the accuracy depends upon how the computer is configured. Both sets of data can be very useful in managing, organizing, searching and cross-referencing files, especially when storing and archiving files.
Anthropologists frequently use codes for organizing and analyzing qualitative and quantitative data. Metadata serve as another opportunity for employing similar codes for both organizing and analyzing your content. Right-click (option-click on MAC) any desktop file and scroll down to <properties>. You will find various fields populated with technical information about the file (see Figure 2). You will also find several empty fields which you can use, such as those for a title, subject, categories and comments. These open fields allow you to describe the file in your own words, providing levels of specificity and details inappropriate for inclusion in the more general file names. Here you can place people’s names, places, concepts, events, and other important data that relate to the document’s substance or intended use. Adding to and editing these metadata fields means that the information you provide (your designation of the data) gets embedded in the digital file itself, facilitating far more powerful referencing, analysis, and retrieval.
Once you begin entering codes, you can then use the search function in your operating system (OS) to sort through all of your files. This is key: you do not need any specialized program for searching your data—your OS’s search function will retrieve and list your search results. As you begin to develop codes for your various research projects you will see the power of accessing and using metadata. For example you can: (1) summarize an interview in the “comments” section of a Word document; (2) use these same codes/words/descriptors as codes for the audio file of the interview; and then (3) place like descriptors in photo and video files. Now you can cross-reference your text, audio, and image files, all while maintaining consistent naming conventions across your various media. This far richer contextualization is a tremendous boon for future recall, review and analysis.
As you work with metadata it can be highly advantageous to work with an established vocabulary, a set of standardized descriptors. As with any other coding scheme, this can be especially useful in situating your work—and putting it in dialog—within professional groups. This is particularly important if your files are used for research by colleagues (who may also run searches on your data). The Dublin Core and the HRAF (see Dublin Core Metadata Initiative DCMI website or Human Relation Area Files HRAF website for these vocabularies) are two particularly robust sets of standardized codes, and their vocabulary sets are shared by many types of researchers. Using one of these sets thus allows for broader comparisons with other researchers’ work, and allows you to more readily share your own data. For instance, you could send audio files of interviews and image files from your fieldsite to research teammates. And by sharing a standard set of descriptors within the group, you facilitate finding and comparing similar types of data for comparison, contrast and analysis. Similarly, within the metadata fields themselves you could note the timestamps of any key dialog your colleagues should listen to, or provide links to transcriptions in accompanying text documents.
Current systems (both Windows and MAC) write the metadata into a “header” part of the file itself. This means that (unless deleted) the data is embedded within the file itself, so when you copy the file onto a flash drive or upload it to the internet, the descriptive data stay with the file. This can be particularly important for copyrighting your work (written or multimedia). Conversely, it can be equally important to purge the metadata from a file, so others do not have inappropriate access. Using a common example, if you work in Word you can scrape the metadata from the file in FILE-INFO-CHECK for ISSUES. This simple routine will remove the information placed into the header file by Word, including your name and date created, as well as comments you have made using “Track Changes”. Other programs have similar functions that you can use to help protect sensitive or privileged information and data, even as you share and publish your materials.
Organizing Your Digital World
Whatever your software programs, design a workflow that works for you. Likewise, use metadata fields to embed coding in your files. Develop your own workflow and metadata procedures, and then stick to your routine. This will keep your digital data organized for archival, current, and future research needs, as well as facilitate ongoing access and analysis.
Jonathan S Marion and Jerome Crowder are the authors of Visual Research: A Concise Introduction to Thinking Visually and have taught the “Photography for the Field” workshops at the 2008-12 AAA Annual Meetings.
Marion is assistant professor of anthropology at University of Arkansas, Fayetteville, and president-elect of the Society for Visual Anthropology. His research interests include performance, embodiment, gender and identity, as well as visual research ethics, theory and methodology, and he is the author of Ballroom: Culture and Costume in Competitive Dance.
Crowder is assistant professor in the Institute for the Medical Humanities at the University of Texas Medical Branch, Galveston. His research interests include medical anthropology and Latin American studies, and he has done long-term fieldwork in the Bolivian and Peruvian Andes.