End of Term Reflection – Communication!

I’ve known for a while that communication is really important when working in teams, but this term really drove home for me how crucial it is. I spent a good amount of time this term working on prepping and uploading archival images of workhouse documents from London in the late eighteenth and early nineteenth centuries as part of the Virtual Workhouse project (you can see the fruits of our labors here!). Tyler and I were both working on these spreadsheets so there were a lot of moving pieces, with having to keep each other in the loop about our progress, checking in with Austin and Sarah about problems we were running into, and talking with Susannah about unexpected issues we came across (like an index in the front of two of the volumes!).

Trello, which is a new piece of our workflow this year, was very helpful for me. Tyler and I could record what we had done each time working on the spreadsheet in a shared place to keep track of our progress. I could tell him that I had finished the titles in Volume 3 but ran into problems with the dates, and he could tell me that he had finished the identifiers in the same volume and fixed the dates. This way we could keep track of what we had finished so neither of us was doing work the other already had.

A screenshot of the Omeka CSV Import Plus Plug-in that we used extensively in our work

Another aspect of communication that I found really important was communication with myself (essentially, documentation). Since we were dealing with the spreadsheets of six volumes of a workhouse minute book, there was a lot of data and a lot of images. I could not count on my memory to keep track of things. Even if I noticed that image 84 in Volume 2 had to be discussed, there was very little chance I would remember that. So I had to make sure to write it down for both myself and others and clearly state exactly why it had to be discussed. The same was true of meetings. I met with Susannah to ask about unexpected pages, and then brought that discussion to a meeting with Tyler and Sarah. Without my notes from the meeting with Susannah, I would not have been able to remember what we had talked about and what her suggestions had been. Likewise, I wouldn’t have remembered our decisions from meeting with Tyler and Sarah to implement them on the spreadsheets. While I’ve known for a while about the value of making good notes to communicate with both others and myself, the high volume of data that we have been working with for the workhouse minutes has driven home to me the absolutely critical nature of such documentation.

Failing to Map Historical Maps

Building off of Martha’s previous post, I’m going to discuss some challenges of mapping projects with old maps. Old maps pose challenges to digital projects. In particular, the spatial arrangements of many old maps don’t match modern day maps of the same area. A path made up of geographic coordinates (such as on Google Maps) is not guaranteed to be compatible with old maps. In addition, there is a fine line in many of these maps between maps and city views, especially in many early modern European prints.

One example of these map/city-view that is useful to think with is the woodcut of Rome from the 1493 Nuremberg Chronicle

A full page spread image of Rome
A woodcut image of Rome from the 1493 Nuremberg Chronicle

This woodcut image clearly shows Rome – in addition to the label (which I will expand a bit on later), there are many recognizable sites – the papal palace, the Castel Sant’Angelo, the Pantheon, and the Colosseum. It is not a simple city view – the geographic relation between these sites shows the general layout of the city. However, trying to plot a path (or even just points) on a georeferenced version of this map is not feasible.

Georeferencing warps the map image, trying to get it to fit the points on the image to relevant points in the real world. Sometimes this warping can be very extreme, especially for certain kinds of transformations:

The map is warped beyond recognition in the attempt to georeference the map.
Mid-process attempt to georeference Rome
The map is warped beyond recognition in the attempt to georeference the map.
Mid-process attempt to georeference Rome

The control points (which link the image to geographic points) on this map do not line up well, since the spatial arrangement of the map/woodcut image do not line up with their geographic locations. This image shows the difference between the points on the 1493 map and the present day map, shown as blue lines.

Rome map georeferenced with visible control points that don't line up with the historical map
Rome map and control points

Another problem with a map like the Nuremberg Chronicle woodcut is that we don’t know what all the landmarks are. There are many church structures, but only a few are labelled. In addition, there are features that seem to be missing – for example, it is difficult to distinguish Tiber Island on the woodcut, which is a major landmark. Furthermore, the scale of the buildings pose problems. It is relatively easy to use the center of the Pantheon as a point on a present day map, but where on the woodcut image is the “center” of the Pantheon? In all, attempting to georeference and plot points on a map such as the Nuremberg Chronicle woodcut image of Rome is frustrating, inaccurate, and ultimately provides no additional insight. In fact, the extreme warping of the image makes it more difficult to understand the and the data represented in relation to it.

My attempt to chart a path on the Nuremberg Chronicle map

The closest point is the Colosseum – nothing else lines up very closely at all (the path is supposed to go from the Colosseum to the Pantheon, to one side of the Ponte Sant’Angelo, to the Castel Sant’Angelo, to the Vatican). The result is not illuminating and does not contribute any new knowledge, and in fact doesn’t serve either purpose well; it is difficult to interpret the Nuremberg Chronicle map, and it is almost impossible to know which landmarks are denoted by the path.

The path of points through Rome does not line up with the location of those points on the Nuremberg map
A path through Rome on top of the Rome Nuremberg Chronicle

Can we trust the labels?

In the Nuremberg Chronicle, no, we can’t! Rome is clearly correct – the sites confirm the label. Other cities are like this, including, for instance, Krakow, which gives specific labels of a part of the city and reflects the city layout. However, for more minor cities, the Nuremberg Chronicle often uses the same woodcut for different cities.

 

In fact, the same woodcut is used for nine images (see below). These images include: Napoli, Perugia, Mantua, Ferrara, Damascus, Bena (I’m not sure what city this refers to), a German province (not sure about this one either), Spain, and Macedonia. While Italian cities may have similar styles, I cannot accept that Napoli, Damascus, Spain and Macedonia literally looked like these woodcuts. Therefore, not only do these images provide difficulties in terms of spatial alignment, but we also cannot always accept them at face value, because they may be no more than a generic representation of a city than a visual representation reflecting an actual cityscape.

Sources:

The color woodcut images come from the digitized University of Cambridge Nuremberg Chronicle (CC BY-NC 3.0): (the page numbers are: Bena 80r, Damascus 23v, Ferrara 159r, Macedonia 275r, Mantua 84r, Napoli 42r, Germany? 284v, Perugia 48v, Spain 289v).

The woodcut image of Rome is from the digitized copy in Morse Library, Beloit College. Last accessed 16 October 2017.

RBMS/BSC Latin Place Names File provided help with place names.

To Map or Not to Map?

During this year’s fall DH training, the DHA’s got some practice using ArcGIS, an online mapping tool that makes it relatively easy to create your own customized maps in one sitting. This post discusses some of the pros and cons, advantages and pitfalls of mapping data. (Note that by mapping I am referring strictly to the use of geospatial maps, not to the more general application of the term that includes graphing.)

Why use a map? Mapping is fun and exciting, and it’s a relatively easy way to build a data visualization that’s interactive and easily facilitates instantaneous spatial comprehension of the data. For these reasons, people are often quick to jump on the “let’s map it!” train whenever there is spatially relevant data. But it’s important to stop and ask this question first: what will a map add to this project that other data visualizations will not? Sometimes, sparsity or lack of variation in your data should disqualify the map idea.

Take this example from Stanford’s Professor Martin Evans, which maps specific locations in and around London that are referenced in works written by authors from London. There’s an abundant amount of data in this data set, and the locations are spread all over London – mapping helps us understand the data, so mapping was a good choice. If, however, you were mapping only locations in London referenced by Sylvia Plath, you might think twice about whether the <10 data points clustered in one small location is worth putting on an interactive map.

Once you’ve determined that a map is worth your time, you might next consider what kind of spatial information you want to convey. Is the data represented well by points on a map? Or is there a path or order to these points? How can you visually differentiate between different paths or groups of points (hint: colors)? Try to create a map that accurately visualizes the story you’re trying to tell with your data. In this example, students at the Georgia Institute of Technology recreated the paths taken throughout the day by characters in Mrs. Dalloway. The smooth, continuous paths tell a better story than a series of sequential points would, and the colors make each path stand out from the others. Above all else, mapping should make it easier for your audience to understand your data, so think hard about how you’re transferring your data to your map. And use colors!

Don’t forget that an important part of mapping is the base map itself, not just the points you put on it. Much of the time, simpler will be better – if the story you’re trying to tell has nothing to do with the terrain of the area, don’t clutter your visual with a terrain base map. Humanities scholars are often excited about using historical base maps, which are historical maps that can be georeferenced onto a modern, digital map of the same location by matching specific points between the two locations. One common problem with historical base maps is that many historical maps are not geographically accurate, so georeferencing them can stretch and distort them to an unusable extent. For example, this 1853 map of Maine from the David Rumsey Map Collection is quite geographically accurate, and would work well as a georeferenced historical base map, but this 1935 world map of post office and radio/telephone services from the same collection is highly geographically inaccurate and would have to be significantly distorted to be georeferenced onto a modern 2-dimensonal map of the world.

Finally, consider how you will communicate the data for each point or path on your graph. Points and paths don’t always speak for themselves, and there will often be metadata or a paragraph of information that necessarily accompanies each data point. How will your user access this information? Is there a key that goes with the map? Do you click on a point to reveal the associated text? Does each point link to more information?

There are many ways to address the above issues and questions, facilitating lots of creativity and flexibility within each project. Above all else, no matter how you approach a mapping project, your map should always give a clear and intuitive answer to the question: what story is this map trying to tell?

Welcome Back!

Last week I arrived early on campus to participate in the fall term DHA training. I didn’t get to take part last year because I was abroad in the fall, so it was a new experience for me. There’s one difference between this year and last year that immediately stands out – since I was able attend the training this year, I had an opportunity to work with and get to know the other DHAs and new DH interns before the term officially started. This was my favorite part of training, and I’m hoping it’ll get us off to a great start this year. I find it so much easier – and more fun! – to work with others when we’ve already eaten deep-fried food from Jesse James Days by the Cannon River together.

At the end of spring term last year I attended the digital humanities conference that I had spent all of spring term helping to organize. Although I was one of only a few students who attended and felt initially intimidated by the sea of “real adults,” I became increasingly aware throughout the course of the conference that I knew what I was doing. I understood a lot of the jargon, I was able to intelligently contribute to conversations, and, most importantly, I felt like I deserved my place in those conversations. In short, it was really cool. A year ago, I wouldn’t have been able to do that. This year, I’m excited to build on that confidence as I expand my DH toolbox. Not too long from now I’ll have to leave Carleton to join the leagues of “real adults,” and I think some confidence will come in handy.

3 Fantastic WordPress Plugins for Your Online Journal

Say, you’re in the middle of publishing your favorite paper to a WordPress website. You are stuck because creating all those footnotes links, formatting the pull quotes and inserting that complicated-looking table seems a daunting task. But don’t give up! Here are three plugins that will help you out!

Footnotes

Plugin: Easy Footnotes

It not only lets you add footnotes throughout your WordPress post, but also compiles a corresponding ordered list of the footnotes at the bottom of your article. 

Nice features:

  • Shortcode enabled. A footnote can be inserted as easily as typing [note]Footnote content.[/note] where you want it to be.
  • Automatic numbering. It would have been a huge pain having to number a few hundreds of footnotes manually, especially when you realize that you’ve skipped number 7 somehow after entering the first 90 footnotes. The good news is that this plugin automatically add the number of the footnote where the shortcode is entered. 

Pull Quotes

Plugin: Easy Pull Quotes

The plugin name is pretty self-explanatory. It helps you create pull quotes in WordPress posts. After installing and activating this plugin, you’ll see an “Easy Pull Quotes” tab in your editing toolbar. By clicking it you’ll be directed to a text box where you can enter your quote and choose its alignment. 

Nice features:

  • The pull quotes can be easily shared to Twitter by the end user by clicking the Twitter icon.
  • If you’re familiar with CSS, you can easily create your own pull quote style by manipulating the code. (Here‘s how you can add pull quotes to WordPress posts without using any plugins. It’d probably be more time-consuming, but it’ll be fun to learn some HTML and CSS!)

Tables

Plugin: TablePress

After installing this plugin, you’ll be able to generate beautiful tables and embed them in your posts. You can learn more about this plugin from this website, where you’ll find a cool demo too. 

Nice features:

  • Shortcode enabled.
  • Tables can be imported and exported from/to Excel, CSV files.
  • With the help of additional Java libraries, it’s possible to allow your readers to sort or filter your tables.

I hope these plugins help! Have fun online publishing!

Learning about NLTK

In the past couple of week I’ve been helping to update the curriculum for a fantastic project called DH Bridge. This curriculum includes a one-day programming bootcamp for people with no computer science experience (and particularly those who are also involved in the humanities) to learn some basic Python skills. I’ve had so much fun doing the tutorial along the way because it focuses on text analysis using the Natural Language Toolkit (NLTK), which I wasn’t previously familiar with, but includes some really cool tools for natural language processing. You can download NLTK for free and use the many Python libraries it has available to do text analysis day and night! Here are a few of the things I learned:

  • NLTK has a built in method for getting word frequencies, and it’ll spit out the n most common words in a text (you decide what n is) along with the number of times that each word appears, in order from most to least frequent. Nothing too complicated – but it’s a great (and very useful) starting place.
  • Want to see the context in which a certain word appears throughout a text? This method takes a single word as a parameter and prints out each instance of that word within its surrounding text. For example, here’s every instance of the word “trial” in Harper Lee’s To Kill a Mockingbird.

This is a great way to get a sense of how a word is being used throughout a text without having to Control+F your way through the whole thing.

  • This one is my favorite because I think it’s so cool. You give it a word and it returns the twenty words that are “most similar” to that word in the text. I haven’t looked too far into how it works, but the method somehow determines which words are most often used in a similar context to the given word. For example, here are the results for the word “trial” in To Kill a Mockingbird.

Some words, like “court” and “newspaper” are pretty self explanatory, but we may question why a word like “family” is so closely associated with the word “trial” in this novel.

Even with these very simple searches, it’s already easy to see the kind of information you can get out of a text that the human eye wouldn’t necessarily be able to see. Yay digital text analysis!

How To Do Your Job When You Don’t Know How To Do Your Job

The cool thing about this job is that I get to constantly be doing new things and jumping into new projects. The flip side to this, however, is that each project is unique and requires very different skills – skills that I (very often? most of the time?) don’t yet have. So this term, I’ve been getting used to the fact that not having a skill to do a certain job doesn’t mean I don’t do the job, it means I get to learn how to do it. The question then often becomes, “Where do I even start to learn how to do X?” The following are some tips and tactics I’ve been working on using when I’m faced with a daunting task that I’ve never done before:

  • Just ask. This seems obvious, but it’s often much easier said than done. People don’t want to risk sounding dumb by asking questions, but 1) people probably won’t actually think you’re dumb, and 2) isn’t it better to ask and learn how to do something correctly than spend all your time doing it wrong?
  • Google it, but be smart about it. Again, this seems obvious, but Google is a gift and a curse. Be wary of bad advice (you wouldn’t cite a Buzzfeed article for an academic paper, so why should you take serious advice from it?), and think hard about the search terms you use (be precise, try a variety of related terms, etc…).
  • Pretend that you know what you’re doing. I love this tactic. Sometimes I know that I don’t know what I’m doing, but I don’t know what I don’t know, so I just start working until I get stuck in order to figure out where the problem is. It’s a really great way to pinpoint exactly what you don’t know.
  • Use sites that were created for these situations, like Lynda.com. If you’re a Carleton student, you already have a subscription! Even if you can’t find a video to explain exactly what you’re supposed to be doing, it can help you to get a hang of the general terminology relating to the task at hand or the basic functionality of a tool you’re learning to use.
  • Look for existing examples. Chances are you’re not the first person to do anything, so it’s a great idea to find examples of best practices and conventions. This is true for pretty much anything, but particularly when you’re doing something totally new.

Of course, the best part about not knowing how to do something is that you get to learn how to do it and then a week later when one of your colleagues doesn’t know how to do the same thing you get to pretend that you’ve known it all along and teach them how to do it! Such is the cycle of life. Remember, everyone’s just trying to fake it ‘til they make it.

All things Bede

As of this moment, I’ve been a member of the Digital Humanities team for seven sometimes challenging, frequently exhilarating and always rewarding weeks. I’ve learned a whole lot – from how to write good documentation to using tools like Omeka and ArcGIS to valuable cultural lessons such as being introduced to the 60’s Batman show.

I spent a good portion of my time this term focusing on the Bede Project which aims at creating an online commentary to the Ecclesiastical History of the English People by Venerable Bede. Three Carleton professors – Rob Hardy, Austin Mason, and Bill North – are collaborating on this project. Once finished, it is going to be part of Dickinson College Commentaries. I started working on this project over a year ago and since then have grown fond of Bede and his clear if occasionally funky Medieval Latin.

This term I’ve been focusing on two aspects of the project – vocabulary lemmatization and mapping. Lemmatization is a fancy word for mapping every inflected word form to its dictionary form, or lemma. Most of it was done automatically using a lemmatizer script, so what was left for Bard and me to do was to fill in the words for which the lemma for some reason wasn’t found. That would happen either when the inflected form was ambiguous, in which case I went back to Bede’s text to figure out which of the umpteen possible things it meant, or because the word wasn’t found in the word list the lemmatizer pulled data from (that would be true for Medieval Latin vocabulary, names, or words that had an alternative spelling). While slightly monotonous, this is a great refresher for my rusty Latin, and there’s a fun problem-solving aspect to figuring out which of the many possible meanings a word has in any given context.

The other part of the project I was focusing on is collecting all the data necessary for creating an interactive map of Bede’s England. I created a spreadsheet with a list of all the places mentioned in Bede using Plumber’s index of place names and found coordinates of the corresponding modern places. This process had its own challenges: sometimes it would not be clear where a place mentioned by Bede was located. When that was the case, I engaged in extensive googling and searching through various commentaries to Bede’s text hoping to find a note on the corresponding modern location (sometimes the name mentioned in Bede and the modern English name of the place don’t sound at all alike – for instance, Verulamium is now called St. Alban’s). After that I verified the coordinates of each of the places and added links to Pleiades and/or PastScape for the locations that have an excavated medieval site. The spreadsheet was then uploaded to ArcGIS, resulting in the following map (pretty cool, right?):

The next step I will be focusing on is adding a layer with all the rivers.

Organization: Yes, it really works.

Seventh week is beginning (did anyone else just go into fight or flight mode after reading those words?), which means that I’ve now had well over a month to settle in to my first term as a DHA. Initially, I was going to write that I had spent the first several weeks of this job learning the ropes and getting the hang of how it goes (because I have indeed learned a great many things about a great deal of stuff), but then I realized that that’s not really true. More accurately, I’d say that I’ve jumped in headfirst, taking a “sink or swim” approach to this new job, so now seems like a great time to come up for air and do a bit of reflecting.

In short, I think I can confidently say that I have not utterly failed (I joke, I’ve actually done quite well). I owe this success in large part to the people I work with, who are intelligent and always helpful. But there’s another tool that has been key in learning quickly how to tackle a new project: the documentation.

The nature of student jobs and participation in organizations is that the turnover is fast – jobs and organizations are looking for new employees and members every year, so making training efficient can be essential. I’ve participated I some student organizations where it seems like every week we’re saying, “I’m pretty sure so-and-so did a project like this a couple years ago, but then they graduated…do you have idea how we could get their contact information to see if they’ve still got that information? Or maybe I still have an email about it from freshman year…” Yes, sifting through emails from 2014 is one of the warning signs that something went wrong…

Student turnover can be a logistical nightmare, but this job has been proof that it doesn’t need to be. It’s been so easy for me to access project history, familiarize myself with all the relevant tools and information, and then quickly jump into new projects. Not all of it is perfect, but the effort was made and I am reaping the benefits. If better (or any) documentation is something you think your job or organization could benefit from, here are some tips I’ve gleaned from my experience:

  1. Be specific and consistent when naming documents. “Meeting 3/5/15” is not helpful. “Initial Meeting with Web Developer” is helpful.
  2. Note specifically who has done what. If you need to contact someone about work they’ve done, you don’t want to be guessing between 10 different people.
  3. Take the time to organize. The point of documentation is that it makes everyone’s lives easier down the road. Make everything easy to find using section headers and bulleted lists.
  4. Note things that did and didn’t work. For example, if you’re reflecting on how an annual event went, a note that says “Next year get catering order in 1 week before event” can save planning time and prevent disasters in the future.
  5. Keep everything in one place. This seems obvious, but it can be very easy to for things to go missing, especially if there are a lot of people working on one project. For example, having one big Google Drive folder ensures that everyone knows where and how to access everything.

Now go forth and document!

Something to Think about When You Work on DH Remotely

The past winter break I was in China where some websites and web services are not available. My days were a little bleak without Youtube and Facebook. The worst part was that I had no access to my Carleton Gmail account or Google drive. This made my task of uploading articles to Journal of Historians of Netherlandish Art (JHNA) particularly challenging – it was hard to contact my supervisor and the authors, and it was impossible to view the JHNA articles, figures and upload guide since they are all stored on Google drive. I was glad that I realized this problem before I left campus. I tried to resolve this issue by giving my supervisor an alternative email address and installing Carleton GlobalProtect VPN. However, the VPN didn’t work for the first few weeks of my break. Here’s a rundown of the problems I faced, and how I solved them:

 

(the ones marked with [FAILED] have failed and thus not recommended):

First 2 days of my break

  1. [FAILED] Sat and cried

Week 2

Using my personal email account, 

  1. contacted my supervisor, explained the situation
  2. [FAILED] filed an ITS report, which was automatically rejected because it was not sent from a carleton account
  3. given that (2) had failed, reached out to all my friends who work for ITS for help

Week 3 – 4

  1. VPN troubleshooting with tremendous help from ITS
  2. [FAILED] Experimented some random free VPNs found using Baidu (a Chinese search engine), which brought my laptop some virus problems
  3. Switched to Yahoo (relatively more reliable than Baidu) and found a few highly ranked VPNs in IOS app store
  4. Tried one of the VPNs, “Betternet VPN”. It’s free, and though slow, it works! *Note: This app can only be found in the American app store. An American Apple ID is required.
  5. ITS made some backend updates and GlobalProtect started to work for me!

 

To summarize, if you are traveling outside of the United States and are likely to have similar problems, consider doing these in advance:

  1. Install Carleton GlobalProtect VPN (or other trusted VPN software)
  2. Friend and bribe ITS workers
  3. Give people you need to contact an alternative email address that you have access to in any network environment
  4. Save as many things you might need from Google drive to….your hard drive?…as possible
  5. Download other reliable VPNs to your tablets just in case GlobalProtect fails you

And lastly when you are on your trip and find out that none of above helps, please be patient and stay positive – there’s always going to be a way out; otherwise, enjoy your days free of digital distractions!