What Exactly Does the Internet Know About You?

I get it – Facebook, Twitter, Google – they own me. They have all my data: the ads I click, the things I search, the pages I visit. The implications of this lack of privacy have been unfolding slowly, but with a dampened sense of urgency, until the recent Cambridge Analytica revelations, and now, people are realizing too late how valuable their data is.

But here’s the thing – although I understand that the broad implications of this privacy breach are very serious, on a personal level I just find it, quite frankly, a little difficult to care. I’m right on the edge of the generation that was thoroughly indoctrinated by the internet from Day 1. I’m too young to remember dial-up, but old enough to remember when the iPhone came out; one of my earliest memories is my parents getting their first cell phones (my mom had the iconic and beloved Motorola Razr), but I never hung out in AIM chat rooms or had a MySpace. So yes, I vaguely remember a world without internet surveillance, but I came of age in the midst of this new era, so for me it’s just reality; it’s the price you pay for (monetarily) free social media and access to unlimited amounts of information. Anyone younger than me won’t even know life without this surveillance. If nothing else, it’s mildly comforting to know that Google’s got everyone’s data, not just mine.

But not caring is a dangerous pattern to fall into, because it’s fine until it’s not fine. It’s fine when Facebook just knows that I like watching videos about artisanal chocolate making, but it’s not fine when widespread demographic targeting influences a presidential election, which is to say, it’s fine until we noticeit. And at that point it’s too late.

The fact is, unless you’re willing to become a recluse or forego many of the incredible advantages of the internet and mobile technology, there isn’t an enormous amount any of us can do except be careful about what we post, click on, and search for (which you should always be doing anyway). But the one thing you cando is to stay educated. In the wake of the Cambridge Analytica scandal, I believe it’s important for everyone to know exactly what information various sources have on you. It won’t stop them from using it, but it may make you aware of how targeted advertising is affecting your online experience.

Most social media outlets allow you to download an archive of the data they have on you; they just often make it very difficult to find. Here’s a guide to how to get some of that data:

To get all the data Facebook has on you…

Go to Settings > in tiny print at the bottom of your settings click “Download a copy of your Facebook data”

To find out how Facebook categories you…

Go to Settings > Ads (on the left sidebar navigation) > Your Information > Your Categories

To get all the data Google has on you…

Go to myaccount.google.com > Control Your Content > “Create Archive” > Pick what you want in the archive and click “Next” > Choose file settings and click “Create Archive”

*A note about Google’s data archive: For all the work Google puts into making sure your Google calendar syncs seemlessly with your Google Gmail and your Google Docs are all stored in one big happy Google Drive, Google clearly isn’t invested in making sure your Google archives experience is just as convenient. A lot of important and interesting Google data, like your entire search history, is just tossed into a JSON and handed over to you. What about the vast part of the population that doesn’t know what a JSON is? Doesn’t know how to read a JSON? Doesn’t know how/have the tools to open a JSON on their machine? Google, you could do better.

To find out how Google categorizes you and what ads they think you like…

Go to adssettings.google.com.

To get all the data Twitter has on you…

Go to Settings and Privacy > Your Twitter data (left sidebar navigation) > Scroll all the way to the bottom and click the small print that says “Request Your Data”

To get all the data Snapchat has on you…

(Don’t panic – this doesn’t include every snap you’ve ever sent. It’s mostly account info and statistics, ads you’ve interacted with, and timestamps of every snap you’ve sent, with the actual photos redacted. Oh right, it does include Snaps you’ve recently submitted to Our Story, though. Every. Single. One.)

Go to accounts.snapchat.com > Click “My Data” > Scroll to the bottom and click “Submit Request”

Digital Humanities in the Classroom

For my Anthropological Thought and Theory class midterm, we were assigned to do a visual midterm. What is a visual midterm you ask? Well, in this particular case, it was a timeline, map and genealogy midterm that aimed to understand the broader arc of the development of anthropology as an academic field through the contexts of time, place and relationships, but in a visual manner.  As soon as I read the requirements for this assignment, I thought about utilizing my digital humanities background to create a really cool project. In the end, there were key successes with these projects, as well as fails (some due to my own procrastination, others just completely out of my control), so with this post, I will take you on my journey of using my job as a DHA in the actual classroom.

First, what exactly was the project asking for? The project needed to consist of three visual elements:

1)    a timeline showing the place in history of each of the authors we  had read so far in class, their works, and—if possible—their fieldwork;

2)    a map showing where the authors are from, and where they did their fieldwork; and

3)    an intellectual genealogy tracing who studied with whom, who positively influenced whom, and who critiqued whom.

The first tool that came to mind was ArcGis. My original project idea was to create a map with different layers, each layer indicating some attribute, such as important anthropology schools, field sites, and birthplaces of each anthropologists I was to include. The idea, then was to create a story map with a timeline component using my previously created map from ArcGIS. I have previously used ArcGis, so I had some experience and knowledge of how to use it, however my plan was unsuccessful.

I realized that the project using ArcGis was going to be more time-consuming that I had anticipated. Before even starting to use ArcGis, I needed to do extensive research on the anthropologists, including finding pictures that I could use for each person, and adding some other important facts. That in it of itself took a some hours of labor, and then inputting them into a coherent set on ArcGis was going to be even more time consuming. Thus, I decided to use different digital tools. I think that, had this project been a partner or three people project, then perhaps I would had continued with my original plan. But as the sole project team member, it was not an impossible task, but rather an impossible task to complete by deadline time (specially given I had another major assignment due a day after.)

One other tool I considered using was palladio. I thought it was a perfect way to do the genealogy tracing portion of the assignment, however, I quickly realized that it was impossible to figure out how to use. Why do you ask? I think this all comes down to the fact that palladio does not have a clear set of instructions on how to use it and format the data we’re supposed to format. If you use the sample data, then it works great. But, then you are left at a lost as to how organize your data from scratch.

So having abandoned my two original ideas, I was desperate to find another tool. Then I remembered TimelineJS. With TimelineJS, I could satisfy two of my visual midterm requirements: the geneology tracing (by grouping the different schools/thoughts and color coding them) and creating a nice, and succinct timeline (with pictures!) of every anthropologist I was including in my project. The wonderful thing about TimelineJS is that they provide their own template on google sheets for adding data, and it is customizable! So TimelineJS it was! The slightly cumbersome part was just inputting the individual anthropologists data into the template, it was somewhat time-consuming because I had a lot of information to input, but not a difficult task by any means. With every project however, there’s always rooms for mistakes, such as confusing the pictures of two anthropologists when inputting the information, but it’s nothing that cannot be easily fixed.

Having figured out my timeline problem, I was still left with another aspect of my project unsolved: I needed a map! And thankfully, Google Maps came to the rescue. By this point, I had all of the data in a nicely organized template (thank you TimelineJS!) so it was easier to simply input the data into a google maps, and create my own map story. I still had to do some modifications to the data, but the process was more straightforward. Google Maps was also easier to use, and I was able to also add the layers of information I originally wanted.

Doing this project taught me a lot of important lessons about digital humanities work, including the importance of being flexible to change and compromise, because often-times tools will either not have all the components you need in a project, or other times, you simply do not have the time to do the project like you envisioned it.

Link to project: Anthropology Thought and Theory 

Presenting Interdisciplinary Research

This winter term, I double compsed (for any non-Carleton readers: “comps” is the equivalent of a senior thesis or capstone project – it stands for “comprehensive exercise”). For both of my comps, one in computer science and one in English, I was lucky enough to have the opportunity to do digital humanities projects, but this posed a problem when I was required to give a presentation for each project at the end of the term.

For both presentations, my audience was a mix of humanities people, computer science people, and people who lie somewhere in between. How do I give a presentation that accommodates my entire audience? How do I explain the tech to the humanities folks, and contextualize the humanities for the tech folks?

Here are some rules for interdisciplinary presentations that I created for myself while planning my comps presentations:

Either explain jargon or put it in a black box. Combining tools from multiple disciplines is going to cause a vocabulary problem. You can’t say, “I ran text files of each novel through a Python script that used the NLTK’s POS-tagger to tag each word, then iterated over the tagged tuples to count occurrences of different parts of speech,” and expect anyone who’s never coded before to follow. Either take the time to explain what the NLTK’s POS-tagger is, or just say “I used a tool to get the part of speech of every word in the text.” The same goes for humanities lingo – make sure your entire audience clearly understands what close reading or deconstruction is before using those terms to contextualize your results.

Signpost. In an interdisciplinary presentation, it’s not unreasonable to expect that at least part of your audience is going to get lost at some point. Unless you’re going out of your way to explain every STEM concept and humanities context (which would make for a very long, very boring presentation), at some point someone is going to get lost. But that’s ok! Divide your presentation into clearly defined sections, and at the beginning and end of each section, talk about what you’re going to or have just explained, so that everyone can grasp the broader concepts. Even if someone gets lost within a section, with signposting they’ll hopefully be able to jump back in in the next section.

Include something for everyone. If you’re giving an interdisciplinary presentation, it should be truly interdisciplinary! Acknowledge the different subgroups of your audience and make them feel like they are a part of the conversation by including details from each discipline of your project, and not over-explaining as if they weren’t there. This rule almost contradicts my first rule, and the two can be hard to balance. The goal is to find a happy medium for each discipline between including enough interesting detail for the experts and enough explanation for those unfamiliar with the discipline.

Trying and Learning New Things

As this term draws to a close, I’m pausing to consider the work I’ve done this term. As I stop to consider it, this term has been an interesting mix of both new tasks and at the same time the continuation of previous tasks. A small example of this is social media. I’ve been in charge of the DHA Twitter account for a little while, but this was the first time I began to use a tweet scheduler – same task, but a new method. (Side note: I love the tweet scheduler! I can write up tweets once a week and not have to worry about forgetting to send them at the right time!)

My work on Team Workhouse this term is similar. I’ve been involved in Team Workhouse for almost two years now, but this term I took on new tasks in the Workhouse project. The first new task I took on was being a Teaching Assistant (TA) for the History course, Bringing the English Past to Virtual Life (explore the course blog!), and as part of that I both attended class and held office hours for student help. Attending class as a TA and providing in-class help were both totally new experiences for me, and I explored some thoughts about being a TA in a recent blog post that you can read here! This term I also continued working on the Virtual Workhouse Digital Archive Omeka site, which last term and over winter break I did extensive work with the metadata of the collections housed on the site.

A draft mock-up of a possible layout for the Virtual Workhouse Digital Archive Collections page

This term, however, I had a go at wireframing for the site. If you don’t know what wireframing is (which I didn’t before I did it), it is essentially sketching out the basic layout of a website in order to have a concrete idea what you want it to be before actually working on website itself. I tried it out using Balsamiq (a wireframing tool) and really enjoyed it! It was fun to not just react to technology but think more purposefully about what the goals of the site were and how to design the layout to best accomplish those goals.

Carleton’s Undergraduate Journal of Humanistic Studies

Something entirely new I’m about to start working on is learning LaTeX. I am now a board editor on Carleton’s Undergraduate Journal of Humanistic Studies and I’m going to be working on the website (which I know how to do) and typesetting the papers chosen for the journal – this uses LaTeX, which I don’t know how to do. I don’t have any experience with LaTeX, but I’m excited to start learning. If there’s anything that I’ve learned from my work as a DHA, it’s that there’s always something new to learn!

Learning How to TA

This term, I’m making my first foray into the world of being an in-class Teaching Assistant (TA). In past terms I’ve worked as an out of class TA, holding office hours and offering outside support, but this is my first time actually attending class. This means that there’s some new things for me to figure out, but there’s also some things that I learned from being a TA last term that still apply.

Ana and I were out of class TAs for a Classics course last term and I learned some important things from that experience. One thing I always try to do now when I’m working with a student is check what they do know. Immersed as we are in the world of metadata, I didn’t think to explain what metadata itself was. But pretty early on we got that question – what exactly is metadata? And once we got that question, it made sense. Metadata was not something they were studying in class, so there was no expectation that they would know what it was. After that, I made sure to check with students what they knew about Omeka and metadata first, so I would know where to start that would be most helpful. Because of course there is also the flip side to this problem – if a student is familiar and comfortable with metadata, there’s no need to explain it. So I always found it most helpful to check first before beginning any explanations, so I could meet the student where they were.

An aspect of being a TA that is absolutely new to me is being in class with the students. On Wednesday there was time in class for students to work on an assignment in pairs. I was a bit shy about going up the students when they were working, and at first just wandered and waited for someone to ask a question. I realized after a little while that actually approaching the students was more helpful. While when I wandered past the students wouldn’t ask any questions, if I prompted them with a simple, “how’s it going for you?” they frequently would ask me a question. So although I was shy about doing asking them directly, it was more productive for both of us if I did. I’m still trying to get more comfortable in my new role, but I’m learning some good approaches along the way in order to provide assistance for both the professors and students in the most helpful way.

See some of the work the class has been doing on the blog!

Course Blog for Bringing the English Past to Virtual Life

Digital Humanities (Mini) Job Fair!

One of the struggles, I think, that we as DHA’s have is the ability to convey what our work is really about and what exactly constitutes Digital Humanities. A lot of people on campus still don’t know what Digital Humanities is, let alone that we have a department here. Many people are often confused when I say I work as a Digital Humanities Associate, and I always have to give a 30-second elevator pitch about what my work entails.  With that in mind, I suggested the idea of having a Digital Humanities social/job fair as a way to expose students on campus to what we do as DHA’s. Every year there’s new students who are hired, so I think this could be a great way to motivate other students to apply to the jobs that may otherwise go unnoticed, or to at least learn what Digital Humanities is all about!

Myself and another of our DHA’s, Tyler, are now in the process of planning the event. However, instead of just focusing on DHA’s, we are also hoping to have students from other digital/tech related jobs at Carleton, such as our very own Digital Scholarship Interns, as well as Academic Technology assistants. We hope that this event will be a mini-job fair and that Carleton students can learn more about our jobs, and perhaps apply to these jobs next year.

End of Term Reflection – Communication!

I’ve known for a while that communication is really important when working in teams, but this term really drove home for me how crucial it is. I spent a good amount of time this term working on prepping and uploading archival images of workhouse documents from London in the late eighteenth and early nineteenth centuries as part of the Virtual Workhouse project (you can see the fruits of our labors here!). Tyler and I were both working on these spreadsheets so there were a lot of moving pieces, with having to keep each other in the loop about our progress, checking in with Austin and Sarah about problems we were running into, and talking with Susannah about unexpected issues we came across (like an index in the front of two of the volumes!).

Trello, which is a new piece of our workflow this year, was very helpful for me. Tyler and I could record what we had done each time working on the spreadsheet in a shared place to keep track of our progress. I could tell him that I had finished the titles in Volume 3 but ran into problems with the dates, and he could tell me that he had finished the identifiers in the same volume and fixed the dates. This way we could keep track of what we had finished so neither of us was doing work the other already had.

A screenshot of the Omeka CSV Import Plus Plug-in that we used extensively in our work

Another aspect of communication that I found really important was communication with myself (essentially, documentation). Since we were dealing with the spreadsheets of six volumes of a workhouse minute book, there was a lot of data and a lot of images. I could not count on my memory to keep track of things. Even if I noticed that image 84 in Volume 2 had to be discussed, there was very little chance I would remember that. So I had to make sure to write it down for both myself and others and clearly state exactly why it had to be discussed. The same was true of meetings. I met with Susannah to ask about unexpected pages, and then brought that discussion to a meeting with Tyler and Sarah. Without my notes from the meeting with Susannah, I would not have been able to remember what we had talked about and what her suggestions had been. Likewise, I wouldn’t have remembered our decisions from meeting with Tyler and Sarah to implement them on the spreadsheets. While I’ve known for a while about the value of making good notes to communicate with both others and myself, the high volume of data that we have been working with for the workhouse minutes has driven home to me the absolutely critical nature of such documentation.

Failing to Map Historical Maps

Building off of Martha’s previous post, I’m going to discuss some challenges of mapping projects with old maps. Old maps pose challenges to digital projects. In particular, the spatial arrangements of many old maps don’t match modern day maps of the same area. A path made up of geographic coordinates (such as on Google Maps) is not guaranteed to be compatible with old maps. In addition, there is a fine line in many of these maps between maps and city views, especially in many early modern European prints.

One example of these map/city-view that is useful to think with is the woodcut of Rome from the 1493 Nuremberg Chronicle

A full page spread image of Rome
A woodcut image of Rome from the 1493 Nuremberg Chronicle

This woodcut image clearly shows Rome – in addition to the label (which I will expand a bit on later), there are many recognizable sites – the papal palace, the Castel Sant’Angelo, the Pantheon, and the Colosseum. It is not a simple city view – the geographic relation between these sites shows the general layout of the city. However, trying to plot a path (or even just points) on a georeferenced version of this map is not feasible.

Georeferencing warps the map image, trying to get it to fit the points on the image to relevant points in the real world. Sometimes this warping can be very extreme, especially for certain kinds of transformations:

The map is warped beyond recognition in the attempt to georeference the map.
Mid-process attempt to georeference Rome
The map is warped beyond recognition in the attempt to georeference the map.
Mid-process attempt to georeference Rome

The control points (which link the image to geographic points) on this map do not line up well, since the spatial arrangement of the map/woodcut image do not line up with their geographic locations. This image shows the difference between the points on the 1493 map and the present day map, shown as blue lines.

Rome map georeferenced with visible control points that don't line up with the historical map
Rome map and control points

Another problem with a map like the Nuremberg Chronicle woodcut is that we don’t know what all the landmarks are. There are many church structures, but only a few are labelled. In addition, there are features that seem to be missing – for example, it is difficult to distinguish Tiber Island on the woodcut, which is a major landmark. Furthermore, the scale of the buildings pose problems. It is relatively easy to use the center of the Pantheon as a point on a present day map, but where on the woodcut image is the “center” of the Pantheon? In all, attempting to georeference and plot points on a map such as the Nuremberg Chronicle woodcut image of Rome is frustrating, inaccurate, and ultimately provides no additional insight. In fact, the extreme warping of the image makes it more difficult to understand the and the data represented in relation to it.

My attempt to chart a path on the Nuremberg Chronicle map

The closest point is the Colosseum – nothing else lines up very closely at all (the path is supposed to go from the Colosseum to the Pantheon, to one side of the Ponte Sant’Angelo, to the Castel Sant’Angelo, to the Vatican). The result is not illuminating and does not contribute any new knowledge, and in fact doesn’t serve either purpose well; it is difficult to interpret the Nuremberg Chronicle map, and it is almost impossible to know which landmarks are denoted by the path.

The path of points through Rome does not line up with the location of those points on the Nuremberg map
A path through Rome on top of the Rome Nuremberg Chronicle

Can we trust the labels?

In the Nuremberg Chronicle, no, we can’t! Rome is clearly correct – the sites confirm the label. Other cities are like this, including, for instance, Krakow, which gives specific labels of a part of the city and reflects the city layout. However, for more minor cities, the Nuremberg Chronicle often uses the same woodcut for different cities.

 

In fact, the same woodcut is used for nine images (see below). These images include: Napoli, Perugia, Mantua, Ferrara, Damascus, Bena (I’m not sure what city this refers to), a German province (not sure about this one either), Spain, and Macedonia. While Italian cities may have similar styles, I cannot accept that Napoli, Damascus, Spain and Macedonia literally looked like these woodcuts. Therefore, not only do these images provide difficulties in terms of spatial alignment, but we also cannot always accept them at face value, because they may be no more than a generic representation of a city than a visual representation reflecting an actual cityscape.

Sources:

The color woodcut images come from the digitized University of Cambridge Nuremberg Chronicle (CC BY-NC 3.0): (the page numbers are: Bena 80r, Damascus 23v, Ferrara 159r, Macedonia 275r, Mantua 84r, Napoli 42r, Germany? 284v, Perugia 48v, Spain 289v).

The woodcut image of Rome is from the digitized copy in Morse Library, Beloit College. Last accessed 16 October 2017.

RBMS/BSC Latin Place Names File provided help with place names.

To Map or Not to Map?

During this year’s fall DH training, the DHA’s got some practice using ArcGIS, an online mapping tool that makes it relatively easy to create your own customized maps in one sitting. This post discusses some of the pros and cons, advantages and pitfalls of mapping data. (Note that by mapping I am referring strictly to the use of geospatial maps, not to the more general application of the term that includes graphing.)

Why use a map? Mapping is fun and exciting, and it’s a relatively easy way to build a data visualization that’s interactive and easily facilitates instantaneous spatial comprehension of the data. For these reasons, people are often quick to jump on the “let’s map it!” train whenever there is spatially relevant data. But it’s important to stop and ask this question first: what will a map add to this project that other data visualizations will not? Sometimes, sparsity or lack of variation in your data should disqualify the map idea.

Take this example from Stanford’s Professor Martin Evans, which maps specific locations in and around London that are referenced in works written by authors from London. There’s an abundant amount of data in this data set, and the locations are spread all over London – mapping helps us understand the data, so mapping was a good choice. If, however, you were mapping only locations in London referenced by Sylvia Plath, you might think twice about whether the <10 data points clustered in one small location is worth putting on an interactive map.

Once you’ve determined that a map is worth your time, you might next consider what kind of spatial information you want to convey. Is the data represented well by points on a map? Or is there a path or order to these points? How can you visually differentiate between different paths or groups of points (hint: colors)? Try to create a map that accurately visualizes the story you’re trying to tell with your data. In this example, students at the Georgia Institute of Technology recreated the paths taken throughout the day by characters in Mrs. Dalloway. The smooth, continuous paths tell a better story than a series of sequential points would, and the colors make each path stand out from the others. Above all else, mapping should make it easier for your audience to understand your data, so think hard about how you’re transferring your data to your map. And use colors!

Don’t forget that an important part of mapping is the base map itself, not just the points you put on it. Much of the time, simpler will be better – if the story you’re trying to tell has nothing to do with the terrain of the area, don’t clutter your visual with a terrain base map. Humanities scholars are often excited about using historical base maps, which are historical maps that can be georeferenced onto a modern, digital map of the same location by matching specific points between the two locations. One common problem with historical base maps is that many historical maps are not geographically accurate, so georeferencing them can stretch and distort them to an unusable extent. For example, this 1853 map of Maine from the David Rumsey Map Collection is quite geographically accurate, and would work well as a georeferenced historical base map, but this 1935 world map of post office and radio/telephone services from the same collection is highly geographically inaccurate and would have to be significantly distorted to be georeferenced onto a modern 2-dimensonal map of the world.

Finally, consider how you will communicate the data for each point or path on your graph. Points and paths don’t always speak for themselves, and there will often be metadata or a paragraph of information that necessarily accompanies each data point. How will your user access this information? Is there a key that goes with the map? Do you click on a point to reveal the associated text? Does each point link to more information?

There are many ways to address the above issues and questions, facilitating lots of creativity and flexibility within each project. Above all else, no matter how you approach a mapping project, your map should always give a clear and intuitive answer to the question: what story is this map trying to tell?

Welcome Back!

Last week I arrived early on campus to participate in the fall term DHA training. I didn’t get to take part last year because I was abroad in the fall, so it was a new experience for me. There’s one difference between this year and last year that immediately stands out – since I was able attend the training this year, I had an opportunity to work with and get to know the other DHAs and new DH interns before the term officially started. This was my favorite part of training, and I’m hoping it’ll get us off to a great start this year. I find it so much easier – and more fun! – to work with others when we’ve already eaten deep-fried food from Jesse James Days by the Cannon River together.

At the end of spring term last year I attended the digital humanities conference that I had spent all of spring term helping to organize. Although I was one of only a few students who attended and felt initially intimidated by the sea of “real adults,” I became increasingly aware throughout the course of the conference that I knew what I was doing. I understood a lot of the jargon, I was able to intelligently contribute to conversations, and, most importantly, I felt like I deserved my place in those conversations. In short, it was really cool. A year ago, I wouldn’t have been able to do that. This year, I’m excited to build on that confidence as I expand my DH toolbox. Not too long from now I’ll have to leave Carleton to join the leagues of “real adults,” and I think some confidence will come in handy.