Learning about NLTK

In the past couple of week I’ve been helping to update the curriculum for a fantastic project called DH Bridge. This curriculum includes a one-day programming bootcamp for people with no computer science experience (and particularly those who are also involved in the humanities) to learn some basic Python skills. I’ve had so much fun doing the tutorial along the way because it focuses on text analysis using the Natural Language Toolkit (NLTK), which I wasn’t previously familiar with, but includes some really cool tools for natural language processing. You can download NLTK for free and use the many Python libraries it has available to do text analysis day and night! Here are a few of the things I learned:

  • NLTK has a built in method for getting word frequencies, and it’ll spit out the n most common words in a text (you decide what n is) along with the number of times that each word appears, in order from most to least frequent. Nothing too complicated – but it’s a great (and very useful) starting place.
  • Want to see the context in which a certain word appears throughout a text? This method takes a single word as a parameter and prints out each instance of that word within its surrounding text. For example, here’s every instance of the word “trial” in Harper Lee’s To Kill a Mockingbird.

This is a great way to get a sense of how a word is being used throughout a text without having to Control+F your way through the whole thing.

  • This one is my favorite because I think it’s so cool. You give it a word and it returns the twenty words that are “most similar” to that word in the text. I haven’t looked too far into how it works, but the method somehow determines which words are most often used in a similar context to the given word. For example, here are the results for the word “trial” in To Kill a Mockingbird.

Some words, like “court” and “newspaper” are pretty self explanatory, but we may question why a word like “family” is so closely associated with the word “trial” in this novel.

Even with these very simple searches, it’s already easy to see the kind of information you can get out of a text that the human eye wouldn’t necessarily be able to see. Yay digital text analysis!

How To Do Your Job When You Don’t Know How To Do Your Job

The cool thing about this job is that I get to constantly be doing new things and jumping into new projects. The flip side to this, however, is that each project is unique and requires very different skills – skills that I (very often? most of the time?) don’t yet have. So this term, I’ve been getting used to the fact that not having a skill to do a certain job doesn’t mean I don’t do the job, it means I get to learn how to do it. The question then often becomes, “Where do I even start to learn how to do X?” The following are some tips and tactics I’ve been working on using when I’m faced with a daunting task that I’ve never done before:

  • Just ask. This seems obvious, but it’s often much easier said than done. People don’t want to risk sounding dumb by asking questions, but 1) people probably won’t actually think you’re dumb, and 2) isn’t it better to ask and learn how to do something correctly than spend all your time doing it wrong?
  • Google it, but be smart about it. Again, this seems obvious, but Google is a gift and a curse. Be wary of bad advice (you wouldn’t cite a Buzzfeed article for an academic paper, so why should you take serious advice from it?), and think hard about the search terms you use (be precise, try a variety of related terms, etc…).
  • Pretend that you know what you’re doing. I love this tactic. Sometimes I know that I don’t know what I’m doing, but I don’t know what I don’t know, so I just start working until I get stuck in order to figure out where the problem is. It’s a really great way to pinpoint exactly what you don’t know.
  • Use sites that were created for these situations, like Lynda.com. If you’re a Carleton student, you already have a subscription! Even if you can’t find a video to explain exactly what you’re supposed to be doing, it can help you to get a hang of the general terminology relating to the task at hand or the basic functionality of a tool you’re learning to use.
  • Look for existing examples. Chances are you’re not the first person to do anything, so it’s a great idea to find examples of best practices and conventions. This is true for pretty much anything, but particularly when you’re doing something totally new.

Of course, the best part about not knowing how to do something is that you get to learn how to do it and then a week later when one of your colleagues doesn’t know how to do the same thing you get to pretend that you’ve known it all along and teach them how to do it! Such is the cycle of life. Remember, everyone’s just trying to fake it ‘til they make it.

Organization: Yes, it really works.

Seventh week is beginning (did anyone else just go into fight or flight mode after reading those words?), which means that I’ve now had well over a month to settle in to my first term as a DHA. Initially, I was going to write that I had spent the first several weeks of this job learning the ropes and getting the hang of how it goes (because I have indeed learned a great many things about a great deal of stuff), but then I realized that that’s not really true. More accurately, I’d say that I’ve jumped in headfirst, taking a “sink or swim” approach to this new job, so now seems like a great time to come up for air and do a bit of reflecting.

In short, I think I can confidently say that I have not utterly failed (I joke, I’ve actually done quite well). I owe this success in large part to the people I work with, who are intelligent and always helpful. But there’s another tool that has been key in learning quickly how to tackle a new project: the documentation.

The nature of student jobs and participation in organizations is that the turnover is fast – jobs and organizations are looking for new employees and members every year, so making training efficient can be essential. I’ve participated I some student organizations where it seems like every week we’re saying, “I’m pretty sure so-and-so did a project like this a couple years ago, but then they graduated…do you have idea how we could get their contact information to see if they’ve still got that information? Or maybe I still have an email about it from freshman year…” Yes, sifting through emails from 2014 is one of the warning signs that something went wrong…

Student turnover can be a logistical nightmare, but this job has been proof that it doesn’t need to be. It’s been so easy for me to access project history, familiarize myself with all the relevant tools and information, and then quickly jump into new projects. Not all of it is perfect, but the effort was made and I am reaping the benefits. If better (or any) documentation is something you think your job or organization could benefit from, here are some tips I’ve gleaned from my experience:

  1. Be specific and consistent when naming documents. “Meeting 3/5/15” is not helpful. “Initial Meeting with Web Developer” is helpful.
  2. Note specifically who has done what. If you need to contact someone about work they’ve done, you don’t want to be guessing between 10 different people.
  3. Take the time to organize. The point of documentation is that it makes everyone’s lives easier down the road. Make everything easy to find using section headers and bulleted lists.
  4. Note things that did and didn’t work. For example, if you’re reflecting on how an annual event went, a note that says “Next year get catering order in 1 week before event” can save planning time and prevent disasters in the future.
  5. Keep everything in one place. This seems obvious, but it can be very easy to for things to go missing, especially if there are a lot of people working on one project. For example, having one big Google Drive folder ensures that everyone knows where and how to access everything.

Now go forth and document!

Martha Says Hello

img_5824Hi! My name is Martha Durrett, and I’m a junior Computer Science and English major at Carleton College. Computer science and English! What? “Well those don’t overlap at all,” you might say. In some way you’re right…I certainly won’t be counting any of my CS classes for English credits. But in many ways that’s wrong – for example, digital humanities! Could I have found a more fun and engaging way to integrate my two majors?

As I learn more about digital humanities, I’m hoping to continue to break down that distinction between “English major” and “computer science major.” I want to discover fun and engaging new ways to integrate not just English and computer science, but any subject that piques my interest. Who says subjects need to be separate? (Finland doesn’t – check this out if you haven’t heard about Finland’s radical educational reform.)  Throughout the rest of the year, I’ll be thinking about how the digital world is changing expectations about how we’re supposed to learn about and interact with the humanities. If you ask me, the humanities (whatever that hefty term entails) have spent far too long hiding inside of textbooks, and it’s about time we did something new with them!