Scripting Reflection

Why scripting?

With the pace at which institutions are acquiring born-digital materials, or digitizing analog materials, archivists often encounter tasks that are otherwise very time-consuming or ultra tedious to perform manually. Scripting is sometimes turned to as a way to do things quickly and accurately, at scale. Scripts have been written by archivists, some who are seasoned developers, others who are not.

While you don’t need to be a full-fledged programmer to work in a modern repository, having a basic understanding of what scripting can do can help you to at least imagine ways to better automate and manage repetitive processes.

For this activity, you’ll explore a real-life example of code used by a repository and translate it into pseudocode. Think of pseudocode as plain, conversational language that explains the steps a script takes to accomplish its task.


Steps

1. Select a script

  • In class, you’ll choose one Python* script from Ruth Tillman’s ArchivesSnake GitHub repository to work with. Each script performs a small job using the Python programming language and the ArchivesSnake Library. Try to choose one that is no more than 50 lines.

  • A Python script file will end with .py (e.g., download-subjects.py).


2. Get to know the script

  • Open your chosen script by clicking its name.
  • Read the script from top to bottom. Don’t worry at this point about understanding exactly what it’s doing. This is just a way to get acquainted with its length, any comments made by the author, and see if there’s anything that initially makes sense. One of Python’s strengths is that it is written to be more human readable that other scripting languages. Try to find those things!
  • Ruth conveniently wrote these scripts in “chunks”, in the same way written stories are broken up into paragraphs (with a line break in between each chunk). Try first to hash out what you think the chunk is doing. Note: lines starting with # denote what are known as “comments”. They’re there to explain what the code is doing! Please pay special attention to them. There are also descriptions for (some) of the scripts in her main ASpaceASnake page that might also be helpful for you (just please note, not all scripts have a description).
  • Once you’ve tried out describing what each chunk does, go back and look up terms/script parts that are unfamiliar and fill out your pseudocode.

3. Use scripting resources

You are encouraged to use any resource to help you understand the code:

  • Preferred: Official Python documentation such as the Python Built-in Functions document (this will only contain details for information that comes with a standard Python install. For more specific libraries, you will need to consult other documentation such as ArchivesSnake).
  • Online Communities (e.g., Stack Overflow)
  • Video Tutorials (e.g., YouTube)
  • Generative AI (e.g., chatbots - I don’t exactly encourage this but it’s inevitable that generative AI information will appear, even if you do a basic web browser search.)
  • Cold browser search (e.g. typing in and searching for “what does [this term/script step] mean?” in a browser application)

You don’t need to cite your sources. The goal is to understand what the script is doing, not to write a formal paper.


4. Write your pseudocode

  • Follow the structure from this week’s slide deck: code on one side, pseudocode on the other.
  • Example pseudocode: “This line checks if a folder exists. If it doesn’t, it creates a folder.”

5. If you’re confused, pseudo-note it!

If you’re unsure about any part of the script, it’s okay! Feel free to note it in your pseudocode:

“I think this line initializes a connection to a server, but I’m not entirely sure.”

Uncertainty is part of learning, and you’re not expected to deeply dissect every detail. When we discuss your findings next week, we can go over parts that confused you.

Please note: I will not be grading you on whether or not your pseudocode is correct. Instead, I want to see you made an effort to figure out, on a high-level, what the script was trying to accomplish.


6. Submit your pseudocode

  • Spend 1-2 hours tops on this activity. Don’t overthink it! The point is to try and warm up to what Python code looks like, and learn a little bit more about its syntax and how archivists have used it to do their work.
  • Submit your pseudocode to Brightspace by Sunday night.
  • Use clear formatting to separate code from pseudocode for easy readability. Suggestion: Use a two-column “table” using a word processing or slide deck application.

7. Prepare to share

Be ready to discuss your pseudocode in class next week. Each student will take 2-3 minutes to discuss:

  • What Ruth was trying to accomplish.
  • What they used to figure out its behavior (web browser search? A #Comment? ArchivesSnake documentation? Asked a student/friend/colleague?).
  • What was confusing or unclear and why.
  • Any other takeaways you would like to share.