Skip to content

Commit 9a055d5

Browse files
Merge pull request #22 from ch3080/ch3080-patch-checkout
grammatical/punctuation edits, completed sentence
2 parents 9e39903 + 712d519 commit 9a055d5

File tree

1 file changed

+8
-9
lines changed

1 file changed

+8
-9
lines changed

episodes/03-foundations.md

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -19,17 +19,17 @@ keypoints:
1919

2020
## Foundations
2121

22-
In the last episode we discussed what we each think of as data. We came up with a lot of different ideas of what data looks like and how it can be used. Before we crack on with using the computational tools at our disposal, I want to spend some time on some foundation level stuff - a combination of best practice and generic skills that frame what you'll encounter across Archive Carpentry.
22+
In the last episode, we discussed what we each think of as data. We came up with a lot of different ideas of what data looks like and how it can be used. Before we crack on with using the computational tools at our disposal, I want to spend some time on some foundation level stuff - a combination of best practice and generic skills that frame what you'll encounter across Archive Carpentry.
2323

2424
**Trainer Note**: we recommend using this section as an opportunity to discuss foundational skills that you think are relevant.
2525

2626
### Data are Collected Through Research
2727

28-
To summarize the brainstorming session that we had in the last episode, data are information collected through research. As archivists we support research. When we start to think of our collections as data we can start to support new methods of providing access to our data. Data can be manipulated using automated or computational methods allowing us to improve our workflows. When approaching our work with a data-aware mindset we should think of the systems that we are using to do our work.
28+
To summarize the brainstorming session that we had in the last episode, data are information collected through research. As archivists, we support research. When we start to think of our collections as data, we can start to support new methods of providing access to our data. Data can be manipulated using automated or computational methods, allowing us to improve our workflows. When approaching our work with a data-aware mindset, we should think of the systems that we are using to do our work.
2929

3030
### The computer and the systems inside it are stupid
3131

32-
This does not mean that the computer isn't useful. Given a repetitive task, an enumerative task, or a task that relies on memory, it can produce results faster, more accurately, and less grudgingly than you or I. Rather when I say that you should keep in mind that the computer is stupid, I mean to say that computer only does what you tell it to. If it throws up an error it is often not your fault, rather in most cases the computer has failed to interpret what you mean because it can only work with what it knows (ergo, it is bad at interpreting). This is not to say that the people who told the computer what to tell you when it doesn't know what to do couldn't have done a better job with error messages, for they could. So keep in mind as we go along that if you find an error message frustrating, it isn't the computer's fault that it is giving you an archaic and incomprehensible error message, it is a human person's.
32+
This does not mean that the computer isn't useful. Given a repetitive task, an enumerative task, or a task that relies on memory, it can produce results faster, more accurately, and less grudgingly than you or I. Rather when I say that you should keep in mind that the computer is stupid, I mean to say that computer only does what you tell it to. If it throws up an error, it is often not your fault; in most cases, the computer has failed to interpret what you mean because it can only work with what it knows (ergo, it is bad at interpreting). This is not to say that the people who told the computer what to tell you when it doesn't know what to do couldn't have done a better job with error messages -- they could. So keep in mind as we go along that if you find an error message frustrating, it isn't the computer's fault that it is giving you an archaic and incomprehensible error message, it is a human person's.
3333

3434
- **The correct language to learn is the one that works in your local context**. There truly isn't a best language, just languages with different strengths and weaknesses, all of which incorporate the same fundamental principles;
3535
- **Knowing the structure of the interface that you are using will assist you in learning**. Databases and computer systems can seem opaque. Knowing what data structures they were built to support can help you to troubleshoot
@@ -38,16 +38,15 @@ This does not mean that the computer isn't useful. Given a repetitive task, an e
3838

3939
### Beyond the Interface
4040

41-
Much of the work that you do with data may be completed through a software interface. Your
42-
archival catalog and excel spreadsheets are interfaces that allow you to view your data more
43-
easily. The data itself is organized into structures that many of you will be familiar with, but
44-
is much more text heavy and may not be as simple for humans to read.
41+
Much of the work that you do with data may be completed through a software interface. Your archival catalog and Excel spreadsheets are interfaces that allow you to view your data more easily. The data itself is organized into structures that many of you will be familiar with, but is much more text-heavy and may not be as simple for humans to read.
4542

4643
### Plain text formats are your friend
4744

4845
Why? Because computers can process them! Structures and formats that may be easier for humans to read often cannot be read by computers.
4946

50-
If you want computers to be able to process your stuff, try to get in the habit where possible of using platform-agnostic formats such as .txt for notes and .csv or .tsv for tabulated data (the latter pair are just spreadsheet formats, separated by commas and tabs respectively). These plain text formats are preferable to the proprietary formats used as defaults by Microsoft Office because they can be opened by many software packages and have a strong chance of remaining viewable and editable in the future. Most standard office suites include the option to save files in .txt, .csv and .tsv formats, meaning you can continue to work with familiar software and still take appropriate action to make your work accessible. Compared to .doc or .xls, these formats have the additional benefit of containing only machine-readable elements. Whilst using bold, italics, and colouring to signify headings or to make a visual connection between data elements is common practice, these display-orientated annotations are not (easily) machine-readable and hence can neither be queried and searched nor are appropriate for large quantities of information (the rule of thumb is if you can't find it by CTRL+F it isn't machine readable). Preferable are standards that have been
47+
If you want computers to be able to process your stuff, try to get into the habit of using platform-agnostic formats where possible, such as .txt for notes and .csv or .tsv for tabulated data (the latter pair are just spreadsheet formats, separated by commas and tabs respectively). These plain text formats are preferable to the proprietary formats used as defaults by Microsoft Office because they can be opened by many software packages and have a strong chance of remaining viewable and editable in the future. Most standard office suites include the option to save files in .txt, .csv and .tsv formats, meaning you can continue to work with familiar software and still take appropriate action to make your work accessible. Compared to .doc or .xls, these formats have the additional benefit of containing only machine-readable elements.
48+
49+
Whilst it is common practice to use bold, italics, and colouring to signify headings or to make a visual connection between data elements, these display-orientated annotations are not (easily) machine-readable, and hence can neither be queried and searched nor are appropriate for large quantities of information (the rule of thumb is, if you can't find it by CTRL+F, it isn't machine readable). It is preferable to use standards that signify heading levels, as these standards are not only machine-readable, but also translate easily across web browsers and potential future content migrations.
5150

52-
In archival practice, standards have been developed in order for computers to understand the methods that we use to describe our collections. ISAD(G) -- General International Standard Archival Description -- has helped archivists to determine how to describe their collections but EAD -- Encoded Archival Description -- has given archivists a standard way of formatting their description.
51+
In archival practice, standards have been developed in order for computers to understand the methods that we use to describe our collections. ISAD(G) -- General International Standard Archival Description -- has helped archivists to determine how to describe their collections but EAD -- Encoded Archival Description -- has given archivists a standard way to format their description.
5352

0 commit comments

Comments
 (0)