forked from LibraryCarpentry/lc-data-intro-archives
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathaio.html
1875 lines (1833 loc) · 79.5 KB
/
aio.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<!-- START: inst/pkgdown/templates/layout.html --><!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<title>Data Intro for Archivists: All in One View</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" type="text/css" href="assets/styles.css">
<script src="assets/scripts.js" type="text/javascript"></script><!-- mathjax --><script type="text/x-mathjax-config">
MathJax.Hub.Config({
config: ["MMLorHTML.js"],
jax: ["input/TeX","input/MathML","output/HTML-CSS","output/NativeMML", "output/PreviewHTML"],
extensions: ["tex2jax.js","mml2jax.js","MathMenu.js","MathZoom.js", "fast-preview.js", "AssistiveMML.js", "a11y/accessibility-menu.js"],
TeX: {
extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"]
},
tex2jax: {
inlineMath: [['\\(', '\\)']],
displayMath: [ ['$$','$$'], ['\\[', '\\]'] ],
processEscapes: true
}
});
</script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><!-- Responsive Favicon for The Carpentries --><link rel="apple-touch-icon" sizes="180x180" href="apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="favicon-16x16.png">
<link rel="manifest" href="site.webmanifest">
<link rel="mask-icon" href="safari-pinned-tab.svg" color="#5bbad5">
<meta name="msapplication-TileColor" content="#da532c">
<meta name="theme-color" content="#ffffff">
</head>
<body>
<header id="top" class="navbar navbar-expand-md navbar-light bg-white top-nav library"><a class="visually-hidden-focusable skip-link" href="#main-content">Skip to main content</a>
<div class="container-fluid top-nav-container">
<div class="col-md-6">
<div class="large-logo">
<img alt="Library Carpentry" src="assets/images/library-logo.svg">
</div>
</div>
<div class="selector-container">
<div class="dropdown">
<button class="btn btn-secondary dropdown-toggle bordered-button" type="button" id="dropdownMenu1" data-bs-toggle="dropdown" aria-expanded="false">
<i aria-hidden="true" class="icon" data-feather="eye"></i> Learner View <i data-feather="chevron-down"></i>
</button>
<ul class="dropdown-menu" aria-labelledby="dropdownMenu1">
<li><button class="dropdown-item" type="button" onclick="window.location.href='instructor/aio.html';">Instructor View</button></li>
</ul>
</div>
</div>
</div>
<hr></header><nav class="navbar navbar-expand-xl navbar-light bg-white bottom-nav library" aria-label="Main Navigation"><div class="container-fluid nav-container">
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
<span class="menu-title">Menu</span>
</button>
<div class="nav-logo">
<img class="small-logo" alt="Library Carpentry" src="assets/images/library-logo-sm.svg">
</div>
<div class="lesson-title-md">
Data Intro for Archivists
</div>
<div class="search-icon-sm">
<!-- TODO: do not show until we have search
<i role="img" aria-label="search button" data-feather="search"></i>
-->
</div>
<div class="desktop-nav">
<ul class="navbar-nav me-auto mb-2 mb-lg-0">
<li class="nav-item">
<span class="lesson-title">
Data Intro for Archivists
</span>
</li>
<li class="nav-item">
<a class="nav-link" href="key-points.html">Key Points</a>
</li>
<li class="nav-item">
<a class="nav-link" href="reference.html#glossary">Glossary</a>
</li>
<li class="nav-item">
<a class="nav-link" href="profiles.html">Learner Profiles</a>
</li>
<li class="nav-item dropdown">
<button class="nav-link dropdown-toggle" id="navbarDropdown" data-bs-toggle="dropdown" aria-expanded="false">
More <i data-feather="chevron-down"></i>
</button>
<ul class="dropdown-menu" aria-labelledby="navbarDropdown">
<li><a class="dropdown-item" href="discuss.html">Discussion</a></li>
<li><a class="dropdown-item" href="reference.html">Reference</a></li>
</ul>
</li>
</ul>
</div>
<form class="d-flex col-md-2 search-form">
<fieldset disabled>
<input class="form-control me-2 searchbox" type="search" placeholder="Search" aria-label="Search"><button class="btn btn-outline-success tablet-search-button" type="submit">
<i class="search-icon" data-feather="search" role="img" aria-label="search button"></i>
</button>
</fieldset>
</form>
</div>
<!--/div.container-fluid -->
</nav><div class="col-md-12 mobile-title">
Data Intro for Archivists
</div>
<aside class="col-md-12 lesson-progress"><div style="width: %" class="percentage">
%
</div>
<div class="progress library">
<div class="progress-bar library" role="progressbar" style="width: %" aria-valuenow="" aria-label="Lesson Progress" aria-valuemin="0" aria-valuemax="100">
</div>
</div>
</aside><div class="container">
<div class="row">
<!-- START: inst/pkgdown/templates/navbar.html -->
<div id="sidebar-col" class="col-lg-4">
<div id="sidebar" class="sidebar">
<nav aria-labelledby="flush-headingEleven"><button role="button" aria-label="close menu" alt="close menu" aria-expanded="true" aria-controls="sidebar" class="collapse-toggle">
<i class="search-icon" data-feather="x" role="img"></i>
</button>
<div class="sidebar-inner">
<div class="row mobile-row">
<div class="col">
<div class="sidenav-view-selector">
<div class="accordion accordion-flush" id="accordionFlush9">
<div class="accordion-item">
<h2 class="accordion-header" id="flush-headingNine">
<button class="accordion-button collapsed" id="instructor" type="button" data-bs-toggle="collapse" data-bs-target="#flush-collapseNine" aria-expanded="false" aria-controls="flush-collapseNine">
<i id="eye" aria-hidden="true" class="icon" data-feather="eye"></i> Learner View
</button>
</h2>
<div id="flush-collapseNine" class="accordion-collapse collapse" aria-labelledby="flush-headingNine" data-bs-parent="#accordionFlush2">
<div class="accordion-body">
<a href="instructor/aio.html">Instructor View</a>
</div>
</div>
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
</div>
<!--div.sidenav-view-selector -->
</div>
<!--/div.col -->
<hr>
</div>
<!--/div.mobile-row -->
<div class="accordion accordion-flush" id="accordionFlush11">
<div class="accordion-item">
<button id="chapters" class="accordion-button show" type="button" data-bs-toggle="collapse" data-bs-target="#flush-collapseEleven" aria-expanded="false" aria-controls="flush-collapseEleven">
<h2 class="accordion-header chapters" id="flush-headingEleven">
EPISODES
</h2>
</button>
<div id="flush-collapseEleven" class="accordion-collapse show collapse" aria-labelledby="flush-headingEleven" data-bs-parent="#accordionFlush11">
<div class="accordion-body">
<div class="accordion accordion-flush" id="accordionFlush1">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading1">
<a href="index.html">Summary and Setup</a>
</div>
<!--/div.accordion-header-->
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush2">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading2">
<a href="01-introduction.html">1. Introduction to Library Carpentry</a>
</div>
<!--/div.accordion-header-->
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush3">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading3">
<a href="02-think-data.html">2. Don't think you work with data?</a>
</div>
<!--/div.accordion-header-->
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush4">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading4">
<a href="03-foundations.html">3. Foundations</a>
</div>
<!--/div.accordion-header-->
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush5">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading5">
<a href="04-regular-expressions.html">4. Regular Expressions</a>
</div>
<!--/div.accordion-header-->
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush6">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading6">
<a href="05-quiz.html">5. Introduction to Data - Multiple Choice Quiz</a>
</div>
<!--/div.accordion-header-->
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
<div class="accordion accordion-flush" id="accordionFlush7">
<div class="accordion-item">
<div class="accordion-header" id="flush-heading7">
<a href="06-quiz-answers.html">6. Introduction to Data - Multiple Choice Quiz (answers)"</a>
</div>
<!--/div.accordion-header-->
</div>
<!--/div.accordion-item-->
</div>
<!--/div.accordion-flush-->
</div>
</div>
</div>
<hr class="half-width">
<div class="accordion accordion-flush resources" id="accordionFlush12">
<div class="accordion-item">
<h2 class="accordion-header" id="flush-headingTwelve">
<button class="accordion-button collapsed" id="resources" type="button" data-bs-toggle="collapse" data-bs-target="#flush-collapseTwelve" aria-expanded="false" aria-controls="flush-collapseTwelve">
RESOURCES
</button>
</h2>
<div id="flush-collapseTwelve" class="accordion-collapse collapse" aria-labelledby="flush-headingTwelve" data-bs-parent="#accordionFlush12">
<div class="accordion-body">
<ul>
<li>
<a href="key-points.html">Key Points</a>
</li>
<li>
<a href="reference.html#glossary">Glossary</a>
</li>
<li>
<a href="profiles.html">Learner Profiles</a>
</li>
<li><a href="discuss.html">Discussion</a></li>
<li><a href="reference.html">Reference</a></li>
</ul>
</div>
</div>
</div>
</div>
<hr class="half-width resources">
<a href="aio.html">See all in one page</a>
<hr class="d-none d-sm-block d-md-none">
<div class="d-grid gap-1">
</div>
</div>
<!-- /div.accordion -->
</div>
<!-- /div.sidebar-inner -->
</nav>
</div>
<!-- /div.sidebar -->
</div>
<!-- /div.sidebar-col -->
<!-- END: inst/pkgdown/templates/navbar.html-->
<!-- START: inst/pkgdown/templates/content-extra.html -->
<div class="col-xl-8 col-lg-12 primary-content">
<main id="main-content" class="main-content"><div class="container lesson-content">
<section id="aio-01-introduction"><p>Content from <a href="01-introduction.html">Introduction to Library Carpentry</a></p>
<hr>
<p> Last updated on 2023-04-24 |
<a href="https://github.com/librarycarpentry/lc-data-intro-archives/edit/main/episodes/01-introduction.md" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
<div class="text-end">
<button role="button" aria-pressed="false" tabindex="0" id="expand-code" class="pull-right"> Expand All Solutions <i aria-hidden="true" data-feather="plus"></i></button>
</div>
<div class="overview card">
<h2 class="card-header">Overview</h2>
<div class="row g-0">
<div class="col-md-4">
<div class="card-body">
<div class="inner">
<h3 class="card-title">Questions</h3>
<ul>
<li>What do archivists gain from code?</li>
</ul>
</div>
</div>
</div>
<div class="col-md-8">
<div class="card-body">
<div class="inner bordered">
<h3 class="card-title">Objectives</h3>
<ul>
<li>Explain why software skills are valuable to archivists</li>
<li>Know where to go for help during Library Carpentry</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<section id="overview"><h2 class="section-heading">Overview<a class="anchor" aria-label="anchor" href="#overview"></a>
</h2>
<hr class="half-width">
<div class="section level3">
<h3 id="introduction">Introduction<a class="anchor" aria-label="anchor" href="#introduction"></a>
</h3>
<p>Welcome to Library Carpentry! This series of introductory workshops
on software skills for librarians and archivists started life as an
exploratory programme funded by the Software Sustainability Institute
and supported by <a href="https://software-carpentry.org/" class="external-link">Software
Carpentry</a> and City University London. Thanks also go to the British
Library and the University of Sussex where James Baker, who developed
the workshops, worked when planning and delivering the workshops. The
aim of Library Carpentry is to create a set of tools the community can
manage, support, enrich, and reuse as it sees fit. Periodically during
the sessions we will collect anonymous feedback that will go into
improving the classes and ensuring that they best fit the evolving needs
and requirements of the library and information science community.</p>
<p>The rationale for Library Carpentry is twofold. First, as Andromeda
Yelton argues in her excellent <a href="https://journals.ala.org/ltr/issue/view/506" class="external-link">ALA Library
Technology Report</a> ‘Coding for Librarians: learning by example’, code
is a means for librarians to take control of practice and to empower
themselves and their organisation to meet user needs in flexible ways.
Second, librarians play a crucial role in cultivating world class
research. And in most research areas today world class research relies
on the use of software. Librarians with software skills are then well
placed to continue that cultivation of world class research.</p>
</div>
<div class="section level3">
<h3 id="where-to-go-for-help">Where to go for help<a class="anchor" aria-label="anchor" href="#where-to-go-for-help"></a>
</h3>
<p>First, identify people on your table who can help: you will all be
working from the same material, so someone around you may have figured
out the point you are stuck at.</p>
<p>Second, there are helpers on hand to help if those around you can’t.
You should all have access to coloured sticky notes: attaching a red
sticky note to your laptop indicates that you need help (it might also
alert the attention of someone around you!). So, please use them.</p>
<p>Third, each part of Library Carpentry may require you to install
software or download data. Breaks are a good time to ask for help.</p>
<p>Fourth, we encourage you to finish up or repeat tasks after class
time: if you run into any problem, please report them on the relevant
Github issues page (see the bottom of each lesson page for a link).</p>
<p>Most Library Carpentry lessons will require you to follow along while
your instructor demonstrates a software tool or approach. Sometimes you
will fall behind. If you put your red sticky note up on your computer,
this lets a helper know you need assistance. Your issue may be specific
to your computer. Computers are stupid, can frustrate, and as you all
have different machines it can be tricky to resolve problems. Please be
patient, particularly if your issue is local. Stepping outside and
taking a gulp of fresh air always helps.</p>
<div id="keypoints1" class="callout keypoints">
<div class="callout-square">
<i class="callout-icon" data-feather="key"></i>
</div>
<div class="callout-inner">
<h3 class="callout-title">Keypoints<a class="anchor" aria-label="anchor" href="#keypoints1"></a>
</h3>
<div class="callout-content">
<ul>
<li>Don’t be scared to ask for help</li>
</ul>
</div>
</div>
</div>
</div>
</section></section><section id="aio-02-think-data"><p>Content from <a href="02-think-data.html">Don't think you work with data?</a></p>
<hr>
<p> Last updated on 2023-04-24 |
<a href="https://github.com/librarycarpentry/lc-data-intro-archives/edit/main/episodes/02-think-data.md" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
<div class="text-end">
<button role="button" aria-pressed="false" tabindex="0" id="expand-code" class="pull-right"> Expand All Solutions <i aria-hidden="true" data-feather="plus"></i></button>
</div>
<div class="overview card">
<h2 class="card-header">Overview</h2>
<div class="row g-0">
<div class="col-md-4">
<div class="card-body">
<div class="inner">
<h3 class="card-title">Questions</h3>
<ul>
<li>What sort of data do you work with?</li>
<li>What do you do with it?</li>
<li>What tools do you use to help you?</li>
</ul>
</div>
</div>
</div>
<div class="col-md-8">
<div class="card-body">
<div class="inner bordered">
<h3 class="card-title">Objectives</h3>
<ul>
<li>Recognise that they work with data</li>
<li>Compare what tasks they peform on data and the tools they use</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<section id="dont-think-you-work-with-data-think-again"><h2 class="section-heading">Don’t think you work with data? Think again<a class="anchor" aria-label="anchor" href="#dont-think-you-work-with-data-think-again"></a>
</h2>
<hr class="half-width">
<div class="section level3">
<h3 id="task-1">Task 1<a class="anchor" aria-label="anchor" href="#task-1"></a>
</h3>
<p>This group task is an opportunity for you to think about the sort of
data you have, what you do with it, and what tools you use to do
that.</p>
<ul>
<li>Start by getting into pairs.</li>
<li>Brainstorm all the different sorts of data you work with (examples
might include metadata, catalogue data, legacy data, data ouptut from
DROID etc.)</li>
<li>Your instructor will gather in these ideas and lead a discussion to
establish that we are all talking about roughly the same thing when we
talk about data</li>
<li>Get into groups of 4-6.</li>
<li>Discuss your own data, trying to answer questions including; How
much data do you have? Where is it stored? Who has access to it? How is
it formatted or stored? Can you move it about easily - in and out of
systems? In particular think about the tools you use to help you manage
your data as well as any problems you have with it.</li>
<li>Each group then reports back on two problems they have with their
data.</li>
<li>The instructor will collate these on a whiteboard and facilitate a
discussion about; a) how starting to think in terms of data is a good
first step for what we will be learning, b) what it is we will be
learning, and c) how what we will be learning will help us to solve some
of the problems we are facing.</li>
</ul>
</div>
<div class="section level3">
<h3 id="task-2">Task 2<a class="anchor" aria-label="anchor" href="#task-2"></a>
</h3>
<p>This follow-on task aims to guide learners in thinking about data as
conceptually seperate from the systems that produce, store, and preserve
it. It offers an opportunity to think about how data move through
archival systems and the value of archival data outside of those
systems.</p>
<ul>
<li>As a group, consider the types of data you discussed in the previous
task and select one representative example.</li>
<li>Using sticky notes, map the lifecycle of a data point from the
moment of creation to its long-term home or to disposition (long term
transfer, destruction, etc.)</li>
<li>Discuss: How many people or organizations have been custodians of
the data? How many systems has it moved through? Is there a relationship
between the individual(s) creating the data and those who make
preservation or disposition decisions? How does the lifecycle of the
dataset impact documentation, metadata, or the data itself?</li>
<li>Each group attaches their data lifecycle map to the whiteboard</li>
<li>The instructor will lead a discussion about lifecycles of archival
data and highlight the potential value of these data outside of the
systems we typically associate with archival data.</li>
</ul>
<div id="keypoints1" class="callout keypoints">
<div class="callout-square">
<i class="callout-icon" data-feather="key"></i>
</div>
<div class="callout-inner">
<h3 class="callout-title">Keypoints<a class="anchor" aria-label="anchor" href="#keypoints1"></a>
</h3>
<div class="callout-content">
<ul>
<li>We all have data and it is not just enough to put it into a system
and forget about it</li>
</ul>
</div>
</div>
</div>
</div>
</section></section><section id="aio-03-foundations"><p>Content from <a href="03-foundations.html">Foundations</a></p>
<hr>
<p> Last updated on 2023-04-24 |
<a href="https://github.com/librarycarpentry/lc-data-intro-archives/edit/main/episodes/03-foundations.md" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
<div class="text-end">
<button role="button" aria-pressed="false" tabindex="0" id="expand-code" class="pull-right"> Expand All Solutions <i aria-hidden="true" data-feather="plus"></i></button>
</div>
<div class="overview card">
<h2 class="card-header">Overview</h2>
<div class="row g-0">
<div class="col-md-4">
<div class="card-body">
<div class="inner">
<h3 class="card-title">Questions</h3>
<ul>
<li>what best practice and generic skills underpin your encounters with
data and research?</li>
</ul>
</div>
</div>
</div>
<div class="col-md-8">
<div class="card-body">
<div class="inner bordered">
<h3 class="card-title">Objectives</h3>
<ul>
<li>identify and use best practice in data structures</li>
<li>identify and understand a data-driven mindset</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<section id="foundations"><h2 class="section-heading">Foundations<a class="anchor" aria-label="anchor" href="#foundations"></a>
</h2>
<hr class="half-width">
<p>In the last episode, we discussed what we each think of as data. We
came up with a lot of different ideas of what data looks like and how it
can be used. Before we crack on with using the computational tools at
our disposal, I want to spend some time on some foundation level stuff -
a combination of best practice and generic skills that frame what you’ll
encounter across Archive Carpentry.</p>
<p><strong>Trainer Note</strong>: we recommend using this section as an
opportunity to discuss foundational skills that you think are
relevant.</p>
<div class="section level3">
<h3 id="data-are-collected-through-research">Data are Collected Through Research<a class="anchor" aria-label="anchor" href="#data-are-collected-through-research"></a>
</h3>
<p>To summarize the brainstorming session that we had in the last
episode, data are information collected through research. As archivists,
we support research. When we start to think of our collections as data,
we can start to support new methods of providing access to our data.
Data can be manipulated using automated or computational methods,
allowing us to improve our workflows. When approaching our work with a
data-aware mindset, we should think of the systems that we are using to
do our work.</p>
</div>
<div class="section level3">
<h3 id="the-computer-and-the-systems-inside-it-are-stupid">The computer and the systems inside it are stupid<a class="anchor" aria-label="anchor" href="#the-computer-and-the-systems-inside-it-are-stupid"></a>
</h3>
<p>This does not mean that the computer isn’t useful. Given a repetitive
task, an enumerative task, or a task that relies on memory, it can
produce results faster, more accurately, and less grudgingly than you or
I. Rather when I say that you should keep in mind that the computer is
stupid, I mean to say that computer only does what you tell it to. If it
throws up an error, it is often not your fault; in most cases, the
computer has failed to interpret what you mean because it can only work
with what it knows (ergo, it is bad at interpreting). This is not to say
that the people who told the computer what to tell you when it doesn’t
know what to do couldn’t have done a better job with error messages –
they could. So keep in mind as we go along that if you find an error
message frustrating, it isn’t the computer’s fault that it is giving you
an archaic and incomprehensible error message, it is a human
person’s.</p>
<ul>
<li>
<strong>The correct language to learn is the one that works in your
local context</strong>. There truly isn’t a best language, just
languages with different strengths and weaknesses, all of which
incorporate the same fundamental principles;</li>
<li>
<strong>Knowing the structure of the interface that you are using
will assist you in learning</strong>. Databases and computer systems can
seem opaque. Knowing what data structures they were built to support can
help you to troubleshoot</li>
<li>
<strong>Automate to make the time to do something else!</strong>
Taking the time to gather together even the most simple programming
skills can save time to do more interesting stuff! (even if often that
more interesting stuff is learning more programming skills …)</li>
<li>
<strong>Understanding the interface can help you to communicate with
developers and engineers</strong> Taking the time to gather together
even the most simple programming skills can help you to better
communicate your needs to developers.</li>
</ul>
</div>
<div class="section level3">
<h3 id="beyond-the-interface">Beyond the Interface<a class="anchor" aria-label="anchor" href="#beyond-the-interface"></a>
</h3>
<p>Much of the work that you do with data may be completed through a
software interface. Your archival catalog and Excel spreadsheets are
interfaces that allow you to view your data more easily. The data itself
is organized into structures that many of you will be familiar with, but
is much more text-heavy and may not be as simple for humans to read.</p>
</div>
<div class="section level3">
<h3 id="plain-text-formats-are-your-friend">Plain text formats are your friend<a class="anchor" aria-label="anchor" href="#plain-text-formats-are-your-friend"></a>
</h3>
<p>Why? Because computers can process them! Structures and formats that
may be easier for humans to read often cannot be read by computers.</p>
<p>If you want computers to be able to process your stuff, try to get
into the habit of using platform-agnostic formats where possible, such
as .txt for notes and .csv or .tsv for tabulated data (the latter pair
are just spreadsheet formats, separated by commas and tabs
respectively). These plain text formats are preferable to the
proprietary formats used as defaults by Microsoft Office because they
can be opened by many software packages and have a strong chance of
remaining viewable and editable in the future. Most standard office
suites include the option to save files in .txt, .csv and .tsv formats,
meaning you can continue to work with familiar software and still take
appropriate action to make your work accessible. Compared to .doc or
.xls, these formats have the additional benefit of containing only
machine-readable elements.</p>
<p>Whilst it is common practice to use bold, italics, and colouring to
signify headings or to make a visual connection between data elements,
these display-orientated annotations are not (easily) machine-readable,
and hence can neither be queried and searched nor are appropriate for
large quantities of information (the rule of thumb is, if you can’t find
it by CTRL+F, it isn’t machine readable). It is preferable to use
standards that signify heading levels, as these standards are not only
machine-readable, but also translate easily across web browsers and
potential future content migrations.</p>
<p>In archival practice, standards have been developed in order for
computers to understand the methods that we use to describe our
collections. ISAD(G) – General International Standard Archival
Description – has helped archivists to determine how to describe their
collections but EAD – Encoded Archival Description – has given
archivists a standard way to format their description.</p>
<div id="keypoints1" class="callout keypoints">
<div class="callout-square">
<i class="callout-icon" data-feather="key"></i>
</div>
<div class="callout-inner">
<h3 class="callout-title">Keypoints<a class="anchor" aria-label="anchor" href="#keypoints1"></a>
</h3>
<div class="callout-content">
<ul>
<li>data are used in research</li>
<li>archival collections and archival description are data</li>
<li>data structures should be consistent and predictable</li>
<li>consider the standards and structures used in your own data</li>
<li>identify and use computational methods in your work</li>
<li>identify how standards and structures can be used in research</li>
</ul>
</div>
</div>
</div>
</div>
</section></section><section id="aio-04-regular-expressions"><p>Content from <a href="04-regular-expressions.html">Regular Expressions</a></p>
<hr>
<p> Last updated on 2023-04-24 |
<a href="https://github.com/librarycarpentry/lc-data-intro-archives/edit/main/episodes/04-regular-expressions.md" class="external-link">Edit this page <i aria-hidden="true" data-feather="edit"></i></a></p>
<div class="text-end">
<button role="button" aria-pressed="false" tabindex="0" id="expand-code" class="pull-right"> Expand All Solutions <i aria-hidden="true" data-feather="plus"></i></button>
</div>
<div class="overview card">
<h2 class="card-header">Overview</h2>
<div class="row g-0">
<div class="col-md-4">
<div class="card-body">
<div class="inner">
<h3 class="card-title">Questions</h3>
<ul>
<li>How can you imagine using regular expressions in your work?</li>
</ul>
</div>
</div>
</div>
<div class="col-md-8">
<div class="card-body">
<div class="inner bordered">
<h3 class="card-title">Objectives</h3>
<ul>
<li>Use regular expressions in searches</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<section id="regular-expressions"><h2 class="section-heading">Regular Expressions<a class="anchor" aria-label="anchor" href="#regular-expressions"></a>
</h2>
<hr class="half-width">
<p>One of the reason why I have stressed the value of consistent and
predictable directory and filenaming conventions is that working in this
way enables you to use the computer to select files based on the
characteristics of their file name. So, for example, if you have a bunch
of files where the first four digits are the year and you only want to
do something with files from ‘2014’, then you can. Or if you have
‘journal’ somewhere in a filename when you have data about journals, you
can use the computer to select just those files then do something with
them. Equally, using plain text formats means that you can go further
and select files or elements of files based on characteristics of the
data <em>within</em> files.</p>
<p>A powerful means of doing this selecting based on file
characteristics is to use regular expressions, often abbreviated to
regex. A regular expression is a sequence of characters that define a
search pattern, mainly for use in pattern matching with strings, or
string matching, i.e. “find and replace”-like operations. Regular
expressions are typically surrounded by <code>/</code> characters,
though we will (mostly) ignore those for ease of comprehension. Regular
expressions will let you:</p>
<ul>
<li>Match on types of character (e.g. ‘upper case letters’, ‘digits’,
‘spaces’, etc.)</li>
<li>Match patterns that repeat any number of times</li>
<li>Capture the parts of the original string that match your
pattern</li>
</ul>
<p>As most computational software has regular expression functionality
built in and as many computational tasks in libraries are built around
complex matching, it is good place for Library Carpentry to start in
earnest.</p>
<p>A very simple use of a regular expression would be to locate the same
word spelled two different ways. For example the regular expression
<code>organi[sz]e</code> matches both “organise” and “organize”.</p>
<p>But it would also match <code>reorganise</code>,
<code>reorganize</code>, <code>organises</code>, <code>organizes</code>,
<code>organised</code>, <code>organized</code>, et cetera, because we’ve
not specified the beginning or end of our string. So there are a bunch
of special syntax that help us be more precise.</p>
<p>The first we’ve seen: square brackets can be used to define a list or
range of characters to be found. So:</p>
<ul>
<li>
<code>[ABC]</code> matches A or B or C</li>
<li>
<code>[A-Z]</code> matches any upper case letter</li>
<li>
<code>[A-Za-z0-9]</code> matches any upper or lower case letter or
any digit (note: this is case-sensitive)</li>
</ul>
<p>Then there are:</p>
<ul>
<li>
<code>.</code> matches any character</li>
<li>
<code>\d</code> matches any single digit</li>
<li>
<code>\w</code> matches any part of word character (equivalent to
<code>[A-Za-z0-9]</code>)</li>
<li>
<code>\s</code> matches any space, tab, or newline</li>
<li>
<code>\</code> NB: this is also used to escape the following
character when that character is a special character. So, for example, a
regular expression that found <code>.com</code> would be
<code>\.com</code> because <code>.</code> is a special character that
matches any character.</li>
<li>
<code>^</code> asserts the position at the start of the line. So
what you put after it will only match if they are the first characters
of a line.</li>
<li>
<code>$</code> asserts the position at the end of the line. So what
you put before it will only match if they are the last characters of a
line.</li>
<li>
<code>\b</code> adds a word boundary. Putting this either side of a
word stops the regular expression matching longer variants of words. So:
<ul>
<li>the regular expression <code>foobar</code> will match
<code>foobar</code> and find <code>666foobar</code>,
<code>foobar777</code>, <code>8thfoobar8th</code> et cetera</li>
<li>the regular expression <code>\bfoobar</code> will match
<code>foobar</code> and find <code>foobar777</code>
</li>
<li>the regular expression <code>foobar\b</code> will match
<code>foobar</code> and find <code>666foobar</code>
</li>
<li>the regular expression <code>\bfoobar\b</code> will find
<code>foobar</code>
</li>
</ul>
</li>
</ul>
<p>So, what is <code>^[Oo]rgani.e\b</code> going to match.</p>
<div id="using-special-characters-in-regular-expression-matches" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="using-special-characters-in-regular-expression-matches" class="callout-inner">
<h3 class="callout-title">Using special characters in regular expression
matches<a class="anchor" aria-label="anchor" href="#using-special-characters-in-regular-expression-matches"></a>
</h3>
<div class="callout-content">
<p>Can you guess what the regular expression <code>^[Oo]rgani.e\b</code>
will match?</p>
</div>
</div>
</div>
<div id="accordionSolution1" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution1" aria-expanded="false" aria-controls="collapseSolution1">
<h4 class="accordion-header" id="headingSolution1">
Show me the solution
</h4>
</button>
<div id="collapseSolution1" class="accordion-collapse collapse" aria-labelledby="headingSolution1" data-bs-parent="#accordionSolution1">
<div class="accordion-body">
<pre><code><span><span class="va">organise</span></span>
<span><span class="va">organize</span></span>
<span><span class="va">Organise</span></span>
<span><span class="va">Organize</span></span>
<span><span class="va">organife</span></span>
<span><span class="va">Organike</span></span></code></pre>
<p>Or, any other string that starts a line, begins with a letter
<code>o</code> in lower or capital case, proceeds with
<code>rgani</code>, has any character in the 7th position, and ends with
the letter <code>e</code>.</p>
</div>
</div>
</div>
</div>
<p>Other useful special characters are:</p>
<ul>
<li>
<code>*</code> matches the preceding element zero or more times. For
example, ab*c matches “ac”, “abc”, “abbbc”, etc.</li>
<li>
<code>+</code> matches the preceding element one or more times. For
example, ab+c matches “abc”, “abbbc” but not “ac”.</li>
<li>
<code>?</code> matches when the preceding character appears zero or
one time.</li>
<li>
<code>{VALUE}</code> matches the preceding character the number of
times define by VALUE; ranges can be specified with the syntax
<code>{VALUE,VALUE}</code>
</li>
<li>
<code>|</code> means or.</li>
</ul>
<p>So, what are these going to match?</p>
<div id="oorgani.ew" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="oorgani.ew" class="callout-inner">
<h3 class="callout-title">
<code>^[Oo]rgani.e\w*</code><a class="anchor" aria-label="anchor" href="#oorgani.ew"></a>
</h3>
<div class="callout-content">
<p>Can you guess what the regular expression
<code>^[Oo]rgani.e\w*</code> will match?</p>
</div>
</div>
</div>
<div id="accordionSolution2" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution2" aria-expanded="false" aria-controls="collapseSolution2">
<h4 class="accordion-header" id="headingSolution2">
Show me the solution
</h4>
</button>
<div id="collapseSolution2" class="accordion-collapse collapse" aria-labelledby="headingSolution2" data-bs-parent="#accordionSolution2">
<div class="accordion-body">
<pre><code><span><span class="va">organise</span></span>
<span><span class="va">Organize</span></span>
<span><span class="va">organifer</span></span>
<span><span class="va">Organi2ed111</span></span></code></pre>
<p>Or, any other string that starts a line, begins with a letter
<code>o</code> in lower or capital case, proceeds with
<code>rgani</code>, has any character in the 7th position, follows with
letter <code>e</code> and zero or more characters from the range
<code>[A-Za-z0-9]</code>.</p>
</div>
</div>
</div>
</div>
<div id="oorgani.ew-1" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="oorgani.ew-1" class="callout-inner">
<h3 class="callout-title">
<code>[Oo]rgani.e\w+$</code><a class="anchor" aria-label="anchor" href="#oorgani.ew-1"></a>
</h3>
<div class="callout-content">
<p>Can you guess what the regular expression
<code>[Oo]rgani.e\w+$</code> will match?</p>
</div>
</div>
</div>
<div id="accordionSolution3" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution3" aria-expanded="false" aria-controls="collapseSolution3">
<h4 class="accordion-header" id="headingSolution3">
Show me the solution
</h4>
</button>
<div id="collapseSolution3" class="accordion-collapse collapse" aria-labelledby="headingSolution3" data-bs-parent="#accordionSolution3">
<div class="accordion-body">
<pre><code><span><span class="va">organiser</span></span>
<span><span class="va">Organized</span></span>
<span><span class="va">organifer</span></span>
<span><span class="va">Organi2ed111</span></span></code></pre>
<p>Or, any other string that ends a line, begins with a letter
<code>o</code> in lower or capital case, proceeds with
<code>rgani</code>, has any character in the 7th position, follows with
letter <code>e</code> and one or more characters from the range
<code>[A-Za-z0-9]</code>.</p>
</div>
</div>
</div>
</div>
<div id="oorgani.ewb" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="oorgani.ewb" class="callout-inner">
<h3 class="callout-title">
<code>^[Oo]rgani.e\w?\b</code><a class="anchor" aria-label="anchor" href="#oorgani.ewb"></a>
</h3>
<div class="callout-content">
<p>Can you guess what the regular expression
<code>^[Oo]rgani.e\w?\b</code> will match?</p>
</div>
</div>
</div>
<div id="accordionSolution4" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution4" aria-expanded="false" aria-controls="collapseSolution4">
<h4 class="accordion-header" id="headingSolution4">
Show me the solution
</h4>
</button>
<div id="collapseSolution4" class="accordion-collapse collapse" aria-labelledby="headingSolution4" data-bs-parent="#accordionSolution4">
<div class="accordion-body">
<pre><code><span><span class="va">organise</span></span>
<span><span class="va">Organized</span></span>
<span><span class="va">organifer</span></span>
<span><span class="va">Organi2ek</span></span></code></pre>
<p>Or, any other string that starts a line, begins with a letter
<code>o</code> in lower or capital case, proceeds with
<code>rgani</code>, has any character in the 7th position, follows with
letter <code>e</code>, and ends with zero or one characters from the
range <code>[A-Za-z0-9]</code>.</p>
</div>
</div>
</div>
</div>
<div id="oorgani.ew-2" class="callout challenge">
<div class="callout-square">
<i class="callout-icon" data-feather="zap"></i>
</div>
<div id="oorgani.ew-2" class="callout-inner">
<h3 class="callout-title">
<code>^[Oo]rgani.e\w?$</code><a class="anchor" aria-label="anchor" href="#oorgani.ew-2"></a>
</h3>
<div class="callout-content">
<p>Can you guess what the regular expression
<code>^[Oo]rgani.e\w?$</code> will match?</p>
</div>
</div>
</div>
<div id="accordionSolution5" class="accordion challenge-accordion accordion-flush">
<div class="accordion-item">
<button class="accordion-button solution-button collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#collapseSolution5" aria-expanded="false" aria-controls="collapseSolution5">
<h4 class="accordion-header" id="headingSolution5">
Show me the solution
</h4>
</button>
<div id="collapseSolution5" class="accordion-collapse collapse" aria-labelledby="headingSolution5" data-bs-parent="#accordionSolution5">
<div class="accordion-body">
<pre><code><span><span class="va">organise</span></span>
<span><span class="va">Organized</span></span>
<span><span class="va">organifer</span></span>
<span><span class="va">Organi2ek</span></span></code></pre>
<p>Or, any other string that starts and ends a line, begins with a
letter <code>o</code> in lower or capital case, proceeds with
<code>rgani</code>, has any character in the 7th position, follows with
letter <code>e</code> and zero or one characters from the range
<code>[A-Za-z0-9]</code>.</p>
</div>
</div>
</div>