Skip to content

Commit 07f81e5

Browse files
Add guidelines, requirements.txt
1 parent fc05ee4 commit 07f81e5

File tree

2 files changed

+23
-2
lines changed

2 files changed

+23
-2
lines changed

llm-silver-anno/README.md

+18-2
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Both `llm_adjudicator.py` and `review_ocr.py` expect input data in CSV format. E
2929
| textdocument | The OCR content |
3030
| path | Path to the source video |
3131

32-
#### LLM Adjudicator
32+
### LLM Adjudicator
3333
The adjudication environment takes the output of the OCR reviewer annotations, plus:
3434

3535
- silver_standard_annotation
@@ -64,4 +64,20 @@ The adjudicator produces a CSV with all input values, and stores the output in t
6464
| Field | Description |
6565
|-------|-------------|
6666
| adjudicated | Whether the row has been processed |
67-
| accepted | Whether the annotation was accepted. If false, it should be discarded from the data batch. (Note, this will also be set to true if the user manually corrected the LLM annotation) |
67+
| accepted | Whether the annotation was accepted. If false, it should be discarded from the data batch. (Note, this will also be set to true if the user manually corrected the LLM annotation) |
68+
69+
70+
## Guidelines
71+
72+
### OCR Reviewer:
73+
74+
The OCR Reviewer allows for a few annotation options for each image:
75+
76+
- **Swap scene type** between credit and chyron if the scene has been misclassified by SWT.
77+
- **Reject OCR** if the OCR results are so poor in quality that they would be useless as input to an RFB model. This is equivalent to sequence-tagging the text as a series of "O"s, i.e. no viable RFB results found.
78+
- **Delete** if the true scene type is not credit or chyron, or if the OCR results are an edge case that necessitate throwing them out from the batch entirely (this should not happen very often).
79+
- **Submit** if all needed changes have been made, or if the results were correct initially.
80+
81+
### LLM Adjudicator:
82+
83+
The LLM adjudicator is comparitively more simple, with options to accept or reject the LLM's annotations. The user can also edit the BIO-formatted annotations directly -- after editing, select accept ("👍") to submit the changes. The tags will be automatically parsed to JSON format for real-time preview.

llm-silver-anno/requirements.txt

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
opencv_python==4.9.0.80
2+
pandas==2.2.2
3+
streamlit==1.33.0
4+
streamlit_extras==0.4.3
5+
streamlit_shortcuts==0.1.1

0 commit comments

Comments
 (0)