ddb4003e66 | ||
---|---|---|
README.md | ||
bins.csv | ||
sorter.rb | ||
tester.txt |
README.md
pat-dissertation
Code for Pat's dissertation.
sorter.rb usage
Options
sorter.rb takes the following options:
option | usage |
---|---|
-f, --file | the name of the input file |
-b, --bin-file | the name of the bin csv file |
-t, --type | what type of splitting to do, can be "iat" or "pn" |
Output
sorter.rb will generate two files:
- [filename]-out.json
- [filename]-out.csv
Both files contain the same data in json or csv format.
Type options
The program has two filtering modes:
iat
This mode grabs all text from the input file in between PLOVEOPENING
and PLOVECLOSING
.
It ignores all text before PLOVEOPENING
and after PLOVECLOSING
.
It does not support multiple sections of text.
pn
This mode grabs each section of text from the input file in between Narrative:
and Signatures:
.
It supports multiple sections from a single input text file.
Example:
./sorter.rb --file tester.txt --bin-file bins.csv --type iat
The above command will run against tester.txt
, count strings according to bins.csv
, and process the input text in iat
mode.
It will create tester-out.json
and tester-out.csv
containing the output data.