-
Notifications
You must be signed in to change notification settings - Fork 1
Data source
The dataset was originally made available on Michigan’s website at http://www.michigan.gov/snyder/0,4668,7-277-57577_57657-376716--,00.html, where it is still online as of 3/17/20. It is stored on the state’s website in PDF format, with lengths ranging from one page to ____ pages. The PDF naming convention is according to the name of a state department (presumably responding to the FOIA request represented in the PDF). Some FOIA requests are included in the PDF, but most PDFs do not begin with this documentation. It appears that PDFs may be roughly grouped by email sender, implying that the sequencing of PDF numbers and within-PDF order may correspond with a) FOIA response by department and b) email sender. Department identification and acronyms are in Appendix A.
A student in 2018 noticed some additional archived versions of this website that had additional PDFs. Currently, the dataset we are working with has only the data that is currently available on the website, although we have one copy of the additional PDFs (rendered through OCR) available on the Google Drive folder. In a later stage of the process we can add these PDFs into the overall dataset. Many of the cached PDFs are scanned copies of handwritten notes or paper documents with handwriting in the margins, so they will be more complex to analyze anyway.
soon I will organize the pages into a kind of table of contents/outline below.