Zotero and Plain Text Workflow

#1 · May 5, 2023, 9:49 pm

Quote from dandennison84 on May 5, 2023, 9:49 pm
I'm starting a new topic as we were discussing aspects of workflow in another thread. I would like to explain my research workflow and process and how Zotero fits into it.

Plain Text

Plain Text in an age of templates, word processors and a variety of tools to make life easier seems counterintuitive at first glance. Plain Text, however, solves two very important problems for a researcher. First, it separates the content of your research from the presentation of your research. This allows more specialized tools to focus on different aspects of your process. Second, it minimizes your dependence on proprietary vendor tools that can lock you in and make it more difficult to change or adapt. Proprietary tools can age over time making their file formats difficult to translate or use. They make it difficult to get your data back out in an efficient manner to use in another tool. They can make it difficult or impossible to integrate with other tools. The enclosed PDF shows a sample of a single file in 3 different apps 4 different ways.

No Vendor Lock

I opened up my markdown file in Notepad (I'm on Windows). You can see that there IS structure in markdown and Plain Text. It just uses combinations of symbols to structure your document: "---", "#", "**" "[]()", etc. So those are things like metadata, headings, bullets and links. The arrows and line show various real examples.

Next, I opened the same file in Zettlr, my editor of choice. Note that even though it is plain text, Zettlr still renders it.

Next, I opened the same file in Obsidian, another plain text editor. I show 2 screenshots of it, one in plain mode and one in rendered mode.

So for editing, easy peasy. I am not tied to any vendor product. My raw files work on a number of different editors with different features, "out-of-the-box."

Output to Just About Anything

Next, I attached a pdf of 3 different sets of screenshots. From Zettlr, a click "Share", select the format (one-click), and it renders what I asked, "out-of-the-box." I didn't configure anything.

An HTML file, I use this a lot.

A PDF.

A Microsoft word doc.

Here are a couple of references on the philosophy and practice of Plain Text.

Sustainable Authorship in Plain Text using Pandoc and Markdown | Programming Historian

The Plain Text Life: Note Taking, Writing and Life Organization Using Plain Text Files | Mark Koester (markwk.com)

Plain Text, Papers, Pandoc - kieranhealy.org

Research Workflow

My research workflow for genealogy is based around the Genealogy Standards book. The best pictures I've found are on Marc McDermott's Genealogy Explained Website. How to Structure Your Research and Genealogical Proof Standard. I am also heavily influenced by archival processes. In particular, principles of order and provenance. What this means in practical terms is that I try to keep only a single copy of a document and process it once and make the sharing transparent. I want to only keep original documents, in a chronological order (in genealogical terms), in formats that will stand the test of time. This informs my file naming, directory and management processes. See for example.

Archival Arrangement Principles | Lucidea

Archives 101: An Introduction to Archival Processing - The Experiment StationThe Experiment Station (phillipscollection.org)

How do archivists organize collections? | Peeling the Past (peelarchivesblog.com)

So my workflow has to support the following aspects of genealogy research.

Research question-based workflow.

A research plan designed to answer the research question that contains a list of sources.

An analysis process that examines, analyzes, and evaluates each source in turn.

An ability to follow the GPS and articulate a conclusion.

Output some kind of result: update a family tree, produce a report, etc.

Example System

I organize my genealogy research using an Ahnentafel system. Like this: My Ahnentafel based filing system | English Ancestors. This keeps all of my files with the person I am researching. Collateral lines have name directories under the particular ancestor. If it is extensive enough or I'm doing research for someone else, they get their own separate Ahnentafel directory and lines. I've attached a screenshot. I'm not trying to convince to use it, but you need to understand it to understand how and why I do what I do with Zotero.

Inside each directory, I keep most everything related to that person. I've attached a screenshot of that as well. You can see that if I have originals, they are there. Each *.md file is a Plain Text markdown file like I showed earlier. There is one per source document. It has a transcript/abstract/extract, source analysis, link to the image, and EE-style citations. I handle census a bit differently in that I transcribe the entire sheet in excel. I tried it in markdown, but that is too painful. Excel works better.

I also have any research documents, proof summaries, etc in here. They are generally by the primary research subject.

Also you can notice a timeline spreadsheet. I have that in every directory. The timeline is where I keep a TODO of what I need to do with this person.

Inputs into the System

So how do I collect data and add it to my genealogy workflow? This is where Zotero shines and what I use it for. Anytime I read a source, I first add it to Zotero. If I want to read and take notes first, I use a browser plugin, hypothes.is to take mark up and take notes. The point is not the particular tooling I'm using, it is the problem being solved: curating a list of sources and processing them in a central place to avoid Collector's Fallacy. You know, when you have a bazillion links and never go back to them. So Zotero solves the following problems for me.

Centralized location that provides a standard, open framework for saving every source that is important to me. Particularly, its ability to create a source from a web link or DOI or ISBN or other identifier.

For each source, Zotero lets me add other links and notes under it. This is a key feature for me, I can keep things together that belong together. Now in my case, I don't use Zotero notes that much, I use Plain Text markdown for reasons described above. BUT, I can link that md file to a source.

Zotero lets me create folders/subfolders and it virtually lets me add the same source to multiple folders. So the real one is always in your My Library or whatever you've named it. But I can have different copies of it elsewhere. In this way, I can mirror my Ahnentafel system in Zotero.

Zotero lets me name links independently of the file they are linked to. For example, an 1880 US Census page has entries for your great-grandfather, mother, and 5 kids. There is also a FAN you want to track over 30 years of their neighbors that are on the same page. You can create ONE transcript, ONE abstract. ONE source analysis and then simply go into those half-dozen different people and link to the same document but name the link something logical for the person it is attached to. You don't have to create copies and then worry about changes and you don't have to stick the file in a central location away from the rest of the documents.

Zotero stores and exports it bibliographic data in a number of open standards including bib, and json for sources and csl for formatting. You select what you want to export and what format and Zotero will create a file on disk that it will also update as changes are made. Your other software then uses that bib, csl or other file to read Zotero's source information. In Zettlr, for example, I can create a footnote in Zettlr that references the Zotero Citation Key that you see on the top right of your source in Zotero.

Since Zotero uses an open bib format, it can be used by software such as Pandoc to output your markdown files and use citations. All the output examples, I've attached were generated with Pandoc. I just point my Zettlr software to it and it works.

All of this work out-of-the-box but provides points where I can modify the output or how things work if needed. For example, I can add stylesheets for HTML or Microsoft Word. I can use Latex files to help pandoc generate nice PDFs, etc.

I'm starting a new topic as we were discussing aspects of workflow in another thread. I would like to explain my research workflow and process and how Zotero fits into it.

Plain Text

Plain Text in an age of templates, word processors and a variety of tools to make life easier seems counterintuitive at first glance. Plain Text, however, solves two very important problems for a researcher. First, it separates the content of your research from the presentation of your research. This allows more specialized tools to focus on different aspects of your process. Second, it minimizes your dependence on proprietary vendor tools that can lock you in and make it more difficult to change or adapt. Proprietary tools can age over time making their file formats difficult to translate or use. They make it difficult to get your data back out in an efficient manner to use in another tool. They can make it difficult or impossible to integrate with other tools. The enclosed PDF shows a sample of a single file in 3 different apps 4 different ways.

No Vendor Lock

I opened up my markdown file in Notepad (I'm on Windows). You can see that there IS structure in markdown and Plain Text. It just uses combinations of symbols to structure your document: "---", "#", "**" "[]()", etc. So those are things like metadata, headings, bullets and links. The arrows and line show various real examples.
Next, I opened the same file in Zettlr, my editor of choice. Note that even though it is plain text, Zettlr still renders it.
Next, I opened the same file in Obsidian, another plain text editor. I show 2 screenshots of it, one in plain mode and one in rendered mode.

So for editing, easy peasy. I am not tied to any vendor product. My raw files work on a number of different editors with different features, "out-of-the-box."

Output to Just About Anything

Next, I attached a pdf of 3 different sets of screenshots. From Zettlr, a click "Share", select the format (one-click), and it renders what I asked, "out-of-the-box." I didn't configure anything.

1. An HTML file, I use this a lot.
2. A PDF.
3. A Microsoft word doc.

Here are a couple of references on the philosophy and practice of Plain Text.

Research Workflow

My research workflow for genealogy is based around the Genealogy Standards book. The best pictures I've found are on Marc McDermott's Genealogy Explained Website. How to Structure Your Research and Genealogical Proof Standard. I am also heavily influenced by archival processes. In particular, principles of order and provenance. What this means in practical terms is that I try to keep only a single copy of a document and process it once and make the sharing transparent. I want to only keep original documents, in a chronological order (in genealogical terms), in formats that will stand the test of time. This informs my file naming, directory and management processes. See for example.

So my workflow has to support the following aspects of genealogy research.

Research question-based workflow.
A research plan designed to answer the research question that contains a list of sources.
An analysis process that examines, analyzes, and evaluates each source in turn.
An ability to follow the GPS and articulate a conclusion.
Output some kind of result: update a family tree, produce a report, etc.

Example System

I organize my genealogy research using an Ahnentafel system. Like this: My Ahnentafel based filing system | English Ancestors. This keeps all of my files with the person I am researching. Collateral lines have name directories under the particular ancestor. If it is extensive enough or I'm doing research for someone else, they get their own separate Ahnentafel directory and lines. I've attached a screenshot. I'm not trying to convince to use it, but you need to understand it to understand how and why I do what I do with Zotero.
Inside each directory, I keep most everything related to that person. I've attached a screenshot of that as well. You can see that if I have originals, they are there. Each *.md file is a Plain Text markdown file like I showed earlier. There is one per source document. It has a transcript/abstract/extract, source analysis, link to the image, and EE-style citations. I handle census a bit differently in that I transcribe the entire sheet in excel. I tried it in markdown, but that is too painful. Excel works better.
I also have any research documents, proof summaries, etc in here. They are generally by the primary research subject.
Also you can notice a timeline spreadsheet. I have that in every directory. The timeline is where I keep a TODO of what I need to do with this person.

Inputs into the System

So how do I collect data and add it to my genealogy workflow? This is where Zotero shines and what I use it for. Anytime I read a source, I first add it to Zotero. If I want to read and take notes first, I use a browser plugin, hypothes.is to take mark up and take notes. The point is not the particular tooling I'm using, it is the problem being solved: curating a list of sources and processing them in a central place to avoid Collector's Fallacy. You know, when you have a bazillion links and never go back to them. So Zotero solves the following problems for me.

Centralized location that provides a standard, open framework for saving every source that is important to me. Particularly, its ability to create a source from a web link or DOI or ISBN or other identifier.
For each source, Zotero lets me add other links and notes under it. This is a key feature for me, I can keep things together that belong together. Now in my case, I don't use Zotero notes that much, I use Plain Text markdown for reasons described above. BUT, I can link that md file to a source.
Zotero lets me create folders/subfolders and it virtually lets me add the same source to multiple folders. So the real one is always in your My Library or whatever you've named it. But I can have different copies of it elsewhere. In this way, I can mirror my Ahnentafel system in Zotero.
Zotero lets me name links independently of the file they are linked to. For example, an 1880 US Census page has entries for your great-grandfather, mother, and 5 kids. There is also a FAN you want to track over 30 years of their neighbors that are on the same page. You can create ONE transcript, ONE abstract. ONE source analysis and then simply go into those half-dozen different people and link to the same document but name the link something logical for the person it is attached to. You don't have to create copies and then worry about changes and you don't have to stick the file in a central location away from the rest of the documents.
Zotero stores and exports it bibliographic data in a number of open standards including bib, and json for sources and csl for formatting. You select what you want to export and what format and Zotero will create a file on disk that it will also update as changes are made. Your other software then uses that bib, csl or other file to read Zotero's source information. In Zettlr, for example, I can create a footnote in Zettlr that references the Zotero Citation Key that you see on the top right of your source in Zotero.
Since Zotero uses an open bib format, it can be used by software such as Pandoc to output your markdown files and use citations. All the output examples, I've attached were generated with Pandoc. I just point my Zettlr software to it and it works.
All of this work out-of-the-box but provides points where I can modify the output or how things work if needed. For example, I can add stylesheets for HTML or Microsoft Word. I can use Latex files to help pandoc generate nice PDFs, etc.

Uploaded files:

You need to login to have access to uploads.

#2 · May 6, 2023, 6:02 pm

Very intriguing, Dan. And I look forward to having time to dig into some of these links. Your knowledge of archival practices could be helpful to us all. Thanks!

#3 · May 6, 2023, 7:32 pm

Thanks. I’m interested in how others are solving these problems with their workflows. How are they using Zotero?

Share...

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.