Developer API reference
Warning
The APIs documented here are not stable and may change from one version to
another. This is meant to be used by developers, both of papis
itself and
any external plugins.
papis.bibtex
A set of utilities for working with BibTeX and BibLaTeX (as described in the manual).
- papis.bibtex.bibtex_standard_types = frozenset({'article', 'book', 'bookinbook', 'booklet', 'collection', 'dataset', 'inbook', 'incollection', 'inproceedings', 'inreference', 'manual', 'misc', 'mvbook', 'mvcollection', 'mvproceedings', 'mvreference', 'online', 'patent', 'periodical', 'proceedings', 'reference', 'report', 'software', 'suppbook', 'suppcollection', 'suppperiodical', 'thesis', 'unpublished'})
Regular BibLaTeX types (Section 2.1.1).
- papis.bibtex.bibtex_type_aliases = {'conference': 'inproceedings', 'electronic': 'online', 'mastersthesis': 'thesis', 'phdthesis': 'thesis', 'techreport': 'report', 'www': 'online'}
BibLaTeX type aliases (Section 2.1.2).
- papis.bibtex.bibtex_non_standard_types = frozenset({'artwork', 'audio', 'bibnote', 'commentary', 'image', 'jurisdiction', 'legal', 'legislation', 'letter', 'movie', 'music', 'performance', 'review', 'standard', 'video'})
Non-standard BibLaTeX types (Section 2.1.3).
- papis.bibtex.biblatex_software_types = frozenset({'codefragment', 'software', 'softwaremodule', 'softwareversion'})
BibLaTeX Software types (Section 2).
- papis.bibtex.bibtex_types = frozenset({'article', 'artwork', 'audio', 'bibnote', 'book', 'bookinbook', 'booklet', 'codefragment', 'collection', 'commentary', 'conference', 'dataset', 'electronic', 'image', 'inbook', 'incollection', 'inproceedings', 'inreference', 'jurisdiction', 'legal', 'legislation', 'letter', 'manual', 'mastersthesis', 'misc', 'movie', 'music', 'mvbook', 'mvcollection', 'mvproceedings', 'mvreference', 'online', 'patent', 'performance', 'periodical', 'phdthesis', 'proceedings', 'reference', 'report', 'review', 'software', 'softwaremodule', 'softwareversion', 'standard', 'suppbook', 'suppcollection', 'suppperiodical', 'techreport', 'thesis', 'unpublished', 'video', 'www'})
A set of known BibLaTeX types (as described in Section 2.1 of the manual). These types are a union of the types above and can be extended with
extra-bibtex-types
.
- papis.bibtex.bibtex_standard_keys = frozenset({'abstract', 'addendum', 'afterword', 'annotation', 'annotator', 'author', 'authortype', 'bookauthor', 'bookpagination', 'booksubtitle', 'booktitle', 'booktitleaddon', 'chapter', 'commentator', 'date', 'doi', 'edition', 'editor', 'editora', 'editoratype', 'editorb', 'editorbtype', 'editorc', 'editorctype', 'editortype', 'eid', 'entrysubtype', 'eprint', 'eprintclass', 'eprinttype', 'eventdate', 'eventtitle', 'eventtitleaddon', 'file', 'foreword', 'holder', 'howpublished', 'indextitle', 'institution', 'introduction', 'isan', 'isbn', 'ismn', 'isrn', 'issn', 'issue', 'issuesubtitle', 'issuetitle', 'issuetitleaddon', 'iswc', 'journalsubtitle', 'journaltitle', 'journaltitleaddon', 'label', 'language', 'library', 'location', 'mainsubtitle', 'maintitle', 'maintitleaddon', 'month', 'nameaddon', 'note', 'number', 'organization', 'origdate', 'origlanguage', 'origlocation', 'origpublisher', 'origtitle', 'pages', 'pagetotal', 'pagination', 'part', 'publisher', 'pubstate', 'reprinttitle', 'series', 'shortauthor', 'shorteditor', 'shorthand', 'shorthandintro', 'shortjournal', 'shortseries', 'shorttitle', 'subtitle', 'title', 'titleaddon', 'translator', 'url', 'urldate', 'venue', 'version', 'volume', 'volumes', 'year'})
BibLaTeX data fields (Section 2.2.2).
- papis.bibtex.bibtex_key_aliases = {'address': 'location', 'annote': 'annotation', 'archiveprefix': 'eprinttype', 'journal': 'journaltitle', 'key': 'sortkey', 'pdf': 'file', 'primaryclass': 'eprintclass', 'school': 'institution'}
BibLaTeX field aliases (Section 2.2.5).
- papis.bibtex.bibtex_special_keys = frozenset({'crossref', 'entryset', 'execute', 'gender', 'ids', 'indexsorttitle', 'keywords', 'langid', 'langidopts', 'options', 'presort', 'related', 'relatedoptions', 'relatedstring', 'relatedtype', 'sortkey', 'sortname', 'sortshorthand', 'sorttitle', 'sortyear', 'xdata', 'xref'})
Special BibLaTeX fields (Section 2.2.3).
- papis.bibtex.biblatex_software_keys = frozenset({'abstract', 'author', 'date', 'doi', 'editor', 'eprint', 'eprintclass', 'eprinttype', 'file', 'hal_id', 'hal_version', 'institution', 'introducedin', 'license', 'month', 'note', 'organization', 'publisher', 'related', 'relatedstring', 'relatedtype', 'repository', 'subtitle', 'swhid', 'title', 'url', 'urldate', 'version', 'year'})
BibLaTeX software keys (Section 3). Most of these keys are already standard BibLaTeX keys from
bibtex_standard_keys
.
- papis.bibtex.bibtex_keys = frozenset({'abstract', 'addendum', 'address', 'afterword', 'annotation', 'annotator', 'annote', 'archiveprefix', 'author', 'authortype', 'bookauthor', 'bookpagination', 'booksubtitle', 'booktitle', 'booktitleaddon', 'chapter', 'commentator', 'crossref', 'date', 'doi', 'edition', 'editor', 'editora', 'editoratype', 'editorb', 'editorbtype', 'editorc', 'editorctype', 'editortype', 'eid', 'entryset', 'entrysubtype', 'eprint', 'eprintclass', 'eprinttype', 'eventdate', 'eventtitle', 'eventtitleaddon', 'execute', 'file', 'foreword', 'gender', 'hal_id', 'hal_version', 'holder', 'howpublished', 'ids', 'indexsorttitle', 'indextitle', 'institution', 'introducedin', 'introduction', 'isan', 'isbn', 'ismn', 'isrn', 'issn', 'issue', 'issuesubtitle', 'issuetitle', 'issuetitleaddon', 'iswc', 'journal', 'journalsubtitle', 'journaltitle', 'journaltitleaddon', 'key', 'keywords', 'label', 'langid', 'langidopts', 'language', 'library', 'license', 'location', 'mainsubtitle', 'maintitle', 'maintitleaddon', 'month', 'nameaddon', 'note', 'number', 'options', 'organization', 'origdate', 'origlanguage', 'origlocation', 'origpublisher', 'origtitle', 'pages', 'pagetotal', 'pagination', 'part', 'pdf', 'presort', 'primaryclass', 'publisher', 'pubstate', 'related', 'relatedoptions', 'relatedstring', 'relatedtype', 'repository', 'reprinttitle', 'school', 'series', 'shortauthor', 'shorteditor', 'shorthand', 'shorthandintro', 'shortjournal', 'shortseries', 'shorttitle', 'sortkey', 'sortname', 'sortshorthand', 'sorttitle', 'sortyear', 'subtitle', 'swhid', 'title', 'titleaddon', 'translator', 'url', 'urldate', 'venue', 'version', 'volume', 'volumes', 'xdata', 'xref', 'year'})
A set of known BibLaTeX fields (as described in Section 2.2 of the manual). These fields are a union of the above fields and can be extended with extended with
extra-bibtex-keys
.
- papis.bibtex.bibtex_type_required_keys = {'article': ({'author'}, {'title'}, {'eprinttype', 'journaltitle'}, {'date', 'year'}), 'book': ({'author'}, {'title'}, {'date', 'year'}), 'booklet': ({'author', 'editor'}, {'title'}, {'date', 'year'}), 'codefragment': ({'url'},), 'collection': ({'editor'}, {'title'}, {'date', 'year'}), 'dataset': ({'author', 'editor'}, {'title'}, {'date', 'year'}), 'inbook': ({'author'}, {'title'}, {'booktitle'}, {'date', 'year'}), 'incollection': ({'author'}, {'title'}, {'editor'}, {'booktitle'}, {'date', 'year'}), 'inproceedings': ({'author'}, {'title'}, {'booktitle'}, {'date', 'year'}), 'manual': ({'author', 'editor'}, {'title'}, {'date', 'year'}), 'misc': ({'author', 'editor'}, {'title'}, {'date', 'year'}), 'online': ({'author', 'editor'}, {'title'}, {'date', 'year'}, {'doi', 'eprint', 'url'}), 'patent': ({'author'}, {'title'}, {'number'}, {'date', 'year'}), 'periodical': ({'editor'}, {'title'}, {'date', 'year'}), 'proceedings': ({'title'}, {'date', 'year'}), 'report': ({'author'}, {'title'}, {'type'}, {'institution'}, {'date', 'year'}), 'software': ({'author', 'editor'}, {'title'}, {'url'}, {'year'}), 'softwaremodule': ({'author'}, {'subtitle'}, {'url'}, {'year'}), 'softwareversion': ({'author', 'editor'}, {'title'}, {'url'}, {'version'}, {'year'}), 'thesis': ({'author'}, {'title'}, {'type'}, {'institution'}, {'date', 'year'}), 'unpublished': ({'author'}, {'title'}, {'date', 'year'}), None: ()}
A mapping of supported BibLaTeX entry types (see
bibtex_types
) to BibLaTeX fields (seebibtex_keys
). Each value is a tuple of disjoint sets that can contain multiple fields required for the particular type, e.g. an article may require either ayear
or adate
field.
- papis.bibtex.bibtex_type_required_keys_aliases = {'bookinbook': 'inbook', 'inreference': 'incollection', 'mvbook': 'book', 'mvcollection': 'collection', 'mvproceedings': 'proceedings', 'mvreference': 'collection', 'reference': 'collection', 'suppbook': 'book', 'suppcollection': 'collection', 'suppperiodical': 'periodical'}
A mapping for additional BibLaTeX types that have the same required fields. This mapping can be used to convert types before looking into
bibtex_type_required_keys
.
- papis.bibtex.bibtex_type_converter: Dict[str, str] = {'OriginalPaper': 'article', 'annotation': 'misc', 'attachment': 'misc', 'audioRecording': 'audio', 'bill': 'legislation', 'blogPost': 'online', 'bookSection': 'inbook', 'case': 'jurisdiction', 'computerProgram': 'software', 'conferencePaper': 'inproceedings', 'dictionaryEntry': 'misc', 'document': 'article', 'email': 'online', 'encyclopediaArticle': 'article', 'film': 'video', 'forumPost': 'online', 'hearing': 'jurisdiction', 'instantMessage': 'online', 'interview': 'article', 'journal': 'article', 'journalArticle': 'article', 'magazineArticle': 'article', 'manuscript': 'unpublished', 'map': 'misc', 'monograph': 'book', 'newspaperArticle': 'article', 'note': 'misc', 'podcast': 'audio', 'preprint': 'unpublished', 'presentation': 'misc', 'radioBroadcast': 'audio', 'statute': 'jurisdiction', 'tvBroadcast': 'video', 'videoRecording': 'video', 'webpage': 'online'}
A mapping of arbitrary types to BibLaTeX types in
bibtex_types
. This mapping can be used when translating from other software, e.g. Zotero has custom fields in its schema.
- papis.bibtex.bibtex_key_converter: Dict[str, str] = {'abstractNote': 'abstract', 'conferenceName': 'eventtitle', 'place': 'location', 'proceedingsTitle': 'booktitle', 'publicationTitle': 'journal', 'university': 'school'}
A mapping of arbitrary fields to BibLaTeX fields in
bibtex_keys
. This mapping can be used when translating from other software.
- papis.bibtex.bibtex_ignore_keys = frozenset({'file'})
A set of BibLaTeX fields to ignore when exporting from the Papis database. These can be extended with
bibtex-ignore-keys
.
- papis.bibtex.ref_allowed_characters = '([^a-zA-Z0-9._]+|(?<!\\\\)[._])'
A regex for acceptable characters to use in a reference string. These are used by
ref_cleanup()
to remove any undesired characters.
- papis.bibtex.bibtex_verbatim_fields = frozenset({'doi', 'eprint', 'file', 'pdf', 'url', 'urlraw'})
A list of fields that should not be escaped. In general, these will be escaped by the BibTeX engine and should not be modified (e.g. Verbatim fields and URI fields in Section 2.2.1).
- papis.bibtex.exporter(documents: List[Document]) str [source]
Convert documents into a list of BibLaTeX entries
- class papis.bibtex.Importer(**kwargs: Any)[source]
Importer that parses BibTeX files or strings.
Here, uri can either be a BibTeX string, local BibTeX file or a remote URL (with a HTTP or HTTPS protocol).
- classmethod match(uri: str) Importer | None [source]
Check if the importer can process the given URI.
For example, an importer that supports links from the arXiv can check that the given URI matches using:
re.match(r".*arxiv.org.*", uri)
This can then be used to instantiate and return a corresponding
Importer
object.- Parameters:
uri – An URI where the document information should be retrieved from.
- Returns:
An importer instance if the match to the URI is successful or None otherwise.
- papis.bibtex.bibtexparser_entry_to_papis(entry: Dict[str, Any]) Dict[str, Any] [source]
Convert the keys of a BibTeX entry parsed by
bibtexparser
to a papis-compatible format.- Parameters:
entry – a dictionary with keys parsed by
bibtexparser
.- Returns:
a dictionary with keys converted to a papis-compatible format.
- papis.bibtex.bibtex_to_dict(bibtex: str) List[Dict[str, str]] [source]
Convert a BibTeX file (or string) to a list of Papis-compatible dictionaries.
This will convert an entry like
@article{ref, author = { ... }, title = { ... }, ..., }
to a dictionary such as
{ "type": "article", "author": "...", "title": "...", ...}
- Parameters:
bibtex – a path to a BibTeX file or a string containing BibTeX formatted data. If it is a file, its contents are passed to
BibTexParser
.- Returns:
a list of entries from the BibTeX data in a compatible format.
- papis.bibtex.ref_cleanup(ref: str) str [source]
Function to cleanup reference strings so that they are accepted by BibLaTeX.
This uses the
ref_allowed_characters
to remove any disallowed characters from the given ref. Furthermore,slugify
is used to remove unicode characters and ensure consistent use of the underscrore_
as a separator.- Returns:
a reference without any disallowed characters.
- papis.bibtex.create_reference(doc: Dict[str, Any], force: bool = False) str [source]
Try to create a reference for the document doc.
If the document doc does not have a
"ref"
key, this function attempts to create one, otherwise the existing key is returned. When creating a new reference:the
ref-format
key is used, if available,the document DOI is used, if available,
a string is constructed from the document data (author, title, etc.).
- Parameters:
force – if True, the reference is re-created even if the document already has a
"ref"
key.- Returns:
a clean (see
ref_cleanup()
) reference for the document.
- papis.bibtex.to_bibtex(document: Document, *, indent: int = 2) str [source]
Convert a document to a BibTeX containing only valid metadata.
To convert a document, it must have a valid BibTeX type (see
bibtex_types
) and a valid reference under the"ref"
key (seecreate_reference()
). Valid BibTeX keys (seebibtex_keys
) are exported, while other keys are ignored (seebibtex_ignore_keys
) with the following rules:bibtex-unicode
is used to control whether the field values can contain unicode characters.bibtex-journal-key
is used to define the field name for the journal.bibtex-export-file
is used to also add a"file"
field to the BibTeX entry, which can be used by e.g. Zotero to import documents.
- Parameters:
document – a Papis document.
indent – set indentation for the BibTeX fields.
- Returns:
a string containing the document metadata in a BibTeX format.
papis.citations
- papis.citations.Citations
A list of citations for an existing document.
- papis.citations.get_metadata_citations(doc: Document | Dict[str, Any]) List[Dict[str, Any]] [source]
Get the citations in the metadata that contain a DOI.
- papis.citations.fetch_citations(doc: Document) List[Dict[str, Any]] [source]
Retrieve citations for the document.
Citation retrieval is mainly based on querying Crossref metadata based on the DOI of the document. If the document does not have a DOI, this function will fail to retrieve any citations.
- Returns:
a list of citations that have a DOI.
- papis.citations.get_citations_from_database(dois: Sequence[str]) List[Dict[str, Any]] [source]
Look for document DOIs in the database.
- Parameters:
dois – a sequence of DOIs to look for in the current library database.
- Returns:
a sequence of documents from the current library that match the given dois, if any.
- papis.citations.update_and_save_citations_from_database_from_doc(doc: Document) None [source]
Update the citations file of an existing document.
This function will get any existing citations in the document, update them as appropriate and save them back to the citation file.
- papis.citations.update_citations_from_database(citations: List[Dict[str, Any]]) List[Dict[str, Any]] [source]
Update a list of citations with data from the database.
- Parameters:
citations – a list of existing citations to update.
- papis.citations.save_citations(doc: Document, citations: List[Dict[str, Any]]) None [source]
Save the citations to the document’s citation file.
- papis.citations.fetch_and_save_citations(doc: Document) None [source]
Retrieve citations from available sources and save them to the citations file.
- papis.citations.get_citations_file(doc: Document) str | None [source]
Get the document’s citation file path (see
citations-file-name
).- Returns:
an absolute path to the citations file for doc.
- papis.citations.has_citations(doc: Document) bool [source]
- Returns:
True if the document has an existing citations file and False otherwise.
- papis.citations.get_citations(doc: Document) List[Dict[str, Any]] [source]
Retrieve citations from the document’s citation file.
- papis.citations.get_cited_by_file(doc: Document) str | None [source]
Get the documents cited-by file (see
cited-by-file-name
).- Returns:
an absolute path to the cited-by file for doc.
- papis.citations.has_cited_by(doc: Document) bool [source]
- Returns:
True if the document has a cited-by file and False otherwise.
- papis.citations.save_cited_by(doc: Document, citations: List[Dict[str, Any]]) None [source]
Save the cited-by list citations to the document’s cited-by file.
- papis.citations.fetch_cited_by_from_database(cit: Dict[str, Any]) List[Dict[str, Any]] [source]
Fetch a list of documents that cite cit from the database.
- Parameters:
cit – a citation to look for in the database.
- Returns:
a list of documents that cite cit.
papis.cli
- papis.cli.bool_flag(*args: Any, **kwargs: Any) Callable[[...], Any] [source]
A wrapper to
click.option()
that hardcodes a boolean flag option.
- papis.cli.query_argument(**attrs: Any) Callable[[...], Any] [source]
Adds a
query
argument as aclick
decorator.
- papis.cli.query_option(**attrs: Any) Callable[[...], Any] [source]
Adds a
-q
,--query
option as aclick
decorator.
- papis.cli.sort_option(**attrs: Any) Callable[[...], Any] [source]
Adds a
--sort
and a--reverse
option as aclick
decorator.
- papis.cli.doc_folder_option(**attrs: Any) Callable[[...], Any] [source]
Adds a
--doc-folder
argument as aclick
decorator.
- papis.cli.all_option(**attrs: Any) Callable[[...], Any] [source]
Adds a
--all
option as aclick
decorator.
- papis.cli.git_option(**attrs: Any) Callable[[...], Any] [source]
Adds a
--git
option as aclick
decorator.
- papis.cli.handle_doc_folder_or_query(query: str, doc_folder: str | Tuple[str, ...] | None) List[Document] [source]
Query database for documents.
This handles the
query_option()
anddoc_folder_option()
command-line arguments. If a doc_folder is given, then the document at that location is loaded, otherwise the database is queried using query.- Parameters:
query – a database query string.
doc_folder – existing document folder (see
papis.document.from_folder()
).
- papis.cli.handle_doc_folder_query_sort(query: str, doc_folder: str | Tuple[str, ...] | None, sort_field: str | None, sort_reverse: bool) List[Document] [source]
Query database for documents.
Similar to
handle_doc_folder_or_query()
, but also handles thesort_option()
arguments. It sorts the resulting documents according to sort_field and reverse_field.- Parameters:
sort_field – field by which to sort the resulting documents (see
papis.document.sort()
).sort_reverse – if True, the fields are sorted in reverse order.
- papis.cli.handle_doc_folder_query_all_sort(query: str, doc_folder: str | Tuple[str, ...] | None, sort_field: str | None, sort_reverse: bool, _all: bool) List[Document] [source]
Query database for documents.
Similar to
handle_doc_folder_query_sort()
, but also handles theall_option()
argument.- Parameters:
_all – if False, the user is prompted to pick a subset of documents (see
papis.api.pick_doc()
).
- papis.cli.bypass(group: Group, command: Command, command_name: str) Callable[[...], Any] [source]
Overwrite existing
papis
commands.This function is specially important for developing scripts in
papis
.For example, consider augmenting the
add
command, as seen when usingpapis add
. In this case, we may want to add some additional options or behavior before callingpapis.commands.add
, but would like to avoid writing it from scratch. This function can then be used as follows to allow thisimport click import papis.cli import papis.commands.add @click.group() def main(): """Your main app""" pass @papis.cli.bypass(main, papis.commands.add.cli, "add") def add(**kwargs): # do some logic here... # and call the original add command line function by papis.commands.add.cli.bypassed(**kwargs)
papis.commands
- class papis.commands.AliasedGroup(name: str | None = None, commands: MutableMapping[str, Command] | Sequence[Command] | None = None, **attrs: Any)[source]
A
click.Group
that accepts command aliases.This group command is taken from here and is to be used for groups with aliases. In this case, aliases are defined as prefixes of the command, so for a command named
remove
,rem
is also accepted as long as it is unique.- format_commands(ctx: Context, formatter: HelpFormatter) None [source]
Overwrite the default formatting.
- class papis.commands.Script(command_name: str, path: str | None, plugin: Command | None)[source]
A
papis
command plugin or script.These plugins are made available through the main
papis
command-line interface as subcommands.- plugin: Command | None
A
click.Command
if the script is registered as an entry point.
- papis.commands.get_external_scripts() Dict[str, Script] [source]
Get a mapping of all external scripts that should be registered with Papis.
An external script is an executable that can be found in the
papis.config.get_scripts_folder()
folder or in the user’s PATH. External scripts are recognized if they are prefixed withpapis-
.- Returns:
a mapping of scripts that have been found.
- papis.commands.get_scripts() Dict[str, Script] [source]
Get a mapping of commands that should be registered with Papis.
This finds all the commands that are registered as entry points in the namespace
"papis.command"
.- Returns:
a mapping of scripts that have been found.
- papis.commands.get_all_scripts() Dict[str, Script] [source]
Get a mapping of all commands that should be registered with Papis.
This includes the results from
get_external_scripts()
andget_scripts()
. Entrypoint-based scripts take priority, so if an external script with the same name is found it is silently ignored.- Returns:
a mapping of scripts that have been found.
papis.config
- papis.config.get_general_settings_name() str [source]
Get the section name of the general settings.
>>> get_general_settings_name() 'settings'
- class papis.config.Configuration[source]
A subclass of
configparser.ConfigParser
with custom defaults.This class automatically reads the configuration file and imports any required scripts. If no file exists, a default one is created.
Use
get_configuration()
to instantiate this class instead of calling it directly.
- papis.config.get_default_settings() Dict[str, Dict[str, Any]] [source]
Get the default settings for all non-user variables.
Additional user variables can be registered using
register_default_settings()
and will be included in this dictionary.
- papis.config.register_default_settings(settings_dictionary: Dict[str, Dict[str, Any]]) None [source]
Register configuration settings into the global configuration registry.
Notice that you can define sections or global options. For instance, let us suppose that a script called
foobar
defines some configuration options. In the script there could be the following definedimport papis.config options = {"foobar": { "command": "open"}} papis.config.register_default_settings(options)
which can then be accessed globally through
papis.config.get("command", section="foobar")
- Parameters:
settings_dictionary – a dictionary of configuration settings, where the first level of keys defines the sections and the second level defines the actual configuration settings.
- papis.config.get_config_home() str [source]
- Returns:
a (platform dependent) base directory relative to which user specific configuration files should be stored.
- papis.config.get_config_folder() str [source]
Get the main configuration folder.
- Returns:
a (platform dependent) folder where the configuration files are stored, e.g.
$HOME/.config/papis
on POSIX platforms.
- papis.config.get_config_file() str [source]
Get the main configuration file.
- Returns:
the path of the main configuration file, which by default is in
get_config_folder()
, but can be overwritten usingset_config_file()
.
- papis.config.get_configpy_file() str [source]
Get the main Python configuration file.
This is a file that will get automatically
eval()
ed if it exists and allows for more dynamic configuration.- Returns:
the path of the main Python configuration file, which by default is in
get_config_folder()
.
- papis.config.get_scripts_folder() str [source]
- Returns:
the folder where additional scripts are stored, which by default is in
get_config_folder()
.
- papis.config.set(key: str, value: Any, section: str | None = None) None [source]
Set a key in the configuration.
- Parameters:
key – the name of the key to set.
value – the value to set it to, which can be any value understood by the
Configuration
.section – the name of the section to set the key in.
- papis.config.general_get(key: str, section: str | None = None, data_type: type | None = None) Any | None [source]
Get the value for a given key in section.
This function is a bit more general than the get from
Configuration
(seeconfigparser.ConfigParser.get()
). In particular it supportsProviding the key and section, in which case it will retrieve the key from that section directly.
The key has the format
<section>-<key>
and no section is specified. In this case, the full key is expected to be in the general settings section or a library section.
The priority of the search is given by
The key is retrieved from a library section.
The key is retrieved from the given section, if any.
The key is retrieved from the general section.
- Parameters:
key – a key in the configuration file to retrieve.
section – a section from which to retrieve the key, which defaults to
get_general_settings_name()
.data_type – the data type that should be expected for the value of the variable.
- papis.config.get(key: str, section: str | None = None) Any | None [source]
Retrieve a general value (can be None) from the configuration file.
- papis.config.getint(key: str, section: str | None = None) int | None [source]
Retrieve an integer value from the configuration file.
>>> set("something", 42) >>> getint("something") 42
- papis.config.getfloat(key: str, section: str | None = None) float | None [source]
Retrieve an floating point value from the configuration file.
>>> set("something", 0.42) >>> getfloat("something") 0.42
- papis.config.getboolean(key: str, section: str | None = None) bool | None [source]
Retrieve a boolean value from the configuration file.
>>> set("add-open", True) >>> getboolean("add-open") True
- papis.config.getstring(key: str, section: str | None = None) str [source]
Retrieve a string value from the configuration file.
>>> set("add-open", "hello world") >>> getstring("add-open") 'hello world'
- papis.config.getlist(key: str, section: str | None = None) List[str] [source]
Retrieve a list value from the configuration file.
This function uses
eval()
to execute a the string present in the configuration file into a Python list. This can be unsafe if the list contains unknown code.>>> set("tags", "['a', 'b', 'c']") >>> getlist("tags") ['a', 'b', 'c']
- Raises:
SyntaxError – Whenever the parsed syntax is either not a valid python object or not a valid python list.
- papis.config.get_configuration() Configuration [source]
Get the configuration object,
If no configuration has been initialized, it initializes one. Only one configuration per process should ever be configured.
- papis.config.merge_configuration_from_path(path: str | None, configuration: Configuration) None [source]
Merge information of a configuration file found in path into configuration.
- Parameters:
path – a path to a configuration file.
configuration – an existing
Configuration
object.
- papis.config.set_lib_from_name(libname: str) None [source]
Set the current library from a name.
- Parameters:
libname – the name of a library in the configuration file or a path to an existing folder that should be considered a library.
- papis.config.get_lib_from_name(libname: str) Library [source]
Get a library object from a name.
- Parameters:
libname – the name of a library in the configuration file or a path to an existing folder that should be considered a library.
- papis.config.get_lib() Library [source]
Get current library.
If there is no library set before, the default library will be retrieved. If the
PAPIS_LIB
environment variable is defined, this is the library name (or path) that will be taken as a default.
- papis.config.get_libs_from_config(config: Configuration) List[str] [source]
Get all library names from the given configuration.
In the configuration file, any sections that contain a
"dir"
or a"dirs"
key are considered to be libraries.
- papis.config.reset_configuration() Configuration [source]
Resets the existing configuration and returns a new one without any user settings.
- papis.config.escape_interp(path: str) str [source]
Escape paths added to the configuration file.
By default, the
papis.config.Configuration
enables string interpolation in the key values (e.g. usingkey = %(other_key)s-suffix)
). Any paths added to the configuration should then be escaped so that they do not interfere with the interpolation.
papis.docmatcher
- class papis.docmatcher.ParseResult(search: str, pattern: Pattern[str], doc_key: str | None)[source]
Result from parsing a search string.
For example, a search string such as
"author:einstein"
will result inr = ParseResult(search="einstein", pattern=<...>, doc_key="author")
- pattern: Pattern[str]
A regex pattern constructed from the
search
usingget_regex_from_search()
.
- class papis.docmatcher.MatcherCallable(*args, **kwargs)[source]
A callable
typing.Protocol
used to match a document for a given search.- __call__(document: Document, search: Pattern[str], match_format: str | None = None, doc_key: str | None = None) Any [source]
Match a document’s keys to a given search pattern.
The matcher can decide whether the match_format or the doc_key take priority when matching against the given pattern in search. If possible, doc_key should be given priority as the more specific choice.
- Parameters:
search – a regex pattern to match the query against (see
ParseResult.pattern
).match_format – a format string (see
papis.format.format()
) to match against.doc_key – a specific key in the document to match against.
- Returns:
None if the match fails and anything else otherwise.
- class papis.docmatcher.DocMatcher[source]
This class implements the mini query language for papis.
The (static) methods should be used as follows:
First, the search string has to be set:
DocMatcher.set_search(search_string)
Then, the parse method should be called in order to decipher the search_string:
DocMatcher.parse()
Finally, the
DocMatcher
is ready to match documents with the input query via:DocMatcher.return_if_match(doc)
- parsed_search: ClassVar[List[ParseResult] | None] = None
A parsed version of the
search
string usingparse_query()
.
- matcher: ClassVar[MatcherCallable | None] = None
A
MatcherCallable
used to match the document to theparsed_search
.
- match_format: ClassVar[str] = ''
A format string (defaulting to
match-format
) used to match the parsed search results if no document key is present.
- classmethod return_if_match(doc: Document) Document | None [source]
Use
DocMatcher.parsed_search
to match the doc against the query.>>> import papis.document >>> from papis.database.cache import match_document >>> doc = papis.document.from_data({'title': 'einstein'}) >>> DocMatcher.set_matcher(match_document) >>> result = DocMatcher.parse('einste') >>> DocMatcher.return_if_match(doc) is not None True >>> result = DocMatcher.parse('heisenberg') >>> DocMatcher.return_if_match(doc) is not None False >>> result = DocMatcher.parse('title : ein') >>> DocMatcher.return_if_match(doc) is not None True
- Parameters:
doc – a Papis document to match against.
- classmethod set_search(search: str) None [source]
Set the search for this instance of the matcher.
>>> DocMatcher.set_search('author:Hummel') >>> DocMatcher.search 'author:Hummel'
- classmethod set_matcher(matcher: MatcherCallable) None [source]
Set the matcher callable for the search.
>>> from papis.database.cache import match_document >>> DocMatcher.set_matcher(match_document)
- classmethod parse(search: str | None = None) List[ParseResult] [source]
Parse the main query text.
This method will also set
DocMatcher.parsed_search
to the resulting parsed query and it will return it too.>>> print(DocMatcher.parse('hello author : einstein')) [['hello'], ['author', 'einstein']] >>> print(DocMatcher.parse('')) [] >>> print( DocMatcher.parse( '"hello world whatever :" tags : \'hello ::::\'')) [['hello world whatever :'], ['tags', 'hello ::::']] >>> print(DocMatcher.parse('hello')) [['hello']]
- Parameters:
search – a custom search text string that overwrite
search
.- Returns:
a parsed query.
- papis.docmatcher.get_regex_from_search(search: str) Pattern[str] [source]
Creates a default regex from a search string.
>>> get_regex_from_search(' ein 192 photon').pattern '.*ein.*192.*photon.*' >>> get_regex_from_search('{1234}').pattern '.*\\{1234\\}.*'
- Parameters:
search – a valid search string.
- Returns:
a regular expression representing the search string, which is properly escaped and allows for multiple spaces.
- papis.docmatcher.parse_query(query_string: str) List[ParseResult] [source]
Parse a query string using
pyparsing
.The query language implemented by this function for Papis supports strings of the form:
'hello author : Einstein title: "Fancy Title: Part 1" tags'
which will result in
results = [ ParseResult(search="hello", pattern=<...>, doc_key=None), ParseResult(search="Einstein", pattern=<...>, doc_key="author"), ParseResult(search="Fancy Title: Part 1", pattern=<...>, doc_key="title"), ParseResult(search="tags", pattern=<...>, doc_key=None), ]
We can see there that constructs of the form
"key:value"
with the colon as a separator are recognized and parsed to document keys with the color. They can be escaped by enclosing them in quotes. Otherwise, each individual word in the search query will give anotherParseResult
. Each search term can contain additional regex characters.- Parameters:
query_string – a search string to parse into a structured format.
- Returns:
a list of parsing results for each token in the query string.
papis.document
Module defining the main document type.
- papis.document.DocumentLike
A union of types that can be converted to a document.
- papis.document.EmptyKeyConversion = {'action': None, 'key': None}
A default
KeyConversion
.
- class papis.document.KeyConversionPair(from_key, rules)[source]
-
- rules: List[KeyConversion]
A
list
ofKeyConversion
key mapping rules used to rename and post-process thefrom_key
and its value.
- papis.document.keyconversion_to_data(conversions: Sequence[KeyConversionPair], data: Dict[str, Any], keep_unknown_keys: bool = False) Dict[str, Any] [source]
Function to convert between dictionaries.
This can be used to define a fixed set of translation rules between, e.g., JSON data obtained from a website API and standard
papis
key names and formatting. The implementation is completely generic.For example, we have the simple dictionary
data = {"id": "10.1103/physrevb.89.140501"}
which contains the DOI of a document with the wrong key. We can then write the following rules
conversions = [ KeyConversionPair("id", [ {"key": "doi", "action": None}, {"key": "url": "action": lambda x: "https://doi.org/{}".format(x)} ]) ] new_data = keyconversion_to_data(conversions, data)
to rename the
"id"
key to the standard"doi"
key used bypapis
and a URL. Any number of such rules can be written, depending on the complexity of the incoming data. Note that any errors raised on the application of the action will be silently ignored and the corresponding key will be skipped.- Parameters:
conversions – a sequence of
KeyConversionPair
s used to convert the data.data – a
dict
to be convert according to conversions.keep_unknown_keys – if True unknown keys from data are kept in the resulting dictionary. Otherwise, only keys from conversions are present.
- Returns:
a new
dict
containing the entries from data converted according to conversions.
- papis.document.author_list_to_author(data: Dict[str, Any], separator: str | None = None, multiple_authors_format: str | None = None) str [source]
Convert a list of authors into a single author string.
This uses the
multiple-authors-separator
and themultiple-authors-format
settings to construct the concatenated authors.- Parameters:
data – a
dict
that contains an"author_list"
key to be converted into a single author string.
>>> author1 = {"given": "Some", "family": "Author"} >>> author2 = {"given": "Other", "family": "Author"} >>> author_list_to_author({"author_list": [author1, author2]}) 'Author, Some and Author, Other'
- papis.document.guess_authors_separator(authors: str) str [source]
Attempt to determine the separator for various non-BibTeX author lists.
- Parameters:
authors – author string to determine the separator for.
- Returns:
a regex that can be used to split the authors string.
For example:
>>> s = "Sanger, F. and Nicklen, S. and Coulson, A. R." >>> assert guess_authors_separator(s) == "and" >>> s = "Fabian Sanger and Steven Nicklen and Alexander R. Coulson" >>> assert guess_authors_separator(s) == "and" >>> s = "Fabian Sanger, Steven Nicklen, Alexander R. Coulson" >>> assert guess_authors_separator(s) == "," >>> s = "Fabian Sanger, and Steven Nicklen, and Alexander R. Coulson" >>> import re >>> sep = guess_authors_separator(s) >>> assert re.match(sep, ", and") >>> s = "Dagobert Duck and von Beethoven, Ludwig and Ford, Jr., Henry" >>> assert guess_authors_separator(s) == "and" >>> s = "Turing, A. M." >>> assert guess_authors_separator(s) == "and"
- papis.document.split_author_name(author: str) Dict[str, Any] [source]
Split an author name into a given and family name.
This uses
bibtexparser.customization.splitname()
to correctly split and determine the first and last names of an author in the list. Note that this is just a heuristic and can give incorrect results for certain author names.- Parameters:
author – a string containing an author name.
- Returns:
a
dict
with the family and given name of the author.
- papis.document.split_authors_name(authors: str | List[str], separator: str | None = None) List[Dict[str, Any]] [source]
Convert list of authors to a fixed format.
Uses
split_author_name()
to construct the individual authors and the separator to split the authors in the list.- Parameters:
authors – a list of author names, where each entry can consists of multiple authors separated by separator.
separator – a separator for entries in authors that contain multiple authors. If None, a separator is guessed using
guess_authors_separator()
.
- class papis.document.DocHtmlEscaped(doc: Document)[source]
Small helper class to escape HTML elements in a document.
>>> DocHtmlEscaped(from_data({"title": '> >< int & "" "'}))['title'] '> >< int & "" "'
- class papis.document.Document(folder: str | None = None, data: Dict[str, Any] | None = None)[source]
An abstract document in a
papis
library.This class inherits from a standard
dict
and implements some additional functionality.- html_escape
A
DocHtmlEscaped
instance that can be used to escape keys in the document for use in HTML documents.
- set_folder(folder: str) None [source]
Set the document’s main folder.
This also updates the location of the info file and other attributes. Note, however, that it will not load any data from the given folder even if it contains another info file (see
from_folder()
for this functionality).- Parameters:
folder – an absolute path to a new main folder for the document.
- get_main_folder() str | None [source]
- Returns:
the root path in the filesystem where the document is stored, if any.
- get_main_folder_name() str | None [source]
- Returns:
the folder name of the document, i.e. the basename of the path returned by
get_main_folder()
.
- get_info_file() str [source]
- Returns:
path to the info file, which can also be an empty string if no such file has been created.
- get_files() List[str] [source]
Get the files linked to the document.
The files in a document are stored relative to its main folder. If no main folder is set on the document (see
set_folder()
), then this function will not return any files. To retrieve the relative file paths only, accessdoc["files"]
directly.- Returns:
a
list
of absolute file paths in the document’s main folder, if any.
- papis.document.from_data(data: Dict[str, Any]) Document [source]
Construct a
Document
from a dictionary.- Parameters:
data – a dictionary to be made into a new document.
- papis.document.from_folder(folder_path: str) Document [source]
Construct a
Document
from a folder.- Parameters:
folder_path – absolute path to a valid
papis
folder.
- papis.document.to_json(document: Document) str [source]
Export the document to JSON.
- Returns:
a JSON string corresponding to all the entries in the document.
- papis.document.to_dict(document: Document) Dict[str, Any] [source]
Convert a document back into a standard
dict
.- Returns:
a
dict
corresponding to all the entries in the document.
- papis.document.dump(document: Document) str [source]
Dump the document into a formatted string.
The format of the string is not fixed and is meant to be used to display the document entries in a consistent way across
papis
.- Returns:
a string containing all the entries in the document.
>>> doc = from_data({'title': 'Hello World'}) >>> dump(doc) 'title: Hello World'
- papis.document.delete(document: Document) None [source]
Delete a document from the filesystem.
This function delete the main folder of the document (recursively), but it does not delete the in-memory version of the document.
- papis.document.describe(document: Document | Dict[str, Any]) str [source]
- Returns:
a string description of the current document using
document-description-format
.
- papis.document.move(document: Document, path: str) None [source]
Move the document to a new main folder at path.
This supposes that the document exists in the location
document.get_main_folder()
and will change the folder in the input document as a result.- Parameters:
path – absolute path where the document should be moved to. This path is expected to not exist yet and will be created by this function.
>>> doc = from_data({'title': 'Hello World'}) >>> doc.set_folder('path/to/folder') >>> import tempfile; newfolder = tempfile.mkdtemp() >>> move(doc, newfolder) Traceback (most recent call last): ... FileExistsError: There is already...
- papis.document.sort(docs: Sequence[Document], key: str, reverse: bool = False) List[Document] [source]
Sort a list of documents by the given key.
The sort is performed on the key with a priority given to the type of the value. If the key does not exist in the document, this is given the lowest priority and left at the end of the list.
- Parameters:
docs – a sequence of documents.
key – a key in the documents by which to sort.
reverse – if True, the sorting is done in reverse order (descending instead of ascending).
- Returns:
a list of documents sorted by key.
- papis.document.new(folder_path: str, data: Dict[str, Any], files: Sequence[str] | None = None) Document [source]
Creates a complete document with data and existing files.
The document is saved to the filesystem at folder_path and all the given files are copied over to the main folder.
- Parameters:
folder_path – a main folder for the document.
data – a
dict
with key and values to be used as metadata in the document.files – a sequence of files to add to the document.
- Raises:
FileExistsError – if folder_path already exists.
papis.downloaders
- class papis.downloaders.Importer(uri: str = '')[source]
Importer that tries to get data and files from implemented downloaders.
This importer simply calls
get_info_from_url()
on the given URI.- classmethod match(uri: str) Importer | None [source]
Check if the importer can process the given URI.
For example, an importer that supports links from the arXiv can check that the given URI matches using:
re.match(r".*arxiv.org.*", uri)
This can then be used to instantiate and return a corresponding
Importer
object.- Parameters:
uri – An URI where the document information should be retrieved from.
- Returns:
An importer instance if the match to the URI is successful or None otherwise.
- fetch() None [source]
Fetch metadata and files for the given
uri
.This method calls
Importer.fetch_data()
andImporter.fetch_files()
to get all the information available for the document. It is recommended to implement the two methods separately, if possible, for maximum flexibility.The imported data is stored in
ctx
and it is not queried again on subsequent calls to this function.
- class papis.downloaders.Downloader(uri: str = '', name: str = '', ctx: Context | None = None, expected_document_extension: Sequence[str] | str | None = None, cookies: Dict[str, str] | None = None, priority: int = 1)[source]
A base class for downloader instances implementing common functionality.
In general, downloaders are expected to implement a subset of the methods below, depending on the generality. A simple downloader could only implement
get_bibtex_url()
andget_document_url()
.- expected_document_extension
A single extension or a list of extensions supported by the downloader. The extensions do not contain the leading dot, e.g.
["pdf", "djvu"]
.
- priority
A priority given to the downloader. This is used when trying to automatically determine a preferred downloader for a given URL.
- session
A
requests.Session
that is used for all the requests made by the downloader.
- classmethod match(url: str) Downloader | None [source]
Check if the downloader can process the given URL.
For example, an importer that supports links from the arXiv can check that the given URL matches using:
re.match(r".*arxiv.org.*", uri)
This can then be used to instantiate and return a corresponding
Downloader
object.- Parameters:
url – An URL where the document information should be retrieved from.
- Returns:
A downloader instance if the match to the URL is successful or None otherwise.
- fetch() None [source]
Fetch metadata and files for the given
uri
.This method calls
Downloader.fetch_data()
andDownloader.fetch_files()
to get all the information available for the document. It is recommended to implement the two methods separately, if possible, for maximum flexibility.The imported data is stored in
ctx
and it is not queried again on subsequent calls to this function.
- fetch_data() None [source]
Fetch metadata for the given URL.
The imported metadata is stored in
ctx
. To fetch the metadata, the following steps are followedCall
get_data()
to import any scraped metadata.Call
get_bibtex_data()
to import any metadata from BibTeX files available remotely.
Note that previous steps overwrite any information, i.e. the BibTeX data will take priority.
- fetch_files() None [source]
Fetch files from the given
uri
.The imported files are stored in
ctx
. The file is downloaded withdownload_document()
and stored as a temporary file.
- get_bibtex_url() str | None [source]
- Returns:
an URL to a valid BibTeX file that can be used to extract metadata about the document.
- get_bibtex_data() str | None [source]
Get BibTeX data available at
get_bibtex_url()
, if any.- Returns:
a string containing the BibTeX data, which can be parsed.
- download_bibtex() None [source]
Download and store that BibTeX data from
get_bibtex_url()
.Use
get_bibtex_data()
to access the metadata from the BibTeX URL.
- get_data() Dict[str, Any] [source]
Retrieve general metadata from the given URL.
This function is meant to be as general as possible and should not contain data imported from BibTeX (use
get_bibtex_data()
instead). For example, this can be used for web scrapping or calling other website APIs to gather metadata about the document.
- get_document_data() bytes | None [source]
Get data for the downloaded file that is given by
get_document_url()
.- Returns:
the bytes (stored in memory) for the downloaded file.
- get_document_extension() str [source]
- Returns:
a guess for the extension of
get_document_data()
. This is based on filetype and uses magic file signatures to determine the type. If no guess is valid, an empty string is returned.
- download_document() None [source]
Download and store the file that is given by
get_document_url()
.Use
get_document_data()
to access the file binary contents.
- check_document_format() bool [source]
Check if the document downloaded by
download_document()
has a file type supported by the downloader.If the downloader has no preferred type, then all files are accepted.
- Returns:
True if the document has a supported file type and False otherwise.
- papis.downloaders.get_available_downloaders() List[Type[Downloader]] [source]
Get all declared downloader classes.
- papis.downloaders.get_matching_downloaders(url: str) List[Downloader] [source]
Get downloaders matching the given url.
- Parameters:
url – a URL to match.
- Returns:
a list of downloaders (sorted by priority).
- papis.downloaders.get_downloader_by_name(name: str) Type[Downloader] [source]
Get a specific downloader by its name.
- Parameters:
name – the name of the downloader. Note that this is the name of the entry point used to define the downloader. In general, this should be the same as its name, but this is not enforced.
- Returns:
a downloader class.
- papis.downloaders.get_info_from_url(url: str, expected_doc_format: str | None = None) Context [source]
Get information directly from the given url.
- Parameters:
url – the URL of a resource.
expected_doc_format – an expected document file type, that is used to override the file type defined by the chosen downloader.
- papis.downloaders.download_document(url: str, expected_document_extension: str | None = None, cookies: Dict[str, Any] | None = None, filename: str | None = None) str | None [source]
Download a document from url and store it in a local file.
An appropriate filename is deduced from the HTTP response in most cases. If this is not possible, a temporary file is created instead. To ensure that the desired filename is chosen, provide the filename argument instead.
- Parameters:
url – the URL of a remote file.
expected_document_extension – an expected file extension. If None, then an extension is guessed from the file contents or from the filename.
filename – a file name for the document, regardless of the given URL and extension.
- Returns:
an absolute path to a local file containing the data from url.
papis.exceptions
This module implements custom exceptions used to make the code more readable.
papis.filetype
- papis.filetype.guess_content_extension(content: bytes) str | None [source]
Guess the extension from (potential) file contents.
This method attempts to look at known file signatures to determine the file type. This is not always possible, as it is hard to determine a unique type.
- Parameters:
content – contents of a file.
- Returns:
an extension string (e.g. “pdf” without the dot) or None if the file type cannot be determined.
- papis.filetype.guess_document_extension(document_path: str) str | None [source]
Guess the extension of a given file at document_path.
- Parameters:
document_path – path to an existing file.
- Returns:
an extension string (e.g. “pdf” without the dot) or None if the file type cannot be determined.
- papis.filetype.get_document_extension(document_path: str) str [source]
Get an extension for the file at document_path.
This uses
guess_document_extension()
and returns a default extension “data” if no specific type can be determined from the file.- Parameters:
document_path – path to an existing file.
- Returns:
an extension string.
papis.format
- papis.format.FORMATTER_EXTENSION_NAME = 'papis.format'
The entry point name for formatter plugins.
- exception papis.format.InvalidFormatterError[source]
An exception that is thrown when an invalid formatter is selected.
- exception papis.format.FormatFailedError[source]
An exception that is thrown when a format string fails to be interpolated.
This can happen due to lack of data (e.g. missing fields in the document) or invalid format strings (e.g. passed to the wrong formatter).
- class papis.format.Formatter[source]
A generic formatter that works on templated strings using a document.
- format(fmt: str, doc: Document | Dict[str, Any], doc_key: str = '', additional: Dict[str, Any] | None = None, default: str | None = None) str [source]
- Parameters:
fmt – a format string understood by the formatter.
doc – an object convertible to a document.
doc_key – the name of the document in the format string. By default, this falls back to
format-doc-name
.default – an optional string to use as a default value if the formatting fails. If no default is given, a
FormatFailedError
will be raised.additional – a
dict
of additional entries to pass to the formatter.
- Returns:
a string with all the replacement fields filled in.
- class papis.format.PythonFormatter[source]
Construct a string using a PEP 3101 (str.format based) format string.
This formatter is named
"python"
and can be set using theformatter
setting in the configuration file. The formatted string has access to thedoc
variable, that is always apapis.document.Document
. A string using this formatter can look like"{doc[year]} - {doc[author_list][0][family]} - {doc[title]}"
Note, however, that according to PEP 3101 some simple formatting is not possible. For example, the following is not allowed
"{doc[title].lower()}"
and should be replaced with
"{doc[title]!l}"
The following special conversions are implemented: “l” for
str.lower()
, “u” forstr.upper()
, “t” forstr.title()
, “c” forstr.capitalize()
, “y” that usesslugify
(throughpapis.paths.normalize_path()
). Additionally, the following syntax is available to select subsets from a string"{doc[title]:1.3S}"
which will select the
words[1:3]
from the title (words are split by single spaces).- format(fmt: str, doc: Document | Dict[str, Any], doc_key: str = '', additional: Dict[str, Any] | None = None, default: str | None = None) str [source]
- Parameters:
fmt – a format string understood by the formatter.
doc – an object convertible to a document.
doc_key – the name of the document in the format string. By default, this falls back to
format-doc-name
.default – an optional string to use as a default value if the formatting fails. If no default is given, a
FormatFailedError
will be raised.additional – a
dict
of additional entries to pass to the formatter.
- Returns:
a string with all the replacement fields filled in.
- class papis.format.Jinja2Formatter[source]
Construct a string using Jinja2 templates.
This formatter is named
"jinja2"
and can be set using theformatter
setting in the configuration file. The formatted string has access to thedoc
variable, that is always apapis.document.Document
. A string using this formatter can look like"{{ doc.year }} - {{ doc.author_list[0].family }} - {{ doc.title }}"
This formatter supports the whole range of Jinja2 control structures and filters so more advanced string processing is possible. For example, we can titlecase the title using
"{{ doc.title | title }}"
or give a default value if a key is missing in the document using
"{{ doc.isbn | default('ISBN-NONE', true) }}"
- format(fmt: str, doc: Document | Dict[str, Any], doc_key: str = '', additional: Dict[str, Any] | None = None, default: str | None = None) str [source]
- Parameters:
fmt – a format string understood by the formatter.
doc – an object convertible to a document.
doc_key – the name of the document in the format string. By default, this falls back to
format-doc-name
.default – an optional string to use as a default value if the formatting fails. If no default is given, a
FormatFailedError
will be raised.additional – a
dict
of additional entries to pass to the formatter.
- Returns:
a string with all the replacement fields filled in.
- papis.format.get_formatter(name: str | None = None) Formatter [source]
Initialize and return a formatter plugin.
Note that the formatter is cached and all subsequent calls to this function will return the same formatter.
- Parameters:
name – the name of the desired formatter, by default this uses the value of
formatter
.
- papis.format.format(fmt: str, doc: Document | Dict[str, Any], doc_key: str = '', additional: Dict[str, Any] | None = None, default: str | None = None) str [source]
Format a string using the selected formatter.
This is the user-facing function that should be called when formatting a string. The formatters should not be called directly.
Arguments match those of
Formatter.format()
.
papis.git
This module serves as an lightweight interface for git related functions.
- papis.git.add(path: str, resource: str) None [source]
Adds changes in the path to the git index with a message.
- Parameters:
path – a folder with an existing git repository.
resource – a resource (e.g.
info.yaml
file) to add to the index.
- papis.git.commit(path: str, message: str) None [source]
Commits changes in the path with a message.
- Parameters:
path – a folder with an existing git repository.
message – a commit message.
- papis.git.mv(from_path: str, to_path: str) None [source]
Renames (moves) the path from_path to to_path.
- Parameters:
from_path – path to be moved (the source).
to_path – destination where from_path is moved. If this is in the same parent directory as from_path, it is a simple rename.
- papis.git.remove(path: str, resource: str, recursive: bool = False, force: bool = True) None [source]
Remove a resource from the git repository at path.
- Parameters:
path – a folder with an existing git repository.
resource – a resource (e.g.
info.yaml
file) to remove from git.recursive – if True, the given resource is removed recursively.
force – if True, the removal is forced so any errors (e.g. file does not exist) are silently ignored.
- papis.git.add_and_commit_resource(path: str, resource: str, message: str) None [source]
Adds and commits a single resource.
- Parameters:
path – a folder with an existing git repository.
resource – a resource (e.g.
info.yaml
file) to remove from git.message – a commit message.
papis.hooks
- papis.hooks.HOOKS_EXTENSION_FORMAT = 'papis.hook.{name}'
All hooks should be in this namespace, e.g.
papis.hook.on_edit_done
.
- papis.hooks.CUSTOM_LOCAL_HOOKS: Dict[str, List[Callable[[...], None]]] = {}
A dictionary of hooks added with
add()
. These can be added inconfig.py
or from other places that do not use the entrypoint framework.
- papis.hooks.run(name: str, *args: Any, **kwargs: Any) None [source]
Run a hook given by its name.
Additional positional and keyword arguments are passed directly to the hook. If it does not support these arguments, the hook will be skipped.
Hooks are run in the following order:
The hooks defined by an entry point.
The hooks defined in
CUSTOM_LOCAL_HOOKS
.
papis.id
- papis.id.compute_an_id(doc: Document, separator: str | None = None) str [source]
Make an id for the input document doc.
This is a non-deterministic function if separator is None (a random value is used). For a given value of separator, the result is deterministic.
- Parameters:
doc – a document for which to generate an id.
separator – a string used to separate the document fields that go into constructing the id.
- Returns:
a (hexadecimal) id for the document that is unique to high probability.
papis.importer
- class papis.importer.ImporterT
Invariant
TypeVar
bound to theImporter
class.alias of TypeVar(‘ImporterT’, bound=
Importer
)
- papis.importer.cache(meth: Callable[[ImporterT], None]) Callable[[ImporterT], None] [source]
Decorator used to cache
Importer
methods.The data is cached in the
Importer.ctx
of each importer instance. The method meth is only called if the context is empty.- Parameters:
meth – a method of an
Importer
.
- class papis.importer.Importer(uri: str = '', name: str = '', ctx: Context | None = None)[source]
- name
A name given to the importer (that is not necessarily unique).
- uri
The URI (Uniform Resource Identifier) that the importer is to extract data from. This can be an URL, a local or remote file name, an object identifier (e.g. DOI), etc.
- classmethod match(uri: str) Importer | None [source]
Check if the importer can process the given URI.
For example, an importer that supports links from the arXiv can check that the given URI matches using:
re.match(r".*arxiv.org.*", uri)
This can then be used to instantiate and return a corresponding
Importer
object.- Parameters:
uri – An URI where the document information should be retrieved from.
- Returns:
An importer instance if the match to the URI is successful or None otherwise.
- classmethod match_data(data: Dict[str, Any]) Importer | None [source]
Check if the importer can process the given metadata.
This method can be used to search for valid URIs inside the data that can then be processed by the importer. For example, if the metadata contains a DOI field, this can be used to import additional information.
- Parameters:
data – An
dict
with metadata to inspect and match against.- Returns:
An importer instance if matching metadata is found or None otherwise.
- fetch() None [source]
Fetch metadata and files for the given
uri
.This method calls
Importer.fetch_data()
andImporter.fetch_files()
to get all the information available for the document. It is recommended to implement the two methods separately, if possible, for maximum flexibility.The imported data is stored in
ctx
and it is not queried again on subsequent calls to this function.
- papis.importer.get_import_mgr() stevedore.extension.ExtensionManager [source]
Retrieve the
stevedore.extension.ExtensionManager
for importer plugins.
papis.library
papis.logging
- class papis.logging.ColoramaFormatter(log_format: str, full_tb: bool = False)[source]
A custom logging formatter that uses
colorama
.- full_tb: bool
A flag to denote whether a full traceback should be displayed when used with
logger.info(..., exc_info=ext)
.
- papis.logging.setup(level: int | str | None = None, color: str | None = None, logfile: str | None = None, verbose: bool | None = None) None [source]
Set up formatting and handlers for the root level Papis logger.
- Parameters:
level – default logging level (see
logging
). By default, this takes values from thePAPIS_LOG_LEVEL
environment variable and falls back to"INFO"
.color – flag to control logging colors. It should be one of
("always", "auto", "no")
. By default, this takes values from thePAPIS_LOG_COLOR
environment variable and falls back to"auto"
.logfile – a path for a file in which to write log messages. By default, this takes values from the
PAPIS_LOG_FILE
environment variable and falls back to None.verbose – make logger verbose (including debug information) regardless of the level. By default, this takes values from the
PAPIS_DEBUG
environment variable and falls back to False.
papis.notes
This module controls the notes for every papis document.
- papis.notes.notes_path(doc: Document) str [source]
Get the path to the notes file corresponding to doc.
If the document does not have attached notes, a filename is constructed (using the
notes-name
setting) in the document’s main folder.- Returns:
a absolute filename that corresponds to the attached notes for doc (this file does not necessarily exist).
- papis.notes.notes_path_ensured(doc: Document) str [source]
Get the path to the notes file corresponding to doc or create it if it does not exist.
If the notes do not exist, a new file is created using
notes_path()
and filled with the contents of the template given by thenotes-template
configuration option.- Returns:
an absolute filename that corresponds to the attached notes for doc.
papis.paths
- papis.paths.unique_suffixes(chars: str | None = None, skip: int = 0) Iterator[str] [source]
Creates an infinite list of suffixes based on chars.
This creates a generator object capable of iterating over lists to create unique products of increasing cardinality (see here). This is mainly intended to create suffixes for existing strings, e.g. file names, to ensure uniqueness.
- Parameters:
chars – list to iterate over
skip – number of suffices to skip (negative integers are set to 0).
>>> import string >>> s = unique_suffixes(string.ascii_lowercase) >>> next(s) 'a' >>> s = unique_suffixes(skip=3) >>> next(s) 'd'
- papis.paths.normalize_path(path: str, *, lowercase: bool | None = None, extra_chars: str | None = None, separator: str | None = None) str [source]
Clean a path to only contain visible ASCII characters.
This function will create ASCII strings that can be safely used as file names or printed to consoles that do not necessarily support full unicode.
- Parameters:
lowercase – if True, the resulting string will always be lowercased (defaults to
doc-paths-lowercase
).extra_chars – extra characters that are allowed in the output path besides the default ASCII alphanumeric characters (defaults to
doc-paths-extra-chars
).separator – word separator used to replace any non-allowed characters in the path (defaults to
doc-paths-word-separator
).
- Returns:
a cleaned ASCII string.
- papis.paths.is_relative_to(path: Path | str, other: Path | str) bool [source]
Check if paths are relative to each other.
This is equivalent to
pathlib.PurePath.is_relative_to()
.- Returns:
True if path is relative to the other path.
- papis.paths.symlink(src: Path | str, dst: Path | str) None [source]
Create a symbolic link pointing to src named dst.
This is a simple wrapper around
os.symlink()
that attempts to give better error messages on different platforms. For example, it offers suggestions for some missing privilege issues.- Parameters:
src – the existing file that dst points to.
dst – the name of the new symbolic link, pointing to src.
- papis.paths.get_document_file_name(doc: Document | Dict[str, Any], orig_path: Path | str, suffix: str = '', *, file_name_format: str | Literal[False] | None = None, base_name_limit: int = 150) str [source]
Generate a file name based on orig_path for the document doc.
This function will generate a file name for the given file path (that does not necessarily exist) based on the document data. If the document data does not provide the necessary keys for file_name_format, then the original path will be preserved.
If resulting path will have the same extension as orig_path and will be modified by
normalize_path()
. The extension is determined usingget_document_extension()
.- Parameters:
orig_path – an input file path
suffix – a suffix to be appended to the end of the new file name.
file_name_format – a format string used to construct a new file name from the document data (see
papis.format.format()
). This value defaults toadd-file-name
if not provided.base_name_limit – a maximum character length of the file name. This is important on operating systems of filesystems that do not support long file names.
- Returns:
a new path based on the document data and the orig_path.
- papis.paths.get_document_folder(doc: Document | Dict[str, Any], dirname: Path | str, *, folder_name_format: str | None = None) str [source]
Generate a folder name for the document at dirname.
This function uses
add-folder-name
to generate a folder name for the doc at dirname. If no folder can be constructed from the format, then the document’spapis_id
is used instead as a subfolder of dirname. Thepapis_id
is guaranteed to be unique.- Parameters:
doc – the document used on the folder_name_format.
dirname – the base directory in which to generate the document main folder.
folder_name_format – a format to use for the folder name that will be filled in using the given doc. If no format is given, we default to
add-folder-name
. This format can have additional subfolders.
- Returns:
a folder name for doc with the root at dirname.
- papis.paths.get_document_unique_folder(doc: Document | Dict[str, Any], dirname: Path | str, *, folder_name_format: str | None = None) str [source]
A wrapper around
get_document_folder()
that ensures that the folder is unique by adding suffixes.- Returns:
a folder name for doc with the root at dirname that does not yet exist on the filesystem.
- papis.paths.rename_document_files(doc: Document | Dict[str, Any], in_document_paths: Iterable[str], *, file_name_format: str | Literal[False] | None = None, allow_remote: bool = True) List[str] [source]
Rename in_document_paths according to file_name_format and ensure uniqueness.
Uniqueness is required with respect to the files in in_document_paths and those in the doc itself (under the files key). If a repeated file name is found, a suffix is generated using
unique_suffixes()
and appended to the new file.- Parameters:
file_name_format – a format string used to construct a new file name from the document data (see
papis.format.format()
). This value defaults toadd-file-name
if not provided.allow_remote – if True, in_document_paths can also be remote URL, that will be downloaded to local files.
- Returns:
a list of modified file names form in_document_paths that are renamed based on file_name_format and suffixed for uniqueness.
papis.pick
- class papis.pick.Picker[source]
An interface used to select items from a list.
- abstract __call__(items: Sequence[T], header_filter: Callable[[T], str], match_filter: Callable[[T], str], default_index: int = 0) List[T] [source]
- Parameters:
items – a sequence of items from which to pick a subset.
header_filter – (optional) a callable that takes an item from items and returns a string representation shown to the user.
match_filter – (optional) a callable that takes an item from items and returns a string representation that is used when searching or filtering the items.
default_index – (optional) sets the selected item when the picker is first shown to the user.
- Returns:
a subset of items that were picked.
- papis.pick.pick(items: ~typing.Sequence[~papis.pick.T], header_filter: ~typing.Callable[[~papis.pick.T], str] = <class 'str'>, match_filter: ~typing.Callable[[~papis.pick.T], str] = <class 'str'>, default_index: int = 0) List[T] [source]
Load a
Picker
plugin and select a subset of items.The arguments to this function match those of
Picker.__call__()
. The specific picker is chosen through thepicktool
configuration option.- Returns:
a subset of items that were picked.
- papis.pick.pick_doc(documents: Sequence[Document]) List[Document] [source]
Pick from a sequence of documents using
pick()
.This function uses the
header-format-file
setting or, if not available, theheader-format
setting to construct a header_filter for the picker. It also uses the configuration settingmatch-format
to construct a match_filter.- Parameters:
documents – a sequence of documents.
- Returns:
a subset of documents that was picked.
- papis.pick.pick_subfolder_from_lib(lib: str) List[str] [source]
Pick subfolders from all existings subfolders in lib.
Note that this includes document folders in lib as well nested library folders.
- Parameters:
lib – the name of an existing library to search in.
- Returns:
a subset of the subfolders in the library.
papis.plugin
- papis.plugin.get_extension_manager(namespace: str) ExtensionManager [source]
- Parameters:
namespace – the namespace for the entry points.
- Returns:
an extension manager for the given entry point namespace.
papis.sphinx_ext
A collection of Papis-specific Sphinx extensions.
This can be included directly into the conf.py
file as a normal extension, i.e.
extensions = [
...,
"papis.sphinx_ext",
]
It will include a custom CustomClickDirective
for
documenting papis
commands and a PapisConfig
directive
for documenting Papis configuration values.
These are included by default when adding it to the extensions
list in your
Sphinx configuration.
- class papis.sphinx_ext.CustomClickDirective(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]
A custom sphinx_click.ClickDirective that removes the automatic title from the generated documentation. Otherwise it can be used in the exact same way, e.g.:
.. click:: papis.commands.add:cli :prog: papis add
- class papis.sphinx_ext.PapisConfig(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine)[source]
A directive for describing Papis configuration values.
The directive is given as:
.. papis-config:: config-value-name
and has the following optional arguments.
:section:
: The section in which the configuration value is given. The section defaults toget_general_settings_name()
.:type:
: The type of the configuration value, e.g. a string or an integer. If not provided, the type of the default value is used.:default:
: The default value for the configuration value. If not provided, this is taken from the default Papis settings.
It can be used as:
.. papis-config:: info-file :default: info.yml :type: str :section: settings This is the file name for where the document metadata should be stored. It is a relative path in the document's main folder.
In text, these configuration values can be referenced using standard role references, e.g.
The document metadata is found in its :confval:`info-file`.
- papis.sphinx_ext.make_link_resolve(github_project_url: str, revision: str) Callable[[str, Dict[str, Any]], str | None] [source]
Create a function that can be used with
sphinx.ext.linkcode
.This can be used in the
conf.py
file aslinkcode_resolve = make_link_resolve("https://github.com/papis/papis", "main")
- Parameters:
github_project_url – the URL to a GitHub project to which to link.
revision – the revision to which to point to, e.g.
main
.
papis.testing
papis.utils
- class papis.utils.A
Invariant
typing.TypeVar
alias of TypeVar(‘A’)
- class papis.utils.B
Invariant
typing.TypeVar
alias of TypeVar(‘B’)
- papis.utils.get_session() requests.Session [source]
Create a
requests.Session
forpapis
.This session has the expected
User-Agent
(seeuser-agent
), proxy (seedownloader-proxy
) and other settings used forpapis
. It is recommended to use it instead of creating arequests.Session
at every call site.
- papis.utils.parmap(f: Callable[[A], B], xs: Iterable[A], np: int | None = None) List[B] [source]
Apply the function f to all elements of xs.
When available, this function uses the
multiprocessing
module to apply the function in parallel. This can have a noticeable performance impact when the number of elements of xs is large, but can also be slower than a sequentialmap()
.The number of processes can also be controlled using the
PAPIS_NP
environment variable. Setting this variable to0
will disable the use ofmultiprocessing
on all platforms.- Parameters:
f – a callable to apply to a list of elements.
xs – an iterable of elements to apply the function f to.
np – number of processes to use when applying the function f in parallel. This value defaults to
PAPIS_NP
oros.cpu_count()
.
- papis.utils.run(cmd: Sequence[str], wait: bool = True, env: Dict[str, Any] | None = None, cwd: str | None = None) None [source]
Run a given command with
subprocess
.This is a simple wrapper around
subprocess.Popen
with custom defaults used to call papis commands.- Parameters:
cmd – a sequence of arguments to run, where the first entry is expected to be the command name and the remaining entries its arguments.
wait – if True wait for the process to finish, otherwise detach the process and return immediately.
env – a mapping that defines additional environment variables for the child process.
cwd – current working directory in which to run the command.
- papis.utils.general_open(file_name: str, key: str, default_opener: str | None = None, wait: bool = True) None [source]
Open a file with a configured open tool (executable).
- Parameters:
file_name – a file path to open.
key – a key in the configuration file to determine the opener used, e.g.
opentool
.default_opener – an existing executable that can be used to open the file given by file_name. By default, the opener given by key, if any, or the default
papis
opener are used.wait – if True wait for the process to finish, otherwise detach the process and return immediately.
- papis.utils.open_file(file_path: str, wait: bool = True) None [source]
Open file using the configured
opentool
.- Parameters:
file_path – a file path to open.
wait – if True wait for the process to finish, otherwise detach the process and return immediately.
- papis.utils.get_folders(folder: str) List[str] [source]
Get all folders with
papis
documents inside of folder.This is the main indexing routine. It looks inside folder and crawls the whole directory structure in search of subfolders containing an
info
file. The name of the file must match the configuredinfo-name
.- Parameters:
folder – root folder to look into.
- Returns:
List of folders containing an
info
file.
- papis.utils.locate_document_in_lib(document: Document, library: str | None = None, *, unique_document_keys: List[str] | None = None) Document [source]
Locate a document in a library.
This function falls back to
unique-document-keys
to determine if the current document matches any document in the library. The first document for which one of the keys in the list matches exactly will be returned.- Parameters:
library – the name of a valid Papis library.
unique_document_keys – a list of keys to match when locating a document.
- Returns:
a full document as found in the library.
- Raises:
IndexError – No document found in the library.
- papis.utils.locate_document(document: Document, documents: Iterable[Document]) Document | None [source]
Locate a document in a list of documents.
This function uses the
unique-document-keys
to determine if the current document matches any document in the list. The first document for which a key matches exactly will be returned.- Parameters:
document – the document to search for.
documents – an iterable of existing documents to match against.
- Returns:
a document from documents which matches the given document or None if no document is found.
- papis.utils.folders_to_documents(folders: Iterable[str]) List[Document] [source]
Load a list of documents from their respective folders.
- Parameters:
folders – a list of folder paths to load from.
- Returns:
a list of document objects.
- papis.utils.update_doc_from_data_interactively(document: Document | Dict[str, Any], data: Dict[str, Any], data_name: str) None [source]
Shows a TUI to update the document interactively with fields from data.
- Parameters:
document – a document (or a mapping convertible to a document) which is going to be updated.
data – additional data to select and merge into document.
data_name – an identifier for the data to show in the TUI.
- papis.utils.get_cache_home() str [source]
Get default cache directory.
This will retrieve the
cache-dir
configuration setting. If not provided, a platform-dependent cache folder is chosen instead.- Returns:
the absolute path for the cache main folder.
- papis.utils.get_matching_importer_or_downloader(uri: str, download_files: bool | None = None, only_data: bool | None = None) List[Importer] [source]
Gets all the importers and downloaders that match uri.
This function tries to match the URI using
match()
and extract the data usingfetch()
. Only importers that fetch the data without issues are returned.- Parameters:
uri – an URI to match the importers against.
download_files – if True, importers and downloaders also try to download files (PDFs, etc.) instead of just metadata.
- papis.utils.get_matching_importer_by_name(name_and_uris: Iterable[Tuple[str, str]], download_files: bool | None = None, only_data: bool | None = None) List[Importer] [source]
Get importers that match the given URIs.
This function tries to match the URI using
match()
and extract the data usingfetch()
. Only importers that fetch the data without issues are returned.- Parameters:
name_and_uris – an list of
(name, uri)
of importer names and URIs to match them against.download_files – if True, importers and downloaders also try to download files (PDFs, etc.) instead of just metadata.
- papis.utils.collect_importer_data(importers: Iterable[Importer], batch: bool = True, use_files: bool | None = None, only_data: bool | None = None) Context [source]
Collect all data from the given importers.
It is assumed that the importers have called the needed
fetch
methods, so all data has been downloaded and converted. This function is meant to only do the aggregation.- Parameters:
batch – if True, overwrite data from previous importers, otherwise ask the user to manually merge.
use_files – if True, both metadata and files are collected from the importers.
papis.yaml
- papis.yaml.data_to_yaml(yaml_path: str, data: Dict[str, Any], *, allow_unicode: bool | None = True) None [source]
Save data to yaml_path in the YAML format.
- Parameters:
yaml_path – path to a file.
data – data to write to the file as a YAML document.
- papis.yaml.list_to_path(data: Sequence[Dict[str, Any]], filepath: str, *, allow_unicode: bool | None = True) None [source]
Save a list of
dict
s to a YAML file.- Parameters:
data – a sequence of dictionaries to save as YAML documents.
filepath – path to a file.
- papis.yaml.yaml_to_data(yaml_path: str, raise_exception: bool = False) Dict[str, Any] [source]
Read a YAML document from yaml_path.
- Parameters:
yaml_path – path to a file.
raise_exception – if True an exception is raised when loading the data has failed. Otherwise just a log message is emitted.
- Returns:
a
dict
containing the data from the YAML document.- Raises:
ValueError – if the document cannot be loaded due to YAML parsing errors.
- papis.yaml.yaml_to_list(yaml_path: str, raise_exception: bool = False) List[Dict[str, Any]] [source]
Read a list of YAML documents.
This is analogous to
yaml_to_data()
, but usesyaml.load_all
to read multiple documents (see PyYAML docs).- Parameters:
yaml_path – path to a file containing YAML documents.
raise_exception – if True an exception is raised when loading the data has failed. Otherwise just a log message is emitted.
- Returns:
a
list
ofdict
objects, one for each YAML document in the file.- Raises:
ValueError – if the documents cannot be loaded due to YAML parsing errors.
papis.commands.doctor
- papis.commands.doctor.FixFn
Callable for automatic doctor fixers. This callable is constructed by a check and is expected to wrap all the required data, so it takes no arguments.
alias of
Callable
[[],None
]
- papis.commands.doctor.CheckFn
Callable for doctor document checks.
- class papis.commands.doctor.Error(name: str, path: str, payload: str, msg: str, suggestion_cmd: str, fix_action: Callable[[], None] | None, doc: Document | None)[source]
A detailed error error returned by a doctor check.
- papis.commands.doctor.register_check(name: str, check: Callable[[Document], List[Error]]) None [source]
Register a new check.
Registered checks are recognized by
papis
and can be used by users in their configuration files throughdoctor-default-checks
or on the command line through the--checks
flag.
- papis.commands.doctor.files_check(doc: Document) List[Error] [source]
Check whether the files of a document actually exist in the filesystem.
- Returns:
a
list
of errors, one for each file that does not exist.
- papis.commands.doctor.keys_missing_check(doc: Document) List[Error] [source]
Checks whether the keys provided in the configuration option
doctor-keys-missing-keys
exist in the document and are non-empty.- Returns:
a
list
of errors, one for each missing key.
- papis.commands.doctor.refs_check(doc: Document) List[Error] [source]
Checks that a ref exists and if not it tries to create one according to the
ref-format
configuration option.- Returns:
an error if the reference does not exist or contains invalid characters (as required by BibTeX).
- papis.commands.doctor.duplicated_keys_check(doc: Document) List[Error] [source]
Check for duplicated keys in the list given by the
doctor-duplicated-keys-keys
configuration option.- Returns:
a
list
of errors, one for each key with a value that already exist in the documents from the current query.
- papis.commands.doctor.duplicated_values_check(doc: Document) List[Error] [source]
Check if the keys given by
doctor-duplicated-values-keys
contain any duplicate entries. These keys are expected to be lists of items.- Returns:
a
list
of errors, one for each key with a value that has duplicate entries.
- papis.commands.doctor.bibtex_type_check(doc: Document) List[Error] [source]
Check that the document type is compatible with BibTeX or BibLaTeX type descriptors.
- Returns:
an error if the types are not compatible.
- papis.commands.doctor.biblatex_type_alias_check(doc: Document) List[Error] [source]
Check that the BibLaTeX type of the document is not a known alias.
The aliases are described by
bibtex_type_aliases
.- Returns:
an error if the type of the document is an alias.
- papis.commands.doctor.biblatex_key_alias_check(doc: Document) List[Error] [source]
Check that no BibLaTeX keys in the document are known aliases.
The aliases are described by
bibtex_key_aliases
. Note that these keys can also be converted on export to BibLaTeX.- Returns:
an error for each key of the document that is an alias.
- papis.commands.doctor.biblatex_required_keys_check(doc: Document) List[Error] [source]
Check that required BibLaTeX keys are part of the document based on its type.
The required keys are described by
papis.bibtex.bibtex_type_required_keys
. Note that most BibLaTeX processors will be quite forgiving if these keys are missing.- Returns:
an error for each key of the document that is missing.
- papis.commands.doctor.get_key_type_check_keys() Dict[str, type] [source]
Check the doctor-key-type-keys configuration entry for correctness.
The
doctor-key-type-keys
configuration entry defines a mapping of keys and their expected types. If the desired type is a list, thedoctor-key-type-separator
setting can be used to split an existing string (and, similarly, if the desired type is a string, it can be used to join a list of items).- Returns:
A dictionary mapping key names to types.
- papis.commands.doctor.key_type_check(doc: Document) List[Error] [source]
Check document keys have expected types.
- Returns:
a
list
of errors, one for each key does not have the expected type (if it exists).
- papis.commands.doctor.html_codes_check(doc: Document) List[Error] [source]
Checks that the keys in
doctor-html-codes-keys
configuration options do not contain any HTML codes like&
etc.- Returns:
a
list
of errors, one for each key that contains HTML codes.
- papis.commands.doctor.html_tags_check(doc: Document) List[Error] [source]
Checks that the keys in
doctor-html-tags-keys
configuration options do not contain any HTML tags like<href>
etc.- Returns:
a
list
of errors, one for each key that contains HTML codes.
- papis.commands.doctor.gather_errors(documents: List[Document], checks: List[str] | None = None) List[Error] [source]
Run all checks over the list of documents.
Only checks registered with
register_check()
are supported and any unrecongnized checks are automatically skipped.- Parameters:
checks – a list of checks to run over the documents. If not provided, the default
doctor-default-checks
are used.- Returns:
a list of all the errors gathered from the documents.
- papis.commands.doctor.fix_errors(doc: Document, checks: List[str] | None = None) None [source]
Fix errors in doc for the given checks.
This function only applies existing auto-fixers to the document. This is not possible for many of the existing checks, but can be used to quickly clean up a document.
- papis.commands.doctor.process_errors(errors: List[Error], fix: bool = False, explain: bool = False, suggest: bool = False, edit: bool = False) None [source]
Process a list of document errors from
gather_errors()
.- Parameters:
fix – if True, any automatic fixes are applied to the document the error refers to.
explain – if True, a short explanation of the error is shown.
suggest – if True, a short suggestion for manual fixing of the error is shown.
edit – if True, the document is opened for editing.
- papis.commands.doctor.run(doc: Document, checks: List[str] | None = None, fix: bool = True, explain: bool = False, suggest: bool = False, edit: bool = False) None [source]
Runner for
papis doctor
.It runs all the checks given by the checks argument that have been registered through
register_check()
. It then proceeds with processing and fixing each error in turn.