Document
Module defining the main document type.
- class papis.document.KeyConversion
A
dict
that contains a key and an action. The key contains the name of a key in another dictionary and the action contains a callable that can pre-processes the value.
- class papis.document.KeyConversionPair
- .. attribute:: foreign_key
A string denoting the foreign key (in the input data).
- list
A
list
ofKeyConversion
dictionaries used to rename and post-process theforeign_key
and its value.
- class papis.document.KeyConversion(_typename, _fields=None, /, **kwargs)
- class papis.document.KeyConversionPair(foreign_key, list)
-
- list: List[KeyConversion]
Alias for field number 1
- papis.document.keyconversion_to_data(conversions: Sequence[KeyConversionPair], data: Dict[str, Any], keep_unknown_keys: bool = False) Dict[str, Any] [source]
Function to convert between dictionaries.
This can be used to define a fixed set of translation rules between, e.g., JSON data obtained from a website API and standard
papis
key names and formatting. The implementation is completely generic.For example, we have the simple dictionary
data = {"id": "10.1103/physrevb.89.140501"}
which contains the DOI of a document with the wrong key. We can then write the following rules
conversions = [ KeyConversionPair("id", [ {"key": "doi", "action": None}, {"key": "url": "action": lambda x: "https://doi.org/{}".format(x)} ]) ] new_data = keyconversion_to_data(conversions, data)
to rename the
"id"
key to the standard"doi"
key used bypapis
and a URL. Any number of such rules can be written, depending on the complexity of the incoming data. Note that any errors raised on the application of the action will be silently ignored and the corresponding key will be skipped.- Parameters:
conversions – a sequence of
KeyConversionPair
s used to convert the data.data – a
dict
to be convert according to conversions.keep_unknown_keys – if True unknown keys from data are kept in the resulting dictionary. Otherwise, only keys from conversions are present.
- Returns:
a new
dict
containing the entries from data converted according to conversions.
- papis.document.author_list_to_author(data: Dict[str, Any]) str [source]
Convert a list of authors into a single author string.
This uses the
multiple-authors-separator
and themultiple-authors-format
configuration settings (see General settings) to construct the concatenated authors.- Parameters:
data – a
dict
that contains an"author_list"
key to be converted into a single author string.
>>> authors = [ {"given": "Some", "family": "Author"}, {"given": "Other", "family": "Author"}] >>> author_list_to_author({"author_list": authors}) 'Author, Some and Author, Other'
- papis.document.split_authors_name(authors: List[str], separator: str = 'and') List[Dict[str, Any]] [source]
Convert list of authors to a fixed format.
This uses
bibtexparser.customization.splitname()
to correctly split and determine the first and last names of an author in the list. Note that this is just a heuristic and can give incorrect results for certain author names.- Parameters:
authors – a list of author names, where each entry can consists of multiple authors separated by separator.
separator – a separator for entries in authors that contain multiple authors.
- class papis.document.DocHtmlEscaped(doc: Document)[source]
Small helper class to escape HTML elements in a document.
>>> DocHtmlEscaped(from_data({"title": '> >< int & "" "'}))['title'] '> >< int & "" "'
- class papis.document.Document(folder: str | None = None, data: Dict[str, Any] | None = None)[source]
An abstract document in a
papis
library.This class inherits from a standard
dict
and implements some additional functionality.- html_escape
A
DocHtmlEscaped
instance that can be used to escape keys in the document for use in HTML documents.
- set_folder(folder: str) None [source]
Set the document’s main folder.
This also updates the location of the info file and other attributes. Note, however, that it will not load any data from the given folder even if it contains another info file (see
from_folder()
for this functionality).- Parameters:
folder – an absolute path to a new main folder for the document.
- get_main_folder() str | None [source]
- Returns:
the root path in the filesystem where the document is stored, if any.
- get_main_folder_name() str | None [source]
- Returns:
the folder name of the document, i.e. the basename of the path returned by
get_main_folder()
.
- get_info_file() str [source]
- Returns:
path to the info file, which can also be an empty string if no such file has been created.
- get_files() List[str] [source]
Get the files linked to the document.
The files in a document are stored relative to its main folder. If no main folder is set on the document (see
set_folder()
), then this function will not return any files. To retrieve the relative file paths only, accessdoc["files"]
directly.- Returns:
a
list
of absolute file paths in the document’s main folder, if any.
- papis.document.from_data(data: Dict[str, Any]) Document [source]
Construct a
Document
from a dictionary.- Parameters:
data – a dictionary to be made into a new document.
- papis.document.from_folder(folder_path: str) Document [source]
Construct a
Document
from a folder.- Parameters:
folder_path – absolute path to a valid
papis
folder.
- papis.document.to_json(document: Document) str [source]
Export the document to JSON.
- Returns:
a JSON string corresponding to all the entries in the document.
- papis.document.to_dict(document: Document) Dict[str, Any] [source]
Convert a document back into a standard
dict
.- Returns:
a
dict
corresponding to all the entries in the document.
- papis.document.dump(document: Document) str [source]
Dump the document into a formatted string.
The format of the string is not fixed and is meant to be used to display the document entries in a consistent way across
papis
.- Returns:
a string containing all the entries in the document.
>>> doc = from_data({'title': 'Hello World'}) >>> dump(doc) 'title: Hello World'
- papis.document.delete(document: Document) None [source]
Delete a document from the filesystem.
This function delete the main folder of the document (recursively), but it does not delete the in-memory version of the document.
- papis.document.describe(document: Document | Dict[str, Any]) str [source]
- Returns:
a string description of the current document using
document-description-format
(see General settings).
- papis.document.move(document: Document, path: str) None [source]
Move the document to a new main folder at path.
This supposes that the document exists in the location
document.get_main_folder()
and will change the folder in the input document as a result.- Parameters:
path – absolute path where the document should be moved to. This path is expected to not exist yet and will be created by this function.
>>> doc = from_data({'title': 'Hello World'}) >>> doc.set_folder('path/to/folder') >>> import tempfile; newfolder = tempfile.mkdtemp() >>> move(doc, newfolder) Traceback (most recent call last): ... Exception: There is already...
- papis.document.sort(docs: Sequence[Document], key: str, reverse: bool = False) List[Document] [source]
Sort a list of documents by the given key.
The sort is performed on the key with a priority given to the type of the value. If the key does not exist in the document, this is given the lowest priority and left at the end of the list.
- Parameters:
docs – a sequence of documents.
key – a key in the documents by which to sort.
reverse – if True, the sorting is done in reverse order (descending instead of ascending).
- Returns:
a list of documents sorted by key.
- papis.document.new(folder_path: str, data: Dict[str, Any], files: Sequence[str] | None = None) Document [source]
Creates a complete document with data and existing files.
The document is saved to the filesystem at folder_path and all the given files are copied over to the main folder.
- Parameters:
folder_path – a main folder for the document.
data – a
dict
with key and values to be used as metadata in the document.files – a sequence of files to add to the document.
- Raises:
FileExistsError – if folder_path already exists.