Command-line reference¶
Interact with Crossref data via a Lightning database.
usage: crossref-lmdb [-h] [--debug] {create,update} ...
Positional Arguments¶
- command
Possible choices: create, update
Named Arguments¶
- --debug
Print error tracebacks.
Default:
False
Sub-commands¶
create¶
Create a Lightning database from Crossref public data.
crossref-lmdb create [-h] --public-data-dir PUBLIC_DATA_DIR --db-dir DB_DIR
[--start-from-file-num START_FROM_FILE_NUM]
[--commit-frequency COMMIT_FREQUENCY]
[--compression-level {-1,0,1,2,3,4,5,6,7,8,9}]
[--filter-path FILTER_PATH]
[--show-progress | --no-show-progress]
[--max-db-size-gb MAX_DB_SIZE_GB]
Named Arguments¶
- --public-data-dir
Path to the Crossref public data directory.
- --db-dir
Path to the directory to write the database files.
- --start-from-file-num
Begin processing from this file number in the public data archive.
Default:
0
- --commit-frequency
How often to commit changes to the database, in units of number of items.
Default:
20000
- --compression-level
Possible choices: -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Level of compression to use for metadata; 0 is no compression, -1 is the default level of compression (6), and between 1 and 9 is the level where 1 is the least and 9 is the most.
Default:
-1
- --filter-path
Path to a Python module file containing a function for filtering DOIs. This function must be called filter_func and accept one parameter, which contains a dict-like interface to item metadata. The function returns False if the item is to be filtered out and True otherwise.
- --show-progress, --no-show-progress
Enable or disable a progress bar. (default: True)
Default:
True
- --max-db-size-gb
Maximum size that the database can grow to, in GB units. Note that this is set to a smaller default on Windows (2 GB), due to it pre-allocating space.
Default:
2000
update¶
Update a Lighting database with new data from the web API.
crossref-lmdb update [-h] [--commit-frequency COMMIT_FREQUENCY]
[--compression-level {-1,0,1,2,3,4,5,6,7,8,9}]
[--filter-path FILTER_PATH]
[--show-progress | --no-show-progress]
[--max-db-size-gb MAX_DB_SIZE_GB] --db-dir DB_DIR
--email-address EMAIL_ADDRESS [--from-date FROM_DATE]
[--filter-arg FILTER_ARG]
Named Arguments¶
- --commit-frequency
How often to commit changes to the database, in units of number of items.
Default:
20000
- --compression-level
Possible choices: -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Level of compression to use for metadata; 0 is no compression, -1 is the default level of compression (6), and between 1 and 9 is the level where 1 is the least and 9 is the most.
Default:
-1
- --filter-path
Path to a Python module file containing a function for filtering DOIs. This function must be called filter_func and accept one parameter, which contains a dict-like interface to item metadata. The function returns False if the item is to be filtered out and True otherwise.
- --show-progress, --no-show-progress
Enable or disable a progress bar. (default: True)
Default:
True
- --max-db-size-gb
Maximum size that the database can grow to, in GB units. Note that this is set to a smaller default on Windows (2 GB), due to it pre-allocating space.
Default:
2000
- --db-dir
Path to the directory containing the LMDB database files.
- --email-address
Email address to provide to the Crossref web API so as to be able to use the polite pool.
- --from-date
A date from which to search for updated records, specified in YYYY[-MM[-DD]] format (i.e., month and day are optional).
- --filter-arg
A Crossref web API filter string for restricting DOIs.