Deployment¶
Deploying the librarian is a relatively simple task thanks to containerisation. We currently strongly recommend that librarian instances are deployed using Docker, rather than running on ‘bare metal’. You will need at least two containers.
A container running the librarian application
A container running a SQLAlchemy-compatible database (we recommend PostgreSQL)
There are several additional containers that you may wish to consider:
A reverse-proxy container (e.g. Nginx) to handle SSL termination.
A container running a caching layer (e.g. Redis) to improve performance (rarely used).
A container running Grafana to display dashboards.
The following sections will guide you through the process of deploying the librarian using Docker.
The Server Config¶
The server configuration is a JSON file that contains all the configuration information required to run the librarian. This should be placed in a directory accessible to the librarian; we recommend running this as either a config map or a bind mount so that it can be easily changed.
The server configuration file should be pointed to as the LIBRARIAN_CONFIG_PATH
environment variable. All of the below configurations can be modified using
environment variables with LIBRARIAN_SERVER_ as a prefix.
The configuration variables are as follows:
name: The name of the server. This is used in the logs and in the inter-librarian transfers, so should be unique within your librarian setup. When other librarians register this as a potential downstream or upstream node, the name should be the same as this.debug: Whether to run the server in debug mode. This should be set tofalsein production. With debug on, the server will make available API docs and other potentially sensitive information.encryption_key_file: The path to the file containing the encryption key. This should be a Fernet key and is used for encrypting and decrypting the data stored about other librarians in the database.Database settings:
database_driver: The database driver to use. This should be one of the SQLAlchemy-supported database drivers. We recommendpostgresql+psycopg.database_user: The username to use to connect to the database.database_password: The password used to connect to the database.database_host: The hostname of the database.database_port: The port of the database.database_name: The name of the database.alembic_config_path: The path to the alembic configuration file. This is used to run database migrations.alembic_path: The path to alembic (defaults toalembic), in case it is for some reason outside of the PATH.
log_level: The log level to use. This should be one of the Python logging levels. We recommendINFOfor production.displayed_site_name: A pretty name for the site, this is used in user interfaces like the slack hooks.displayed_site_description: A description of the site, this is used in user interfaces like the slack hooks.host: The host to bind to. This should be0.0.0.0.port: The port to bind to.add_stores: A JSON object containing the information about stores added to the ilbrarian. See stores.max_search_results: The maximum number of search results to return to clients. Not respected for administative users.maximal_upload_size_bytes: The maximum size of a file that can be uploaded to the librarian. This is in bytes. By default, this is 50 GB.Slack integration:
slack_webhook_enable: Whether to use the slack hook.slack_webhook_url_file: The path to the file contianing theURL of the Slack webhook to send messages to.
slack_webhook_post_error_severity: Which error severities to post to slack. By default, all errors are sent.slack_webhook_post_error_category: Which error categories to post to slack. By default, all errors are sent.
In addition to this server configuration, you will need an appropriate background configuration for this server. More information can be found on the background page.
It is recommended that you store your secrets files in the same directory as the configuration files for ease of use.
The Database¶
We strongly recommend using a PostgreSQL database for the librarian, and the librarian is tested in a production setting using postgres 16. You will need to export the port of the database so that it is visible to the librarian server.
An example configuration for a PostgreSQL database is as follows (docker compose):
services:
librarian-database:
image: postgres:16
restart: always
container_name: "librarian-database"
volumes:
- type: "bind"
source: "/librarian/database"
target: "/var/lib/postgresql/data"
environment:
POSTGRES_USER: "librarian"
POSTGRES_PASSWORD: "hello-world-password"
POSTGRES_DB: "librarian"
expose:
- 5432
networks:
- "librarian-network"
healthcheck:
test: ["CMD-SHELL", "pg_isready"]
interval: 10s
timeout: 10s
retries: 5
networks:
librarian-network:
Note that you will need to set up your own volume or bind mount and backups for
this container. Here we use the example of a bind mount to a directory called
/librarian/database.
The Librarian¶
The librarian itself is a relatively simple container, coming with a docker file in this repository that simply installs the librarian and the binary psycopg driver.
An example configuration for the librarian is as follows (docker compose):
librarian-server:
hostname: "librarian-docker"
restart: always
build:
context: "."
dockerfile: "Dockerfile"
container_name: "librarian-server"
ports:
- 21109:21109
stdin_open: true
tty: true
volumes:
- type: "bind"
source: "/storage/mainstore"
target: "/storage/mainstore"
- type: "bind"
source: "/sneakerA"
target: "/sneakerA"
bind:
propagation: "rslave"
- type: "bind"
source: "/users/me/site-librarian-configs/"
target: "/librarian-configs"
environment:
- LIBRARIAN_CONFIG_PATH=/librarian-configs/server_config.json
- LIBRARIAN_SERVER_DATABASE_USER=librarian
- LIBRARIAN_SERVER_DATABASE_PASSWORD=hello-world-password
- LIBRARIAN_SERVER_ALEMBIC_CONFIG_PATH=/librarian-configs
- LIBRARIAN_BACKGROUND_CONFIG=/librarian-configs/background_config.json
- LIBRARIAN_SERVER_SLACK_WEBHOOK_URL_FILE=/librarian-configs/SLACK_KEY
- LIBRARIAN_SERVER_ENCRYPTION_KEY_FILE=/librarian-configs/FERNET_KEY
depends_on:
librarian-database:
condition: service_healthy
networks:
- "librarian-network"
networks:
librarian-network:
A key thing to note here is that the storage locations have the same path inside and outside of the container. This is because the librarian uses the file system to store files, and so the paths must be the same.
By combining these files, alongside the correct configuration, you can deploy
with a simple docker compose up -d.
Post-Setup¶
After the initial docker deployment of the librarian, it will not work. You will need to log into the container and run the initial database migration; this will create the necessary tables in the database, as well as some initial rows (e.g. the intiial admin user).
You can find the location of your running docker service with docker ps
and log in with docker exec -it $CONTAINER_ID /bin/bash. From here, you should
run the initial setup script:
librarian-server-setup --initial-user=$INITIAL_ADMIN_USER_NAME \
--initial-password=$INITIAL_ADMIN_PASSWORD
Which sets up the initial administrator user and the stores in the system. To run
a database migration, instead of the initial setup, you can run --migrate as
the argument here which will run the alembic migrations.
After this, the librarian should be up and running and you can access it at the specified port using your API client.