Setup WebDAV as a source for Mayan EDMS

I have recently switched to a Document Management System (DMS) to organize my documents. A deeply nested folder structure on my Nextcloud wasn’t an option anymore, I wanted a system with full-text search. I quickly settled on Mayan EDMS, probably the most popular open-source DMS and easily deployable with docker. Let’s run a quick test, I thought:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
version: "3.7"

services:
  app:
    image: mayanedms/mayanedms:3.3.7
    env_file:
      - .env
    depends_on:
      - postgresql
      - redis
    ports:
      - "8089:8000"
    restart: always
    volumes:
      - /mnt/data/mayan/media:/var/lib/mayan

  postgresql:
    image: postgres:9.6-alpine
    env_file:
      - .env
    restart: always
    volumes:
      - /mnt/data/mayan/postgres:/var/lib/postgresql/data

  redis:
    image: redis:5-alpine
    command:
      - redis-server
      - --databases
      - "2"
      - --maxmemory-policy
      - allkeys-lru
      - --save
      - ""
    restart: always

Easy enough, everything worked right out of the box. One of my main concerns was to quickly and most of all seamlessly be able to upload files for later processing. Currently, when scanning a document with a dedicated Android app (Scanbot, to be precise), the generated PDF gets auto-uploaded to my Nextcloud where I will categorize it later. I wanted to keep this method as I had my Nextcloud up and running anyways. Easy enough: Mayan supports a watch folder where you can drop documents which then get automatically processed. However, there is no built-in support for WebDAV and I didn’t want to add a FTP-Server just for document upload. Here comes davfs2, a nice little package to mount WebDAV drives into a filesystem. A nice Docker image was readily available.

However, the Docker bind needs be mounted with rshared and I wasn’t able to bind two Docker containers to the same shared directory. No luck in using this image separately. I merged the Mayan image into it with the following Dockerfile:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
FROM mayanedms/mayanedms:3.3.7

ENV WEBDRIVE_URL=
ENV WEBDRIVE_USERNAME=
ENV WEBDRIVE_PASSWORD=
ENV WEBDRIVE_MOUNT=/mnt/webdrive

# Install davfs2
RUN DEBIAN_FRONTEND=noninteractive apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends davfs2

COPY *.sh /usr/local/bin/

ENTRYPOINT ["run.sh"]

CMD ["run_all"]

run.sh is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#!/bin/bash

# Check variables and defaults
if [ -z "${WEBDRIVE_URL}" ]; then
	echo "No URL specified!"
	exit
fi
if [ -z "${WEBDRIVE_USERNAME}" ]; then
	echo "No username specified, is this on purpose?"
fi
if [ -z "${WEBDRIVE_PASSWORD}" ]; then
	echo "No password specified, is this on purpose?"
fi

# Create secrets file and forget about the password once done (this will have
# proper effects when the PASSWORD_FILE-version of the setting is used)
echo "$WEBDRIVE_MOUNT $WEBDRIVE_USERNAME $WEBDRIVE_PASSWORD" >> /etc/davfs2/secrets
unset WEBDRIVE_PASSWORD

# Add davfs2 options out of all the environment variables starting with DAVFS2_
# at the end of the configuration file. Nothing is done to check that these are
# valid davfs2 options, use at your own risk.
if [ -n "$(env | grep "DAVFS2_")" ]; then
	echo "" >> /etc/davfs2/davfs2.conf
	echo "[$WEBDRIVE_MOUNT]" >> /etc/davfs2/davfs2.conf
	for VAR in $(env); do
		if [ -n "$(echo "$VAR" | grep -E '^DAVFS2_')" ]; then
			OPT_NAME=$(echo "$VAR" | sed -r "s/DAVFS2_([^=]*)=.*/\1/g" | tr '[:upper:]' '[:lower:]')
			VAR_FULL_NAME=$(echo "$VAR" | sed -r "s/([^=]*)=.*/\1/g")
			VAL=$(eval echo \$$VAR_FULL_NAME)
			echo "$OPT_NAME $VAL" >> /etc/davfs2/davfs2.conf
		fi
	done
fi

# Create destination directory if it does not exist.
if [ ! -d $WEBDRIVE_MOUNT ]; then
	mkdir -p $WEBDRIVE_MOUNT
fi

# Mount and verify that something is present. davfs2 always creates a lost+found
# sub-directory, so we can use the presence of some file/dir as a marker to
# detect that mounting was a success. Execute the command on success.
mount -t davfs $WEBDRIVE_URL $WEBDRIVE_MOUNT -o uid=1000,gid=1000,dir_mode=755,file_mode=755
if [ -n "$(ls -1A $WEBDRIVE_MOUNT)" ]; then
	echo "Mounted $WEBDRIVE_URL onto $WEBDRIVE_MOUNT"
else
	echo "Nothing found in $WEBDRIVE_MOUNT, giving up!"
	exit 5
fi

/usr/local/bin/entrypoint.sh "${@}" # entrypoint.sh is the entrypoint of mayanedms/mayanedms

Everything works great, but did I mention the special permissions this docker image needs? Long story short, to mount WebDAV we need to expose /dev/fuse and add CAP_SYS_ADMIN which frankly is too much for me. I want to easily move my Docker containers from one server to another, but I decided I will allow one abstraction less. Instead of making my life hard and giving a whole container special permissions, I settled on installing davfs2 directly on my host and just expose the mounted directory to mayan. It is relatively straightforward to automatically mount it with fstab:

1
https://path.to.dav/mount /var/www/mayan/watch davfs rw,user,uid=1000,gid=1000,noauto,x-systemd.automount,file_mode=0664,dir_mode=2775 0 0

As you can see, the folder is mounted for uid and gid 1000. This way, the mayan container will be able to access it. With the options noauto,-systemd.automount, I will get the folder mounted only when it is accessed. As the Nextcloud is hosted on the same server, it will be impossible to mount it before the Nextcloud application has started and accepts incoming connections. This option ensures the mount process starts much later than that as no one will access the server.

THe mounted folder can easily be bound to the Docker container:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
version: "3.7"

services:
  app:
    image: mayanedms/mayanedms:3.3.7
    env_file:
      - .env
    depends_on:
      - postgresql
      - redis
    ports:
      - "8089:8000"
    restart: always
    volumes:
      - /mnt/data/mayan/media:/var/lib/mayan
      - ./watch:/scanned_files:rshared

  postgresql:
    image: postgres:9.6-alpine
    env_file:
      - .env
    restart: always
    volumes:
      - /mnt/data/mayan/postgres:/var/lib/postgresql/data

  redis:
    image: redis:5-alpine
    command:
      - redis-server
      - --databases
      - "2"
      - --maxmemory-policy
      - allkeys-lru
      - --save
      - ""
    restart: always

I am happy enough with my setup now!