Released: May 19, 2021
This release is of special importance for the project as it also marks the 10 year anniversary of the first release.
{.align-center}
Looking back, is incredible to see how something I started back as a side project to help out the government agency I used to work at, has grown in to something used by people around the world.
It has become a community. It has become a serious product in markets historically dominated by very big companies. It has become a full time job. It has become a gateway to working with a great team of professionals. I has become the channel by which I've gotten to meet and interact with incredible people I would have never met otherwise.
There is a saying that goes: "Find something you love to do, and you'll never have to work a day in your life.". I have found that something in my life, and I have all of you to thank.
-- Roberto Rosario
The big ticket item for this version is the new page composition feature. Historically, Mayan EDMS has equated an uploaded file to a document version as a one to one relationship, with the same relationship applying to the document pages. This is due to the origins and main purpose of the project, which was to be a static scanned document repository.
As the user base and industries in which Mayan EDMS is being used expands, this basic philosophy had to be expanded too, but in such a way that it did not affected previous uses.
The result is what we call the Document Page Composition API. This new feature allows for document pages to be disabled, enabled, appended, removed, or even reordered. All of this is achieved in a non destructive way (the original uploaded file is never changed in any way), all of the changes happen only at the application level.
We accomplished this by making two fundamental changes. The first is the creating a new object type, the document file object. The second was to allow for the creation of any number of document versions for a document. Now when a new file is upload for a document, by default a new document file object is created and a corresponding document version is created that points to the new document file. This makes the document versions behave like "views" to the document files contained in a document.
When a new file is uploaded, the user has the choice to create a new version that matches the page layout of the new file, or to create a new version that includes the pages of the previous file as well as the pages of the new file. The file is uploaded and processed, users can also go to the document version and change its page mapping and ordering.
Important
As with every upgrade, make a backup of your database and document storage. More so as this release is a major one and the first of a series of big paradigm changes for the project. Make sure to validate your data after the upgrade and report any issues.
Important
The default versions of PostgreSQL and MySQL were also updated. These database managers might require dedicated steps to convert your existing database from the current version to the new one. This should be done before upgrading the Mayan EDMS version.
A tool to audit the ACLs in the system was added. This tools is called "Global ACLs" and is found under the "Tools" menu. It will display and allow modification to all the ACLs defined in the system regardless of their origin object or view.
The API is now versioned with a top level entry. Since Mayan uses semantic versioning and API changes only occur on major versions, the API version will match the major version of the release. For this release and for every release of the series 4.x the API version will be "v4".
Support was added for API result sorting. Like the user interface, only indexed database fields support. The OPTION verb can be used to determine which fields are sortable.
API endpoints were added for the Assets model. Cached image generation was also added for faster assets usage.
An asset detail view with image preview was added.
Previously, when displaying images, only one error condition was represented regardless of the underlying cause. This was done using the paper icon with the red cross. The converter was updated to allow app to specify their own error conditions and the corresponding visual indicator. This is now used when generating document images and allows displaying different icons and popup messages when there are no active document versions or the active document version has no pages defined.
Due to the addition of the Document Page Composition API, certain operations might now only be performed on document files while other might only be performed on document versions. These are as follows:
Function Object
Download Document files
Export to PDF (new) Document versions
OCR Document versions
File metadata Document files
Parsing Document files
Transformations, decorations, and Both redactions
Transformations applied to a document file page will be reflected on any document version page. However transformations applied a document version page will not be reflected on the source document version file.
This new feature also changes how we things about document versions and file uploads ordering. These are not longer lineal. Any document version or document file may be deleted at any time.
The caches of the file pages and version pages images are now two separate storages. This allows keeping the document files in one location such as object storage and the document version pages in a faster or local storage for a combination of low cost long term storage for files and volatile fast storage for image previews.
The three storages are:
- DOCUMENTS_FILE_STORAGE_BACKEND - Stores the uploaded document files.
- DOCUMENTS_FILE_PAGE_IMAGE_CACHE_STORAGE_BACKEND - Stores the preview images of the uploaded document files.
- DOCUMENTS_VERSION_PAGE_IMAGE_CACHE_STORAGE_BACKEND - Stores the preview images of the document version pages.
Setting migration have been included to transition existing values but make sure to double check you settings before and after the upgrade.
Previously, when downloading a document, the actual filename of the download was the document label. Now, when a document file is uploaded the actual filename of the file is retained and used as the download filename including its original extensions if any. This solves downloading and display issues with operating systems that rely on file extensions to identify the type of a file.
With the separation of files and versions, other features are now possible. Support was added to grant access to individual versions of a document via ACLs.
Favorite documents now track the date and time of selection to ensure deterministic ordering when deleting the oldest favorites. This also allows displaying the favorite documents from most recently favorited to least recently favorited. The favorite documents feature also received its own API.
Some aspects of the recently added and accessed documents were renamed to make it more clear. As a result the setting DOCUMENTS_RECENT_ACCESS_COUNT was renamed to DOCUMENTS_RECENTLY_ACCESSED_COUNT, and DOCUMENTS_RECENT_ADDED_COUNT to DOCUMENTS_RECENTLY_CREATED_COUNT. Configuration file migrations and migration tests were added. Environment and supervisor settings need to be manually updated.
A new document workflow action was added to change the type of a document.
Add a third document filename generator was added that uses an UUID plus the original filename of the uploaded file. This generator has the advantage of producing unique filename while also preserving the original filename for reference.
The duplicated documents code was moved to its own new app. Along with this, support was added for duplication backends. These backends allow adding more methods of determining what is to be considered a duplicated document. This allows for duplication logic that might be domain specific.
An dedicated API endpoint was added for the duplicates app.
The MySQL Docker image was updated from version 5.7 to 8.0, the PostreSQL image from version 10.14 to 10.15, and the Redis image from version 5.0 to 6.0.
The default version of PostgreSQL will be incremented in each subsequent minor release to catch up to the target version of 12.0.
A database backup and restore is needed when moving from one version of PostgreSQL to another (https://www.postgresql.org/docs/10/backup-dump.html).
Make a backup of your database container before performing the Mayan EDMS upgrade. Then upgrade the database container image, and finally restore the database into the new container. Only then perform the Mayan EDMS upgrade.
The Docker image tagging layout has been updated to differentiate between released versions and series. Series have the 's' prefix and versions have the 'v' prefix. The version 4.0 image will have the tags v4.0 and s4. The version 4.1 image will have the tags v4.1 and s4. This change will allow users to have their deployments pinned to a specific version or to a specific series instead of relying on the "latest" tag.
The Docker Compose file was updated to use profiles for the extra containers (https://docs.docker.com/compose/profiles/). Profiles allow running extra containers with none or minimal modification of the Docker Compose file.
The Docker Compose file was also updated to use extension fields to remove duplicated markup (https://docs.docker.com/compose/compose-file/compose-file-v3/#extension-fields).
An entra container was added to mount and index and expose it to the host operating system.
A container and markup for Traefik was added. This makes for easier deployment of SSL using a single Docker Compose file.
A sample .env file was added to pass environment variables to the Docker Compose file YAML parser (https://docs.docker.com/compose/compose-file/compose-file-v3/#environment).
Support for an env_file was also added to allow passing environment variables to the running containers (https://docs.docker.com/compose/compose-file/compose-file-v3/#env_file).
These changes require Docker Compose version 1.28 or later.
A new tool was added to hold generated files for download. This tool allows performing document version exports in the background. Users will received a message when their download is ready.
To improve auditing, support was added to export and download the global events list, the object events list, and user events list. The event list is exported as a CSV file.
In addition to the existing tool to purge caches, it is now possible to purge specific document files or document version caches. This can be achieved with the new action link and requires the cache purge permission for the object.
Event tracking was added when purging a cache partition.
Many stability, scaling, and performance updates were added to the file caching system.
Cache collisions were solved for many scenarios. This allows controlled access to the cached content even under heavy load situation where many processes might be trying to read or modify cached content.
Failed cache pruning attempts are now automatically retried if the cause was lock contention.
Added detection of excessive cache pruning when cache size is too small for the workload.
The cache eviction logic was improve to use a LFU (Least frequently used) and LRU (Least recently used) combination algorithm. This ensures that only the least usable content is evicted from the cache during pruning.
The caches are now only pruned during startup if their maximum size has changed. This reduces the startup or restart time in installations that are under heavy use.
The format of the file_metadata_value_of helper was changed. The driver and metadata entry are now separated by a double underscore instead of a single underscore. This allows supporting drivers and entries that might contain an underscore themselves.
Templates using this file metadata helper need to be manually updated.
The key creation and expiration fields were converted to date and time fields. Events for key creation and key download were for added. Support for key event subscription was added.
Support for the "Reply To" field was added when sending documents via email and for the mailing workflow actions.
A new app was added to allow sending messages to users. At the moment this app is only used by the downloads app to let users know their download is ready. This is a first stage app and we plan to expand the use of this app in subsequent versions.
A new small app was added to handle details that are specific to an organizations installation of Mayan EDMS. The app is called organizations and holds two settings. The setting ORGANIZATIONS_INSTALLATION_URL allows setting the URL of the installation. This URL is used when composing the document link URL when sending document links via email. It is also used to create the download link for exported document versions.
If this setting is not specified, Mayan EDMS will try to obtain the URL scheme, host, and port from the HTTP request itself. When running Mayan EDMS behind proxy servers or load balancers this might not return accurate results and will require an explicit value for the setting.
The setting ORGANIZATIONS_URL_BASE_PATH replaces the previous setting named COMMON_URL_BASE_PATH used when hosting Mayan EDMS in a path other than the root path ('/') of the web server.
The document app and OCR app migrations dependency was inverted. This makes the OCR migration dependent on the documents app migration instead of the other way around. This allows for completely disabling the OCR app if desired.
Backend support was added for scoped and nested searches. Scoped searches allow using the search app for very complex search patterns. The scopes behave like a binary tree where each operator collapses the term pair into a new runtime scope.
The mathematical representation of search scopes would be as follows:
result of scope X = ((Scope 0) Operator (Scope 1)) Operator ((Scope 2) Operator (Scope 3))
Each scope pair is resolved in reverse as a new runtime scope until the resulting scope requested by the user becomes available.
result of scope 30 = ((Scope 0) Operator (Scope 1) = Scope 10) Operator ((Scope 2) Operator (Scope 3) = Scope 20) = Scope 30
becomes:
result of scope 30 = (Scope 10) Operator (Scope 20) = Scope 30
result of scope 30 = Scope 30
An actual application of the scopes searches is to search for documents with a multiple metadata type and metadata value combinations.
The actual query string would be as follows:
/search/results/documents.DocumentSearchResult/?__0_metadata__metadata_type__name=building&__0_metadata__value=100&__operator_0_1=AND_10&__1_metadata__metadata_type__name=zone&__1_metadata__value=2&__result=10
This means: search for documents with a metadata value of 100 for the metadata type of "building" and that also have a value of 2 for the metadata type of "zone".
Which can also be represented as:
(metadata_type__name=building AND metadata__value=100) AND (metadata_type__name=zone AND metadata__value=2)
When not specified, the search terms are assigned a scope of 0, which preserves the existing search behavior.
This feature was a late addition and the user interface part continues in development. However the backend section is feature complete and accessible via the normal search views and the search API.
This feature is search backend independent.
Support was added to refresh the database representation of a signatures. This is useful for upgrades when a new signature field is added. Refreshing the signature will allow existing signatures to display proper values for these new fields.
The key attributes corresponding to a signature are now displayed as part of the signature serializer in the API.
The API action to upload a detached signature is now its own API endpoint.
Support for document signature events and signature events subscriptions was added.
It is now possible to use staging folders as sources for new document file uploads.
The sources app now has its own menu. This allows switching from different sources easier when uploading new documents or new document files. Conditional source link highlighting was added to better identify which source is currently active.
Support was added to subscribing to user impersonation events.
It is now possible to enable impersonation permission for individual users.
Impersonating users can now be also performed from the user list view, user the "Setup" menu.
A new sub menu was added. We call this menu "Related" as is meant to allow for quicker navigation by displaying object that are related to the one being displayed. For example, when viewing a list of document types, the related menu will show a links to the metadata types setup view.
Another menu that was added to improve navigation was the "Return" menu. Return links was are added to this menu for object contained by other objects such as the case of document files and document version. When navigating a document file or document version a link will be displayed that will quickly return the user to the parent document. These links are decorated with a left oriented chevron for easier identification.
Support was added for collapsing the options of the menus "list facet" and "object" when in list view mode. This behavior is controlled with the new settings: COMMON_COLLAPSE_LIST_MENU_LIST_FACET and COMMON_COLLAPSE_LIST_MENU_OBJECT. Both default to False to preserve the existing behavior.
The URL query key used for sorting results was updated to match the key used for sorting in the API. The new sort key is "_ordering".
The workflows app and API were refactored. The permissions needed to transition a workflow instance were updated. The workflow instance transition permission is now needed for the document and for either the transition or the workflow template. This dual requirement matches the same logic used in other apps such as the metadata and tags apps.
The way workflows are transitioned via the API was improved. The same model methods are now called and transitioning a workflow via the API behaves in the exact manner as transitioning an API via the user interface.
An API endpoint named [workflow-instance-log-entry-detail]{.pre} was added to allow inspection of the individual workflow log events.
The workflow API now allows passing extra data when transitioning a workflow, and the context of the workflow instance is also available.
The state and transition options when using the API are also limited to valid ones as it is done when using the user interface.
The workflow instance creation is now a background task. This speeds up launching new workflows and the initial processing of new documents.
GitLab issue #733 Feature: Draft versions of documents
GitLab issue #777 feature request: cleanup scanned document
GitLab issue #853 Remove parent class and self reference from super() calls
GitLab issue #859 Convert DocumentTypeFilteredSelectForm to use generic FilteredSelectionForm
GitLab issue #864 Feature Request: allow REPLY-TO field in mailing profile
GitLab issue #865 Possible bug: Adding a new version of a document via scanner
GitLab issue #867 Rename document comment model "comment" field to "text"
GitLab issue #921 Feature Request: Version information
GitLab issue #929 COMMON_EXTRA_APPS env variable not doing anything
GitLab issue #939 Using db as lock manager, "name" column in table "lock_manager_lock" too small
GitLab issue #941 Include time of document signature
[:forum-topic:`5085`]{#problematic-1 .problematic} Rest API: Detailed list of signatures
::: {#system-message-1 .system-message} System Message: ERROR/3 (<string>, line 766); backlink
Unknown interpreted text role "forum-topic". :::