Mayan EDMS version 3.2 released

Thu, Aug 1, 2019

Version 3.2.6 has been out for some time now and with no bugs reported we can declare series 3.2 as stable.

The team would like to send a special thank you to the Berkeley County (SC) Government, personnel, and to David Kornahrens Jr, Chief Information Officer and Assistant to Supervisor for their big support.

Click here to read more about this collaboration and how you too can be a part of Mayan’s future.

User interface

This series adds a new contextual navigation bar. This bar is a hybrid between a list menu bar and a sidebar menu. Icons in this menu will show in the object list view and when the object is the main object in the template. This reduces mouse travel and clicks, as most views of the object now remain “open” in the sidebar instead of being hidden inside the “Actions” drop down.

The “Actions” drop-down will now divide the available action links depending on the menu that defines them (“action”, “secondary”) and the object that they act upon. This is useful on views that can display more than one object at a time such as the setup views. For example, during the workflow setup view the action links will be split between actions for the workflow, for the state, and for the transition. This makes setup navigation faster and requires less mental effort as the user no longer needs to remember which link affects which view object.

Entries as separated by menu and object

Entries as separated by menu and object

Another user interface change that landed on this version is the ability to sort lists of objects. Fields that correspond to a database field will have their column heading displayed as an HTML link. Clicking on a sortable column heading link will sort the list by the values of that column. Clicking the heading again will invert the sort order. A small arrow icon will show the sort order.

Only columns that correspond to a database field can be sorted.

Only columns that correspond to a database field can be sorted.

Support for icon composition was added. This allow for a more unified visual language.

Multiple icons can be composed into new ones.

Multiple icons can be composed into new ones.

New apps

A new app named “File metadata” was added. This app allows extracting file information. The app includes a driver by default to extract EXIF field information. This file field includes camera information for photos, authors for office documents and PDFs and other information that can now be used to search or index documents. This app was previously a separate app called Mayan-EXIF (https://gitlab.com/mayan-edms/exif). The app was generalized and is now part of the core group of apps.

Sample file metadata output for a PDF file

Sample file metadata output for a PDF file

The autoadmin app which is in charge of creating the initial admin user after the installation, has been included in the core. This app is made by the same author of Mayan EDMS and at one time in the past was part of the core apps.

Another app was added to handle all dependencies. Previously, the code to handle JavaScript dependencies and license text collection, was contained in the common app. This new app, called “dependencies” now handles both tasks. In addition, it provides checks for binary dependencies. This app’s main view will show dependencies are not being recognized and allow debugging installation issues faster and easier. The app supports Python, JavaScript, web assets, and binary dependencies.

Dependencies can be checked from the web interface.

Dependencies can be checked from the web interface.

Dependencies can also be checked from the command line.

Dependencies can also be checked from the command line.

The JavaScript library download and installation code was updated to provide faster hash verification using block hashing. During tests this change cut verification time to just 28% of the previous time. Additionally, the app can detect existing installations of some dependency types and skip them for even faster upgrade and startup times.

Installed dependencies are now skipped.

Installed dependencies are now skipped.

An app called “platform” was added with the purpose of generating platform configuration files. For this release two templates were added: A template for supervisor for direct deployments and another supervisor template for the Docker image. The generating configuration files for platform software, saves users from manually copying and pasting such files from the documentation’s installation chapter. This was a source of many installation issues.

Platform configuration files can now be generated by Mayan EDMS itself.

Platform configuration files can now be generated by Mayan EDMS itself.

Deprecations

The only feature deprecated on the user facing side is the convertdb management command. This command was added to allow existing installations using SQLite as the database manager to convert their database to one of the recommended database managers.

After many reports, the consensus reached was that this functionality is not meant to be provided in the project. Software projects have little or no control of the aspects upon which they rely. Framework, environment, platform, OS, databases are such examples.

Database conversion is a task best suited for operations oriented software and experienced database users. For these reasons the database conversion command has been deprecated and will be removed in a future version.

Docker image

The Debian Linux base operating system layer used for the Docker image updated to version 9.8.

Optimizations in the way the image is built reduced the final size from 1.25 GB to 1.09 GB.

Support was added for setting the Docker container user’s UID and GUID. This is accomplished by the two new environment variables called MAYAN_UID and MAYAN_GUID.

The Docker image now includes all static media files like JavaScript libraries, web assets, and themes. This make the initial start and restarts of the image faster as only the database migration is executed.

Events system

More system events are now recorded. Some of these are:

  • Document link mailing.
  • Users creation, modification, log in and log out.
  • Groups creation, modification.
  • Roles creation, modification.
  • Indexes creation, modification.
  • ACLs creation, modification.
  • Workflows creation, modification.
  • Smart links creation, modification.
  • Smart links creation, modification.

A new link was added under the User menu to show all the events of the currently logged user.

Incompatible changes

Existing config.yml files need to be updated manually. The prefix mayan.apps. must be added to any reference of an app. For examples, update this:

LOCK_MANAGER_BACKEND: lock_manager.backends.file_lock.FileLock

to this:

LOCK_MANAGER_BACKEND: mayan.apps.lock_manager.backends.file_lock.FileLock

and this:

OCR_BACKEND: ocr.backends.pyocr.PyOCR

to this:

Check the supervisord logs at /var/log/supervisor for additional errors from missed updates in the form::

ImportError: No module named ocr.backends.pyocr

The static file collection and compression command collectstatic is superseded by the new preparestatic command. Both work the same way, but preparestatic has a default blacklist to avoid collecting files from tests, development setup, and demos.

All the Microsoft Internet Explorer browser specific HTML markup was removed.

Mailing

Mailing profiles were updated to allow specifying the email sender address.

Due to the change of using the entire dotted path of apps, the backend of existing mailing profiles will be invalid and must be re-created. Instead of showing an error, invalid mailer backends will be replaced by a placeholder NullMailer class.

The document link URL when mailed is now composed of the COMMON_PROJECT_URL setting value plus the document’s URL instead of Django Site domain. This was the only use of Django’s Site app and was removed as a dependency.

MERCs

Two new Mayan EDMS Requests for Comments (MERCs) were approved during version 4.0 development and applied to this release too.

MERC 5 now requires all callables to use explicit keyword arguments. This MERC in effect makes positional arguments obsolete. These are only retained for Python modules and callables that don’t support named or keyword arguments.

MERC 6 introduces a security and privacy policy. This policy is a preemptive information disclosure reduction. This means that code and views in general will disclose less information when the user doesn’t have the required access for an object, view, or action. Instead of displaying an “Access denied” or “Forbidden” error, a “Not found” or 404 error will be raised. This way the user will not have any information about the existence of a resource for which access has not been granted. To keep the API compatible for this minor release, only views were updated according to MERC 6.

If you are developing a third party app, update your view tests checking the permission-missing behavior to expect a 404 error and not a 403 one.

Memory usage

The code audit performed during the development of version 4.0 revealed many areas where optimizations were possible. All the backward compatible optimizations were backported to this version. These are:

  • Block reading for document hashing instead of loading the entire document’s file into memory.
  • A temporary file is now used for mime type detection instead of reading the entire file into memory or just reading the first bytes of the file.
  • The Converter class is now initialized only when needed. This allows more effective garbage collection.
  • Use of file-like objects instead of buffers.
  • The change to file-like objects allowed the use of Python’s copyfileobj in several places.
  • The language list was converted into a function instead of being used as a list in all instances.
  • Load only one language in the document properties form.
  • Improved ACL system which moves computation of the access control to the database instead of doing the filtering using Python code in program memory.
  • Use of context managers for all creation of file-like objects.
  • Extensive use of temporary files for office document conversion instead of relying on easier to use but more wasteful memory buffers.

As a result, the memory footprint and CPU usage were lowered substantially. Memory usage was lowered to 700MB of RAM under full load. This is great news for all users but of special importance for restricted environments like low tier virtual hosts, container deployments, and single board computers like the Odroids or the Raspberry Pi.

Python 3

Long awaited Python 3 support is here. To ensure a smooth transition only the Mayan EDMS Python package will be released supporting Python 2.7 and 3. For the next release, the Docker image will be converted to work on Python 3. And finally, on the release of the next major version (version 4.0), Python 3 will be the only Python version support. This version of Mayan EDMS, as well as future versions of the same series (3.x) will be the last version supporting python 2.7.

To summarize:

  • Version 3.2: Python 2.7 & Python 3, Docker with Python 2.7.
  • Version 3.3 and up: Python 2.7 & Python 3, Docker with Python 3.
  • Version 4.0: Python 3 only, Docker with Python 3.

Reliability

Database transaction handling was added in more places to ensure data integrity even in extreme situations where database operations get interrupted.

Removals

Django suit was removed from requirements.

Support for generating documents images in base 64 format was also removed.

Removal of the MIMETYPE_FILE_READ_SIZE setting. The new method of using temporary files to determine the MIME type of a document make this setting obsolete.

The search time elapsed calculation was removed.

Settings

The HOME_VIEW setting was defined without a namespace and as a top level setting. This configuration is reserved for native Django setting and the HOME_VIEW setting is now namespaced to the COMMON app where it is defined. The setting name therefore changes from HOME_VIEW to COMMON_HOME_VIEW.

More Django settings were exposed and can now be modified from the user interface or via environment variables:

  • AUTH_PASSWORD_VALIDATORS
  • DEFAULT_FROM_EMAIL
  • EMAIL_TIMEOUT
  • INTERNAL_IPS
  • LANGUAGES
  • LANGUAGE_CODE
  • LOGOUT_REDIRECT_URL
  • STATIC_URL
  • STATICFILES_STORAGE
  • TIME_ZONE
  • WSGI_APPLICATION

New default value of 65535 for the DOCUMENTS_HASH_BLOCK_SIZE setting. This means that new documents will be read and process in blocks of 65KB to determine their SHA256 hash instead of being read entirely in memory.

Search result are now handled as a queryset. Queryset are “lazy” and not evaluated until accessed. This means a queryset can represent a vast number of documents without consuming the entire memory that would be required to hold all the documents instances as a list would. This change make the memory limiting setting SEARCH_LIMIT obsolete and was removed.

The default value for the recently added, recently accessed, and favorite documents settings was increased from 40 to 400. Using the default pagination size of 40 documents per page than means a total of 10 pages of documents for each of one of these views instead of just one page.

The setting COMMON_TEMPORARY_DIRECTORY was moved to the storage app. The setting is now called STORAGE_TEMPORARY_DIRECTORY.

These are the biggest features added but there many other small improvements and additions in many other areas. The release notes detailing all of these can be found in here: https://docs.mayan-edms.com/releases/3.2.html