The PKM (Persistent Knowledge Monitor) is a living repository for every software project, relying on a database management system (MongoDB), where different representations are kept (including source code, design models, specifications, requirements, etc.) as well as their links in a traceability matrix. The PKM uses the local storage of the server to synchronize with external Git repositories. The PKM can be queried and enriched by the actors involved in the project, in order to maintain consistency and keep the most updated and precise information about the project.
This document, which presents the architecture of the PKM server (i.e. the backend) and the format of the different sorts of knowledge that need to be maintained together with the code in its different forms (source, binary, byte code), is intended for:
- A user or a system administrator willing to install the PKM,
- A developer willing to maintain and extend the PKM source code.
The purpose of this document is not about the interactions of clients with the PKM which is the purpose of sibling document titled “Open source client-side software”.
The source code of the PKM is available at https://gitlab.ow2.org/decoder/pkm-api.
The PKM is an open source software subject to the licensing terms below:
Copyright © 2020-2021 Capgemini Group, Commissariat à l’énergie atomique et aux énergies alternatives, OW2, Sysgo AG, Technikon, Tree Technology, Universitat Politècnica de València.
The PKM server is licensed under GNU Affero General Public License version 3.
The SDKs for the clients of PKM and parsers are licensed under Apache License version 2.0.
The parsers and tools are licensed under their respective licenses.
1.3 Choosing a database
The DECODER project partners have sought for a valuable and adequate implementation of the PKM by examining several open-source document-oriented DBs, OrientDB and CouchDB and NoSQL, and the current preferred choice has led us to go for MongoDB V3.4. Indeed, the data used in the project is structured into complex JSON documents, grouped into Collections. Documents are linked together by means of internal fields such as file names, function names or identifiers. The PKM has been tested with MongoDB 3.4 and later (up to version 4.4 at the time of writing this document). Note that after MongoDB 3.4, MongoDB license changed from AGPL (GNU Affero General Public License) open source license to SSPL (Server Side Public License). The purpose of the new license is to prohibit the sale of raw MongoDB storage. Although OSI (Open Source Initiative) does not consider the new license open source, since the PKM purpose is not selling raw MongoDB storage, this change should not affect business and merchantability of the PKM.
However, the partners have had to deal with some technical limitations of MongoDB:
- Database name: on Linux, database name must not contain
'"'(double quote), or
'$'(dollar) characters. This has affected the naming convention of PKM projects because a PKM project is a MongoDB database.
- MongoDB serializes JSON documents as BSON (binary representation of JSON) documents. A BSON document is limited to 16 MiB in a MongoDB collection. Due to that limitation, data schemas were adapted so that the PKM server can split incoming data, at the PKM API boundary, into smaller data chunks convenient for the storage in the database. Splitting data has also, as a side effect, improved performance of searching document elements when servicing the artefacts queries.
- Memory usage when sorting documents in memory is limited to 100 MiB, so the PKM server cannot rely on MongoDB for sorting huge documents in memory. Despite MongoDB can use disk for such expensive operations starting version 4.4, for backward compatibility with previous versions of MongoDB, the PKM do not rely on this feature for sorting huge documents.
1.4 Overall organization of the document