8 Parsers

This chapter presents the parsers that not only provide support for parsing programming languages to the PKM, but also support for parsing UML, .docx documentation file, and debugging information in ELF executable binary files. The parsers can run under the control of Process Engine from the DECODER EU Project tool-chain, which provides the parsers with an extra query parameter {invocationID} identifying the invocation.

8.1 Source code

8.1.1 C

Parse C source codes (REST API 🔐, see FramaCApi.frama_c in SDK):

POST /frama_c/{dbName}?asynchronous=…&invocationID=…
{
    "source_file_paths" : [ "main.c" ],
    "mode":
    {
        "parser": true
    }
}

This parses the C source codes specified in the request body of the project with given name {dbName} using Frama-C and produces a log (see Section 9.9) and some new artefacts related to source code: C source code ASTs, comments and annotations (see Section 9.2). The PKM project is also populated with the system header files needed for the compilation. The operation returns in the response body a Frama-C job which client can poll when in asynchronous mode (i.e. asynchronous=true) using the property id of the job as a job identifier. When in asynchronous mode, the job is enqueued in a job queue. The default behavior is to run jobs synchronously. When the {invocationID} query parameter is defined, the job implicitely runs synchronously.

Run an abstract interpretation/symbolic execution on a C source codes (REST API 🔐, see FramaCApi.frama_c in SDK):

POST /frama_c/{dbName}?asynchronous=…&invocationID=…
{
    "source_file_paths" : [ "main.c" ],
    "mode":
    {
        "eva": true
    },
    "options":
    {
        "eva":
        {
            "main": "main"
        }
    }
}

This runs an abstract interpretation/symbolic execution on a C source codes using Frama-C and produces a log with an analysis report (see Section 9.9). The PKM project is also populated with the system header files needed for the analysis. The format of the analysis report is SARIF.

Run a program proof using Hoare logic on a C source codes (REST API 🔐, see FramaCApi.frama_c in SDK):

POST /frama_c/{dbName}?asynchronous=…&invocationID=…
{
    "source_file_paths" : [ "main.c" ],
    "mode":
    {
        "wp": true
    },
    "options":
    {
        "wp":
        {
            "wp_fct": [ "init" ]
        }
    }

}

This runs a program proof of C annotations (ASCL) using Frama-C and produces a log with an analysis report (see Section 9.9). The PKM project is also populated with the system header files needed for the analysis. The format of the analysis report is SARIF.

Poll a Frama-C job (REST API 🔐, see FramaCApi.getJob in SDK):

GET /frama_c/jobs/{jobId}

When a Frama-C job is running asynchronously, this operation allows polling the Frama-C job with the given identifier {jobId} until job completion. The job is dequeued when client polls a completed job (either finished or failed).

Parse C source codes (CLI):

$ run_frama_c --user=garfield --db=myproject --parser main.c

This parses C source codes and produces a log (see Section 9.9) and some new artefacts related to source code: C source code ASTs, comments and annotations (see Section 9.2). The PKM project is also populated with the system header files needed for the compilation.

Run an abstract interpretation/symbolic execution on a C source codes (CLI):

$ run_frama_c --user=garfield --db=myproject --eva --main=main main.c

This runs an abstract interpretation/symbolic execution on a C source codes using Frama-C and produces a log with an analysis report (see Section 9.9). The PKM project is also populated with the system header files needed for the analysis. The format of the analysis report is SARIF.

Run a program proof using Hoare logic on a C source codes (CLI):

$ run_frama_c --user=garfield --db=myproject --wp --wp-fct=init --wp-fct=process main.c

This runs a program proof of C annotations (ASCL) using Frama-C and produces a log with an analysis report (see Section 9.9). The PKM project is also populated with the system header files needed for the analysis. The format of the analysis report is SARIF.

8.1.2 C++

Parse C++ source codes (REST API 🔐, see FramaClangApi.frama_clang in SDK):

POST /frama_clang/{dbName}?asynchronous=…&invocationID=…
{
    "source_file_paths": [ "OpenCV/modules/core/src/matrix_sparse.cpp" ]
}

This parses the C source codes specified in the request body of the project with given name {dbName} using Frama-Clang and produces a log (see Section 9.9) and some new artefacts related to source code: C++ source code ASTs, comments and annotations (see Section 9.2). The PKM project is also populated with the system header files needed for the compilation. The operation returns in the response body a Frama-Clang job which client can poll when in asynchronous mode (i.e. asynchronous=true) using the property id of the job as a job identifier. When in asynchronous mode, the job is enqueued in a job queue. The default behavior is to run jobs synchronously. When the {invocationID} query parameter is defined, the job implicitely runs synchronously.

Poll a Frama-Clang job (REST API 🔐, see FramaClangApi.getJob in SDK):

GET /frama_clang/jobs/{jobId}

When a Frama-Clang job is running asynchronously, this operation allows polling the Frama-Clang job with the given identifier {jobId} until job completion. The job is dequeued when client polls a completed job (either finished or failed).

Parse C++ source codes (CLI):

$ run_frama_clang --user=garfield --db=OpenCV 'OpenCV/modules/core/src/matrix_sparse.cpp'

This parses C++ source codes using Frama-Clang and produces a log (see Section 9.9) and some new artefacts related to source code: C++ source code ASTs, comments and annotations (see Section 9.2). The PKM project is also populated with the system header files needed for the compilation.

8.1.3 Java

Parse Java source codes (REST API 🔐):

GET /decoder/javaASTGenerator/{projectName}/{sourceFilename}&invocationID=…

This parses Java source codes using Java parser and produces a log (see Section 9.9) and some new artefacts related to source code: Java source code ASTs, comments and annotations (see Section 9.2).

8.2 Documentation

Poll an ASFM job (REST API 🔐, see AsfmApi.getJob in SDK):

GET /asfm/jobs/{jobId}

When an ASFM job is running asynchronously, this operation allows polling the ASFM job with the given identifier {jobId} until job completion. The job is dequeued when client polls a completed job (either finished or failed).

8.2.1 Microsoft Office .docx

Convert a .docx file to an ASFM (REST API 🔐, see AsfmApi.doc_to_asfm in SDK):

POST /asfm/doc_to_asfm/{dbName}?asynchronous=…&invocationID=…
{
    "file_path": "doc.docx"
}

This parses the .docx file specified in the request body of the project with given name {dbName} using doc_to_asfm and produces a log (see Section 9.9) and an ASFM artefacts related to the .docx file (see Section 9.4). The operation returns in the response body an ASFM job which client can poll when in asynchronous mode (i.e. asynchronous=true) using the property id of the job as a job identifier. When in asynchronous mode, the job is enqueued in a job queue. The default behavior is to run jobs synchronously. When the {invocationID} query parameter is defined, the job implicitely runs synchronously.

Convert an ASFM to a .docx file (REST API 🔐, see AsfmApi.asfm_to_doc in SDK):

POST /asfm/asfm_to_doc/{dbName}?asynchronous=…&invocationID=…
{
    "docName": "User's Manual"
}

This generates a .docx from ASFM with the given name in the request body of the project with given name {dbName} and produces a log (see Section 9.9) and a .docx file (see Section 9.1). If present, the sourceFile property of the ASFM is the filename of the generated .docx file, otherwise the name property of the ASFM with an appended '.docx' suffix is the filename of the generated .docx file. The operation returns in the response body an ASFM job which client can poll when in asynchronous mode (i.e. asynchronous=true) using the property id of the job as a job identifier. When in asynchronous mode, the job is enqueued in a job queue. The default behavior is to run jobs synchronously. When the {invocationID} query parameter is defined, the job implicitely runs synchronously.

Convert a .docx file to an ASFM (CLI):

$ run_doc_to_asfm --user=garfield --db=myproject doc.docx

This parses the .docx file using doc_to_asfm and produces a log (see Section 9.9) and an ASFM artefact (see Section 9.4).

Convert an ASFM to a .docx file (CLI):

$ run_asfm_to_doc --user=garfield --db=myproject "User's Manual"

This produces a log (see Section 9.9) and a .docx file (see Sections 9.1 and 9.1.3).

8.2.2 Code to ASFM

Generate an ASFM from the source ASTs (REST API 🔐, see AsfmApi.code_to_asfm in SDK):

POST /asfm/code_to_asfm/{dbName}?asynchronous=…&invocationID=…

This generates an ASFM from the source code ASTs (currently only for C++) of the project with given name {dbName} and produces a log (see Section 9.9). The operation returns in the response body an ASFM job which client can poll when in asynchronous mode (i.e. asynchronous=true) using the property id of the job as a job identifier. When in asynchronous mode, the job is enqueued in a job queue. The default behavior is to run jobs synchronously. When the {invocationID} query parameter is defined, the job implicitely runs synchronously.

8.3 UML

8.3.1 Classes

Parse UML class diagram of an UML2 XMI file (REST API 🔐):

GET /decoder/classModelToJson/{projectName}/{UMLXMIFilename}&invocationID=…

This parses the file with the given name {UMLXMIFilename} of project with given name {projectName} using ClassModelXMI2JSON and produces a log (see Section 9.9) and a UML class diagram (see Section 9.3).

8.3.2 State Machines

Parse UML stage machine of an UML2 XMI file (REST API 🔐):

GET /decoder/smModelToJson/{projectName}/{UMLXMIFilename}&invocationID=…

This parses the file with the given name {UMLXMIFilename} of project with given name {projectName} using SMModelXMI2JSON and produces a log (see Section 9.9) and a UML state machine (see Section 9.3).

8.4 Executable binaries

8.4.1 DWARF debugging information

Parse DWARF debugging information of an ELF executable binary file (REST API 🔐, see ExcavatorApi.excavator):

POST /excavator/{dbName}&invocationID=…
{
    "binary" : "vmlinux",
    "sources" :
    [
        "drivers/net/ethernet/intel/e1000e/80003es2lan.c",
        "drivers/net/ethernet/intel/e1000e/82571.c",
        "drivers/net/ethernet/intel/e1000e/ethtool.c",
        "drivers/net/ethernet/intel/e1000e/ich8lan.c",
        "drivers/net/ethernet/intel/e1000e/mac.c",
        "drivers/net/ethernet/intel/e1000e/manage.c",
        "drivers/net/ethernet/intel/e1000e/netdev.c",
        "drivers/net/ethernet/intel/e1000e/nvm.c",
        "drivers/net/ethernet/intel/e1000e/param.c",
        "drivers/net/ethernet/intel/e1000e/phy.c",
        "drivers/net/ethernet/intel/e1000e/ptp.c",
        …
    ],
    "suppress_types" :
    [
        "__builtin_*"
    ],
    "suppress_functions" :
    [
        "__compiletime_assert_*",
        "__builtin_*",
        "snprintf",
        "sprintf",
        "sscanf",
        "strcspn",
        "strncasecmp",
        "strncat",
        "vsnprintf",
        "strspn",
        "bcmp",
        "fabs",
        "strlen",
        "strncmp",
        "strncpy",
        "memcmp",
        "memmove",
        "strchr",
        "memset",
        "strrchr",
        "memchr",
        "abort"
    ]
}

This parses the file of project with given name {dbName} which is specified in property binary of request body, then processes compilation units, using UNISIM Excavator, and produces a log (see Section 9.9) and equivalent (interfaces and code skeletons only) compilable C source code files (see Section 9.1). The operation returns in the response body an Excavator job which client can poll when in asynchronous mode (i.e. asynchronous=true) using the property id of the job as a job identifier. When in asynchronous mode, the job is enqueued in a job queue. The default behavior is to run jobs synchronously. When the {invocationID} query parameter is defined, the job implicitely runs synchronously.

Poll an Excavator job (REST API 🔐, see ExcavatorApi.getJob in SDK):

GET /excavator/jobs/{jobId}

When an Excavator job is running asynchronously, this operation allows polling the Excavator job with the given identifier {jobId} until job completion. The job is dequeued when client polls a completed job (either finished or failed).