Skip to content
Snippets Groups Projects

Graph-Based Network Forensics: Data Handling


The Data Handling Module handles operations that directly use the Dgraph graph database. It can be used in one of three modes.

The first mode is an indexing mode. It takes the ".schema" and ".rdf" files that were generated by the Transformation Module as input and indexes them using Dgraph Zero.

The second, Dgraph Zero mode, runs Dgraph Zero and handles its arguments.

The third, Dgraph Alpha mode, is used to load the indexed data from a local directory to the database.

At least one Dgraph Zero and one Dgraph Alpha node are needed to run the Dgraph graph database. Dgraph Alpha hosts and serves the data. Dgraph Zero controls the nodes in the Dgraph cluster. It moves data between different Dgraph Alpha instances based on data volume.

The folder dgraph-plugins contains a custom plugin cidr-plugin.go that is used during the indexing stage. It allows defining queries that search in the IP address field based on the CIDR range rather than simply on string syntax using regular expressions.

Requirements

The installation can be performed using the following command:

$ git clone https://gitlab.ics.muni.cz/granef/dgraph-handler.git

Use the following command to build the Docker container:

$ docker build --tag=granef/dgraph-handling .

Usage

The Docker container can be either run separately with command line arguments or as part of the Granef toolkit with arguments set in the granef.yml configuration file.

The following arguments can be set:

a) For all modes:

Short argument Long argument Description Default Required
--module_mode Module mode where one has to be chosen (possible values: indexing, zero, alpha) T
-i --input Input data directory path /data/ F
-o --output Output data directory path granef-indexing F
-m --mounted Mounted data directory path /data/ F
-f --force Forces to overwrite files in output directory F
-l --log Log level (possible values: DEBUG, INFO, WARNING, ERROR, CRITICAL) INFO F

b) For the indexing mode:

Short argument Long argument Description Default
-ms --map_shards Bulk command "--map_shards" argument value 2
-rs --reduce_shards Bulk command "--reduce_shards" argument value 1
-z --zero Bulk command "--zero" argument value localhost:5080
-c --custom_tokenizers Bulk command "--custom_tokenizers" argument value cidr-plugin.so
-b --bulk_args Other bulk command arguments but in format "--arg value"

c) For the Dgraph Zero mode:

Short argument Long argument Description Default
-my --my Zero command "--my" argument value localhost:5080
-ba --bindall Zero command "--bindall" argument value True

d) For the Dgraph Alpha mode:

Short argument Long argument Description Default
-z --zero Alpha command "--zero" argument value localhost:5080
-my --my Alpha command "--my" argument value localhost:7080
-c --custom_tokenizers Alpha command "--custom_tokenizers" argument value cidr-plugin.so
-ap --alpha_p Alpha command "-p" argument value (data directory specifier) p/
-of --alpha_o Alpha command "-o" argument value (port offset for another Alpha) 0
-aa --alpha_args Other alpha command arguments but in format "--arg value"

Use the following command to start the indexing:

$ docker run --rm -v <LOCAL_DIR>:/data/ granef/dgraph-handling -i <INPUT_DIR_PATH> -o <OUTPUT_DIR_PATH> indexing -b <BULK_ARGS> -ms 2 -rs 1 -z localhost:5080 -c cidr-plugin.so

Use the following command to create internal Docker network (just once):

$ docker network create --driver bridge granef

Use the following command to start Dgraph Zero (available only from localhost):

$ docker run --rm --name zero --network granef granef/dgraph-handling zero -my=zero:5080 --bindall=True

Use the following command to start Dgraph Alpha (available only from localhost on port 8080):

$ docker run --rm --name alpha --network granef -p 127.0.0.1:8080:8080 -p 127.0.0.1:9080:9080 -v <LOCAL_DIR>:/data/ granef/dgraph-handling -i /data/ alpha -my=localhost:7080 -z zero:5080 -ap p/

Use the following command to start a second Dgraph Alpha:

$ docker run --rm --name alpha --network granef -p 127.0.0.1:8081:8081 -p 127.0.0.1:9081:9081 -v <LOCAL_DIR>:/data/ granef/dgraph-handling -i /data/ alpha -my=localhost:7081 -z zero:5080 -ap p/ -of 1