Graph-Based Network Forensics: Data Handling
The Data Handling Module handles operations that directly use the Dgraph graph database. It can be used in one of three modes.
The first mode is an indexing mode. It takes the ".schema" and ".rdf" files that were generated by the Transformation Module as input and indexes them using Dgraph Zero.
The second, Dgraph Zero mode, runs Dgraph Zero and handles its arguments.
The third, Dgraph Alpha mode, is used to load the indexed data from a local directory to the database.
At least one Dgraph Zero and one Dgraph Alpha node are needed to run the Dgraph graph database. Dgraph Alpha hosts and serves the data. Dgraph Zero controls the nodes in the Dgraph cluster. It moves data between different Dgraph Alpha instances based on data volume.
The folder dgraph-plugins contains a custom plugin cidr-plugin.go that is used during the indexing stage. It allows defining queries that search in the IP address field based on the CIDR range rather than simply on string syntax using regular expressions.
Requirements
- Dgraph
- Docker
- Python3
- Python3 packages in requirements.txt
The installation can be performed using the following command:
$ git clone https://gitlab.ics.muni.cz/granef/dgraph-handler.git
Use the following command to build the Docker container:
$ docker build --tag=granef/dgraph-handling .
Usage
The Docker container can be either run separately with command line arguments or as part of the Granef toolkit with arguments set in the granef.yml configuration file.
The following arguments can be set:
a) For all modes:
Short argument | Long argument | Description | Default | Required |
---|---|---|---|---|
--module_mode |
Module mode where one has to be chosen (possible values: indexing , zero , alpha ) |
T | ||
-i |
--input |
Input data directory path | /data/ |
F |
-o |
--output |
Output data directory path | granef-indexing |
F |
-m |
--mounted |
Mounted data directory path | /data/ |
F |
-f |
--force |
Forces to overwrite files in output directory | F | |
-l |
--log |
Log level (possible values: DEBUG , INFO , WARNING , ERROR , CRITICAL ) |
INFO |
F |
b) For the indexing mode:
Short argument | Long argument | Description | Default |
---|---|---|---|
-ms |
--map_shards |
Bulk command "--map_shards" argument value | 2 |
-rs |
--reduce_shards |
Bulk command "--reduce_shards" argument value | 1 |
-z |
--zero |
Bulk command "--zero" argument value | localhost:5080 |
-c |
--custom_tokenizers |
Bulk command "--custom_tokenizers" argument value | cidr-plugin.so |
-b |
--bulk_args |
Other bulk command arguments but in format "--arg value" |
c) For the Dgraph Zero mode:
Short argument | Long argument | Description | Default |
---|---|---|---|
-my |
--my |
Zero command "--my" argument value | localhost:5080 |
-ba |
--bindall |
Zero command "--bindall" argument value | True |
d) For the Dgraph Alpha mode:
Short argument | Long argument | Description | Default |
---|---|---|---|
-z |
--zero |
Alpha command "--zero" argument value | localhost:5080 |
-my |
--my |
Alpha command "--my" argument value | localhost:7080 |
-c |
--custom_tokenizers |
Alpha command "--custom_tokenizers" argument value | cidr-plugin.so |
-ap |
--alpha_p |
Alpha command "-p" argument value (data directory specifier) | p/ |
-of |
--alpha_o |
Alpha command "-o" argument value (port offset for another Alpha) | 0 |
-aa |
--alpha_args |
Other alpha command arguments but in format "--arg value" |
Use the following command to start the indexing:
$ docker run --rm -v <LOCAL_DIR>:/data/ granef/dgraph-handling -i <INPUT_DIR_PATH> -o <OUTPUT_DIR_PATH> indexing -b <BULK_ARGS> -ms 2 -rs 1 -z localhost:5080 -c cidr-plugin.so
Use the following command to create internal Docker network (just once):
$ docker network create --driver bridge granef
Use the following command to start Dgraph Zero (available only from localhost):
$ docker run --rm --name zero --network granef granef/dgraph-handling zero -my=zero:5080 --bindall=True
Use the following command to start Dgraph Alpha (available only from localhost on port 8080):
$ docker run --rm --name alpha --network granef -p 127.0.0.1:8080:8080 -p 127.0.0.1:9080:9080 -v <LOCAL_DIR>:/data/ granef/dgraph-handling -i /data/ alpha -my=localhost:7080 -z zero:5080 -ap p/
Use the following command to start a second Dgraph Alpha:
$ docker run --rm --name alpha --network granef -p 127.0.0.1:8081:8081 -p 127.0.0.1:9081:9081 -v <LOCAL_DIR>:/data/ granef/dgraph-handling -i /data/ alpha -my=localhost:7081 -z zero:5080 -ap p/ -of 1