Skip to content
Snippets Groups Projects
Commit baf3049c authored by Andrea Chimenti's avatar Andrea Chimenti
Browse files

Complete filebeat documentation

parent 606d5de4
No related branches found
No related tags found
No related merge requests found
# Filebeat # Filebeat
Filebeat je lightweight nástroj, který slouží primárně ke sběru logů na klientských stanicích a jejich odeslání na Logstash. Jeho výhodou je garantované doručení zpráv i po výpadku sítě, Logstashe apod. a jednoduchá konfigurace. Filebeat běží jako služba pod systemd. Podrobnější popis toho, jak Filebeat funguje, je dostupný v [dokumentaci výrobce](https://www.elastic.co/guide/en/beats/filebeat/8.6/how-filebeat-works.html#_how_does_filebeat_keep_the_state_of_files). Ve zkratce: pamatuje si polohu v souborech a sleduje inkrementální změny.
Sběr se nastavuje definicí cest k log souborům v souboru `filebeat.yml` v sekci `filestream.inputs` (viz [filestream input](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html)). Parsování dat bude obstaráno logstashem. Jednotlivé vstupy je vhodné označit označit polem `service`, které bude využito pro volby parsovacích filterů v logstashi a pro název indexu. Také je možné přidat tagy pro upřesnění vstupu. Tahle pole nedoporučujeme vynechávat protože nebude možná identifikace dat na straně Logstashe a v úložišti bude vznikat datová bažina.
## Instalace
V repozitáři jsou přiloženy 2 konfigurační soubory
- `filebeat.yml`: Naše šablona pro zjednodušené nasazení. Obsahuje popis všech námi použitých polí.
- `filebeat.reference.yml`: Referenční soubor z oficiální dokumentace. Obsahuje veškeré nastavení i s popisem.
### Kroky
1. Instalaci balíčku je možno provést přes DPKG, RPM, APT nebo YUM (preferujte apt a yum). Viz bod 1 v [dokumentaci](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation-configuration.html#installation).
2. Pozastavit upgrade beatů např. pomocí `sudo apt-mark hold filebeat`. Aktualizace klientů by měla probíhat zároveň s aktualizací ostatních komponent ELK.
3. Nastavit konfigurační soubor `/etc/filebeat/filebeat.yml`. Doporučujeme vycházet z námi vytvořené šablony.
1. Položka `filebeat.inputs.paths`, určuje které soubory budou sledovány.
2. Položku `fields.group` je potřeba nastavit na název pracovní skupiny tj. kdo sbírá dat. Slouží k identifikaci.
3. Položku `fields.os` je potřeba nastavit na hodnotu `linux` nebo `windows`. (Filebeat v defaultu neodesílá informaci o os a může být nasazen i na Windows)
4. Případně změnit cestu k certifikátu.
4. Nahrát certifikát na server/stanici, v šabloně nastaven název `http_ca.crt`.
5. Ověřit že je port 5044 otevřen a není blokován firewallem apod.
6. Příkazem `sudo systemctl enable filebeat` nastavit službu filebeat, aby se sama spouštěla po restartu systému. Nebo možno jednorázově spustit příkazem `start`.
**Poznámka k rotaci logů**: Prosím zkontrolovat že pro rotaci logů není použita strategie `copytruncate`. Viz zmínka v [dokumentaci](https://www.elastic.co/guide/en/beats/filebeat/current/file-log-rotation.html).
### Troubleshooting
Pro kontrolu správnosti konfiguračního souboru a nastaveného outputu lze použít následující příkazy:
```text
sudo filebeat test config
sudo filebeat test output
```
###################### Filebeat Configuration Example #########################
# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html
# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input-specific configurations.
# filestream is an input for collecting log messages from files.
- type: filestream
# Unique ID among all inputs, an ID is required.
id: my-filestream-id
# Change to true to enable this input configuration.
enabled: false
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
# Line filtering happens after the parsers pipeline. If you would like to filter lines
# before parsers, use include_message parser.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
# Line filtering happens after the parsers pipeline. If you would like to filter lines
# before parsers, use include_message parser.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
#prospector.scanner.exclude_files: ['.gz$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
# ============================== Filebeat modules ==============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
# ======================= Elasticsearch template setting =======================
setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false
# ================================== General ===================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their field with each
# transaction published.
#tags: ["service-X", "web-tier"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboard archive. By default, this URL
# has a value that is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
# =================================== Kibana ===================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
#host: "localhost:5601"
# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:
# =============================== Elastic Cloud ================================
# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
# ================================== Outputs ===================================
# Configure what output to use when sending the data collected by the beat.
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
#username: "elastic"
#password: "changeme"
# ------------------------------ Logstash Output -------------------------------
#output.logstash:
# The Logstash hosts
#hosts: ["localhost:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
# ================================= Processors =================================
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
# ================================== Logging ===================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors, use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
#logging.selectors: ["*"]
# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#monitoring.enabled: false
# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch outputs are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:
# ============================== Instrumentation ===============================
# Instrumentation support for the filebeat.
#instrumentation:
# Set to true to enable instrumentation of filebeat.
#enabled: false
# Environment in which filebeat is running on (eg: staging, production, etc.)
#environment: ""
# APM Server hosts to report instrumentation results to.
#hosts:
# - http://localhost:8200
# API Key for the APM Server(s).
# If api_key is set then secret_token will be ignored.
#api_key:
# Secret token for the APM Server(s).
#secret_token:
# ================================= Migration ==================================
# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
filebeat.inputs:
# EXAMPLE: collecting syslog messages
- type: filestream
id: fs_syslog
paths:
- /var/log/syslog*
fields:
service: "syslog"
prospector.scanner.exclude_files: ['\.gz$', '\.zst$']
parsers:
- multiline:
type: pattern
pattern: "^[[:space:]]+"
match: after
processors:
- add_locale: ~
# TEMPLATE: how to define your own inputs
- type: filestream
id: fs_custom_logs
# set the paths to files you want to harvest, you can use * to match anything
paths:
- /var/log/file_you_want_to_log.log
- /var/log/my_folder/*.log
# exclude compressed files, use only if you use the * wildcard in defined paths
prospector.scanner.exclude_files: ['\.gz$', '\.zst$']
# use this settings if you have multiline messages in your file (that start with a whitespace), otherwise you can remove it
parsers:
- multiline:
type: pattern
pattern: "^[[:space:]]+"
match: after
# add information about timezone
processors:
- add_locale: ~
# service name will be used for parsing data in logstash and will be part of the name of the index
# consult with administrator
fields:
service: "service_name"
# possible additional tags for further specification of input
tags: ["tag1", "tag2"]
# OTHER SETTINGS:
# you can change the level of logging of the filebeat service, default is info
logging:
level: info
# if you have a lot of data, you should consider uploading only the last N hours.
# this setting is important when first running beats
ignore_older: 24h
# change the group name and os for better data sorting
fields:
group: "name_of_my_group"
os: "linux" # or "windows"
output.logstash:
hosts: ["log.ucn.muni.cz:5045"]
ssl.enabled: true
ssl.certificate_authorities: ["/etc/filebeat/http_ca.crt"]
-----BEGIN CERTIFICATE-----
DUMMY_CERT
-----END CERTIFICATE-----
...@@ -26,7 +26,6 @@ Winlogbeat je open-source nástroj vyvinutý společností Elastic, který slou ...@@ -26,7 +26,6 @@ Winlogbeat je open-source nástroj vyvinutý společností Elastic, který slou
![List of Windows services with Winlogbeat](../../img/services-windows.png) ![List of Windows services with Winlogbeat](../../img/services-windows.png)
### Hromadná instalace ### Hromadná instalace
Souběžnou instalaci na více zařízení lze provést přes SCCM konzoli. Souběžnou instalaci na více zařízení lze provést přes SCCM konzoli.
......
...@@ -3,9 +3,11 @@ ...@@ -3,9 +3,11 @@
## Základní informace ## Základní informace
### Název projektu ### Název projektu
Vytvoření datového jezera pro potřeby uchovávání provozních dat Masarykovy univerzity Vytvoření datového jezera pro potřeby uchovávání provozních dat Masarykovy univerzity
### Řešitelé ### Řešitelé
- Ing. Jindřich Zechmeister (MU) - Ing. Jindřich Zechmeister (MU)
- RNDr. Daniel Tovarňák, Ph.D. (MU) - RNDr. Daniel Tovarňák, Ph.D. (MU)
- Mgr. Martin Kotlík (MU) - Mgr. Martin Kotlík (MU)
...@@ -13,16 +15,18 @@ Vytvoření datového jezera pro potřeby uchovávání provozních dat Masaryko ...@@ -13,16 +15,18 @@ Vytvoření datového jezera pro potřeby uchovávání provozních dat Masaryko
- Bc. Andrea Chimenti (VUT) - Bc. Andrea Chimenti (VUT)
### Období ### Období
29\. 6\. 2022 - 29\. 6\. 2023 29\. 6\. 2022 - 29\. 6\. 2023
### URL ### URL
https://fondrozvoje.cesnet.cz/projekt.aspx?ID=690 <https://fondrozvoje.cesnet.cz/projekt.aspx?ID=690>
## Obsah repozitáře ## Obsah repozitáře
Tento repozitář obsahuje výstupní artefakty projektu. Jednotlivé technologie jsou rozděleny po složkách s následující strukturou: Tento repozitář obsahuje výstupní artefakty projektu. Jednotlivé technologie jsou rozděleny po složkách s následující strukturou:
```
```text
├── Beats ├── Beats
│ ├── Filebeat │ ├── Filebeat
│ └── Winlogbeat │ └── Winlogbeat
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment