EFK Part 2 Elasticsearch
Now that we have Fluent Bit running we need forward these logs to to Elasticsearch. Elasticsearch stores the structured logs in indices, enabling us to search the logs using full text queries, ideal for log analytics.
How to configure Elasticsearch
Elasticsearch is quite easy to setup with security features disabled for the initial setup. I deployed it using the following Kubernetes resources:
- A StatefulSet
- A Headless Service
- A PersistentVolumeClaim (PVC)
Security settings in Elasticsearch
Before we begin, a quick word on security. Disabling security is not recommended for production. I find that it simplifies the initial setup.
If set to
false, security features are disabled, which is not recommended. It also affects all Kibana instances that connect to this Elasticsearch instance; you do not need to disable security features in thosekibana.ymlfiles. For more information about disabling security features in specific Kibana instances, see Kibana security settings.
The same is also true for Fluent Bit, it also needs a certificates for HTTPS. We will handle that in the last part of the series.
1. StatefulSet
Here I deployed a single-node Elasticsearch with xpack.security.enabled set to false. With security disabled we can login without a username and password. Another thing we don't have to worry about is setting up the transport TLS and HTTPS at this point.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
spec:
serviceName: elasticsearch
replicas: 1
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:9.1.4
env:
- name: discovery.type
value: single-node
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
- name: xpack.security.enabled
value: "false"
- name: xpack.monitoring.collection.enabled
value: "true"
ports:
- containerPort: 9200
- containerPort: 9300
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "2Gi"
volumes:
- name: data
persistentVolumeClaim:
claimName: elasticsearch-pvc
2. Service
Next, we create a headless service to give network identity for our StatefulSet pod.
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
spec:
selector:
app: elasticsearch
ports:
- port: 9200
name: http
targetPort: 9200
- port: 9300
name: transport
targetPort: 9300
clusterIP: None
3. PersistentVolumeClaim (PVC)
Finally, we define a PVC to provide persistent storage for our log data.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: elasticsearch-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
4. Verification
We use FluxCD to manage our Kubernetes manifests, and these Elasticsearch resources are deployed automatically to the monitoring namespace.
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: elasticsearch-kustomization
namespace: default
spec:
interval: 5m
targetNamespace: monitoring
sourceRef:
kind: GitRepository
name: elasticsearch-repo
path: "./apps/overlays/rudolf/elasticsearch"
prune: true
wait: true
timeout: 1m
Once Flux has reconciled these resources, you can verify the cluster's health directly through the Elasticsearch API.
First, open a shell inside the elasticsearch-0 pod, which is running in the monitoring namespace:
kubectl exec -it elasticsearch-0 -n monitoring -- /bin/bash
Curl the: /_cluster/health for the status:
curl -s -X GET "http://localhost:9200/_cluster/health?pretty"
You should see a response like this:
{
"cluster_name" : "docker-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 104,
"active_shards" : 104,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"unassigned_primary_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Understanding Cluster Health:
- Green: All shards are allocated.
- Yellow: All primary shards are allocated, but replica shards are not.
- Red: At least one primary shard is not allocated.
It is normal to see a Yellow status indication with a single-node setup like mine. Replica shards cannot be assigned because there's only one node. To solve this and turn the cluster status green, you can set the number of replicas to 0 for all indices.
You can update all the existing indices from inside the Elasticsearch pod like this:
curl -s -X PUT "http://localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'
{
"index": {
"number_of_replicas": 0
}
}'
5. Create an Index Lifecycle Management Policy
Elasticsearch stores data in an index. They tend to grow in size as data collects over time. In order to manage the size and number of indices that are created daily we can configure a Index Lifecycle Management (ILM) Policy.
For my homelab purposes I wanted the following:
- The index can reach 1gb
- The index is created daily (rollover)
- Remove indices older than 7 days
We can setup the ILM as follows:
- Create a lifecycle policy
- Create an index template to apply the lifecycle policy
- Create an initial managed index and alias
Create a lifecycle policy
First, open a shell inside the elasticsearch-0 pod, which is running in the monitoring namespace:
kubectl exec -it elasticsearch-0 -n monitoring -- /bin/bash
Use the Create or update policy API to add an ILM policy to the Elasticsearch cluster:
curl -X PUT "http://localhost:9200/_ilm/policy/fluentbit-ilm-policy" \
-H "Content-Type: application/json" \
-d '
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_age": "1d",
"max_size": "1gb"
}
}
},
"delete": {
"min_age": "7d",
"actions": {
"delete": {}
}
}
}
}
}
'
Explanation:
hot: Rollover after 1 day or 1 GBdelete: Remove indices older than 7 days
Create an index template to apply the lifecycle policy
Use the Create or update index template API to add an index template to a cluster and apply the lifecycle policy to indices matching the template:
curl -X PUT "http://localhost:9200/_index_template/fluentbit-template" \
-H "Content-Type: application/json" \
-d '
{
"index_patterns": ["fluentbit-*"],
"priority": 500,
"template": {
"settings": {
"index.lifecycle.name": "fluentbit-ilm-policy",
"index.lifecycle.rollover_alias": "fluentbit",
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword"
}
}
}
]
}
}
}
'
Explanation:
- Converts all strings to
keywordautomatically - Applies ILM policy to all
fluentbit-*indices - Single shard + no replicas
Create an initial manage index and alias
Use the Create an index API to create the initial managed index.
curl -X PUT "http://localhost:9200/fluentbit-000001" \
-H "Content-Type: application/json" \
-d '
{
"aliases": {
"fluentbit": {
"is_write_index": true
}
}
}
'
Explanation:
fluentbitis the aliasfluentbit-000001is the concrete first index- This is required for ILM rollover to work correctly
6. Wrapping up
We've successfully deployed a single-node Elasticsearch cluster in our homelab using Kubernetes StatefulSets, persistent storage, and FluxCD. While we've disabled security for simplicity in this initial setup, our logs are now being persistently stored in Elasticsearch indices, ready to be analyzed.
In the next part of this series, we'll deploy Kibana and connect it to our Elasticsearch cluster. The front-end that completes this EFK Logging Stack.