31. 01. 2025 Matteo Cipolletta Log Management, Log-SIEM, NetEye

NFS and Elasticsearch: A Storage Disaster for Data but a Lifesaver for Snapshots

When designing an Elasticsearch architecture, choosing the right storage is crucial. While NFS might seem like a convenient and flexible option, it comes with several pitfalls when used for hosting live Elasticsearch data (hot, warm, cold, and frozen nodes). However, NFS proves to be an excellent choice for storing snapshots and searchable snapshots. Here’s why.

Why NFS is a Bad Choice for Elasticsearch Data

1. Poor Performance and High Latency

NFS is a network protocol, introducing additional latency compared to local storage (NVMe, SSD, or HDD). Elasticsearch is highly sensitive to disk latency, as it performs numerous real-time I/O operations, such as:

Writing and updating indices
Searching and aggregating
Replica and segment rebalancing across nodes

The latency introduced by NFS can severely degrade cluster performance, slowing down queries and indexing operations (check the Elastic discussion forum here).

2. Locking and Concurrency Issues

Elasticsearch relies on advanced locking mechanisms to ensure data consistency. However, NFS does not handle file system locks well, especially when multiple nodes access the same segments simultaneously. This can lead to:

Index corruption
Lock errors on shards
Failures during shard rebalancing or recovery

3. Weak Consistency and Crash Recovery Problems

NFS does not provide the same level of consistency and persistence guarantees as local file systems like XFS or ext4. In the event of a node crash or loss of connection to the NFS server, Elasticsearch might end up in an inconsistent state, resulting in hard-to-diagnose errors.

4. Scalability Bottlenecks

Elasticsearch is designed to scale by distributing the load across multiple nodes, each with its own local storage. Using NFS as shared storage for multiple nodes introduces contention, becoming a bottleneck that limits the cluster’s ability to scale efficiently.

Why NFS is Perfect for Snapshots (and Searchable Snapshots)

1. Snapshots: Reliable and Scalable Backups

Elasticsearch snapshots are point-in-time backups of indices, used for disaster recovery or data migration. In this case, NFS is an excellent choice because:

Snapshots are sequential operations, and thus not dependent on low-latency disk performance
NFS is easily expandable, allowing you to store large numbers of snapshots without impacting cluster performance
Snapshot recovery is asynchronous, so network latency does not affect Elasticsearch’s real-time operations

2. Searchable Snapshots: Cost Optimization and Efficient Archiving

For the cold and frozen tiers, Elasticsearch introduces searchable snapshots, allowing searches to be performed directly on snapshots without restoring them first. Here, NFS provides significant advantages:

Searchable snapshots are accessed in read-only mode, avoiding the locking and consistency issues of using NFS for live data
Local storage savings: searchable snapshots eliminate the need to keep full local copies of indices, reducing storage costs
On-demand access: searchable snapshot data is read-only when needed, making NFS latency acceptable for data that’s accessed less frequently

Want to know more about snapshots? Check-out the Elastic Documentation here:

Conclusion

Using NFS as primary storage for Elasticsearch is highly discouraged due to latency, locking, consistency, and scalability issues. However, NFS is an excellent solution for managing snapshots and searchable snapshots, providing a reliable backup strategy and efficient long-term data management in cold and frozen tiers.

If you’re deciding where and how to use NFS in your Elasticsearch cluster, use it for snapshots—but never for live indices!

Matteo Cipolletta

I'm an IT professional with a strong knowledge of Security Information and Event Management solutions. I have proven experience in multiple Enterprise contexts with managing, designing, and administering Security Information and Event Management (SIEM) solutions (including log source management, parsing, alerting and data visualizations), its related processes and on-premises and cloud architectures, as well as implementing Use Cases and Correlation Rules to enable SOC teams to detect and respond to cyber threats.

Author

Matteo Cipolletta

Latest posts by Matteo Cipolletta

20. 12. 2024 APM, Log-SIEM

Elastic Observability Engineer Certification: A Hands-On Perspective

18. 10. 2024 Log Management, Log-SIEM, NetEye

Offloading Data Enrichment to Satellite Machines with Logstash

14. 06. 2024 APM, NetEye, Real User Experience, Visual Synthetic Monitoring

The Right Monitoring Tool: Elastic Synthetic Browser Monitor vs. Alyvix

15. 03. 2024 APM, Log-SIEM, NetEye

Unleashing Elastic APM: Containerized Scalability Explored

09. 01. 2024 Unified Monitoring

Reassign Elasticsearch ILM Policy with Python

See All

NFS and Elasticsearch: A Storage Disaster for Data but a Lifesaver for Snapshots

Why NFS is a Bad Choice for Elasticsearch Data

1. Poor Performance and High Latency

2. Locking and Concurrency Issues

3. Weak Consistency and Crash Recovery Problems

4. Scalability Bottlenecks

Why NFS is Perfect for Snapshots (and Searchable Snapshots)

1. Snapshots: Reliable and Scalable Backups

2. Searchable Snapshots: Cost Optimization and Efficient Archiving

Conclusion

Matteo Cipolletta

Author

Matteo Cipolletta

Latest posts by Matteo Cipolletta

Search by technology

Contact

Subscribe to blog

Categories

Recent posts

Archive

NFS and Elasticsearch: A Storage Disaster for Data but a Lifesaver for Snapshots

Why NFS is a Bad Choice for Elasticsearch Data

1. Poor Performance and High Latency

2. Locking and Concurrency Issues

3. Weak Consistency and Crash Recovery Problems

4. Scalability Bottlenecks

Why NFS is Perfect for Snapshots (and Searchable Snapshots)

1. Snapshots: Reliable and Scalable Backups

2. Searchable Snapshots: Cost Optimization and Efficient Archiving

Conclusion

Matteo Cipolletta

Author

Matteo Cipolletta

Latest posts by Matteo Cipolletta

Related Content

Improving Your Backup of MariaDB

Logging OpenShift Incoming Traffic on Elasticsearch

Cron Job Monitoring with Tornado

Upgrading MariaDB within NetEye 4.41: Enabling a Feature-Rich Future

How to Create a Serial Modem Emulation Service on NetEye

Search by technology

Contact

Subscribe to blog

Categories

Recent posts

Archive