登录查看更多内容

Elastic{ON} 2016: What's Next

Ryan Goldman

VP Marketing

发布日期: 2016年2月23日

That’s a wrap on SignalFx’s first visit to Elastic{ON}. We spent three days giving demos on monitoring modern infrastructure, talking to customers and prospects about alerting for higher performance from the Elasticsearch stack, and, of course, giving away snazzy socks.

We heard tons of compelling use cases from Elasticsearch practitioners in the wild. Like how the Mayo Clinic enriches its event data prior to storage and indexing to make it easier for physicians to match symptoms against past interventions and determine the best course of action with what-if scenarios. Cisco’s Talos security analytics team uses Elasticsearch for pattern matching on persistent threat behavior to identify and take down hackers like the SSHPsychos (a.k.a. Group 93). In a complementary use case, FireEye evolved its Elasticsearch storage, speed, and performance to enable customers’ more complex queries and find more advanced security threats. And companies like HotelTonight and Eventbrite use Elasticsearch to build better discovery and recommendation engines for an improved customer experience.

Along with these business cases, we loved having the chance to discuss the operations that underlie Elasticsearch availability and growth, and how it comes together with the rest of the services used to build today’s apps. We know from our own experience the important role Elasticsearch plays in modern infrastructure, providing easy integration and setup with a great API. We’ve also found that scaling, alerting on, and capacity planning for Elasticsearch can sometimes be challenging.

Based on our conversations with hundreds of Elasticsearch users in the SignalFx booth last week, we’ve learned that pre-built dashboards and an automatic cluster-level read out of performance and resource availability—days left of disk space, query latency, top clusters by index growth—are key to getting a fast start towards running Elasticsearch in production. From there, you can dive deeper to explore your implementation by index or shard or node, even proactively determining when and how to reshard with zero downtime (hint: add a “generation” concept to documents).

We also heard loud and clear that intelligent alerting is high on most people’s wish lists. Alerting on Elasticsearch at the node level can be painful, especially when you’re dealing with a service sitting on cloud infrastructure. With SignalFx, we’ve been able to eliminate noise by alerting exactly once, for example, on cluster health that not only passes a critical threshold, but also meets a duration condition beyond typical self-recovery period (instead of creating an alert for every node that reports yellow or red).

At SignalFx, every engineer or team that writes a service also operates that service—running upgrades, doing instrumentation, monitoring and alerting, establishing SLOs, performing maintenance, etc. The challenges we face are largely universal to everyone who uses Elasticsearch at any scale. Mahdi Ben Hamida, who oversees the search and metadata persistence layers of SignalFx, has been sharing his insights and experiences monitoring Elasticsearch both at Elastic{ON} and in a recent series of blogs.

Join Mahdi for a live webinar this Thursday, February 25, at 10am PST to hear his lessons learned from scaling Elasticsearch from six shards to 24 (plus replicas) across 72 machines holding many hundreds of millions of documents.

Read the original post on the the SignalFx blog.

要查看或添加评论，请登录

Ryan Goldman的更多文章

See Errors Through the User's Eyes: Extend Observability with Sentry + SessionStack

2017年9月20日

See Errors Through the User's Eyes: Extend Observability with Sentry + SessionStack

Observability as a Starting Point Embracing observability is one of the most important decisions your product team can…

1 条评论
The Game of Telephone that Hurts User Experience

2017年7月13日

The Game of Telephone that Hurts User Experience

Despite the best efforts of modern development to eliminate all bugs prior to release, crashes still happen. However…
Reversibility: The Secret to Moving at the Speed of Web-Scale

2017年2月2日

Reversibility: The Secret to Moving at the Speed of Web-Scale

Ever wonder why you almost never hear about Facebook, Amazon, Netflix, or Google suffering from major outages or…
SignalFx Achieves AWS DevOps Competency Partner Status

2017年1月31日

SignalFx Achieves AWS DevOps Competency Partner Status

SignalFx, the cloud monitoring solution powering intelligent alerts and full visibility for dynamic infrastructure and…
How We Upgraded Elasticsearch 1.x to 2.x with Zero Downtime: Why Upgrade

2016年12月6日

How We Upgraded Elasticsearch 1.x to 2.x with Zero Downtime: Why Upgrade

Elasticsearch has progressed rapidly from version 1.0.
The State of Digital Transformation Readiness: Survey

2016年10月27日

The State of Digital Transformation Readiness: Survey

Defining Digital Transformation By and large, most enterprises are developing or executing a digital strategy to…
A Hierarchy of Needs for Application Monitoring

2016年10月6日

A Hierarchy of Needs for Application Monitoring

If you went back in time 25 years and talked to a database administrator, they would tell you there’s one thing they…
How to Fill the Gap Between APM and Logs with Alerting on Production Metrics

2016年8月17日

How to Fill the Gap Between APM and Logs with Alerting on Production Metrics

Infrastructure and applications are rapidly shifting to more elastic, distributed cloud environments. As a result, the…
6 Top Metrics to Monitor in Amazon EBS

2016年7月20日

6 Top Metrics to Monitor in Amazon EBS

Amazon Elastic Block Store (EBS) is used to provide disk volumes for Elastic Compute Cloud (EC2) instances. It’s…
Full Application Lifecycle Management: The Airplane Metaphor

2016年7月13日

Full Application Lifecycle Management: The Airplane Metaphor

With the real-time insight introduced by modern infrastructure monitoring, application developers, infrastructure…

See all articles

Elastic{ON} 2016: What's Next

Ryan Goldman

VP Marketing

Ryan Goldman的更多文章

社区洞察

其他会员也浏览了

Security, Humor, HPC, Cloud, IBM - (321.3.7) Thursday AM

Filter & Split Firewall/CEF logs into multiple Sentinel tables (analytics/basic tier) to save in ingestion costs

Nate Lee on Shifting Left to Outsmart Threats

Changing Data Security in Cloud : Beyond Traditional DLP, Runtime with SOC, and Data Protection

Generative AI: The Future of Cloud Security

Reimagining Resilience with Commvault

Augment your SIEM: Scanner for Datadog

Precept IT Weekly Digest – 21st June, 2024: Sailing Through the Digital Seas

Head in the Cloud: Security for a Hyper-Connected Era

Azure Cloud Security Monitoring

Ryan Goldman的更多文章

See Errors Through the User's Eyes: Extend Observability with Sentry + SessionStack

The Game of Telephone that Hurts User Experience

Reversibility: The Secret to Moving at the Speed of Web-Scale

SignalFx Achieves AWS DevOps Competency Partner Status

How We Upgraded Elasticsearch 1.x to 2.x with Zero Downtime: Why Upgrade

The State of Digital Transformation Readiness: Survey

A Hierarchy of Needs for Application Monitoring

How to Fill the Gap Between APM and Logs with Alerting on Production Metrics

6 Top Metrics to Monitor in Amazon EBS

Full Application Lifecycle Management: The Airplane Metaphor

社区洞察

其他会员也浏览了

Security, Humor, HPC, Cloud, IBM - (321.3.7) Thursday AM

Filter & Split Firewall/CEF logs into multiple Sentinel tables (analytics/basic tier) to save in ingestion costs

Nate Lee on Shifting Left to Outsmart Threats

Changing Data Security in Cloud : Beyond Traditional DLP, Runtime with SOC, and Data Protection

Generative AI: The Future of Cloud Security

Reimagining Resilience with Commvault

Augment your SIEM: Scanner for Datadog

Precept IT Weekly Digest – 21st June, 2024: Sailing Through the Digital Seas

Head in the Cloud: Security for a Hyper-Connected Era

Azure Cloud Security Monitoring