Creating Analytics Dashboards is fairly simple these days either using Graphana or ELK Stack   kibana4-2-color-low-high-graph

But  the difficult part is to make the Data speak for us and take Remedial Actions !

Usually its very important to close the feedback loop with the same system that generate events .

Sustainability of business process , systems , applications and infrastructure often depend upon how quickly one can detect abnormality , threshold violations and take actions in an automated manner.

Let’s call this concept as  ->  C(A-cube) -> Continuous Automated Actionable Analytics !


Continuous Actionable Analytics Process :

  • Define all the Business Policies (metrics to be measured , violation levels , thresholds , score values ) in MongoDB or Couchbase
  • Process the Events
    • Extract the events from the incoming streams (Ad-hoc)
      • Logstash->Spark-ElasticSearch is best choice
      • Logstash->Kafka or Kafka Client -> Kafka -> Spark is another choice if multiple sources are pushing data
      • Logstash->ElasticSearch – Percolator great option for computing fast threshold violation
    • Expose an API to dynamically trigger a query (Scheduled)
      • For historical analysis or scheduled event query periodically fire ES / Spark queries
  • Execute Policies
    • The business policies should be fetched from NoSQL store based on the tenant key and composite identifier
    • Accordingly construct a ElasticSearch Query payload or Spark RDD Query
    • As soon as violations are caught , notify end users and persist the violations both into Elastic Search as well as NoSQL store
    • Analyze the textual data in event stream (fuzzy search, nearest terms, NLP Entity extraction)
  • Generate Notifications
    • Notifications should be flexible json structure (templates retrieved from NoSQL store) and should be able to communicate to other systems and users via notification channels.
    • Record error conditions in a KnowledgeBase (NoSQL document)
    • Note that ES store is used for doing historical analysis through Kibana Dashboard
  • Monitor Notifications
    • Lookup violation info  from NoSQL store in the context of current event
    • If #violation breaches threshold (as defined in the Policy) , find remediation actions to be executed
    • Detect Anomalies
    • Retrieve the recommendations , suggested fixes and other remedial messages from KnowledgeBase
    • In order to keep KnowledgeBase updated send messages to developers or business owners so that missing resolutions or remedial activities can be fed back to the KnowledgeBase
  • Take Actions
    • For system issues e.g. server / memory issues restart process , for process issues kill or disable jobs (if fails all compliances) , for business events generate remedial events
    • Escalation and wider alerts
    • Historically it has been observed, taking remedial actions reduces the possibility of future violations.