Skip to content

Latest commit

 

History

History
228 lines (206 loc) · 8.76 KB

File metadata and controls

228 lines (206 loc) · 8.76 KB

Monitoring

Azure Monitor

  • Centralized ways of getting insights from application to infrastructure
  • You can diagnose, trace and debug issues
  • Uses ML to detect anomolies and reveal hidden patterns
  • Track how customers interact with the application
  • Components: Alerts, metrics, action groups, monitoring & reporting, dashbaord, logs
  • High level view
    • Collects data from: Application, operating system, resources, subcription, tenant
    • Populates stores: Metrics & logs
    • Perform functions:
      • Insights: Application, container, VM, monitoring solutions
      • Visualize: Dashboards, views, Power BI, Workbooks
      • Analyze: Metrics Explorer, Log Analytics
      • Respond: Alerts, Autoscale
      • Integrate: Event Hubs, Logic Apps, Ingest & Export APIs

Alerts

  • Notifies when important conditions are found in the monitoring data
  • Flow of alerts:
    • Alert Rule [Target Resource (Signal) -> Criteria (Logic Test)] ->
      • Action Group (Actions to do)
      • Monitor condition (Alert State)
  • Alert rules have single of each properties:
    • Target resource
      • Scope & signals for alerting.
      • E.g. VM
    • Signal
      • Emitted by target resource
      • Can be metrics, activity log, application insights and log.
    • Criteria
      • Combination of signal and logic applied on target resources.
      • E.g. less than X CPU usage.
    • Logic
      • User-defined logic to verify that signal is within expected range/values.
      • E.g. less than 30% CPU usage.
    • Alert name
    • Alert description
    • Severity
      • Alert once the criteria specified in the alert.
      • Can range from 0 to 4.
    • Action
      • Specific action taken when the alert is fired.
  • You can alert on:
    • Metric values
    • Log search queries
    • Health of underlying Azure platform
    • More..
  • State of alerts:
    • New: Created or fired
    • Acknowledged: Issue is reviewed.
    • Closed: Issue has been resolved.
      • Can be reopened by changing its state.
    • User changes state from New.

Log types

  • Diagnostic Logs
    • Non-compute resources: Resource metrics
    • Compute resources: Guest OS (e.g. syslog for Linux, event logs for Windows)
    • Azure Monitoring Agents
      • Azure Diagnostics Extension (cloud only)
        • Windows Server and Linux
        • useful for basic resource-level monitoring
        • Deployed automatically to VM when you enable it.
        • Boot diagnostics (serial console)
      • Log Analytics Agent (hybrid solution)
        • Can collect logs from Azure & on-prem systems to same namespace.
  • Application Logs
    • Trace event streams
    • Programmed in application itself.
    • Application Insights
      • Instrumentation tool
      • HTTP requests
      • Dependency Calls (to e.g. SQL, external services, background services)
  • Activity Logs
    • Azure infrastructure logs
    • E.g.
      • Who created VM?
      • Who configured this VNet?
      • Traffic stream from NSG?
    • Can be sent to: Log Analytics, Event Hubs, Azure Storage

NSG (Network Security Group) Flow logging

  • Flow logs handled by NSG's.
  • Plot using
    • In-built Azure plotting tool Network Watcher
    • Power BI

Azure Cost Management

  • Can be eached through Cost analysis blade of desired scope.
  • In Cost analysis you can filter by Tags.
  • Cost Management shows organizational cost and usage patterns with advanced analytics
  • Reports show your internal and external costs for usage and Azure Marketplace charges
  • You can automate periodically export of your costs
    • 💡 You can also see daily usage data in Azure Account Center -> Billing history -> Current period -> Download usage
  • Data is consumed by other Azure resources
  • Predictive analytics are also available.

Metrics

  • Collected one-minute frequency
  • Uniquely identified in a namespace.
  • 💡 Stored for 93 days
    • Collected in Azure metrics database (time series database)
    • 💡 Copy to Log Analytics for long term storage
  • Holds value properties: Time, Type, Resource, Value, Multiple Dimensions
  • Value:
    • Health of application: can help to identify route cause.
    • Valuable when combined with other metrics.
  • Sources of metrics:
    • Platform metrics
      • Each resource provides
      • Visibility into health and performance
    • Application metrics
      • Generated by application insights
      • Detect performance issues & track trends
    • Custom metrics
      • ❗ Must be created in same region as the resource that has the metrics
  • Use-cases: Metrics explorer, Metric Alert Rule, Auto Scale, Route & Stream, Archive, Access

Third party tools

  • ITSM
    • IT as a Service
    • Helps to design, plan, deliver, operate, and control information technology (IT) services
    • Azure ITSM Connector:
      • Bi-directional connection layer between and your ITSM tool(s)
      • Use cases:
        • Create ITSM work items based on Azure alerts.
        • Sync ITSM incident/change request data to Azure.
  • SIEM
    • Security information and event management
    • Example: Splunk (there's an open source add-on to send to Event Hubs)

Action groups

  • Name: Unique identifier
  • Action type
    • Voice call or SMS
      • ❗ Up to 10 SMS / voice call actions in an action group.
      • ❗ No more than 1 SMS / Voice call every 5 minutes.
    • Webhook
      • ❗ Up to 10 Webhook call actions in an action group.
      • It'll retry 2 times: first after 10, then 100 seconds.
    • Logic App
      • ❗ Up to 10 logic app actions in an action group.
    • Automation runbook
      • ❗ Up to 10 Runbook actions in an action group.
    • Azure Function
    • ITSM
      • ❗ Up to 10 ITSM actions in an action group.
    • Email
      • ❗ Up to 1000 e-mail actions in an action group.
      • ❗ No more than 100 emails in an hour.
    • Push notification
      • Azure App Push
      • ❗ Up to 10 Azure app actions in an action group.
  • Details: corrospending phone number, email address, webhook URI, or ITSM conenction details.

Monitoring and reporting on spend

  • Two ways to understand Azure bill:
    1. Compare usage and costs (invoice) with usage file
      • Detailed usage CSV file shows charges & daily usage in billing period
        • Download:
          1. Sign into the Azure account Center as the Account Administrator
          2. Select the subscription for which you want the invoice and usage information
          3. Select billing history -> Download usage
        • Select billing history
    2. Compare the usage and costs with Azure portal
      • Subscription -> Cost analysis -> Filter by Timespan
  • See estimated costs: Subscription -> Usage and estimated costs

Log Analytics

  • Old: OMS, new: Embedded in Azure Monitor as Logs.
  • It's a dataware house for telemetry
    • It converts any schema to a table schema that allows you to query.
      • Uses KQL (pipe-based) language to query.
  • All monitoring roads lead t o Azure Log Analytics
    • There's always an integaration from an logging Azure component to Log Analytics.
  • You can download agents in Workspace -> Connect
    • Agents do not require VPN
    • System Center Operations Manager can send data to Log Analytics from cloud/on-prem servers.
  • Azure Data Explorer -> Query language is used & viewed
  • Alert rule
    • Based on each query that run on regular intervals, results are evaulated to trigger an alert.
    • Target: Specific Azure resource
    • Criteria: Specific logic to trigger an action
      • Log Alerts describes where signal is custom query based on Log Analytics
    • Action: Call to send a notification
    • Set-up in Log Analytics -> Alerts
  • Export: Excel, PowerBI
  • Application Insights data is used in a different partition in Log Analytics.
    • E.g. requests, traces, usages
    • Allows you to cross application queries
  • Function: Queries can be saved as functions to be used within another query.
  • Requires log analytics workspace

Create performance baselines

  • Baseline
    • Configuration management term
    • Signifies an agreed-upon description of product attributes, per unit time, which serves as a basis for defining change.
    • 💡 It's not only recommended but mandatory for team to develop a baseline.
      • Gather diagnostics for long enough time.
        • Capture all peaks and values over ordinary usage.
        • Enable streams and create baseline
      • Even analyze those and agree upon which performance ranges are acceptable to define SLA's.
      • Helps to isolate pr oblem
  • Baselining in Azure
    1. Continous monitoring
    2. Normal operational parameters
    3. Alerts on deviations
    4. Take proactive corrective actions
  • Baselines actions
    • Enable diagnostics monitoring and telemetry, e.g.:
      • Azure IaaS resources
      • Azure App Service apps
    • Creating performance baselines
      • Analyze diagnostics output
      • Plot metrics