- Centralized ways of getting insights from application to infrastructure
- You can diagnose, trace and debug issues
- Uses ML to detect anomolies and reveal hidden patterns
- Track how customers interact with the application
- Components: Alerts, metrics, action groups, monitoring & reporting, dashbaord, logs
- High level view
- Collects data from: Application, operating system, resources, subcription, tenant
- Populates stores: Metrics & logs
- Perform functions:
- Insights: Application, container, VM, monitoring solutions
- Visualize: Dashboards, views, Power BI, Workbooks
- Analyze: Metrics Explorer, Log Analytics
- Respond: Alerts, Autoscale
- Integrate: Event Hubs, Logic Apps, Ingest & Export APIs
- Notifies when important conditions are found in the monitoring data
- Flow of alerts:
- Alert Rule [Target Resource (Signal) -> Criteria (Logic Test)] ->
- Action Group (Actions to do)
- Monitor condition (Alert State)
- Alert Rule [Target Resource (Signal) -> Criteria (Logic Test)] ->
- Alert rules have single of each properties:
- Target resource
- Scope & signals for alerting.
- E.g. VM
- Signal
- Emitted by target resource
- Can be metrics, activity log, application insights and log.
- Criteria
- Combination of signal and logic applied on target resources.
- E.g. less than X CPU usage.
- Logic
- User-defined logic to verify that signal is within expected range/values.
- E.g. less than 30% CPU usage.
- Alert name
- Alert description
- Severity
- Alert once the criteria specified in the alert.
- Can range from 0 to 4.
- Action
- Specific action taken when the alert is fired.
- Target resource
- You can alert on:
- Metric values
- Log search queries
- Health of underlying Azure platform
- More..
- State of alerts:
- New: Created or fired
- Acknowledged: Issue is reviewed.
- Closed: Issue has been resolved.
- Can be reopened by changing its state.
- User changes state from New.
- Diagnostic Logs
- Non-compute resources: Resource metrics
- Compute resources: Guest OS (e.g. syslog for Linux, event logs for Windows)
- Azure Monitoring Agents
- Azure Diagnostics Extension (cloud only)
- Windows Server and Linux
- useful for basic resource-level monitoring
- Deployed automatically to VM when you enable it.
- Boot diagnostics (serial console)
- Log Analytics Agent (hybrid solution)
- Can collect logs from Azure & on-prem systems to same namespace.
- Azure Diagnostics Extension (cloud only)
- Application Logs
- Trace event streams
- Programmed in application itself.
- Application Insights
- Instrumentation tool
- HTTP requests
- Dependency Calls (to e.g. SQL, external services, background services)
- Activity Logs
- Azure infrastructure logs
- E.g.
- Who created VM?
- Who configured this VNet?
- Traffic stream from NSG?
- Can be sent to: Log Analytics, Event Hubs, Azure Storage
- Flow logs handled by NSG's.
- Plot using
- In-built Azure plotting tool Network Watcher
- Power BI
- Can be eached through Cost analysis blade of desired scope.
- In Cost analysis you can filter by Tags.
- Cost Management shows organizational cost and usage patterns with advanced analytics
- Reports show your internal and external costs for usage and Azure Marketplace charges
- You can automate periodically export of your costs
- 💡 You can also see daily usage data in Azure Account Center -> Billing history -> Current period -> Download usage
- Data is consumed by other Azure resources
- Predictive analytics are also available.
- Collected one-minute frequency
- Uniquely identified in a namespace.
- 💡 Stored for 93 days
- Collected in Azure metrics database (time series database)
- 💡 Copy to Log Analytics for long term storage
- Holds value properties: Time, Type, Resource, Value, Multiple Dimensions
- Value:
- Health of application: can help to identify route cause.
- Valuable when combined with other metrics.
- Sources of metrics:
- Platform metrics
- Each resource provides
- Visibility into health and performance
- Application metrics
- Generated by application insights
- Detect performance issues & track trends
- Custom metrics
- ❗ Must be created in same region as the resource that has the metrics
- Platform metrics
- Use-cases: Metrics explorer, Metric Alert Rule, Auto Scale, Route & Stream, Archive, Access
- ITSM
- IT as a Service
- Helps to design, plan, deliver, operate, and control information technology (IT) services
- Azure ITSM Connector:
- Bi-directional connection layer between and your ITSM tool(s)
- Use cases:
- Create ITSM work items based on Azure alerts.
- Sync ITSM incident/change request data to Azure.
- SIEM
- Security information and event management
- Example: Splunk (there's an open source add-on to send to Event Hubs)
- Name: Unique identifier
- Action type
- Voice call or SMS
- ❗ Up to 10 SMS / voice call actions in an action group.
- ❗ No more than 1 SMS / Voice call every 5 minutes.
- Webhook
- ❗ Up to 10 Webhook call actions in an action group.
- It'll retry 2 times: first after 10, then 100 seconds.
- Logic App
- ❗ Up to 10 logic app actions in an action group.
- Automation runbook
- ❗ Up to 10 Runbook actions in an action group.
- Azure Function
- ITSM
- ❗ Up to 10 ITSM actions in an action group.
- Email
- ❗ Up to 1000 e-mail actions in an action group.
- ❗ No more than 100 emails in an hour.
- Push notification
- Azure App Push
- ❗ Up to 10 Azure app actions in an action group.
- Voice call or SMS
- Details: corrospending phone number, email address, webhook URI, or ITSM conenction details.
- Two ways to understand Azure bill:
- Compare usage and costs (invoice) with usage file
- Detailed usage CSV file shows charges & daily usage in billing period
- Download:
- Sign into the Azure account Center as the Account Administrator
- Select the subscription for which you want the invoice and usage information
- Select billing history -> Download usage
- Select billing history
- Download:
- Detailed usage CSV file shows charges & daily usage in billing period
- Compare the usage and costs with Azure portal
- Subscription -> Cost analysis -> Filter by Timespan
- Compare usage and costs (invoice) with usage file
- See estimated costs: Subscription -> Usage and estimated costs
- Old: OMS, new: Embedded in Azure Monitor as Logs.
- It's a dataware house for telemetry
- It converts any schema to a table schema that allows you to query.
- Uses KQL (pipe-based) language to query.
- It converts any schema to a table schema that allows you to query.
- All monitoring roads lead t o Azure Log Analytics
- There's always an integaration from an logging Azure component to Log Analytics.
- You can download agents in Workspace -> Connect
- Agents do not require VPN
- System Center Operations Manager can send data to Log Analytics from cloud/on-prem servers.
- Azure Data Explorer -> Query language is used & viewed
- Alert rule
- Based on each query that run on regular intervals, results are evaulated to trigger an alert.
- Target: Specific Azure resource
- Criteria: Specific logic to trigger an action
- Log Alerts describes where signal is custom query based on Log Analytics
- Action: Call to send a notification
- Set-up in Log Analytics -> Alerts
- Export: Excel, PowerBI
- Application Insights data is used in a different partition in Log Analytics.
- E.g. requests, traces, usages
- Allows you to cross application queries
- Function: Queries can be saved as functions to be used within another query.
- Requires log analytics workspace
- Baseline
- Configuration management term
- Signifies an agreed-upon description of product attributes, per unit time, which serves as a basis for defining change.
- 💡 It's not only recommended but mandatory for team to develop a baseline.
- Gather diagnostics for long enough time.
- Capture all peaks and values over ordinary usage.
- Enable streams and create baseline
- Even analyze those and agree upon which performance ranges are acceptable to define SLA's.
- Helps to isolate pr oblem
- Gather diagnostics for long enough time.
- Baselining in Azure
- Continous monitoring
- Normal operational parameters
- Alerts on deviations
- Take proactive corrective actions
- Baselines actions
- Enable diagnostics monitoring and telemetry, e.g.:
- Azure IaaS resources
- Azure App Service apps
- Creating performance baselines
- Analyze diagnostics output
- Plot metrics
- Enable diagnostics monitoring and telemetry, e.g.: