Dish Monitoring Enhancements

What we heard from you, and what we plan to do about it

Dish Monitoring Enhancements

Phase 1

Dish Monitoring Enhancements

The Issue

The Solution

Currently, LRF signal monitoring/equipment statuses are not tracked in one centralized place.

We will gather channel statuses from the VL Pro Evertz system, LRF encoder statuses via the NMX system, LRF router statuses via the Solar Winds system, and the remaining LRF equipment statuses via the Miranda LCS system using SNMP requests.

Dish Monitoring Enhancements

As a result, when issues arise, team members have to make phone calls and send emails to other teams to get in sync about which equipment is having issues.

We will feed LRF data into Dish Maps to give a birds-eye view of the statuses of LRFs, changing LRF colors to give an indication of it's operational status.

If an LRF is clicked on within Dish Maps, we will show a view of all available equipment statuses within the LRF to enable quick troubleshooting of issues.

Additionally, the current LRF map data visualization (Dish Maps) does not show the current LRF operational status and more granular LRF system statuses.

The Issue

The Solution

For each LRF, we will need to show Executive-Level data points for DMAs effected, services effected, and call volume generated.

When an executive-level user clicks on an LRF, they will by default see a view with the data points that are useful to them rather than showing the equipment statuses by default.

Dish Monitoring Enhancements

We will work with the Tech Ops team in order to gather call volume information programmatically.

We will work with Randy and Glen to gather information programmatically about which DMAs are related to a particular LRF site by integrating with their system after they get the source LRF logic built into Elvis.

This information cannot be visible to all team members.

We should show this Executive-Level information by default to users in Dish Maps that are authorized to view it so they don't have to sift through equipment statuses before seeing relevant data to them.

Dish Maps LRF View

Dish Monitoring Enhancements

Dish Maps LRF View after clicking an LRF (we will show executive level data points here by default for executive-level users)

Dish Monitoring Enhancements

The Issue

The Solution

NMX servers can be taken down and new ones can be stood up at any time without other team's knowledge.

We will build a page into the new monitoring UI for the compression team to keep their NMX local server information up to date in our database. This will ensure that our monitoring system as well as other engineering teams will be able to programmatically know which server to connect to for particular information.

Dish Monitoring Enhancements

Currently, the server information is not stored anywhere that can be accessed by code, causing code that depends on NMX to need to be manually updated when it is discovered that a previously available NMX server is not available anymore or the NMX server has been updated to handle different data than it used to.

Also, the compression team currently has no way of knowing how NMX failure alerts relate to particular services (physical or virtual equipment) and how those services were effected by a particular failure captured in NMX.

We will allow for manual entry of NMX changes in the new monitoring UI, as well as an import/export CSV file functionality for ease of making updates to the NMX server info.

For NMX alerts captured by our system, when you click "View" on a particular alert, we will display which physical/virtual equipment is effected by that alert and show the status of that particular equipment.

The Issue

The Solution

This causes code reading equipment statuses from the LCS system to need to be manually updated when issues are detected due to equipment ids/locations change.

We will build a page into the new monitoring UI for LCS equipment information to be kept up to date by Mark Painter's team. CSV import/export will be made available for ease of inputting/editing information.

Dish Monitoring Enhancements

LCS equipments details (for example OIDs and MIBs) can change over time without any way to programmatically keep track of those changes.

Also, Mark Painter's team uses LCS data but doesn't see what LCS looks like across all DMAs which would be helpful for their troubleshooting needs. 

We will also build a dashboard for Mark Painter's team to be able to view information that is useful to his team such as how LCS looks across all DMAs.

This will ensure our monitoring API and other engineering teams can programmatically update LCS equipment metadata and gather information from the correct locations without manually needing to update their code.

The Issue

The Solution

Alerts/Data from multiple monitoring systems should be visible cross-team to ensure all teams have visibility into relevant systems for streamlined troubleshooting.

We will build a monitoring UI which will show all of the alert data we gather.

Dish Monitoring Enhancements

Email notifications should be configurable to allow customization of which alerts coming from different particular systems will cause notifications to particular team members/teams.

Within the new monitoring UI, there will also be a page where notifications can be configured. This will allow users to select a system where data is coming from, select an alert type, add keyword search terms, and then select which team(s)/team member(s) will receive an email notification if that particular type of alert comes through.

Monitoring UI Alerts Table

Dish Monitoring Enhancements

Monitoring UI Alerts Table View Alert

Dish Monitoring Enhancements

Monitoring UI Notifications Table

Dish Monitoring Enhancements

Monitoring UI Create Notification

Dish Monitoring Enhancements

The Issue

The Solution

Since only some alerts are to be treated as critical for LRFs, admins will need the ability to determine which alerts, from each particular system we are gathering data from, will cause an LRF in Dish Maps to turn red.

Within the new Monitoring UI we build, there will be the ability to mark particular alerts as critical so that if that type of alert comes to the system, it will cause it's corresponding LRF to turn red.

Dish Monitoring Enhancements

Monitoring UI Map Config Table

Dish Monitoring Enhancements

Monitoring UI Map Congif Table Create Rule

Dish Monitoring Enhancements

The Issue

The Solution

After a critical issue is handled, team members will need to be able to view all known issues and mark issues as resolved.

Within the new monitoring UI, all issues will be viewable and there will be an option to select an issue and mark it as resolved.

Dish Monitoring Enhancements

Relevant team members will need to be notified once LRF issues are resolved.

The new monitoring server will also need to keep track of critical statuses and auto-detect system failure resolutions and auto-mark issues as resolved when possible.

When an issue is found, the server will monitor the equipment or channel with the issue, once the status comes back as operational, the server will automatically set the issue in question as resolved.

When an LRF related issue is manually or automatically resolved, our server will send an HTTP request to the Dish Maps API to inform them that the LRF no longer needs to be showing red.

Dish Monitoring Enhancements

Architectural Diagram

Phase 2 Ideas

Dish Monitoring Enhancements

Phase 2 Feature Ideas

Slating automation and slating scheduler UI + API development (API should be usable by other Dish teams to programmatically schedule changes as needed).

Dish Monitoring Enhancements

Connect to additional systems within Dish that would be useful to bring into the new centralized monitoring UI.

Build dashboards (TOC Dashboards included) within the new monitoring UI using the gathered data.

Create auto-ticket generation and assignment functionality when issues are detected.

Build IP video QA automation software

Add AI/Machine Learning data processing

Expand monitoring efforts to include Core (additional NMX API license would be required)

Automate entries into Elvis

Features with pending questions

Dish Monitoring Enhancements

Features with pending questions

Dish Monitoring Enhancements

Feature Question
Ability to provide API keys for other Dish systems to utilize our data collection API Should we architect and document our API with this in mind?
Network/Deployment details The database for Solar Winds is deployed in one network(Broadcast Network), NMX/VL Pro/LCS etc are deployed in another Dish Network.(Corporate Network)

Where should we deploy our UI/Server in order to ensure we have connectivity to all systems?
Correlating gathered data to a particular LRF Does the data from NMX, LCS, Solar Winds, and VL Pro all have LRF location data associated with it so that we can properly display issues on Dish Maps relating to a particular LRF?

Dish Monitoring Enhancements

By Akyuna Akish

Dish Monitoring Enhancements

  • 322