As more workloads move to the cloud and access networks become more open, ensuring security and stability becomes a more demanding task. The attack surface expands, creating more gaps for hackers to exploit potential vulnerabilities. At the same time, maintaining network health is more important than ever for today’s organizations, with digital operations now the backbone of successful business operations.
To improve the security of their networks, enterprises need to have comprehensive insight into all network activity. By implementing pervasive network visibility systems, they can mitigate risk and improve overall data security.
The Anatomy of a Multi-Tenant Network
Hybrid, cross-cloud infrastructures now in common use present a new, challenging facet for managing network security. Across businesses of all sizes, many workloads now run on public cloud or virtualized environments with a third-party provider.
In these environments, multiple tenants share the same infrastructure and network resources, requiring new, advanced methods for managing network security. Compared to dedicated environments, multi-tenant infrastructures carry a higher risk of breach or failure, which makes enhanced threat detection and monitoring a critical security function. Without these in place, the entire infrastructure is at risk, and even a minor issue can affect all clients that share common infrastructure.
Infrastructure providers, therefore, need to be able to extract detailed network activity data and provide timely insight to each tenant. Identifying who moves what through the network is difficult under normal circumstances, and the petabytes of data transferred within multi-tenant networks make the task even more difficult. This can result in lengthy “dwell times,” i.e., the time between a compromise and a breach identification, which can have a severe impact on a tenant’s business.
If the network / infrastructure vendor has their network instrumented for detailed traffic visibility, they will be in a much better position to provide accurate information in real time. Both the vendor and the client business have more resources to help them mitigate the risk, speed incident response, and keep their infrastructure secure.
Why Network Visibility Matters
The advancement of security and network technologies has taken network visibility to an entirely new level. Cutting-edge systems can provide a comprehensive overview of all network activity to create a foundation for the defense against different types of threats.
By gaining real-time visibility into your network, organizations can detect any unusual activities before they create damage or financial loss. Moreover, they can provide a timely response in case of a breach, as well as gather relevant data for post-incident forensics. More specifically, implementing network visibility solutions and systems can help on three significant levels.
1. Proactive detection of abuse or potentially malicious activity
Proactive detection is an essential step toward securing network traffic. Rather than exporting intelligence from incident reports, proactive monitoring continuously gathers data that can help prevent the incident in the first place. Tools designed for proactive monitoring create baselines of network traffic and automatically alert on deviations from historical activity, making it easier to detect different types of malicious behavior.
With this type of intelligence, network engineers can prevent a greater number of malicious intrusions and breaches. Unusual network activities are reported in real time and are easier to stop from making an impact. Moreover, this kind of intelligence minimizes the impact in the case of an actual breach.
2. Real-time investigation (incident response)
If a breach or equipment failure occurs, real-time investigation is critical to minimizing downtime. However, with network attacks becoming rapidly more sophisticated, fast response time is becoming harder to achieve without a comprehensive network visibility solution.
The problem with real-time investigation is the fact that some complex attacks, such as blended DDoS or highly specialized malware, may not be easy to detect. In these cases, the time span between the initial compromise and detection is prolonged, resulting not only in poor user experience but also damages the company long term. Even though there has been an overall improvement in companies’ ability to minimize dwell time over the last couple of years, an alarming number of organizations still fail to detect a breach in a reasonable timeframe.
According to this year’s SANS Incident Response report, the ability of organizations to identify and respond to a breach within 24 hours increased significantly last year – from 40% of respondents in 2016 to 50% in 2017. On the other hand, about 15% of organizations surveyed need months to identify a breach, while about 5% take more than seven months.
The extent to which this can be damaging is perhaps best illustrated by some recent newsworthy breaches. In last year’s Equifax breach, it took months for the company to identify the compromise. The result was almost 143 million affected accounts. More recently, security analysts discovered that personal information of 198 million US voters has been publicly accessible for ten years via an Amazon S3 instance belonging to the data firm Deep Root Analytics.
These examples both indicate that companies are still taking too long to identify vulnerabilities and protect their data. While the SANS report figures show a significant improvement in the way companies handle potential security breaches, these real-world examples warn of the dangers of not being able to do so.
3. Post-incident network traffic forensics
To provide an effective post-incident response, network engineering teams need access to detailed, relevant data from across the entire network. They need to be able to do a granular investigation that encompasses more than simply identifying the target resources affected by an attack. They need information on any unusual activity that may have preceded the incident in order to identify entry points, lateral movement, and to inform potential solutions for improved security posture going forward.
A Case for Intelligent Network Visibility
The complexity of network monitoring grows in parallel to the network itself. The bigger the network, the more difficult it becomes to track all activities on it. At phoenixNAP, we faced a similar issue as we opened new data centers worldwide and as our customer base grew.
As a provider of custom bare metal, cloud, colocation, and advanced managed services solutions, phoenixNAP is focused on ensuring the security and reliability of its network. Our global network expands at a rapid pace, which is why we needed a comprehensive network monitoring tool.
Kentik enabled us to more effectively monitor and manage our global network and provide our clients with a higher degree of stability. By moving from an appliance-based tool to Kentik’s scalable network traffic intelligence platform, Kentik Detect®, we enhanced our ability to detect deviant network behavior and provide more comprehensive incident response. Below are some of the practical ways in which Kentik enabled more effective network monitoring and analysis at phoenixNAP:
1. Traffic engineering and load balancing across locations and ISPs
PhoenixNAP uses Kentik to analyze traffic patterns for its clients. A number of factors such as destination ASNs, AS paths, next hop and outgoing location, and, finally, ISP are taken into consideration to plan and execute traffic path changes and engineering. This allows us to scale bandwidth and improve network performance characteristics like latency and loss as our clients’ business grow.
2. Proactive alerting of traffic dips and potential outages
PhoenixNAP uses Kentik to track traffic usage per client. Kentik keeps a historical baseline for each client’s traffic, which we use both for historical traffic comparison and also for automatic detection of any loss or disruption in traffic. Alerts from Kentik automatically open tickets with our engineering teams to investigate any incidents and provide an early warning for network-wide problems like path changes or other issues. This enables us to be proactive and start remediating immediately while also providing timely notifications to our clients about ongoing situations.
3. Real-time interface capacity alerts
Kentik also monitors router and switch ports for utilization. Alarms are generated when ports reach a certain % utilization that indicates imminent congestion and customer impact. This gives the engineering teams valuable information and provides lead time to augment infrastructure capacity to support our continuous growth.
PhoenixNAP executive teams also rely on data from Kentik to determine drivers of growth and inform decision making about targeted network upgrades.
4. Integration with multiple communication channels
Out of the box, Kentik integrates with email systems, Slack, PagerDuty, and other logging systems. This gives technical teams the ability to receive notifications on a variety of channels rather than having to dig through a complex UI to get details about ongoing incidents. It also increases collaboration among teams and avoids the need to constantly relay incident status information between the NOC and the engineering teams.
5. Help clients with compromised hosts
The recent memcached amplification attack method resulted in a record-breaking 1.7 Tbps DDoS attack. Using Kentik, phoenixNAP set up alert policies and automated notifications to provide early warning if client systems became involved in attack activity. If any malicious activity of this kind was detected, the traffic would first be filtered to prevent impact to the network. Afterward, the affected client would be notified so that the affected systems could be patched. Kentik helped us improve overall security, preserve valuable bandwidth, and keep the network highly available for all phoenixNAP clients.
For more details on how we used Kentik’s solutions to better service our clients, read our case study. You can also hear more about our Kentik use cases during a live joint-webinar on Tuesday, June 5, 2018. Sign up for the webinar.