Evaluating Network Performance: A Content Server Perspective
I recently defended my school thesis and it was remarkable. I’ll love to write about what I learned while writing my thesis paper. I consume a lot of long content articles online, so or else you are enthusiastic about this niche, I’ll strongly suggest you just skip it.
What You’ll Learn
How Enterprises leverage Internet exchange points with their content servers
Why Cloud Providers partner with Internet Exchange Points
How to grab thousands of Paris-traceroute data of a content server using a shell script and then passing it into a .txt file
Understanding what is an IP Geolocator
How to convert this data into an excel spreadsheet
How to determine the minimum latency to the content server
How to determine the distance to the content server
Checking if your Internet Exchange Point (IXP) is along the path to the content server
What Can a Thesis Paper Like this Help You Achieve
Prerequisites
To get the most out of this article we require the following:
Ubuntu Linux is installed in either Windows WSL or as a VM on MAC OS
An understanding of Python
A foundational understanding of computer networking
An understanding of how a shell script works
The Research Questions I Tried to Determine
What amount of the fifty most visited websites in my country host their content within and outside the country?
What kind of relationship exists between content location and observed latency?
To what extent is ISP A peering at the IXP for communication within the country?
Introduction
In my previous article [1], I explored what an Internet Exchange Point (IXP) was, and also how I attempted to figure out the Python script I needed for grabbing Traceroute data from content servers. I suggest reading that first if you haven’t. In summary, an IXP is a meetup zone for huge content servers; Google, and Facebook. Also for content delivery network (CDN) providers like Akamai and Fastly. Plus other huge enterprises like Netflix server media content.
How Enterprises Leverage Internet Exchange Points with their Content Servers
While writing my thesis, I knew that I was focusing on web page media content hosted on the server. So, think of HTML, CSS, and Javascript files.
A Packet Tracer simulated Web Server with HTTP contents added to it.
But I got curious as to how companies serving large media content serve their content, especially the ones who don’t use content delivery network (CDN) providers like Akamai and Fastly. This wasn’t a part of my thesis, however, I find it helpful. I decided to look at Netflix and realized that they host their content in an IXP Data Center.
How Netflix Optimize Content Popularity Using a CDN [2]
How Netflix Optimize Content Popularity Using a CDN II [3]
From the above, we can see that globally, what Netflix first does is host all their content at an IXPs data center that is geographically closest to its users, they also partner with the ISPs in that region too, meaning the ISP acts as a content delivery network (CDN).
All of these are made possible with Netflix Open Connect CDN which is a combination of local servers (called Open Connect Appliances - OCAs) and backbone infrastructure (ISPs or the IXPs).
Why Cloud Providers Partner with IXPs
Also during my research, I noticed that a cloud provider can partner with an IXP.
Now it's clear that the cloud provider may have some sort of partnership with the IXP and may be providing resources such as servers or storage to the IXP. Finally, the cloud provider may also be providing services to members of the IXP, such as hosting their websites or providing them with additional services. [4]
How to Grab Thousands of Paris-traceroute Data of a Content Server using a Shell Script
File: Shell.sh
The above script does the following:
Specifies the date and time the traceroute data were captured.
Captures each data and sends it into a text file
Specify the date and time the capturing ended.
Now imagine you are capturing data from several other websites, simply replicate the websites in your shell script.
To execute the script do the following:
Go into WSL mode
Locate the path to the shell script
Then execute the below code
sudo ./filename.sh
Lastly, I use a python script to execute the shell script
File: paris-traceroute-to-shell.py
Understanding What is an IP Geolocator
Now the next step in the thesis was to determine the location where the content server is situated. This is possible using an IP Geolocator. It is a tool used to determine the geographical location of an IP address.
For accuracy, it was best to use two different IP Geolocators and compare their locations together. So, I decided to opt for these two:
A number of details can be gotten from an IP Geolocator. I used a private IP address, so it makes sense that no details were shown.
How to Convert this Data into an Excel Spreadsheet
Since the number of IP addresses along the traceroute path to determine their locations where in the thousands, I needed to leverage the IP Geolocators API platform and write a python script that automatically captures each IP address in the text file, sends it to the IP Geolocator, and then print out the result into an excel spreadsheet for further analysis.
File: python_IPGeolocator.py
File: Python_IPWhois.py
These scripts helped me answer the research questions I had in mind
What amount of the forty-nine most visited websites in my country host their content within and outside the country?
How to Determine the Minimum Latency to the Content Server
It is necessary to determine the minimum latency to these content servers in order to ensure that data is transferred quickly and reliably. It also helped answer two of the research questions I had in mind:
What kind of relationship exists between distance and observed latency?
To what extent is ISPA peering at the IXPN for communication within Nigeria?
It was noted that I gathered these data for 20 days. Specifically, I ran fifty (50) traceroutes to the server per day.
So how did I determine the minimum latency for each website:
Per day, for each last hop of one, traceroute data, I selected the minimum latency among the 3 probes. So, if they were 20ms 10ms 5ms, I chose 5ms.
Also among the minimum latency for all the 50 traceroute data in one day, I selected the minimum.
Lastly, I selected the minimum latency for the website across the 20 days.
How to Determine the Distance to the Content Server
To do this, I used a Haversine formula which determines the great-circle distance between two points on a sphere given their longitudes and latitudes. These longitudes and latitudes were gotten from the IP Geolocators. I just had to add them to the below python script. Specifically, I use the IP address associated with the domain name of the content server. Then got its location, and lastly distance from my PC to it.
File: distanceToServer.py
Checking if your Internet Exchange Point (IXP) is along the Path to the Content Server
To achieve this, I simply did a BGP Looking Glass lookup using Hurricane Electric, a Tier 1 ISP. Specifically, I searched for the IP address associated with my country's IXP. Then I compared it to the traceroute data I gathered to see if it were along the path.
What Can a Thesis Paper Like this Help You Achieve
It can help one gain a better understanding of how to measure, analyze, and improve the performance of a content server. For example, know the best location to host a new server for customers based on the current internet architecture in that region.
A paper on this can also help one understand how to use performance metrics such as latency, throughput, and packet loss to determine the quality of service.
Additionally, this paper can help one gain an understanding of how to use various network technologies (e.g. BGP) to improve the performance of a content server.
References
Uneze, C. (2020, June 28). A look into my internet measurement [Blog post]. Retrieved from https://charlesuneze.substack.com/p/a-look-into-my-internet-measurement
Netflix. (2020, May 13). Content popularity for Open Connect [Blog post]. Retrieved from https://netflixtechblog.com/content-popularity-for-open-connect-b86d56f613b9
Netflix. (2018, November). Open Connect briefing paper [PDF]. Retrieved from https://openconnect.netflix.com/Open-Connect-Briefing-Paper.pdf
Interxion. (n.d.). AWS Direct Connect. Retrieved from https://www.interxion.com/why-interxion/colocate-with-the-clouds/aws-direct-connect/
Network Charles. (n.d.). Evaluating Network Performance: A Content Server Perspective [GitHub repository]. Retrieved from https://github.com/network-charles/Evaluating-Network-Performance-A-Content-Server-Perspective