#60 - IP Geolocation using Google Cloud Load Balancers

Links, Code, and Transcript

In this episode, we will checking out an IP Geolocation Beta feature using Google Cloud Load Balancers. Using this feature, you can use the Load Balancer to inject user defined HTTP headers with geolocation data from client requests.

Before we dive into the technical aspects of this episode. First, let me show you what the end results looks like. Back in episode 56, we deployed this example site using Kubernetes, then we configured a simple deployment pipeline in episode 58. In this episode, we are adding a little footer down here that guesses where the user request is coming from based off their IP address. You can see here, it says Okay Bay, which is pretty close to where I live. Also, just a heads up, this is pretty specific to Google Cloud but could be reproduced elsewhere with similar technologies (we will about that at the end of this episode).

IP Geolocation using Google Cloud Load Balancers

You can try this out yourself, by going to sysadmindemo.com as looking at the footer. Then, if you click the debug link here, you will get full debug headers of where I think this request is coming from. So, you can see, I think it is coming from Oak Bay, BC, Canada. This is just a few kilometers from my house. Pretty cool.

IP Geolocation using Google Cloud Load Balancers

Alright, lets chat about why I added this? Well, in an upcoming episode, I wanted to build sort of an example interactive game where you answer questions and either get them right or wrong. I wanted to create sort of a leaderboard where we would group all answers by the users country.

Why use IP Geolocation?

  • Customer Service (issues with language, local custom, slow site, etc)
  • Set default language when accessing a website
  • Analytics (should we market more in a particular area / market growth)
  • Fraud detection
  • Gaming - Matchmaking
  • Assisting with address lookup
  • Map set the default view port

But, there are tons of things you can do with this type of data and I just wanted to walk through a couple of them. Lots of websites use this type of thing behind the scenes without you even knowing. They often log it when you enter support tickets, user feedback, etc. Lots of times this can help determine if say users from a particular country or region are having a hard time using your site, maybe you need to translate it for them, or maybe it is super slow in their country. This could act as a good signal that you need to add additional infrastructure there or something. Lots of people use this type of data for analytics to see if their product is growing in a particular country, or to direct marketing efforts, etc. Some people use this for fraud detection, but I suspect most of this is happening through machine learning now, where country is likely just one of many features used. When you are playing multiplayer games, lots of companies will try to use this type of geolocation data in their matchmaking service, to match you up with people close to you, that speak your same language. Also, tons of address lookup, closest store location lookups, etc, mapping type tools use this behind the scenes for assisting your searches.

Alright, so lets walk through how this works. By the way, in episodes 56 and 58 we walked through this architecture in detail. So, we have a user up here, making a connection that passes through our load balancer using their IP address, and that gets passed to our web server.

IP Geolocation using Google Cloud Load Balancers

Typically, you will just see some type of X-Forwarded-For header that contains the users IP address. We can ask the load balancer to lookup the users IP Geo Location data and inject additional headers into this request.

IP Geolocation using Google Cloud Load Balancers

These additional headers will then be passed down to our web server where we can do things with them. In a nutshell that is basically how the https://sysadmindemo.com/ page works.

IP Geolocation using Google Cloud Load Balancers

So, lets set this up. There is this new Beta feature called User-defined request headers where we can ask the load balancer to inject custom headers as requests are passing through it.

IP Geolocation using Google Cloud Load Balancers

There is an example down here, where we ask to have this X-Client-Geo-Location header added with the users region and city added in. Then, here is what that might look like, with say the US as the country and Mountain View as the city.

Then, there is this table that shows all the available keyboards you can use. So, you can add things like estimated request round trip time, client country, client state or province, client city, the city lat long points, and then a bunch of TLS metadata.

So, this is the demo website again where we were testing out Kubernetes Pod Load Balancing. In this version of the site I have not added this feature yet. The one I showed earlier was the end result of all this (through the magic of video editing). So, we can click down here, in to the debug messages where I am dumping out all the headers, and we do not have an geolocation headers yet. We only have the users IP address which I have blurred out here. Along with a bit of debug information about the host and binary that is running this site.

Alright, so lets look at how we add these headers from here, into this site here. In the next tab, I have the https://sysadmindemo.com/ sites load balancers configuration open. You can see I am listening on ports 80 and 443 and gave an SSL certificate added. There are no user headers added right now.

IP Geolocation using Google Cloud Load Balancers

If we go back to the documentation page, and scroll down a bit, you can see some commands here for adding the user defined headers. So, I have copied these into a text editor, and used all the values from that table above. Let me show you.

So, this is the command we can going to run that will modify the https://sysadmindemo.com/ load balancer and inject these additional headers on each request. So, this will give us the estimated request round trip time, county, state or province, city, and the city lat and long. Then we will grab the TLS metadata too. Finally, we will create a header that included a few of these values into a single line. Likely, you would not add all these like this, but would just select a few you were interested in, into a combined header or something. Then you can just parse these using scripts on your web server.

Cool, so lets jump back to the console and run this command. So, I am going to use the cloud shell for this, but you could totally use the gcloud command from your machine too. You can use this command here to get a listing of all the Load Balancers in your project.

gcloud beta compute backend-services list

I already know the one I want, and put that into the command we are going to paste here. You can see I am requesting these headers be updated on a specific load balancer here, by saying gcloud beta compute backend-services update, and then the load balancer ID. Then, lets run it, and thats it.

gcloud beta compute backend-services update k8s-be-31711--1dc04f318f303884 \
  --custom-request-header 'X-Client-RRT:{client_rtt_msec}' \
  --custom-request-header 'X-Client-Region:{client_region}' \
  --custom-request-header 'X-Client-Region-Subdivision:{client_region_subdivision}' \
  --custom-request-header 'X-Client-City:{client_city}' \
  --custom-request-header 'X-Client-Lat-Long:{client_city_lat_long}' \
  --custom-request-header 'X-Client-TLS-Host:{tls_sni_hostname}' \
  --custom-request-header 'X-Client-TLS-Version:{tls_version}' \
  --custom-request-header 'X-Client-TLS-Cipher:{tls_cipher_suite}' \
  --custom-request-header 'X-Client-Geo-Location:{client_region},{client_city}' \
  --global

Lets jump back to the https://sysadmindemo.com/debug page and refresh a few times. This takes about 5-10 minutes to fully deploy. So, I am just going to pause the recording and come back in a few minutes here. Alright, we can back, and you can see we now have the added headers here. Let me just increase the size a little to make it easier to see. You can see we have all the headers now. Pretty cool right? We have all the city, state, country, round trip time, etc. So, now on the web server side we could parse these out and hopefully use these values to increase the users experience. Maybe by selecting the default language, assisting with address lookups, or in my case building a game leaderboard based of what country the user is connecting from. Then, on the currently deployed version of this site I added the footer down here.

X-Client-City: Oak Bay
X-Client-Geo-Location: CA,Oak Bay
X-Client-Lat-Long: 48.426481,-123.314127
X-Client-Region: CA
X-Client-Region-Subdivision: CABC
X-Client-Rrt: 22
X-Client-Tls-Cipher: c02f
X-Client-Tls-Host: sysadmindemo.com
X-Client-Tls-Version: TLSv1.2

So, how can you do this without using Google Cloud? Well, traditionally people have been using Maxmind. Honestly, this product goes back probably at least 15+ years. I remember using this a long time ago for some simple fraud detection. You could use this and there is a whole slew of integrations you could use too. It does cost a few bucks though. Let me find the pricing. Here you go.

IP Geolocation using Google Cloud Load Balancers

Google’s version is $0.75 cents per million requests. So, likely pretty reasonable for many smaller sites. I have also added a link in the episode notes below where you can do something similar on Cloudflare if you are using that.

Alright, that is it for this episode. Thanks for watching and I will see you next week. Bye.

Metadata
  • Published
    2019-03-13
  • Duration
    9 minutes
  • Download
    MP4 or WebM
You may also like...
Terraform
Terraform

#57 - 2019-02-20

Vault
Vault

#72 - 2019-05-02