Che Hodgins // Musings on Web Development

Tag Archive for ‘geolocation’

Advanced Geolocation

I write this in honor of the Firefox web browser. I still remember when it was first released November 9th, 2004, and gave me hope for a better, nicer, non-IE world. Today, Firefox 3.5 is released. Building upon nearly 5 years of success, they have continued innovating and I thank them for making the web a better place.

Feel free to Skip to the demo.

I previously wrote about using IP based Geolocation. Although this method is widely used, the downsides are obvious: inaccurate results, proxies, false positives, and a lack of privacy control for the end user.

The Future of Geolocation

The new generation of browsers are implementing the Geolocation API specification. This gives the browser the job of figuring out where you are. There are some positive points and negative points to this. Firstly, the position of the user can be more accurate. In IP-based Geolocation, the only data available is the IP address. The browser has access to much more precise data such as WiFi networks and GPS devices (iPhone!). Secondly, privacy settings. The browser should be able to ask the user if they will allow such information to be shared, ideally even the level of accuracy that should be shown. This is possible if implemented in the browser. One negative point is something we are all familiar with: cross-browser compatibility. Different implementations in different web browser will make developers miserable, but hey, that’s what standards are for, right? :)

Browser Support

As of today a few web browsers support geolocation. Here’s the status of the mainstream browsers:

  • Firefox: Available in version 3.5, released today.
  • iPhone Safari: Available in OS3.
  • IE: Experimental in version 8.
  • Opera: Available in nightly builds since March 2009.
  • Chrome: Available through Google Gears API
  • Safari: Unknown.

Here is how to request a users location (see line 26 for the goods):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
function knownLocation(position) {

    var latitude, longitude;
    if (position.coords) { // iPhone
        latitude = position.coords.latitude;
        longitude = position.coords.longitude;
    } else { // Firefox
        latitude = position.latitude;
        longitude = position.longitude;
    }

    var div = document.getElementById('geo');
    div.innerHTML = div.innerHTML + "Latitude: " + latitude + "<br/>Longitude: " + longitude;
}

function unknownLocation() {
    var div = document.getElementById('geo');
    div.innerHTML = div.innerHTML + "Unknown Location";
}

window.onload = function() {
    var div = document.getElementById('geo');
    div.innerHTML = div.innerHTML + "Browser: " + navigator.appName + " (" + navigator.appVersion + ")<br/>";

    if (navigator.geolocation) {
        navigator.geolocation.getCurrentPosition(knownLocation, unknownLocation);
    } else {
        div.innerHTML = div.innerHTML + "Browser not supported";
    }
};

This then prompts the user for their approval:

Geolocation in Firefox 3.5

Similarly, on the iPhone:

Conclusion

This is really cool. With geolocation implemented in the browser, great precision can be achieved. For example, on the iPhone the GPS is used, so visiting the demo page from my living room gives different coordinates than when visiting from my kitchen. I can imagine many useful applications of this. Another thing I love is that I can deny my location to certain sites, which I will absolutely use on certain sites.

Tags: , , ,

Free and Fast Geolocation in PHP

Geo* (as I call them) are the web technologies that provide a link between online content and Earth’s geography. Examples includes Geocoding (finding latitude/longitude based on street addresses), Geotagging (tagging media with latitude/longitude coordinates), and Geolocation (finding latitude/longitude of a computer).

Geolocation is a particularly cool technique because it allows you to estimate a person’s geographic location, thus allowing you to provide a custom tailored experience on your website, among other things. This can be useful as much as it can be annoying. There are several methods of Geolocation, some as simple as asking the user where they are located. This article focuses on adding IP based Geolocation to your PHP website for free all the while keeping it fast.

Problems

If IP addresses are to be used to determine a persons physical location then a few possible problems come to mind:

  • How accurate is the mapping between an IP address and a geographical location?
    • From maxmind.com’s Geolocation service: “99.8% accurate on a country level, 90% accurate on a state level, and 83% accurate for the US within a 25 mile radius.”. Doing some research, the matching is done using either the address of the ISP that owns that IP [link], or by buying the data from websites that ask for users locations [link].
  • What about users behind proxies?
    • Some Geolocation databases flag the IPs of potential anonymous proxy servers.
    • Most proxy servers send X-Forwarded-For and Client-IP headers that you can use.

This is not perfect, but in many cases the approximate geographical location of a user can be inferred.

Demo time

This demo will use the free Geolocation database provided by Maxmind.com. I believe this is the ideal choice for normal (i.e. not Facebook) websites for several reasons:

  • It is free (there is a paid version with higher accuracy)
  • It is fast. They report up to 1 million queries per second on 1 machine.
  • It is extensible. The database can be upgraded to the paid version by just replacing the binary.
  • They like developers. They provide implementations in over 10 different programming languages, with benchmarks.
  • Their website is full of valuable information. They provide benchmarks, an explanation of how they collect their data, and more. I haven’t seen this with any other IP Geolocation services.

There are two options for us PHP developers. The pure PHP library or a PECL package implementing the C library. For reasons that will be discussed below, the PECL package will be used. If you do not want to use a PECL package or are on a hosted server, then you can download the pure PHP classes here.

First, the GeoIP C library must be downloaded (link) and installed. Note that this can be installed on windows as well. No special options are needed to install it:

1
2
3
mbpro:GeoIP-1.4.6 chehodgins$ sudo ./configure
mbpro:GeoIP-1.4.6 chehodgins$ sudo make
mbpro:GeoIP-1.4.6 chehodgins$ sudo make install



Then the PECL package can be installed:

1
2
3
4
mbpro:~ chehodgins$ sudo pecl install geoip
downloading geoip-1.0.7.tar ...
...
You should add "extension=geoip.so" to php.ini



Next, add this extension to php.ini (i.e. extension=geoip.so), restart apache and check out phpinfo():

GeoIP in phpinfo()

GeoIP in phpinfo()



The final step before writing code is to download the actual database. It is updated monthly so remember to stay up to date. The directory that should contain the file is OS dependent, so create a quick php script to see where the directory is:

1
2
3
4
ini_set('display_errors', true);
error_reporting(E_ALL | E_STRICT);
$result = geoip_record_by_name('72.30.81.165');
var_dump($result);



Gives us:

Determine the binary directory


Now save the binary to the directory mentioned in the php warning, reload your script, and the warning should disappear. Let’s try again with some more code:

1
2
3
4
5
6
7
8
ini_set('display_errors', true);
error_reporting(E_ALL | E_STRICT);

$functions = get_extension_funcs('geoip');
var_dump($functions);

$result = geoip_record_by_name('72.30.81.165');
var_dump($result);



Gives:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
array
0 => string 'geoip_database_info' (length=19)
1 => string 'geoip_country_code_by_name' (length=26)
2 => string 'geoip_country_code3_by_name' (length=27)
3 => string 'geoip_country_name_by_name' (length=26)
4 => string 'geoip_continent_code_by_name' (length=28)
5 => string 'geoip_org_by_name' (length=17)
6 => string 'geoip_record_by_name' (length=20)
7 => string 'geoip_id_by_name' (length=16)
8 => string 'geoip_region_by_name' (length=20)
9 => string 'geoip_isp_by_name' (length=17)
10 => string 'geoip_db_avail' (length=14)
11 => string 'geoip_db_get_all_info' (length=21)
12 => string 'geoip_db_filename' (length=17)
13 => string 'geoip_region_name_by_code' (length=25)
14 => string 'geoip_time_zone_by_country_and_region' (length=37)

array
'continent_code' => string 'NA' (length=2)
'country_code' => string 'US' (length=2)
'country_code3' => string 'USA' (length=3)
'country_name' => string 'United States' (length=13)
'region' => string 'CA' (length=2)
'city' => string 'Sunnyvale' (length=9)
'postal_code' => string '94089' (length=5)
'latitude' => float 37.4249000549
'longitude' => float -122.007400513
'dma_code' => int 807
'area_code' => int 408



With only an IP address we can easily get the country, postal code, longitude and latitude, and even the area code of the user.

Performance

I initially thought that the PECL version would outperform the pure PHP version by a small percentage. I was wrong. The PECL version was much faster. Here are some informal benchmarks.

Iterations Total Avg Notes
PECL GeoIP 10,000 0.7s .007ms per request
Pure PHP 10,000 49.2s 4.92ms per request
PECL GeoIP 1 0.08ms 0.08ms per request Typical real world usage
Pure PHP 1 2.4ms 2.4ms per request Typical real world usage

As a validation of my results I benchmarked the pure PHP library being used in a web application and had comparable results to my benchmarks (5.9ms per IP lookup versus the 2.4ms above).

Conclusion

Because of the ease of implementation, the low cost, and the minimal performance losses, there is much to be gained by adding IP Geolocation to your web application. The PECL package is the ideal configuration because it provides a faster experience with less code to maintain. The pure PHP library is none the less still relatively fast and thus still worth it. This is still far from a perfect solution. False positives can occur, anonymous proxies mess everything up, and IP addresses are constantly changing. Also, what about users’ who simply do not want to share their location? There are privacy issues. This is currently a hot topic, with the W3C geolocation API being actively worked on, including the efforts of Mozilla, Opera and others to improve the situation of location awareness on the web, something I am looking forward to it.

More reading:

Interesting Geolocation presentation
GeoIP functions in the PHP manual
Cool Geo* stuff at Y!

Tags: , , ,