UP | HOME

Class 15: matplotlib - graphing

Table of Contents

Introduction

This class is about starting to plot data! Matplotlib is a super powerful plotting system. I am definitely not an expert. Check out the range of examples:

http://matplotlib.sourceforge.net/

Setup

I have pre-parsed the data for you.

mkdir -p ~/class/15
cd ~/class/15
wget http://vislab-ccom.unh.edu/~schwehr/rt/examples/2011-10-11.gga.dat.bz2

There is no need to uncompress the data!

bzcat 2011-10-11.gga.dat.bz2 | head
# x y z quality satellites hdop
-70.9395833333 43.1354166667 35.7 2 9 1.1
-70.9395766667 43.135415 36.1 2 9 1.1
-70.93957 43.1354133333 36.5 2 9 1.1
-70.9395666667 43.1354133333 37.0 2 9 1.1
-70.9395633333 43.1354133333 37.4 2 9 1.1
-70.9395633333 43.1354133333 37.8 2 9 1.1
-70.9395616667 43.1354133333 38.3 2 9 1.1
-70.9395616667 43.135415 38.7 2 9 1.1
-70.93956 43.1354133333 39.1 2 9 1.1

Loading the data

Start ipython with pylab loaded

ipython --pylab

We can load the data using numpy's loadtxt function. It knows how to handle gzip and bzip2 compressed data!

loadtxt?
# q to quit out of the pager

data = loadtxt('2011-10-11.gga.dat.bz2')

data

# The results
array([[-70.93958333,  43.13541667,  35.7       ,   2.        ,
          9.        ,   1.1       ],
       [-70.93957667,  43.135415  ,  36.1       ,   2.        ,
          9.        ,   1.1       ],
       [-70.93957   ,  43.13541333,  36.5       ,   2.        ,
          9.        ,   1.1       ],
       ..., 
       [-70.93955167,  43.13545667,  43.        ,   2.        ,
          8.        ,   1.2       ],
       [-70.93955167,  43.13545667,  43.3       ,   2.        ,
          8.        ,   1.2       ],
       [-70.93955167,  43.13545833,  43.5       ,   2.        ,
          8.        ,   1.2       ]])

That's not really what we want. We would like an array for each variable. There is a "unpack" option for loadtxt that will let us assign each of the 6 columns separately.

x,y,z,quality,satellites,hdop = np.loadtxt('2011-10-11.dat', unpack=True)

average(x)
# -70.939601490675187

average(y)
#: 43.135434976022353

min(x)
# -70.939716666699994

max(x)
# -70.939486666700006

Simple plots

That's nice, but can we plot those?

plot(x)

Ugly units, but it does give us a sense of what is going on. It looks pretty random. How about latitude?

plot(y)

Uh oh! Having both on the same plot is not what we want. Time to learn how to manage "figures".

cla() # wipe what is in the "current" figure
figure(1)
plot(x)
figure(2) # make a new figure
plot(y)
figure(3)
plot(x,y)

Can we lable the hair ball?

You bet!

title('GPS wander for 1 day')
xlabel('Longitude')
ylabel('Latitude')

We can also put a dot at each point:

plot(x,y,'ro')  # Yuck!  too many points

Try something nicer - the lines with the average of x and y marked:

cla()
plot(x,y)
annotate('Center', xy = (average(x),average(y)))
plot(average(x),average(y), 'ro')

Using pyproj to calculate distances and directions

pyproj can reproject data, but it also has functions for calculating distances and bearings on great circles. We can use this to plot the direction and distance of the GPS wander during a day in the xy plane. Let's give it a try with two points:

import pyproj

geod = pyproj.Geod(ellps='WGS84')

geod.inv(-70.93957666666667, 43.135415, -70.939585, 43.135403333333336)
# (-152.38532556888342, 27.614668733409243, 1.464169686122718)

The results are the direction from pt 1 to pt 2, the reverse direction, and the distance in meters.

Let's make a script that will use pyproj to give us two lists: the direction and the distance. We will use the average of x and y as our reference point. Call is "wander.py":

import pyproj

def wander_list():

    geod = pyproj.Geod(ellps='WGS84')

    x,y,z,quality,satellites,hdop = np.loadtxt('2011-10-11.dat', unpack=True)

    x_ave = np.average(x)
    y_ave = np.average(y)

    delta_m = []
    delta_dir = []
    for i in range(len(x)):
        d = geod.inv(x_ave,y_ave, x[i], y[i])
        delta_dir.append(d[0])
        delta_m.append(d[2])

    return delta_dir, delta_m

What????

ValueError: undefined inverse geodesic (may be an antipodal point)

If the values are too close together, it can't calculate a direction. We have not talked about exceptions up to this point. We now need to learn a little about them. In python, when things go wrong, well designed code will often "raise" an exception. It is saying "hey! this is what is wrong". The hope is that some of the code above that problem will know what to do. If no part of the program "catches" the exception and handles it, the program has to stop. Here is a little example to use in ipython:

try:
    raise Exception
    # Never get to here
except:
    # Do what is necessary to get things back on track
    print 'Opps!'

Modify the code in your for loop to look like this:

try:
    d = geod.inv(x_ave,y_ave, x[i],y[i])
    delta_dir.append(d[0])
    delta_m.append(d[2])
except:
    delta_dir.append(0) # just pick north
    delta_m.append(0) # No distance

Now it should work!

reload wander
dir, m = wander.wander_list('2011-10-11.gga.dat.bz2')

cla()
plot dir

Errr… dir runs -180 to +180. Let's make that 0..360:

dir360 = [ d if d>0 else d+360 for d in dir ]

Didn't get that last line? I'll explain it more in a video.

Fancier plotting - histograms and subplots

It would be nice to be able to make a figure with multiple plots and maybe have a histogram. Let's try making a histogram first.

cla()
hist(m)
hist(m, bins=30)
hist(m, bins=100)

Which is a reminder that you are warned that histograms are not a stable concept. How they look depends heavily on the number of bins.

Warning: matplotlib is designed to follow how matlab does plotting. That means that subplots start counting from 1, not 0. Bummer.

cla()
subplot (411)
plot(dir360)
plot(x)
subplot (412)
plot(y)
subplot (413)
plot(m)
subplot (414)
hist(m, bins=200)

Author: Kurt Schwehr

Date: <2011-10-20 Thu>

HTML generated by org-mode 7.4 in emacs 23