the little bicycle that could

a quick introduction

pronto is giving bicycles an opportunity like never before - a chance to see all of what seattle has to offer! 

 

  • 500 bikes and 54 stations throughout the city
  • cycle-sharing means the only thing that defines where a bike can go is where the bike is now! 
  • a fee for rides longer than 30 minutes means that bikes get to check in all over the city! 

 

presentation notes: it gets a little math-y here and there. to see the details, click the up and down arrows when you see them! 

meet pronto pete - the little bicycle that could

pete is a pronto bike in seattle.  he has a step-through frame, an adjustable seat post, and a wanderlust for the emerald city! 

pete wants to go everywhere, but there are 3 things he's most excited to do: 

  • see the views from 605' up at the space needle  
  • meet a shark at the seattle aquarium
  • take a quick ride to bc on the victoria clipper (do bikes need a passport?)

a random bike

lets say that any time pete goes for a ride, he is equally likely to go anywhere in the city.

 

  • mathematically, pete is going for on a random walk (or in this case, a random bike) 
  • since there are 53 other places for pete to go, the probability that he picks any particular one is 1/53

 

what could pete do with this knowledge? 

  • calculate the average amount of peddling he has to do in order to see his favorite locations, or even to see to see the whole city! 

one
two 
three

all

how long will it take pete to go to the aquarium, then the space needle, then the pier if he starts of at the frye museum

if pete wants to go to the space needle next, it'll take an average of 53 more rides to get to there. that's 106 total! 

on average, its going to take pete 53 rides to get to the aquarium from the Frye

now to finish his trip, it'll typically take an average of  53 more rides for a total of 159 rides just to get to three places! 

what if pete wants to see the whole city and he doesn't mind what order he sees the sights? it'll take him 248 trips to get around town!

applying the geometric distribution

  • with pete starting at the frye museum, we have, at every ride, that he will or will not reach the aquarium. 
  • each trial is independent and has the same probability of success as we're assuming a random walk.
    • so each trial is a bernoulli experiment with parameter 1/53

we can use the random variable X to represent the number of bernoulli trials necessary to reach the aquarium

Thus the expected number of rides necessary to reach the aquarium is 53

The expectation of the three trips is, by linearity, equal to the sum of each expected value (53)

E[X]=\frac{1}{p}=\frac{1}{(1/53)}=53
E[X]=1p=1(1/53)=53E[X]=\frac{1}{p}=\frac{1}{(1/53)}=53
E[X_1+X_2+X_3]=E[X_1]+E[X_2]+E[X_3]
E[X1+X2+X3]=E[X1]+E[X2]+E[X3]E[X_1+X_2+X_3]=E[X_1]+E[X_2]+E[X_3]

the coupon clipping problem

  • it certainly doesn't sound applicable, but the expected number of trials necessary to visit every node in a random walk is equivalent to the coupon clipping problem. 

  • we know the expected number of steps is the sum of the expectations for any particular step, so the expected number of steps is 

 

 

 

  • but with some asymptotic analysis of the harmonic numbers, we can calculate the total number of steps necessary to collect all coupons (that is visit every station, regardless of order). 
E[x]=xln(x)+\gamma x +o(1)
E[x]=xln(x)+γx+o(1)E[x]=xln(x)+\gamma x +o(1)
E[x]=xh_x\text{ where } h_x \text{ is the harmonic number at }x.
E[x]=xhx where hx is the harmonic number at x.E[x]=xh_x\text{ where } h_x \text{ is the harmonic number at }x.
\gamma \approx 0.57722
γ0.57722\gamma \approx 0.57722

but wait, pronto rider, pete's trip depends on you!

pronto pete doesn't travel at random...he goes where you want to go! 

 

instead of having a probability of going from one node to another of 1/53, we have to include the probability of a rider wanting to take a ride.

  • its not very likely a rider will take a ride for one block
  • its not very likely a rider will take a ride across the city without checking in somewhere 
  • a rider might borrow a bike just to ride and return to the same place
  • but ... the next place he goes still only depends on where he is at the moment! 

enter math!

we make a matrix that shows all of the "transition" probabilities based on the past year of pronto rentals. that is, the probability of going from one rental station to the next. 

 

a city map demonstrating the probabilities when starting at 3rd and broad is shown to the right.

an interactive version of this transition matrix is available at

http://bit.ly/intmcproba

lets start with pete back at the museum

how long will it take pete to get to the aquarium now? 

 

probability of going directly to the aquarium: 0.0069

 

for every 1,000 rides pete takes from the museum, he ends up at the aquarium about 7 times!

 

the average number of rides it'll take Pete to get visit the sharks: 32 

reaching a particular state

for a markov chain, a state j is an absorbing state is one that you never leave once entered. that is, the jth entry of the jth row is equal to 1. we redefine our markov chain to make the desired state absorbing, so once we've reached the statation we stop calculating! 

 

call         the number of steps necessary to reach the desired station from the ith station.

 

then we can calculate the expected value of         using the following formula:

X_i
XiX_i
X_i
XiX_i
X_i-\sum_{j=1}^n p_{i,j}X_j=1
Xij=1npi,jXj=1X_i-\sum_{j=1}^n p_{i,j}X_j=1

reaching a particular state

let      be the vector repesenting the expected number of steps necessary to reach the abosorbing state

 

then we can factor     and we have

 

 

 

by removing the absorbing state from the matrix we now have a solution for the       vector

 

this gives the expected number of steps to j from any other station 

X
XX
(I-P)X=1
(IP)X=1(I-P)X=1
X
XX
X
XX

pete doesn't care what order - he wants to see things

how long will it take to travel to three locations if it doesn't matter what order pete travels? 

 

this is a much more difficult thing for pete to calculate. imagine pronto only has 3 stops, the frye museum, the aquarium, and pier 69. 

 

if we started from pier 69, the probability of the trip happening is so unlikely that pete will probably be biking forever before he sees all three

simplifying  formal language theory

an alphabet is a finite set of symbols

a word is a finite sequence of symbols over an alphabet

the kleene closure is the set of all words of any length (up to infinity) over a set v including the empty string

\text{let } L \text{ be an alphabet defined by } L={a,b,c}
let L be an alphabet defined by L=a,b,c\text{let } L \text{ be an alphabet defined by } L={a,b,c}
\text{then } W=aabcab \text{ is a word over }L.
then W=aabcab is a word over L.\text{then } W=aabcab \text{ is a word over }L.
\text{and the Kleene closure over } L \text{ is:}
and the Kleene closure over L is:\text{and the Kleene closure over } L \text{ is:}
L^*={\big\{\emptyset, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc, aba, abc,...\big\}}
L={,a,b,c,aa,ab,ac,ba,bb,bc,ca,cb,cc,aba,abc,...}L^*={\big\{\emptyset, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc, aba, abc,...\big\}}

a language theoretic approach to solving the problem

While the problem of reaching a vertex is solvable using the geometric distribution when restricted to two vertices, the problem becomes significantly more complex when considering the case of three vertices. 

 

Consider the structure shown above with the probabilities as defined. 

making an alphabet

identifying all possible paths becomes an interesting task as the number of vertices increases, as bicycles can return to the same location or backtrack along a path. 

 

consider paths going from pier 69 to the frye museum and the seattle aquarium and terminating once all three have been visited as words. 

 

the alphabet for the words is 

 

 

 

\{a_1,a_2,a_3,f_1,f_2,f_3,p_1,p_2,p_3\}
{a1,a2,a3,f1,f2,f3,p1,p2,p3}\{a_1,a_2,a_3,f_1,f_2,f_3,p_1,p_2,p_3\}

forming the language

then there are 4 possible languages starting at the pier and visiting both the aquarium and the museum

(p_1^*p_2f_1^*f_3)^*p_1^*p_3
(p1p2f1f3)p1p3(p_1^*p_2f_1^*f_3)^*p_1^*p_3
(p_1^*p_2f_1^*f_3)^*p_1^*p_2f_1^*f_2
(p1p2f1f3)p1p2f1f2(p_1^*p_2f_1^*f_3)^*p_1^*p_2f_1^*f_2
(p_1^*p_3a_1^*a_3)^*p_1^*p_2
(p1p3a1a3)p1p2(p_1^*p_3a_1^*a_3)^*p_1^*p_2
(p_1^*p_3a_1^*a_3)^*p_1^*p_3a_1^*a_2
(p1p3a1a3)p1p3a1a2(p_1^*p_3a_1^*a_3)^*p_1^*p_3a_1^*a_2

By assigning probabilities to each transition and including infinite sums for the Kleene closures, we can calculate                                  

E[x]=\sum_{i=1}^\infty xP(x)
E[x]=i=1xP(x)E[x]=\sum_{i=1}^\infty xP(x)

but with these probabilities, the series is divergent

extending this trip to everywhere! 

you'd think the aquarium, the frye museum, and a trip to british columbia would tire pete out - especially with the infinite trips, but he's ready to take a class at uw, take a selfie at the first starbucks, and really see what the city has to offer him 

 

but calculations here are too big...we run into a lot of infinities...

i think i can (simulate this)

we simulated more than 10000 bikes and recorded the stops they visited at 

  • 1000 rides 
  • 10000 rides 

to see how many of them got to see all the stations

 

for the 1000 step rides, pete made it to all of the stations 0.056% of the time. that's a little more than once every 2000000 steps! 

 

for 10000 step rides, pete made it to all of the stations 0.51% of the time. 

Using R to Simulate Bike Behavior

  • simulations performed in R Studio Version 0.99.467 via R version 3.2.0 - 'full of ingredients'
    • begin with A, a vector of j zeros, and P, a j x j transition matrix, and N, a maximum number of entries. 
    • choose an initial state i. 
      • replace the ith entry in A  with a 1.
    • randomly select a next step k based on the row vector of probabilities for that state. 
      • replace the kth entry in A with a 1. 
    • calculate the value of S, the sum of values in A. 
      • if S is equal to 54 then break. else, if n<N, repeat

i thought i could

this is the shortest simulation that was built. the probability of this exact trip happening again is so unlikely pete doesn't even want to think about it!

 

what a trip!

pronto pete is so jealous!

enough of the data wrangling! in order for pete to see the sights, he needs YOU to go out and ride! 

thanks for reading!