The journey of
the Unofficial
Brunei Times
Archive

Website - wget

Text

wget --mirror --convert-links 
  --adjust-extension 
  --page-requisites 
  --no-parent http://example.org

ePaper

How to get the files?

Live demo (fingers crossed): http://digital.nstp.com.my/

Been down this road before

Download all the things!

mkdir 161001
wget -c "$URL/getZoom.jsp?id=160101bruneitimes&file=Zoom-1.jpg" -O 1.jpg
wget -c "$URL/getZoom.jsp?id=160101bruneitimes&file=Zoom-2.jpg" -O 2.jpg
wget -c "$URL/getZoom.jsp?id=160101bruneitimes&file=Zoom-3.jpg" -O 3.jpg
wget -c "$URL/getZoom.jsp?id=160101bruneitimes&file=Zoom-4.jpg" -O 4.jpg
wget -c "$URL/getZoom.jsp?id=160101bruneitimes&file=Zoom-5.jpg" -O 5.jpg
wget -c "$URL/getZoom.jsp?id=160101bruneitimes&file=Zoom-6.jpg" -O 6.jpg
...

zip -r 2016_10.zip 1610*

Offload a backup!

Bandwidth + Quota

telegram-cli

send_document <peer> <file>

If peer has a space, replace with _

./telegram-cli --enable-msg-id

history <peer>

load_document <msg-id>

Where to host?

Google Photos!

Nothing official

We love big red warnings

Still have a chance!

We have upload code!

    "require": {
        "google/apiclient": "^2.0"
    }

...

// Get a service account key from https://console.developers.google.com/ 
putenv('GOOGLE_APPLICATION_CREDENTIALS=./auth.json');

// email address of the user to upload data with
$user_to_impersonate = "EMAIL@DOMAIN.COM"; 

Google Photos!

But.....

Wrong EXIF

Back to the docs!

😞

Time to share!

Back to the drawing board

Right the EXIF wrongs

exiftool '-datetimeoriginal=2016:01:18 12:00:00' \
   -overwrite_original_in_place test.jpg

Oldie but a goodie

API Docs with Libraries!

Thank you good API

More solutions more problems

API drops - 1 or 2 months per session

<!— status code : 502 —>
<!— Server Connection Closed —>
<!— host machine: r11.ycpi.tw1.yahoo.net —>
<!— timestamp: 1481898655.000 —>
<!— url: https://api.flickr.com/services/upload/-->

More solutions more problems

Out of ordered uploads (drag and drop...)

Internet to the rescue

  def order_photosets_by_title
    ordered_list = load_photoset_list.sort { |a,b| 
       (b[:title] || "") <=> (a[:title] || "") }
    ordered_ids = ordered_list.map { |set| 
       set[:id] }.join(',')
    @flickr.send_request('flickr.photosets.orderSets', 
       { :photoset_ids => ordered_ids }, :post)
  end

Thank you again good API

To the public!

Thank you yet again good API

Cover view creation!

33MB but... 1125 requests!

css.spritegen.com => 8 requests

Hosting part deux

Lessons

  • A good API is good
  • Server side command line stuff is great
  • Research end to end first!
  • A good API is great
  • Old tech is still good tech
  • Google API servers are fast and robust
  • Flickrs API servers are not so...
  • A good API is fantastic

The Journey of the unofficial Brunei Times Archive

By Timothy Lim

The Journey of the unofficial Brunei Times Archive

  • 1,584
Loading comments...

More from Timothy Lim