Categories
Tutorial

Getting turn notifications on your Garmin Instinct courses

It turns out the Garmin Instinct supports notifying you that a turn in your course is coming up. Like this:

It took considerable fiddling to work out how to make this happen consistently and Garmin doesn’t provide any instructions, so I figured I’d explain here.

TLDR: Use a 3rd party service to create the course, export it as a TCX file and then import into Garmin Connect.

If you create the course with Garmin Connect, these turn notifications don’t show up. Instead, you need to create the course in another service and then export it as a TCX file. If you don’t want the notifications (they can be annoying on trails with lots of switchbacks, or if you already know the area well), then use a GPX file instead.

Here are examples of exporting a TCX file from Strava (free version), AllTrails (paid version), and RideWithGPS (paid version)

The final steps are to import it to Garmin Connect. You can either do this by opening the TCX file on your phone and selecting the app, or by importing it in the web version.

Finally, to actually get it onto your watch you need to select the course in the app and choose “Send to Device”

It’s a bit of a confusing process to learn, but having these turn directions is really useful when running or biking in a new location. One thing I haven’t worked out is if there is any way to silence these notifications from your watch. Sometimes if the trail is windy they can get annoying. Let me know if you’ve got an answer to that!

Categories
Tech Blog Uncategorized

Big data for the little guy

Building a serverless data warehouse on AWS with Firehose, Glue and Athena

motivation

I work at a company with fewer than 10 people on the engineering team. Everyone wears multiple hats and is responsible for their piece of the product end to end. There are no BAs, QAs, DBAs or DevOps engineers. There is no fence to throw anything over.

This presented a challenge when we determined the aggregate data we were getting from Google Analytics/Firebase was not sufficient and that we needed a new system to track user engagement. It needed to be able to collect large volumes of data from all of our products and expose this for both ad-hoc querying and BI reporting – a data warehouse.

Since we use AWS, the obvious approach would have been to use Redshift. However it’s expensive and realistically the data we needed to store was going to be in the order of gigabytes per year. Certainly not the terabytes or petabytes Redshift boasts of being able to handle. This was late 2018 and so Athena and Glue had been released, but not Lake Formation. I was vaguely aware that these services existed, but hadn’t really worked out what they might be useful for. To be fair, the AWS FAQs don’t exactly clarify things:

Q: What can I do with Amazon Athena?

Amazon Athena helps you analyze data stored in Amazon S3. You can use Athena to run ad-hoc queries using ANSI SQL, without the need to aggregate or load the data into Athena. Amazon Athena can process unstructured, semi-structured, and structured data sets. Examples include CSV, JSON, Avro or columnar data formats such as Apache Parquet and Apache ORC…

https://aws.amazon.com/athena/faqs/?nc=sn&loc=6

Not the most motivating bit of sales copy.

Two things helped things fall into place for me:

  1. Reading the book Designing Data-Intensive Applications. This gives a contemporary overview of the database ecosystem and underlying technologies. It helped me understand the Parquet file format (a compressed columnar alternative to CSVs/JSON) and that Athena is actually just AWS hosting the open source Presto query engine for you. Additionally, AWS Glue is just a way of defining Apache Hive tables.
  2. Reading the engineering blog posts of a few big tech companies (Netflix, AirBnB, Uber). It turns out even these big players are not using commercial propriety data warehouse systems but instead composing their own, with similar sets of open source technologies for streaming, storage and querying.

Implementation

It turns out the answers were all there in the AWS docs if you dig hard enough, but it took a while to work out how to combine the various services. We got there eventually and now have a cheap, low maintenance and simple ‘data warehouse for the little guy’. The moving parts of our solution are:

  1. A simple custom API that accepts requests from our web applications and mobile apps, adds some additional data (timestamp, unique uuid) and sends this JSON onto AWS Firehose as a base64 encoded string.
  2. An AWS Kinesis Firehose delivery stream which is configured to:
    1. Match the JSON with a specific glue schema
    2. Convert the record to Apache Parquet format
    3. Buffer 15 mins worth of events and then write them all to a specific S3 bucket in a year/month/day/hour folder structure.
  3. AWS Glue configured with a database, table and columns to match the format of events being sent. Additionally, an Glue crawler is configured to run several times a day to discover the new folders being added to the S3 bucket and to update the schema with this partition metadata.
  4. AWS Athena to query the database as defined in Glue.
  5. External SQL query tools and Microsoft Power BI also query Athena using the ODBC driver.

At the time of writing, we have 6Gb of Parquet files stored in S3 using this system. That’s ~130 million events. The monthly costs to run this are wonderfully low:

  • S3 storage – $0.80
  • Athena – $0.30 (varies based on your query volume)
  • Glue – $0.50
  • Kinesis Firehose – $0.50

gotchas

  • Modifying the schema in Glue (eg. adding a new column) does not automatically reflect in Firehose. You need to modify the stream config to use the new schema version.
  • The Glue crawlers have a number of configuration options. It turns out the settings you want if you wish to be in full control of the schema and not have deleted columns re-appear are:
    • Schema updates in the data store: Ignore the change and don’t update the table in the data catalog.
    • Object deletion in the data store: Ignore the change and don’t update the table in the data catalog.
  • I read that Parquet files perform best when they are ~1Gb in size. Ours are way smaller than that, but to try and get as close as possible we have the buffer settings in our Firehose stream at the maximum setting. One consequence of this is that when testing new events you have to remember to wait for the buffer to flush out and to run the Glue crawler before trying to query the data.
  • Finally, and this is a biggie – Athena is a read only service. If you want to modify or delete data, you need to work with the underlying files in S3. Our data is append only so this hasn’t been much of an issue, but definitely something to consider.
  • There is a one-one relationship between a Firehose stream and a AWS Glue table.
  • At some point you or an upstream events producer will send through bad data.
    • Firehose emits a FailedConversion.Records CloudWatch metric. It’s useful to put some monitoring around this so you can react quickly.
    • If an event does not match the configured schema, Firehose will put it in a S3 folder named ‘format-conversion-failed’. The raw data will be there in base64 encoded form.

Future plans

Overall, this system is working really well for us. A few things we’re considering for the future:

  • Investigating the new options in Lake Formation and Glue for ingesting changes directly from RDS databases. This would open up options for transitioning to more of a ‘data lake’ with data coming from multiple places.
  • Experimenting with reading and analyzing data with Databricks/Spark as an alternative to Athena – might be good for more involved queries.
  • Looking at using more complicated data structures such as struct and array columns.
  • Adding a downstream process to query the system and aggregate certain events into a format better suited for quick reporting.
Categories
Travel

Grand Canyon/Zion Trip

I’ve just returned from a short, but very enjoyable trip down to the Grand Canyon and Zion National Parks.

A few thoughts for anyone organising something similar…

Camping

In the Grand Canyon I stayed at the huge NPS Mather Campground which was nice enough. However it sells out quickly and for those who don’t want to plan so far ahead, or would rather not spend the $$$ I’ve since discovered that there is dispersed camping available in Kaibab National Forest, just outside the park. Details here.

Food

On the drive from the Grand Canyon to Zion, Kanab is a good place to stop for food. I would particularly recommend the Kanab Creek Bakery:
http://www.kanabcreekbakery.com/.

Categories
Adventure Travel

Yellowstone Logistics

Here’s a quick summary of how I organised my six day trip to Yellowstone.

  1. Flew Vancouver -> Salt Lake City. Picked up rental car, bought gas + groceries and drove up to near the West entrance of Yellowstone. Slept in the back of the rental off a forest service road.
  2. Entered via West Yellowstone and drove anti-clockwise around the loop, sorting out my backcountry permits in Grant Village. I then continued around the loop to the Cascade Lake 4K5 Trailhead just north of Canyon Village and hiked into my campsite at Wolf Lake.
  3. Hiked back out, check out the canyon and then drove to Mammoth HotSprings via Tower Junction. After walking around the hot springs I drove down to the Biscuit Basin parking lot which was the trailhead for my hike to tonight’s campsite at Firehole Falls.
  4. ┬áHiked out and visited a bunch more geysers and hot springs in the Old Faithful area. Continued driving to the DeLacy Creek parking lot where I started walking to my next campsite ‘Bluff Top’ on the shores of Shoshone Lake.
  5. Hiked out and drove out the South Yellowstone gate and through Grand Teton National Park. Had an early dinner in Jackson before continuing south to a free campsite I found on Wikicamps, next to the Salt River – Whitetail Lane Recreation Area, just outside Afton.
  6. Drove down to Park City and did a few hours of mountain biking before driving over the hill to Salt Lake City and flying home.

4 states visited – Utah, Idaho, Montana (briefly), Wyoming