This blog post is part of my Advanced Jekyll series. Last week I wrote about using Custom Jekyll Plugins with GitHub Pages which I highly recommend reading when you get a chance.

In many cases, when you’re building websites in Jekyll, those websites become the single source of truth for that information for the outside world (and possibly internally). For example, a careers website would contain all job listings, an events website would have all the events and their state, etc. You get the idea. These are the kind of sites you’d build with Jekyll.

Wouldn’t it be easy (and lots of fun) to expose that data via an endpoint? Surely that’s not possible with Jekyll because static site generators just aren’t capable of that, right?

Well, it is very possible and super simple! And I’m going to show you how you can build an API in Jekyll. 🙂

Create our Jekyll Project

For this tutorial, I am going to use my shell (I use iTerm 2 on macOS) to create a Jekyll website for a “fake” monthly meetup which will list all of our meetups in two formats: on the homepage in HTML and on a separate JSON endpoint.

Run the commands below after $. If your shell looks just like below, you’re good to head to the next section! For your reference, I use Ruby 2.2.4 for this tutorial.

$ gem install jekyll -v 3.4.0
$ jekyll new meetups
$ cd meetups/
$ ls -a
.            .gitignore   Gemfile      _config.yml  about.md
..           .sass-cache  Gemfile.lock _posts       index.md

Let’s Store Our Data

The website for our meetups isn’t helpful if it doesn’t have information on the meetups itself. In Jekyll, we can harness a powerful feature for storing meetup details: Jekyll Datafiles.

Jekyll Datafiles helps you store data in a central place (in CSV, JSON or YAML) and access said data easily inside of your Jekyll project. Datafiles live in the _data folder, which we’ll need to create (as it isn’t created by default).

Once we have files created inside of the _data folder, we can start up our Jekyll serve and access it through the site.data variable in Liquid.

In this example, I’m going to create a meetup entry for London April 2017.

# _data/london-meetup-april-2017.yml

name: "Random Meetup April 2017"
description: "Come join us for some snacks and chat about random stuff."
city: "London, UK"
datetime: 2017-04-04 20:00:00
venue: "TBD"

Now we need to list all of our meetups in index.md in the root of our project:

---
layout: home
---

# Random Monthly Meetup

Every month, we host a meetup talking about random stuff. We'll provide some snacks and other goodies. Just come by and say hi!

All of our meetups are below:

{% for entry in site.data %}
  {% assign meetup = entry[1] %}
  <h1>{{ meetup.name }} {{ meetup.city }}</h1>
  <p>{{ meetup.description }}</p>
  <p>When: {{ meetup.datetime | date_to_long_string }}</p>
  <p>Location: {{ meetup.venue }}</p>
  <hr />
{% endfor %}

Pretty cool. Now if we create any other files in our _data folder they will automatically show up on our homepage if they are formatted similarly to our _data/london-meetup-april-2017.yml file.

However, what if in the future we want to store other data in our meetup website, like a list of all the talks that were given, etc? That’s not possible because all of the information stored in _data is just a list of our meetups. So we’ll need to restructure our data to serve that purpose.

Structuring our data

With Jekyll Datafiles, you can store your data is any number of folders/files in the _data folder. So, in essence, we can store our meetups in _data/meetups and problem solved, right?

However, I’d go a step further and create a more nesting for our meetup data. So, we can have a folder called meetups/ then one for the year, month and have all relevant meetups in that particular folder? Sounds good.

$ mkdir _data/meetups
$ mkdir _data/meetups/2017
$ mkdir _data/meetups/2017/04
$ mv _data/london-meetup-april-2017.yml _data/meetups/2017/04/london-meetup-april-2017.yml

Great! However, in this project, we don’t actually care about the filename is called. This hierarchy is just for our benefit, Jekyll is fine with it either way. Now, it makes no sense to repeat ourselves, so we can cut the month april, year 2017 and the word meetup from our .yml filename so let’s do that.

$ cd _data/meetups/2017/04
$ mv london-meetup-april-2017.yml london.yml

Now that folder structure is looking a lot more fun to work with. However, index.md will no longer any of our meetups. This is because site.data will now be a multi-dimensional hash (a hash inside of a hash) based on our folder structure. You can see what I mean later in the article when we try to create our API endpoint.

This means we’ll need to rename site.data to site.data.meetups.2017.04 in our index.md, which seems quite odd, right? It would only show all of our meetups in April.

So, we’ll need to write multiple nested for-loop statements in our index.md in order to show all of our meetups, not just those specific a year or month. Let’s do that below:

---
layout: home
---

# Random Monthly Meetup

Every month, we host a meetup talking about random stuff. We'll provide some snacks and other goodies. Just come by and say hi!

All of our meetups are below:

{% for year in site.data.meetups %}
  {% assign months = year[1] %}
  {% for month in months %}
    {% assign meetups = month[1] %}
    {% for meetup_data in meetups %}
      {% assign meetup = meetup_data[1] %}
      <h1>{{ meetup.name }} {{ meetup.city }}</h1>
      <p>{{ meetup.description }}</p>
      <p>When: {{ meetup.datetime | date_to_long_string }}</p>
      <p>Location: {{ meetup.venue }}</p>
      <hr />
    {% endfor %}
  {% endfor %}
{% endfor %}

The above code runs a for-loop for each folder structure we have. As you can see, we have three folders in _data (which are meetups, :year and :month) and we have a for-loop statement for each level of hierarchy. That’s great. However, this is really ugly code. We need to simplify this logic, which I’ll do later.

Creating our API endpoint

So, let’s go ahead and create an REST API endpoint that will expose all of our meetups in a lovely JSON format. Let’s go ahead and create an an meetups.json in our root folder.

---
---

{{ site.data.meetups | jsonify }}

That’s it, we have a JSON endpoint that lives at /meetups.json in our site!

What’s great is that Jekyll will expose Liquid templating language to any file in your project as long as it has the YAML Front Matter (the two lines of --- at the top, basically).

So, what does that exactly output? See below.

{
  "2017": {
    "04": {
      "london": {
        "name": "Random Meetup April 2017",
        "description": "Come join us for some snacks and chat about random stuff.",
        "city": "London, UK",
        "datetime": "2017-04-04 20:00:00",
        "venue": "TBD"
      }
    }
  }
}

That’s fairly nice, right? This could be published right now and it’ll work just fine. But it’s time to add some polish to make it look 💯!

Writing A Jekyll Plugin To Polish Our API

So, here, we’ll want to look at creating a Jekyll Plugin. I’ll go more into Plugins in future blog posts, but for now, I’m going to just create a Liquid Filter. You can read more on plugins here.

We’ll need multiple custom filters for a consistent, well structured API:

  • Having all of our events as one array instead of a nested hash.
  • Being able to filter what columns are accessible in our API
  • Being able to wrap our API in a easy-to-follow format

So, I’ve gone ahead and wrote a custom Jekyll filter called ApiFilter that I’ll go ahead and save in _plugins/filters/api_filter.rb that’ll implement these rules:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# _plugins/filters/api_filter.rb

module Jekyll
  module Filters
    module ApiFilter
      # This will take multi-dimensional hash and \
      # convert it into a flattened array of hashes.
      def flatten_hash(input)
        all_values = input.to_a.flatten
        hash_values = all_values.select { |value| value.class == Hash }
        most_nested_values = []

        if hash_values.count > 0
          hash_values.each do |hash_value|
            most_nested_values << flatten_hash(hash_value)
          end

          most_nested_values.flatten
        else
          return input
        end
      end

      # This will take an array of hashes and \
      # only show columns that have been explicitly approved
      def filter_fields(input, fields)
        downcased_fields = fields
          .split(",")
          .map { |field| field.strip.downcase }

        input.map do |entry|
          entry.select do |key, value|
            downcased_fields.include?(key.downcase)
          end
        end
      end

      # This will take an input and wrap a \
      # hash around it with a designated key with \
      # a timestamp
      def wrap_with_key(input, key)
        {
          key => input,
          :status => "OK",
          :last_updated => Date.today
        }
      end
    end
  end
end

Liquid::Template.register_filter(Jekyll::Filters::ApiFilter)

Now, if we’re running a Jekyll server, we’ll need to restart that server for our plugin to work. Jekyll is smart enough to know that we’ve added a plugin and will autoload it upon running $ jekyll serve.

Let’s apply these filters to our meetups.json file in our Jekyll project:

---
---

{{ site.data.meetups |
   flatten_hash |
   filter_fields: 'name,description,city,datetime' |
   wrap_with_key: 'meetups' |
   jsonify }}

And the JSON output over at /meetups.json on our site will now be:

{
  "meetups": [{
    "name": "Random Meetup April 2017",
    "description": "Come join us for some snacks and chat about random stuff.",
    "city": "London, UK",
    "datetime": "2017-04-04 20:00:00"
  }],
  "last_updated": "2017-01-30",
  "status": "OK"
}

Pretty cool, right? Our data looks different and a lot more predictable now:

  • It is a flat array that is wrapped around the events key
  • We chose not to show venue which is no longer exposed
  • It shows us when the file was last_updated which is helpful

P.S. We can now go back to our index.md and make that look pretty:

---
layout: home
---

# Random Monthly Meetup

Every month, we host a meetup talking about random stuff. We'll provide some snacks and other goodies. Just come by and say hi!

All of our meetups are below:

{% assign meetups = (site.data.meetups | flatten_hash) %}
{% for meetup in meetups %}
<h1>{{ meetup.name }} {{ meetup.city }}</h1>
<p>{{ meetup.description }}</p>
<p>When: {{ meetup.datetime | date_to_long_string }}</p>
<p>Location: {{ meetup.venue }}</p>
<hr />
{% endfor %}

The power of code, right? I love it - things are just so simple!

Renaming the API endpoint

I’m just about ready to publish my API endpoint, but I’m not particularly happy about the location of the endpoint. It’s on the root of my website and it’s called meetups.json. I’d like for it to be in an api/ folder.

Technically, I could just create a folder called api/ in my Jekyll project and move my meetups.json into that folder. That’ll work just fine. But there’s an easier way if we just add the following line to the top of our meetups.json:

---
permalink: "api/meetups.json"
---

...

Amazing. We now have an automatically generating API, in a structure we like and at a URL endpoint that we like. There’s a lot more we can do here, but this is a great starting point.

Conclusion

I would love to see more companies use the power of Jekyll across their websites. Like many other developers, I love playing around with data and I’d recommend looking into building JSON endpoints even in your static sites. It’ll probably help your own developers a bunch, too.

Have any questions? Feel free to reach out - I would love to hear from you! 🙃