Serverless site analytics with Clojure nbb and AWS

Since I started this blog, I miss some simple analytics. I don't want a cookie banner and I don't want to pay if possible, so I took some time to build a homemade solution, it was actually a good opportunity to try out the new @borkdude creation: nbb (a Clojure interpreter on node.js) which is a good fit for easy Clojure-based AWS Lambda (with no compilation step).
Check this blogpost for more details using nbb on AWS lambda: https://blog.michielborkent.nl/aws-lambda-nbb.html
You can see (and use freely) the final code here: https://github.com/cyppan/simple-site-analytics, the AWS infrastructure is managed as code (well, yaml actually) through the Serverless framework
The features I need:
- Track the views per day
- Keep specific counters for "utm_source" (passed in a query param when sharing my blog URLs, ex: "twitter" or "slack")
- Show the top URLs (even if I have only three for now :) )
The AWS components used are:
- A DynamoDB table SiteStatistics used to store the views counters per day and url.
- A Lambda which increments the views counters.
- A Lambda which returns a html page showing some statistics about the last seven days.
- Two API Gateway HTTP endpoints proxying to the lambdas (
POST /track
andGET /dashboard
).
Part one: the tracker
Each time a user view an URL, a fetch will call the /track
endpoint with the canonical url and the utm_source if any.
The following javascript snippet can be added on the website pages:
<script type="text/javascript">
fetch('https://xxxxxxx.execute-api.eu-west-3.amazonaws.com/track', {
method: 'post',
mode: 'cors',
headers: {"Content-Type": "application/json"},
body: JSON.stringify({
url: document.querySelector("link[rel='canonical']").getAttribute("href"),
utm_source: new URLSearchParams(window.location.search).get("utm_source")
})
});
</script>
The corresponding "track" lambda uses the node.js library "@aws-sdk/client-dynamodb" to create a dynamo client and call a function "increment-views".
(defn increment-views [day url utm-source]
(.send @dynamo-client
(dynamo/UpdateItemCommand.
(clj->js {:TableName "SiteStatistics"
:Key {:day {:S day}
:url {:S url}}
:UpdateExpression (str "ADD #views :increment"
(when (seq utm-source)
(str ", views_" utm-source " :increment")))
:ExpressionAttributeNames {"#views" "views"}
:ExpressionAttributeValues {":increment" {:N "1"}}
:ReturnValues "ALL_NEW"}))))
The design of the SiteStatistics table is pretty simple, the composite key is (day, url) and the columns are the counters (views, views_twitter, views_slack, ...). It is adapted for the read patterns I need (fetch all the URL counters for the last N days).
Part two: the dashboard
The other "dashboard" lambda is meant to be open in the browser and display a HTML view of the statistics. there is a bit more code involved in order to generate the view. For the styling I use bulma CSS which is a real time saver.
The first thing the lambda does is to fetch all the items from the SiteStatistics Dynamo table for the last 7 days, named stat-rows in the code, ex: [{:views 4 :views_slack 2 :day "2022-02-01" :url "https://url.com/page"} ,,,]
(defn fetch-last-7-days-statistics
"returns [{day url views}]"
[]
(p/let [items (js/Promise.all
(for [day (last-7-days)]
(p/let [resp (.send @dynamo-client
(dynamo/QueryCommand.
(clj->js {:TableName "SiteStatistics"
:KeyConditionExpression "#day = :day"
:ExpressionAttributeNames {"#day" "day"}
:ExpressionAttributeValues {":day" {:S day}}})))
resp (js->clj resp :keywordize-keys true)
items (->> (:Items resp)
(map -parse-dynamo-item))]
(or (seq items) [{:day day :views 0}]))))]
(into [] cat items)))
The page contains three sections.
Section 1: the counters tiles

(defn counter-cards [stat-rows]
(let [views (reduce + 0 (map :views stat-rows))
views-slack (reduce + 0 (map :views_slack stat-rows))
views-twitter (reduce + 0 (map :views_twitter stat-rows))]
[:nav.level.is-mobile
[:div.level-item.has-text-centered
[:div
[:p.heading "Total views"]
[:p.title views]]]
[:div.level-item.has-text-centered
[:div
[:p.heading "views from Slack"]
[:p.title views-slack]]]
[:div.level-item.has-text-centered
[:div
[:p.heading "views from Twitter"]
[:p.title views-twitter]]]]))
Section 2: the views bar chart

For this one I generate a vega-lite grammar and I use vega-embed to render it
(defn views-bar-chart [stat-rows]
(let [data (->> stat-rows
(group-by :day)
(map (fn [[day rows]]
{:day day
:views (reduce + 0 (map :views rows))}))
(sort-by :day <))
spec (clj->js {:$schema "https://vega.github.io/schema/vega-lite/v5.json"
:data {:values data}
:mark {:type "bar"}
:width "container"
:height 300
:encoding {:x {:field "day"
:type "nominal"
:axis {:labelAngle -45}}
:y {:field "views"
:type "quantitative"}}})
id (str "div-" (.toString (crypto/randomBytes 16) "hex"))
raw (str "<div id=\"" id "\" style=\"width:100%;height:300px\"></div>"
"<script type=\"text/javascript\">"
"vegaEmbed ('#" id "', JSON.parse('" (js/JSON.stringify spec) "'));"
"</script>")]
[:div {:dangerouslySetInnerHTML {:__html raw}}]))
Section 3: the top urls table

(defn top-urls-table [stat-rows]
(let [top-urls (->> stat-rows
(filter :url)
(group-by :url)
(map (fn [[url rows]]
{:url url
:views (reduce + 0 (map :views rows))
:views_slack (reduce + 0 (map :views_slack rows))
:views_twitter (reduce + 0 (map :views_twitter rows))}))
(sort-by :views >))]
[:table.table.is-fullwidth.is-hoverable.is-striped
[:thead>tr
[:th "Rank"]
[:th "URL"]
[:th "Views"]
[:th "Slack"]
[:th "Twitter"]]
[:tbody
(for [[i {:keys [url views views_slack views_twitter]}] (map-indexed vector top-urls)]
[:tr
[:th {:style {:width "20px"}} (inc i)]
[:td [:a {:href url} url]]
[:td {:style {:width "20px"}} views]
[:td {:style {:width "20px"}} views_slack]
[:td {:style {:width "20px"}} views_twitter]])]]))
That's it! the whole page is generated like this:
(wrap-template
[:<>
[:div.box
(counter-cards stat-rows)
(views-bar-chart stat-rows)]
[:div.box
[:h1.title.is-3 "Top URLs"]
(top-urls-table stat-rows)]])
The final files tree is just:
handlers
├── dashboard.cljs
└── track.cljs
package.json
index.mjs
serverless.yml
package-lock.json
You can see the whole code here with some more information in the readme about costs, CORS, and how to develop locally.