<li><a href="#api_get">/get/(user), /get/(user)/(start record)-(end record) - get records for a user</a></li>
<li><a href="#api_info">/info/(user) - Get information about a user</a></li>
<li><a href="#api_tag">/tag/(#|H|@)(tagname) - Retrieve records containing tags</a></li>
+ <li><a href="#api_subscribe">/subscribe/(user) - Subscribe to a user's updates</a></li>
+ <li><a href="#api_unsubscribe">/unsubscribe/(user) - Unsubscribe from a user's updates</a></li>
+ <li><a href="#api_feed">/feed - Get updates for subscribed users</a></li>
+ <li><a href="#api_feedinfo">/feedinfo, /feedinfo/(user) - Get subscription status</a></li>
</ul>
</li>
<li><a href="#design">Design</a>
<li><a href="#motivation">Motivation</a></li>
<li><a href="#web_app_stack">Web App Stack</a></li>
<li><a href="#database">Database</a></li>
+ <li><a href="#subscriptions">Subscriptions</a></li>
<li><a href="#problems">Problems and Future Work</a></li>
</ul>
</li>
<h3><a name="configuring">Configuring</a></h3>
-<p>I know I'm gonna get shit for not using an autoconf-based system, but
-I really didn't want to spend time figuring it out. You should edit
-libs.mk and put in the paths where you can find headers and libraries
-for the above requirements.
+<p>There is now an experimental autoconf build system. If you run
+<code>add-autoconf</code>, it'll do the magic and create a
+<code>configure</code> script that'll do the familiar things. If I ever
+get around to distributing source packages, you should find that this
+has already been done.
+
+<p>If you'd rather stick with the manual system, you should edit libs.mk
+and put in the paths where you can find headers and libraries for the
+above requirements.
<p>Also, further apologies to BSD folks — I've probably committed
several unconscious Linux-isms. It would not surprise me if the
or maybe a 502 Bad Gateway if you have it behind another web server.
<p>All usernames must be 32 characters or less. Usernames must contain
-only the ASCII characters 0-9, A-Z, a-z, underscore (_), period (.),
-hyphen (-), single quote ('), and space ( ). Passwords can be at most
-64 bytes, and have no limits on characters (but beware: if you have a
-null in the middle, it will stop checking there because I use
-<code>strncmp(3)</code> to compare).
+only the ASCII characters 0-9, A-Z, a-z, underscore (_), and hyphen (-).
+Passwords can be at most 64 bytes, and have no limits on characters (but
+beware: if you have a null in the middle, it will stop checking there
+because I use <code>strncmp(3)</code> to compare).
<p>Tags must be 64 characters or less, and can contain only the ASCII
-characters 0-9, A-Z, a-z, hyphen (-), and underscore (_).
+characters 0-9, A-Z, a-z, underscore (_), and hyphen (-).
<h3><a name="api_create">/create</a> - create a new user</a></h3>
<p>There is currently no support for getting more than 50 tags, but /tag
will probably mutate to work like /get.
+<h3><a name="api_subscribe">/subscribe/(user)</a> - Subscribe to a
+user's updates</a></h3>
+
+<p>POST to /subscribe/(user) with a <code>username</code> parameter and
+an auth cookie, where (user) is the user whose updates you wish to
+subscribe to. The server will respond with JSON failure if the auth
+cookie is bad or if the user doesn't exist. The server will respond
+with JSON success after the subscription is successfully registered.
+
+<h3><a name="api_unsubscribe">/unsubscribe/(user)</a> - Unsubscribe from
+a user's updates</h3>
+
+<p>Identical to /subscribe, but removes the subscription.
+
+<h3><a name="api_feed">/feed</a> - Get updates for subscribed users</h3>
+
+<p>POST to /feed, with a <code>username</code> parameter and an auth
+cookie. The server will respond with a JSON list of the last 50 updates
+from all subscribed users, in reverse chronological order. Fetching
+/feed resets the new message count returned from /feedinfo.
+
+<p>NOTE: subscription notifications are only stored while subscriptions
+are active. Any records inserted before or after a subscription is
+active will not show up in /feed.
+
+<h3><a name="api_feedinfo">/feedinfo, /feedinfo/(user)</a> - Get subscription
+status for a user</a></h3>
+
+<p>POST to /feedinfo with a <code>username</code> parameter and an auth
+cookie to get general information about your subscribed feeds.
+Currently, this only tells you how many new records there are since the
+last time /feed was fetched. The server will respond with a JSON
+object:
+
+<pre>
+{"new":3}
+</pre>
+
+<p>POST to /feedinfo/(user) with a <code>username</code> parameter and
+an auth cookie, where (user) is a user whose subscription status you are
+interested in. The server will respond with a simple JSON object:
+
+<pre>
+{"subscribed":true}
+</pre>
+
+<p>The value of "subscribed" will be either true or false depending on
+the subscription status.
+
<h2><a name="design">Design</a></h2>
<h3><a name="motivation">Motivation</a></h3>
<p>I was impressed by <a
href="http://www.varnish-cache.org/">varnish</a>'s design, so I decided
early in the design process that I'd try out mmaped I/O. Each user in
-Blërg has their own database, which consists of one or more data and
-index files, and a metadata file. When a database is opened, only the
-metadata is actually read (currently a single 64-bit integer keeping
-track of the last record id). The data and index files are memory
+Blërg has their own database, which consists of a metdata file, and one
+or more data and index files. The data and index files are memory
mapped, which hopefully makes things more efficient by letting the OS
-handle when to read from disk. The index files are preallocated because
-I believe it's more efficient than writing to it 40 bytes at a time as
-records are added. The database's limits are reasonable:
+handle when to read from disk (or maybe not &mdash I haven't benchmarked
+it). The index files are preallocated because I believe it's more
+efficient than writing to it 40 bytes at a time as records are added.
+The database's limits are reasonable:
<table class="statistics">
<tr><td>maximum record size</td><td>65535 bytes</td></tr>
and totally unreliable in a crash. But that's the way you want it,
right? :]
+<h3><a name="subscriptions">Subscriptions</a></h3>
+
+<p>When I first started thinking about the idea of subscriptions, I
+immediately came up with the naïve solution: keep a list of users to
+which users are subscribed, then when you want to get updates, iterate
+over the list and find the last entries for each user. And that would
+work, but it's kind of costly in terms of disk I/O. I have to visit
+each user in the list, retrieve their last few entries, and store them
+somewhere else to be sorted later. And worse, that computation has to
+be done every time a user checks their feed. As the number of users and
+subscriptions grows, that will become a problem.
+
+<p>So instead, I thought about it the other way around. Instead of doing
+all the work when the request is received, Blërg tries to do as much as
+possible by "pushing" updates to subscribed users. You can think of it
+kind of like a mail system. When a user posts new content, a
+notification is "sent" out to each of that user's subscribers. Later,
+when the subscribers want to see what's new, they simply check their
+mailbox. Checking your mailbox is usually a lot more efficient than
+going around and checking everyone's records yourself, even with the
+overhead of the "mailman."
+
+<p>The "mailbox" is a subscription index, which is identical to a tag
+index, but is a per-user construct. When a user posts a new record, a
+subscription index record is written for every subscriber. It's a
+similar amount of I/O as the naïve version above, but the important
+difference is that it's only done once. Retrieving records for accounts
+you're subscribed to is then as simple as reading your subscription
+index and reading the associated records. This is hopefully less I/O
+than the naïve version, since you're reading, at most, as many accounts
+as you have records in the last N entries of your subscription index,
+instead of all of them. And as an added bonus, since subscription index
+records are added as posts are created, the subscription index is
+automatically sorted by time! To support this "mail" architecture, we
+also keep a list of subscribers and subscrib...ees in each account.
+
<h3><a name="problems">Problems, Caveats, and Future Work</a></h3>
<p>Blërg probably doesn't actually work like Twitter because I've never