X-Git-Url: http://git.bytex64.net/?a=blobdiff_plain;f=www%2Fdoc%2Findex.html;h=9dc8c077486f8907374eaab1e1bb449143904d14;hb=cdcf2e46dea2a2cc210caeb10a951ea8cf5d0e26;hp=9a85061c81821c1ec30277e749b258d6497ae406;hpb=93ac2090dfae4917cfdd3324d35b60756f5dd8b3;p=blerg.git diff --git a/www/doc/index.html b/www/doc/index.html index 9a85061..9dc8c07 100644 --- a/www/doc/index.html +++ b/www/doc/index.html @@ -36,6 +36,10 @@ C.
  • /get/(user), /get/(user)/(start record)-(end record) - get records for a user
  • /info/(user) - Get information about a user
  • /tag/(#|H|@)(tagname) - Retrieve records containing tags
  • +
  • /subscribe/(user) - Subscribe to a user's updates
  • +
  • /unsubscribe/(user) - Unsubscribe from a user's updates
  • +
  • /feed - Get updates for subscribed users
  • +
  • /feedinfo, /feedinfo/(user) - Get subscription status
  • Design @@ -43,6 +47,7 @@ C.
  • Motivation
  • Web App Stack
  • Database
  • +
  • Subscriptions
  • Problems and Future Work
  • @@ -172,14 +177,13 @@ For the HTTP backend, you'll get nothing (since it will have crashed), or maybe a 502 Bad Gateway if you have it behind another web server.

    All usernames must be 32 characters or less. Usernames must contain -only the ASCII characters 0-9, A-Z, a-z, underscore (_), period (.), -hyphen (-), single quote ('), and space ( ). Passwords can be at most -64 bytes, and have no limits on characters (but beware: if you have a -null in the middle, it will stop checking there because I use -strncmp(3) to compare). +only the ASCII characters 0-9, A-Z, a-z, underscore (_), and hyphen (-). +Passwords can be at most 64 bytes, and have no limits on characters (but +beware: if you have a null in the middle, it will stop checking there +because I use strncmp(3) to compare).

    Tags must be 64 characters or less, and can contain only the ASCII -characters 0-9, A-Z, a-z, hyphen (-), and underscore (_). +characters 0-9, A-Z, a-z, underscore (_), and hyphen (-).

    /create - create a new user

    @@ -282,6 +286,55 @@ extra author field, like so:

    There is currently no support for getting more than 50 tags, but /tag will probably mutate to work like /get. +

    /subscribe/(user) - Subscribe to a +user's updates

    + +

    POST to /subscribe/(user) with a username parameter and +an auth cookie, where (user) is the user whose updates you wish to +subscribe to. The server will respond with JSON failure if the auth +cookie is bad or if the user doesn't exist. The server will respond +with JSON success after the subscription is successfully registered. + +

    /unsubscribe/(user) - Unsubscribe from +a user's updates

    + +

    Identical to /subscribe, but removes the subscription. + +

    /feed - Get updates for subscribed users

    + +

    POST to /feed, with a username parameter and an auth +cookie. The server will respond with a JSON list of the last 50 updates +from all subscribed users, in reverse chronological order. Fetching +/feed resets the new message count returned from /feedinfo. + +

    NOTE: subscription notifications are only stored while subscriptions +are active. Any records inserted before or after a subscription is +active will not show up in /feed. + +

    /feedinfo, /feedinfo/(user) - Get subscription +status for a user

    + +

    POST to /feedinfo with a username parameter and an auth +cookie to get general information about your subscribed feeds. +Currently, this only tells you how many new records there are since the +last time /feed was fetched. The server will respond with a JSON +object: + +

    +{"new":3}
    +
    + +

    POST to /feedinfo/(user) with a username parameter and +an auth cookie, where (user) is a user whose subscription status you are +interested in. The server will respond with a simple JSON object: + +

    +{"subscribed":true}
    +
    + +

    The value of "subscribed" will be either true or false depending on +the subscription status. +

    Design

    Motivation

    @@ -362,14 +415,13 @@ until after I wrote Blërg. :)

    I was impressed by varnish's design, so I decided early in the design process that I'd try out mmaped I/O. Each user in -Blërg has their own database, which consists of one or more data and -index files, and a metadata file. When a database is opened, only the -metadata is actually read (currently a single 64-bit integer keeping -track of the last record id). The data and index files are memory +Blërg has their own database, which consists of a metdata file, and one +or more data and index files. The data and index files are memory mapped, which hopefully makes things more efficient by letting the OS -handle when to read from disk. The index files are preallocated because -I believe it's more efficient than writing to it 40 bytes at a time as -records are added. The database's limits are reasonable: +handle when to read from disk (or maybe not &mdash I haven't benchmarked +it). The index files are preallocated because I believe it's more +efficient than writing to it 40 bytes at a time as records are added. +The database's limits are reasonable: @@ -427,6 +479,42 @@ disk before returning success. This should make Blërg extremely fast, and totally unreliable in a crash. But that's the way you want it, right? :] +

    Subscriptions

    + +

    When I first started thinking about the idea of subscriptions, I +immediately came up with the naïve solution: keep a list of users to +which users are subscribed, then when you want to get updates, iterate +over the list and find the last entries for each user. And that would +work, but it's kind of costly in terms of disk I/O. I have to visit +each user in the list, retrieve their last few entries, and store them +somewhere else to be sorted later. And worse, that computation has to +be done every time a user checks their feed. As the number of users and +subscriptions grows, that will become a problem. + +

    So instead, I thought about it the other way around. Instead of doing +all the work when the request is received, Blërg tries to do as much as +possible by "pushing" updates to subscribed users. You can think of it +kind of like a mail system. When a user posts new content, a +notification is "sent" out to each of that user's subscribers. Later, +when the subscribers want to see what's new, they simply check their +mailbox. Checking your mailbox is usually a lot more efficient than +going around and checking everyone's records yourself, even with the +overhead of the "mailman." + +

    The "mailbox" is a subscription index, which is identical to a tag +index, but is a per-user construct. When a user posts a new record, a +subscription index record is written for every subscriber. It's a +similar amount of I/O as the naïve version above, but the important +difference is that it's only done once. Retrieving records for accounts +you're subscribed to is then as simple as reading your subscription +index and reading the associated records. This is hopefully less I/O +than the naïve version, since you're reading, at most, as many accounts +as you have records in the last N entries of your subscription index, +instead of all of them. And as an added bonus, since subscription index +records are added as posts are created, the subscription index is +automatically sorted by time! To support this "mail" architecture, we +also keep a list of subscribers and subscrib...ees in each account. +

    Problems, Caveats, and Future Work

    Blërg probably doesn't actually work like Twitter because I've never

    maximum record size65535 bytes