Archive for June, 2010
by mithrandi on Jun.18, 2010
People new to working with event-driven programming in general, and event loops in particular, often seem to have a hard time figuring out how to run their code when it needs to be run. For example, the event loop in Twisted is called the reactor (or perhaps more accurately, the event loop is part of the reactor). The reactor can be started and stopped, but the catch is that the current implementation only allows you to start one reactor once per process. Once you’ve stopped it, it can’t be started again. Now, this is really just an unfortunate consequence of two historical design artifacts; the reactor is a global singleton, so you can only have one, and it doesn’t manage the process-global state well enough to be able to start again once it has stopped. However, while it would be nice for those two issues to be fixed, the reality is that the usual answer to “how do I stop and start the reactor more than once?” is “you don’t need to, and you don’t want to.”
The idea of an event loop is, roughly, “don’t call us, we’ll call you”. When an event occurs, the event loop takes care of dispatching that event to the relevant code in order to handle the event. While there may be some layers of library / framework between your application code and the low-level event handler, ultimately it’ll usually be some part of your application code that gets invoked in response to the event. This means that instead of writing your program in the style of “do A, then do B, then do C”, you write it in a way that looks more like “when X happens, do A; when Y happens, do B; when Z happens, do C”.
So, how do you kick things off? If you fire up an event loop without doing anything else, generally no events will ever occur / be received, and so your program will just run forever doing absolutely nothing. Thus, in the startup phase of your program before you start the event loop, you will need to perform some operations that will cause events to be received later on. For example, an HTTP server would probably want to open a socket and start listening on port 80; alternatively, you might have some time-based event, along the lines of “run this code every 30 seconds”. The startup phase is not the only time that you can set your event sources up; you can also do this in any of your application code that handles other events that occur. For example, your web server might open a connection to a database server when it receives a request for a certain URL.
That’s really all there is to it; figure out which event sources you need to set up intially before you run the event loop, and figure out what other events will occur during the lifetime of your application that would require setting up additional event sources, no matter what those event sources might be (making connections, starting timers, completion of a prior operation, etc). While this post may be slightly Twisted-specific, the concepts are pretty general: any decent event loop will necessarily allow both cases (initial setup, and additional operations while the event loop is running). The key is to think of your application as code run in response to events, not just a linear series of operations.
by mithrandi on Jun.07, 2010
As you may have already surmised, Mantissa is designed to host more than one application in the same deployment. Due to the way Mantissa builds on Axiom, it is important to first understand the data model, before starting to understand how the components of your application plug into Mantissa.
A Mantissa deployment, or “node”, consists of a top-level Axiom store called the site store, and various substores contained therein. The substores fall into two categories; user stores, and application stores. The site store primarily contains IService powerups for the various services running in the node (think HTTP server, SMTP server, etc.), and the substore items for the user / app stores. It also contains the IRealm implementation for authentication (using twisted.cred, naturally), along with account / login method information for each user account, as well as other configuration data such as API keys (think PayPal API, or similar).
The user stores are where your actual application data will be stored in a typical Mantissa application. Each user has their own Axiom user store (SQLite database), where data for that user is stored. The account / login method items for that particular user are also duplicated in the user store, to allow user stores to be transferred between site stores. Items in different user stores should not interact with each other, except via the interstore messaging API. This forms the core of Mantissa’s scaling model; while each individual user’s data must all be located on the same node, different users may have their data stored on different nodes, allowing you to spread your users across as many nodes as you need to in order to achieve the desired performance.
Application stores are sort of a special kind of user store; internally, they’re treated as a user store with a username corresponding to the name of the Offering (offerings will be discussed later), and a domain of None. Each application installed on a Mantissa node will have an application store; exactly what to put in there isn’t entirely clear, the best I can come up with is “application-global data”.
Mantissa has a system called “sharing”, whereby access to items in one user’s store can be granted to other users, or even anonymous public access granted. In the case of “websharing” (which works over HTTP), the items are accessed using a URL that specifies which user’s data is being accessed, as well as the identifier of the particular item. For example, the URL might look like http://myapp.com/mithrandi/timeline or http://mithrandi.myapp.com/timeline (“mithrandi” is the username, “timeline” is the share ID).
For a user-centric application (like Google Docs or Dropbox), this works pretty well; if the data in your application is primarily of global scope, then this may not be a good fit at all. There are some ways around this: firstly, not every “user” needs to be an actual live user. Thus, you can divide your data across “artificial” user stores, and share it to the users that need to have access to it, retaining Mantissa’s scalability in the process. For “global” data access or operations, you can send a message to each user store to perform a certain query or operation, and then collate the results; this model of “send the computation to the data” is similar in some ways to MapReduce.
However, if your application truly does not fit into the model, then one possibility is to store the application data in a completely separate service, using Mantissa only for configuration and authorization. The fact that Mantissa stores certain data in Axiom does not prevent you from making queries to an external PostgreSQL database, or Cassandra database, or web service, or whatever else you can come up with; it just means that you derive no specific benefit from Mantissa’s database functionality.
Another serious caveat is that while the conceptual model and APIs for Mantissa scalability are mostly in place, the actual implementation of scaling is not there at all. There is currently no networked implementation of the interstore messaging API, there are no tools for managing a cluster of Mantissa nodes, distributing or migrating user stores, backup/replication of user stores, and so on. The foundation for building these tools and components exists, but at the moment, you’ll need to build them yourself (or pay someone to do it for you) if you need them. Thus, if you are looking for “Facebook size” scalability straight out of the box, Mantissa is probably not for you. On the other hand, depending on your needs, building the tools you need as you grow your application may not pose a significant challenge, so if you’re not afraid of writing a little code, don’t be scared off right away by this. You could also avoid relying on Mantissa for scalability, by storing your application data in another datastore that already meets your scalability needs
Next up: Offerings, or “How do I expose my application to Mantissa?”.
by mithrandi on Jun.07, 2010
“Application server” is a somewhat fuzzy term that seems to mean different things to different people; but in essence, Mantissa serves as a layer that joins your actual application code to the various ways (different protocols, for example) that your application may be accessed. Or, to put it another way, when writing a Mantissa application, the deployment target is Mantissa, not “a web server”, even if your application is only going to be accessed from the web. That’s not to say that Mantissa completely abstracts away the details of the protocol, making it impossible to control how your application is presented; rather, it provides common ground to mediate that interaction. It provides mechanisms for the common portions of functionality, such as access control, data storage, authentication and authorization, and also makes provision for defining standard interfaces through which your application is exposed to the various protocols you may wish to make use of.
This series is not going to be of the “build your own wiki in 60 seconds” kind. While there are some quirks and rough patches in Mantissa that may make it not the best choice for hacking together an application really quickly, it is probably still something that is doable; but unfortunately, I’m not the right person to be writing that sort of article. What I hope to do, instead, is provide a more in-depth introduction that allows you to understand and embrace (or reject!) the design philosophy of Mantissa for longer-term projects, while noting incomplete areas and other potential pitfalls along the way that may need to be avoided.
This is also not going to be an Axiom, Nevow, Python, or Twisted tutorial; while I will touch on aspects of those components as they relate to Mantissa, you will need to look elsewhere for further detail.
That’s it for now, except to thank Jeremy in passing for nagging me about this until I got around to doing it (or starting it, at least).