Build artifacts with pex

2018-05-19 08:01

Comments

For Continuous Integration in my Python application (as opposed to library) projects, I generally want to run my tests as well as build a Docker image, making use of the same artifacts and environment for both: I want to test what I'm actually deploying, rather than something merely similar. Previously this meant doing pip install twice; once into a virtualenv to run the tests, then again in the Docker image build. Sharing the pip cache between these steps speeds things up a lot, but this still ends up using quite a bit of time on network roundtrips etc.

Now that pex works with pypy, I have developed a slightly better workflow for this. Briefly speaking, Pex is a tool for assembling a Python application into a single runnable file that embeds the dependencies of the application; at runtime, the dependencies are ziploaded or extracted to a temporary location as necessary to run the application.

The workflow:

Build a pex file.
Run the tests against the pex file.
Copy the pex into the Docker image.

This is similar to what I would do with a language like Go or Haskell that would produce a single executable as part of the build process.

Adventures in deployment, redux

2018-01-19 21:52

Comments

Just a brief update: I've moved on from the hackish deployment strategy described in an older post; we now have Drone for CI building Docker images, pushing them to a registry, and deploying them on Rancher. You can see an example of the Drone pipeline to do this in our GitHub repos.

This is all I'm going to say on the topic, because that's all there really is to it. If you're using Kubernetes, I strongly suggest you look into Weave Flux for deployment; this is eventually where I want to end up, but migrating to Kubernetes is hard for us at the moment.

Running away again

2016-11-10 23:12

me
personal

[I originally posted this on Facebook; I've disabled comments here because I don't think I can cope with them over here on my blog, but if you really need to comment you can find me on Facebook or elsewhere]

I've run away several times in my life.

The very first time was as a teenager; I don't even recall exactly what it was that triggered it, I think it was anxiety over a compulsory school camp that I didn't want to attend. I didn't have a plan, I didn't even have any real intention to actually run away for real, I just felt like I was trapped and needed to escape / lash out somehow. "Running away" amounted to declaring my intent, leaving the house, hiding somewhere nearby while everyone frantically searched for me, and eventually returning a few hours later once I had calmed down a bit. Of course, at the time, I wasn't really capable of fully understanding the situation; I didn't understand the destructive mental paths that lead to being unable to cope, I didn't really care how much emotional stress I was causing to others, and I didn't realise that had I been able to communicate better about how I was feeling, I could probably have dealt with things much better. In short: I was wrong to run away this time. Then again, I was just a teenager, so I think I can be forgiven for failing at those things.

Later on in life, running away took on a more sophisticated form. Once I gained sufficient control over my life (both in the legal and the financial sense), I took steps to cut certain people out of it. I have few regrets about the process (which was staggered, not something I did all at once); some relationships and interactions are simply not healthy or beneficial for the parties involved, and not having to deal with the constant stress of those interactions definitely improved my mental health and my life in general by quite a bit. In short: I think I was right to run away this time. But in the end, it was still running away; I couldn't cope with the situation, so I chose not to cope with it.

And now…recent events surrounding the 2016 US elections are slowly producing that rising sense of panic and inability to cope again. The elections themselves are not the trigger; rather, it is the conversations they have sparked which has lead me to realise certain things about relatives, acquaintainces, and friends. As most of you probably know, I'm no stranger to intellectual debate / conflict; this isn't about that. I can quite happily handle interactions with someone I disagree with on politics, economics, or what have you; most of the time, I'm happy to discuss / argue the matter for hours, unpacking our disagreement into differences in assumptions vs. differences in reasoning, and ending with a better understanding of both sides even if my mind remains unchanged.

But this is different: this is a conflict over whether or not we should care about other people. Caring about other people is something I learned as an incontrovertible truth from childhood, not a selective privilege to be extended to only some and not others. I now realise that many around me do not feel the same way; they feel that some people are not deserving of their care, for whatever reason, and…I don't know how to deal with this.

I feel like running away this time would be a mistake, that I need to stay and engage with these people, attempt to change their minds; but I have no idea how to do this, and I fear the poisoning of either my own mind, or my relationships, as a result of this. I fear that to maintain these relationships as-is, with no condemnation, will result in sending the message that I support something I do not. I fear that such condemnation would inevitably lead to the destruction of those relationships anyway, accomplishing nothing good in the end. I fear that by running away, I am losing the opportunity to do something about all of this. I feel guilty because of my fear; there are others who do not even have the option of running away, others who are the direct target of the uncaring and the hatred, rather than mere bystanders who can leave at any other time. How much worse must their fear be than mine?

And so here I am, trapped in a seemingly endless mental loop, afraid to run and afraid not to run.

Progress on txacme

2016-08-15 17:40

Comments

The initial release of txacme was perhaps a quieter affair than I was hoping for, but the response has been enthusiastic from the few people that did take notice. Work has been progressing on the next release (which will be 0.9.1, I'm not at all ready to pronounce the API fully baked yet), which will have a number of minor improvements such as better documentation, but the headline feature is probably dns-01 challenge support.

There is currently a PR up with a libcloud-based implementation that works [1]; I also have plans for a twistd plugin to run a standalone issuing service, but I'm not yet sure if this will make it into 0.9.1.

Implementing a dns-01 challenge responder revealed some deficiencies in the IResponder interface that made it necessary to expand it; this is exactly the sort of API change that I was expecting to crop up, and one of the reasons why I'm holding off on a 1.0.0 release for the time being. My guess is that I'll make it to at least 0.9.5 or thereabouts before 1.0.0 comes out.

Footnotes

[1]	Well... mostly. libcloud does synchronous network I/O so it needs to be invoked in a thread pool, and some of the DNS drivers are buggy (eg. the Route 53 driver does not format TXT records how Route 53 expects them), but I have a passing integration test using Rackspace Cloud DNS!

WordPress to Nikola

2016-07-24 17:35

Comments

As you may or may not have noticed, my blog looks substantially different now. This is because I have migrated from WordPress to Nikola. Using the WordPress importer worked out reasonably well, although I needed to fix up the code blocks myself. Going forward I'm just going to use the standard ReST format for writing posts.

I'm also hosting the site on S3 with CloudFront in front, rather than a dinky Linode, so the site should be a lot faster now. However, there's probably some stuff broken that I didn't notice; please yell if you see anything.

Adventures in deployment with Propellor, Docker, and Fabric

2015-06-01 22:27

Comments

Preface

After playing around with Docker a bit, I decided that it would make an ideal deployment platform for my work services (previously we were using some ad-hoc isolation using unix users and not much else). While Docker’s security is…suspect…compared to a complete virtualization solution (see Xen), I’m not so much worried about complete isolation between my services, as things like easy resource limits and imaging. You can build this yourself out of cgroups, chroot, etc. but in the end you’re just reinventing the Docker wheel, so I went with Docker instead.

However, Docker by itself is not a complete solution. You still need some way to configure the Docker host, and you also need to build Docker images, so I added Propellor (which I recently discovered) and Fabric to the mix.

Propellor

Propellor is a configuration management system (in the sense of Puppet, Chef, Salt, et al.) written in Haskell, where your configuration itself is Haskell code. For someone coming from a systems administration background, the flexibility and breadth offered by a real programming language like Haskell may be quite daunting, but as a programmer, I find it far more straightforward to just write code that does what I want, extracting common pieces into functions and so on. Our previous incarnation of things used Puppet for configuration management, but it always felt very awkward to work with; another problem is that Puppet was introduced after a bunch of the infrastructure was in place, meaning a lot of things were not actually managed by Puppet because somebody forgot. Propellor was used to configure a new server from scratch, ensuring that nothing was done ad-hoc, and while I won’t go into too much detail about Propellor, I am liking it a lot so far.

The role of Propellor in the new order is to configure things to provide the expected platform. This includes installing Docker, installing admin user accounts, SSH keys, groups, and so on.

Docker

The Docker workflow I adopted is based on the one described by Glyph. I would strongly recommend you go read his excellent post for the long explanation, but the short version is that instead of building a single container image, you instead build three: A “build” container used to produce the built artifacts from your sources (eg. Python wheels, Java/Clojure JARs), a “run” container which is built by installing the artifacts produced by running the “build” container, and thus does not need to contain your toolchain and -dev packages (keeping the size down), and a “base” container which contains the things shared by the “build” and “run” containers, allowing for even more efficiency of disk usage.

While I can’t show the Docker bits for our proprietary codebases, you can see the bits for one of our free software codebases, including instructions for building and running the images. The relative simplicity of the .docker files is no accident; rather than trying to shoehorn any complex build processes into the Docker image build, all of the heavy lifting is done by standard build and install tools (in the case of Documint: apt/dpkg, pip, and setuptools). Following this principal will save you a lot of pain and tears.

Fabric

The steps outlined for building the Docker images are relatively straightforward, but copy/pasting shell command lines from a README into a terminal is still not a great experience. In addition, our developers are typically working from internet connections where downloading multi-megabyte Docker images / packages / etc. is a somewhat tedious experience, and uploading the resulting images is ten times worse (literally ten times worse; my connection at home is 10M down / 1M up ADSL, for example). Rather than doing this locally, this should instead run on one of our servers which has much better connectivity and a stable / well-defined platform configuration (thanks to Propellor). So now the process would be “copy/paste shell command lines from a README into an ssh session” — no thanks. (For comparison, our current deployment processes use some ad-hoc shell scripts lying around on the relevant servers; a bit better than copy/pasting into an ssh session, but not by much.)

At this point, froztbyte reminded me of Fabric (which I knew about previously, but hadn’t thoughto f in this context). So instead I wrote some fairly simple Fabric tasks to automate the process of building new containers, and also deploying them. For final production use, I will probably be setting up a scheduled task that automatically deploys from our “prod” branch (much like our current workflow does), but for testing purposes, we want a deploy to happen whenever somebody merges something into our testing release branch (eventually I’d like to deploy test environments on demand for separate branches, but this poses some challenges which are outside of the scope of this blog post). I could build some automated deployment system triggered by webhooks from BitBucket (where our private source code is hosted), but since everyone with access to merge things into that branch also has direct SSH access to our servers, Fabric was the easiest solution; no need to add another pile of moving parts to the system.

My Fabric tasks look like this (censored slightly to remove hostnames):

 1 @hosts('my.uat.host')
 2 def build_uat_documint():
 3     with settings(warn_only=True):
 4         if run('test -d /srv/build/documint').failed:
 5             run('git clone --quiet -- https://github.com/fusionapp/documint.git /srv/build/documint')
 6     with cd('/srv/build/documint'):
 7         run('git pull --quiet')
 8         run('docker build --tag=fusionapp/documint-base --file=docker/base.docker .')
 9         run('docker build --tag=fusionapp/documint-build --file=docker/build.docker .')
10         run('docker run --rm --tty --interactive --volume=&quot;/srv/build/documint:/application&quot; --volume=&quot;/srv/build/documint/wheelhouse:/wheelhouse&quot; fusionapp/documint-build')
11         run('cp /srv/build/clj-neon/src/target/uberjar/clj-neon-*-standalone.jar bin/clj-neon.jar')
12         run('docker build --tag=fusionapp/documint --file=docker/run.docker .')
13 
14 
15 @hosts('my.uat.host')
16 def deploy_uat_documint():
17     with settings(warn_only=True):
18         run('docker stop --time=30 documint')
19         run('docker rm --volumes --force documint')
20     run('docker run --detach --restart=always --name=documint --publish=8750:8750 fusionapp/documint')

Developers can now deploy a new version of Documint (for example) by simply running fab build_uat_documint deploy_uat_documint. Incidentally, the unit tests are run during the container build (from the .docker file), so deploying a busted code version by accident shouldn’t happen.

Axiom benchmark results on PyPy 2.5.0

2015-03-07 13:22

Comments

This is a followup to a post I made about 1.5 years ago, benchmarking Axiom on PyPy 2.1.0. Not too much has changed in Axiom since then (we fixed two nasty bugs that mainly affected PyPy, but I don’t expect those changes to have had much impact on performance), but PyPy (now at 2.5.0) has had plenty of work done on it since then, so let’s see what that means for Axiom performance!

Unlike my previous post, I’m basically just going to show the results here without much commentary:

Graph of Axiom performance

A few notes:

I didn’t redo the old benchmark results, but the hardware/software I ran the benchmarks on is not significantly different, so I think the results are still valid as far as broad comparisons go (as you can see, the CPython results match fairly closely).
The benchmark harness I’m using now is improved over the last time, using some statistical techniques to determine how long to run the benchmark, rather than relying on some hardcoded values to achieve JIT warmup and performance stability. Still could use some work (eg. outputting a kernel density estimate / error bars, rather than just a single mean time value).
There is one new benchmark relative to the last time, powerup-loading; PyPy really shines here, cutting out a ton of overhead. There’s still room for a few more benchmarks of critical functions such as actually running and loading query results (as opposed to just constructing query objects).
The branch I used to run these benchmarks is available on Github.
The horizontal axis is cut off at 1.0 so you can’t actually see how store-opening lines up, but the raw data shows that PyPy 2.1.0 took about 53% longer on this benchmark, whil PyPy 2.5.0 only takes about 2% longer.

BitBucket migration

2013-09-28 22:59

Comments

At work, we are currently using Launchpad for project hosting of our proprietary codebase. Launchpad charges $250/year/project for hosting of proprietary projects which is a little steep, and Launchpad/bzr has been falling behind the alternatives in terms of tooling / development / support, so when our Launchpad subscription came up for renewal at the beginning of the month, this caused our somewhat vague plans to switch to something else to crystalize.

I initially assumed that Github/git would be the obvious way to go, but after looking into BitBucket/hg, I was pleasantly surprised to discover that things like hosted CI were available there too. Nobody on our team is much of a git enthusiast to begin with, so using hg seemed like a far more attractive option. This meant figuring out how to do two things: 1) migrate all of our existing bugs, and 2) migrate our existing bzr branches to hg. The former proved to be relatively straightforward: Jonathan wrote a script that used launchpadlib to access the Launchpad API, retrieve the bug data + metadata, and write it out in the BitBucket import format (more on this in another post, or on Jonathan's blog, depending on if I can convince him to write it up or not).

The bzr to hg conversion turned out to be a little more complex. A simple "hg convert" of our trunk branch worked surprisingly well; the trunk history converted correctly (at least as far as I could tell), but more (pleasantly) surprisingly, the branches which were merged into trunk were also reconstructed by the conversion, along with the merges. The conversion relies on the bzr branch nick; this works somewhat like hg branches (it's associated with a commit at the time that you commit), but as bzr does not place as much importance on this label as hg, it is more likely to be wrong by accident (by default the branch nick is just taken from the last component of the path to the branch you are committing in, I believe, and in our case I suspect nobody has ever set the branch nick manually). Among other things, this resulted in 4 different branch names for "trunk", as well as some other oddities like misspelled feature branch names.

(As an aside, I'd like to mention that hg log has far more utility than bzr log, mostly due to the "revsets" feature. Almost all of the inspection I did while debugging the conversion was done on the converted repo using hg, not on the original bzr repo, simply because it was far easier to get the information that way.)

A "branchmap" file solved the problem with differing branches; mapping the different names for "trunk" to "default" made the revision history graph look a lot more reasonable than when I originally did the conversion. I also switched to using --datesort for the conversion at this point; the documentation warns that this may produce a much larger repository than --branchsort (the default), but in my case, the size difference was trivial. I suspect this may only apply in scenarios with back-and-forth merges between long-lived branches, rather than the short-lived topic branches that form the majority of our workflow. I also created an "authormap" file at this point to reconcile differing author identities over the history of our repository. The bzr author identity is a full name/email (eg. "Tristan Seligmann <mithrandi@mithrandi.net>", but again, there were various historical oddities here; BitBucket also has the ability to map author identities to BitBucket users, but I decided normalizing during the conversion was a good idea anyway.

The biggest problem I had to deal with (although this was actually one of the first problems I noticed) was that all of these merged branches were still open. Mercurial has the concept of "open" and "closed" branches, with closed branches being hidden by default in most places. A "closed" branch is simply one whose head revision is marked as closing the branch; which of course, none of my branches had, due to being converted from bzr which does not have an equivalent concept. Committing a closing revision to each branch was simple enough to script, but that only lead to more difficulties: 1) a gigantic pile of noise revisions in the history, and 2) a bunch of dangling heads as the new "close" revision was not part of the merge to trunk. Scripting a merge of all of the dangling heads would have produced even more noise, so I looked for a different solution.

Eventually I ran across a patch on the Mercurial mailing list; unfortunately the thread in which it was posted never went anywhere, but the patch still worked. What this patch allowed me to do was after the initial conversion, run another hg-to-hg conversion in which I marked the last branch revision before the merge to trunk as closing the branch. The full conversion process now looked something like this:

hg convert --datesort --branchmap branchmap --authormap authormap Fusion Fusion-hg-unspliced
cd Fusion-hg-unspliced
hg log --template "{node} close\n" -r "head() and not branch(default)" > ../splicemap
cd ..
PYTHONPATH=$HOME/hg-patched python $HOME/hg-patched/hg convert --splicemap splicemap Fusion-hg-unspliced Fusion-hg

This was good enough for a trunk conversion, but what about open branches that aren't yet merged into trunk? We could have held off until we were able to merge all of these branches, but that seemed like a lot of work (although we did merge as many outstanding branches as possible). Fortunately hg convert can operate in an incremental way; during the conversion, the mapping from source revs to destination revs is stored in dest/.hg/shamap; the only wrinkle was my two stage-conversion process. What I needed was a way to map the original bzr revisions to the hg revisions in the second repository. In order to accomplish this, I wrote a small Python script to merge the two mappings:

import sys revmap2 = {} with open(sys.argv[2]) as fMap2: for line in fMap2: a, b = line.rstrip('\n').split(' ') revmap2[a] = b with open(sys.argv[1]) as fMap1: for line in fMap1: a, b = line.rstrip('\n').split(' ') print '%s %s' % (a, revmap2[b])

With the help of this script, I could now convert other branches:

# Only take the bzr revisions
grep '@' Fusion-hg-unspliced/.hg/shamap > shamap
python mergemaps.py shamap Fusion-hg/.hg/shamap > shamap-spliced
mv shamap-spliced Fusion-hg/.hg/shamap

# Now let's convert a branch
hg convert --branchmap branchmap --authormap authormap Fusion-some-branch Fusion-hg

In summary, the process, while hardly trivial, worked out a lot better than I had initially expected.

EDIT: I forgot to mention in the original draft: We first started thinking about moving away from Launchpad at the beginning of September, and completed the migration in the last week, so the entire process took us less than a month of part-time discussion / work.

Axiom benchmark results on PyPy

2013-08-04 05:01

Comments

EDIT: Updated version now available.

EDIT: Fixed the issue with the store-opening benchmark

Axiom conveniently includes a few microbenchmarks; I thought I’d use them to give an idea of the speed increase made possible by running Axiom on PyPy. In order to do this, however, I’m going to have to modify the benchmarks a little. To understand why this is necessary, one has to understand how PyPy achieves the speed it does: namely, through the use of JIT (Just-In-Time) compilation techniques. In short, these techniques mean that PyPy is compiling code during the execution of a program; it does this “just in time” to run the code (or actually, if I understand correctly, in some cases only after the code has been run). This means that when a PyPy program has just started up, there is a lot of performance overhead in the form of the time taken up by JIT compilation running, as well as time taken up by code being interpreted slowly because it has not yet been compiled. While this performance hit is quite significant for command-line tools and other short-lived programs, many applications making use of Axiom are long-lived server processes; for these, any startup overhead is mostly unimportant, the performance that interests us is the performance achieved once the startup cost has already been paid. The Axiom microbenchmarks mostly take the form of performing a certain operation N times, recording the time taken, then dividing that time by N to get an average time per single operation. I have made two modifications to the microbenchmarks in order to demonstrate the performance on PyPy; first, I have increased the value of “N”; second, I have modified the benchmarks to run the entire benchmark twice, throwing away the results from the first run and only reporting the second run. This serves to exclude startup/”warmup” costs from the benchmark.

All of the results below are from my desktop machine running Debian unstable on amd64, CPython 2.7.5, and PyPy 2.1.0 on a Core i7-2600K running at 3.40GHz. I tried to keep the system mostly quiet during benchmarking, but I did have a web browser and other typical desktop applications running at the same time. Here’s a graph of the results; see the rest of the post for the details, especially regarding the store-opening benchmark (which is actually slower on PyPy).

[graph removed, see the new post instead]

To get an example of how much of a difference this makes, let’s take a look at the first benchmark I’m going to run, item-creation 15. This benchmark constructs an Item type with 15 integer attributes, then runs 10 transactions where each transaction creates 1000 items of that type. In its initial form, the results look like this:

1 mithrandi@lorien&gt; python item-creation 15
2 0.000164939785004
3 mithrandi@lorien&gt; pypy item-creation 15
4 0.000301389718056

That’s about 165µs per item creation on CPython, and 301µs on PyPy, nearly 83% slower; not exactly what we were hoping for. If I increase the length of the outer loop (number of transactions) from 10 to 1000, and introduce the double benchmark run, the results look a lot more encouraging:

1 mithrandi@lorien&gt; python item-creation 15
2 0.000159110188484
3 mithrandi@lorien&gt; pypy item-creation 15
4 8.7410929203e-05

That’s about 159µs per item creation on CPython, and only 87µs on PyPy; that’s a 45% speed increase. The PyPy speed-up is welcome, but it’s also interesting to note that CPython benefits slightly from the changes to the benchmark. I don’t have any immediate explanation for why this might be, but the difference is only about 3%, so it doesn’t matter too much.

The second benchmark is inmemory-setting. This benchmark constructs 10,000 items with 5 inmemory attributes (actually, the number of attributes is hardcoded, due to a limitation in the benchmark code), and then times how long it takes to set all 5 attributes to new values on each of the 10,000 items. I decreased the number of items to 1000, wrapped a loop around the attribute setting to repeat it 1000 times, and introduced the double benchmark run:

1 mithrandi@lorien&gt; python inmemory-setting
2 4.86490821838e-07
3 mithrandi@lorien&gt; pypy inmemory-setting
4 1.28742599487e-07

That’s 486ns to set an attribute on CPython, and 129ns on PyPy, for a 74% speed increase. Note that this benchmark is extremely sensitive to small fluctuations since the operation being measured is such a fast one, so the results can vary a fair amount between benchmarks run. For interest’s sake, I repeated the benchmark except with a normal Python class substituted for Item, in order to compare the overhead of setting an inmemory attribute as compared with normal Python attribute access. The result was 61ns to set an attribute on CPython (making an inmemory attribute about 700% slower), and 2ns on PyPy (inmemory is 5700% slower). The speed difference on PyPy is more to do with how fast setting a normal attribute is on PyPy, than to do with Axiom being slow.

The third benchmark is integer-setting. This benchmark is similar to inmemory-setting except that it uses integer attributes instead of inmemory attributes. I performed the same modifications, except with an outer loop of 100 iterations:

1 mithrandi@lorien&gt; python integer-setting
2 1.23480038643e-05
3 mithrandi@lorien&gt; pypy integer-setting
4 3.80326986313e-06

That’s 12.3µs to set an attribute on CPython, and 3.8µs on PyPy, a 69% speed increase.

The fourth benchmark is item-loading 15. This benchmark creates 10,000 items with 15 integer attributes each, then times how long it takes to load an item from the database. On CPython, the items are deallocated and removed from the item cache immediately thanks to refcounting, but on PyPy a gc.collect() after creating the items is necessary to force them to be garbage collected. In addition, I increased the number of items to 100,000 and introduced the double benchmark run:

1 mithrandi@lorien&gt; python item-loading 15
2 9.09668397903e-05
3 mithrandi@lorien&gt; pypy item-loading 15
4 5.70205903053e-05

That’s 90µs to load an item on CPython, and 57µs on PyPy, for a modest 37% speed increase.

The fifth benchmark is multiquery-creation 5 15. This benchmark constructs (but does not run) an Axiom query involving 5 different types, each with 15 attributes (such a query requires Axiom to construct SQL that mentions each item table, and each column in those tables) 10,000 times. I increased the number of queries constructed to 100,000 and introduced the double benchmark run:

1 mithrandi@lorien&gt; python multiquery-creation 5 15
2 5.5426299572e-05
3 mithrandi@lorien&gt; pypy multiquery-creation 5 15
4 7.98981904984e-06

55µs to construct a query on CPython; 8µs on PyPy; 86% speed increase.

The sixth benchmark is query-creation 15. This benchmark is the same as multiquery-creation, except for queries involving only a single item type. I increased the number of queries constructed to 1,000,000 and introduced the double benchmark run:

1 mithrandi@lorien&gt; python query-creation 15
2 1.548528409e-05
3 mithrandi@lorien&gt; pypy query-creation 15
4 1.56546807289e-06

15.5µs to construct a query on CPython; 1.6µs on PyPy; 90% speed increase.

The final benchmark is store-opening 20 15. This benchmark simply times how long it takes to open a store containing 20 different item types, each with 15 attributes (opening a store requires Axiom to load the schema from the database, among other things). I increased the number of iterations from 100 to 10,000; due to a bug in Axiom, the benchmark will run out of file descriptors partway, so I had to work around this. I also introduced the double benchmark run:

1 mithrandi@lorien&gt; python store-opening 20 15
2 0.00140788140297
3 mithrandi@lorien&gt; pypy store-opening 20 15
4 0.00202187280655

1.41ms to open a store on CPython; 2.02ms on PyPy; 44% slowdown. I’m not sure what the cause of the slowdown is.

A bzr branch containing all of my modifications is available at lp:~mithrandi/divmod.org/pypy-benchmarking.

Divmod / PyPy status update

2013-08-04 02:18

Comments

Just a quick status update:

Epsilon test suite passes on PyPy.
Nevow test suite passes on PyPy.
Axiom (trunk) test suite has two failures on PyPy, fixed by this branch (which just deletes the tests); I don’t expect this to affect any application code.
Mantissa test suite mostly fails. This is due to modules that indirectly import xmantissa.terminal which imports PyCrypto directly or indirectly (via twisted.conch) — PyCrypto does not build on PyPy.
I haven’t looked at Combinator; the test suite has a whole bunch of test failures on CPython due to a change in Subversion, and there’s no real reason to run Combinator with PyPy (I would expect it to be slower than on CPython).
I haven’t looked at Quotient or Imaginary yet, as they depend on Mantissa.
I haven’t looked at Hyperbola, Prime, Reverend, or Sine — I doubt anyone cares about these (I don’t even know what Prime does, and it doesn’t have a test suite).

The next thing I’m going to work on is making the Mantissa dependency on PyCrypto optional; while having an SSH server is nice, there is plenty of functionality in Mantissa that does not depend on or interact with the SSH server in any way, so it’ll still be useful for most applications. With any luck, once this is fixed, the entire test suite will pass; it’s hard to predict given how little of the test suite is currently runnable.