Yearning for Nosetests

I have begun writing unit tests using the python module unittests. This is not ideal because nosetests is a built in unit testing framework with pylons that does fancy things like making the mocking of internal app attributes much more seamless, or even possible.

Currently I have done the following:

  1. Determined that testing to make sure that /self-serve/{branch}/test_builders successfully calls selfserve.new_build_for_builder is not necessary since this functionality is for Pylons to handle, and isn't really effected by the new functionality needing the unit tests in the first place.
  2. I have been debating what the best way to test that selfserve.new_build_for_builder adds an entry to the mq is. On the one hand, if I just had nosetests working, I could simply mock up the function that grabs from the mq, then I could call the entry function and see that the mq entry is correct. However, if nosetests is not working, I seem to be left with few choices, each with their own problems:

    1. I could mock the function that does the mq entry, and then simply check that the object being passed to carrot.messaging.Publisher.send() has all the correct information necessary for the desired functionality. The problem with this is that mq.py is initialized at the start up of buildapi, and at that time carrot.messaging.Publisher is initialized with various config info. The more I dig into this, the more it appears that this is not a rabbithole I should continue to explore.
    2. I could create a new user to access the mq (through RabbitMQ) and then with buildapi and the mq started up, I could use urllib to send a custom request to buildapi, and then watch the mq for the entry, grab it and then verify it. This approach is seriously not ideal. For one, it requires manual setup from a user before being able to run the test, and so it's not easily portable. And two, it's not isolating just the function we want tested, and leaves the door open to other errors in functions unrelated to our test.
  3. I've consolidated the following two unit tests into a new one, which is simply stated as "selfserve.new_build_for_builder requests an entry that is complete and accurate"

    1. selfserve.new_build_for_builder adds an entry to the mq
    2. selfserve.new_build_for_builder's mq Entry is complete and accurate

So to recap, my revised list of unit tests to complete are:

  1. selfserve.new_build_for_builder requests an entry that is complete and accurate
  2. selfserve-agent.do_new_build_for_builder is called and see's all info from selfserve.new_build_for_builder's mq entry
  3. selfserve-agent.do_new_build_for_builder enters info into database correctly

Things not to test for:

  1. /self-serve/{branch}/test_builders successfully calls selfserve.new_build_for_builder

I just sent an email to catlee to ask if he had any guidance to offer on nosetests, given that 3 years ago he seemed to have success with test_builds.py

Making Progress

With my last post I made sure that my development environment was finally in working order and I could begin developing the patch needed for bug 793989. So, I've delved back into the partial patch that catlee has already written, made sense of exactly what is happening with all the pieces, and how they relate to a different sort of buildrequest entry, namely selfserve.rebuild_build. I now have a pretty solid idea of what a complete/accurate set of schedulerdb entries should look like for the new functionality required by this bug. Now I am taking the time to actually develop tests *before* writing the patch. This is a bit of a new thing for me, but I can definitely see the major advantages to testing first.

As far as I can tell, there are 5 main things I need to test for:

  1. /self-serve/{branch}/test_builders successfully calls selfserve.new_build_for_builder
  2. selfserve.new_build_for_builder adds an entry to the mq
  3. selfserve.new_build_for_builder's mq Entry is complete and accurate
  4. mq.do_new_build_for_builder is called and see's all info from selfserve.new_build_for_builder's mq entry
  5. mq.do_new_build_for_builder enters info into database correctly

A few questions are still floating around though. Pylons is setup to run with nosetests, which is really nice because you can load a partial WSGI app and then mess with its internals to test everything as it would be in a real app. However, I have never been able to get this to work successfully. So the question is currently, how long should I spend trying to figure out nosetests? If I decide to forgoe nosetests, I can easily use unittest as I have for other unittests before.

The plan is to contact catlee to ask if he has any additional knowledge pertaining to running test_builds.py with nosetests given that he is one who wrote it 3 years ago. If not, I am going to continue on making the unittests with unittest. Unittest should allow me to do most, if not all, of the testing I need to do, it just needs a little more finessing to get it just right.

I am going to start on the unittest version of tests before hearing back from catlee, since they are easily portable to nosetests and I won't be waiting to get it done.

Selfserve-Agent.py

So after the successes with getting RabbitMQ up and running, there was at least one more thing to be solved before buildapi was really truly entering buildrequests into the schedulerdb. Once I had rabbitmq up and running, it seemed as though buildapi was able to submit a new buildrequest, but in reality it became apparent that while buildapi was connected to the mq, there was something missing from the other side of said mq, to grab and execute the db entries called upon by the queue entries. This is a list of the changes that had to be made from the last post till now:

  • Had to start up selfserve-agent.py as a standalone process up next to the buildapi server
  • In order to successfully start up selfserve-agent.py, a new config was necessary. (catlee had me check out the puppet manifests to understand how selfserve-agent.py should be configured)
  • Add "carrot.vhost = /" under "carrot.hostname = localhost" int he config.ini for buildapi
  • Start up buildapi and then start up selfserve-agent.py
  • To start selfserve-agent.py you have to run it with the wait command to allow it to stay in a loop waiting for input: python buildapi/scripts/selfserve-agent.py -w -v

Once these additions were made to my local buildapi instance, I was able to verify via SequelPro that once I hit "rebuild" on a given build, that it was indeed added to the buildapi-web queue and then grabbed from the queue by selfserve-agent.py and added to the schedulerdb correctly. Finally the todo now is:

  • Update the wiki doc on Setting up a Local Virtualenv for BuildAPI with the new found instructions
  • Clarify the need from the OP on bug 793989
  • Begin writting up unittests to test for proper entry of new buildrequests into the schedulerdb
  • Write up the needed logic to enter a single buildrequest
  • Review the logic
  • Lather, Rinse, Repeat

RabbitMQ Deux: SUCCESS!

I spoke with catlee today to see if he could send over a copy of the scripts that he used to setup buildapi as a user on rabbitmq, and he did. Coop warned that there may be some finicky issues that are enironment specific to my Mac (ie paths, etc). Indeed when I attempted to run the script, with the RabbitMQ server off, I got the error "Error: unable to connect to node rabbit@localhost: nodedown". Then, when I turned the server on, I got the error "Error: {noproc,{gen_server2,call,[worker_pool,next_free,infinity]}}". Obviously something was not quite right, so I did some more looking around. I found that RabbitMQ has a set of plugins that it comes with and they are disabled by default, once I enabled those, I could go into the web app, add buildapi as a user and then changed some config options on buildapi, and BAM! It magically begam accepting entries into the db.

Here is the step by step I used to get RabbitMQ up and running and working with buildapi on Mac OSX.

  1. If MacPorts is not already installed, then go here.
  2. Once you've ensured that MacPorts is installed you can install RabbitMQ: sudo port install rabbitmq-server

    • The instructions for this can be found here
  3. Once RabbitMQ is installed, you need to add buildapi as a user. Enable the rabbitmq_management plugin: rabbitmq-plugins enable rabbitmq_management

    • The instructions for this can be found here
  4. Then restart RabbitMQ: sudo /opt/local/etc/LaunchDaemons/org.macports.rabbitmq-server/rabbitmq-server.wrapper restart
  5. Now go to http://localhost:15672/ and use the username/password combo of guest/guest
  6. Once in, go to 'Admin'
  7. Select the 'Add a user' option and enter the following

    • Username: buildapi
    • Password: buildapi
    • Tags: administrator
  8. Now submit the new user by selecting 'Add user'
  9. Once you have added 'buildapi' as a new user, you will see it listed undet the 'All users' section above
  10. Select 'buildapi' and a window for permissions will come up
  11. Make sure that the permissions are set to the following

    • Virtual Host: /
    • Configure regexp: .*
    • Write regexp: .*
    • Read regexp: .*
  12. Now submit these permissions by selecting 'Set Permission'
  13. Once you have done this, the only thing left is to adjust the config.ini file at the root of buildapi to include the following lines

    • carrot.hostname = localhost
    • carrot.userid = buildapi
    • carrot.password = buildapi
    • carrot.exchange = buildapi.control
    • carrot.consumer.queue = buildapi-web
  14. Once you have made sure that the previous lines were added to your config.ini file in buildapi, then start up buildapi
  15. Go to http://localhost:15672/#/connections and a connection with the username 'buildapi' should be listed and the state should be 'running'

And that's that! I attempted to click 'rebuild' again from a branch page like try and it worked! The database entry was successful!

Now that I have been able to get this mq issue figured out with the help of catlee and coop, thanks guys!, I will now move onto the following:

  • Update the wiki doc on Setting up a Local Virtualenv for BuildAPI with the new found instructions on getting RabbitMQ installed on Mac.
  • Begin writting up unittests to test for proper entry of new buildrequests into the schedulerdb
  • Write up the needed logic to enter a single buildrequest
  • Review the logic
  • Lather, Rinse, Repeat

RabbitMQ

I received an email back from ccop today and it sounds like he was getting similar exceptions to what I was, when trying to submit a build. He said that catlee helped him to install and integrate RabbitMQ with buildapi and he was then able to submit builds. Based on that, I am installing and integrating RabbitMQ into buildapi. I have hit a little snag in integrating on a Mac, since the original script from catlee is for linux, but I should be able to get more info on that in the morning when the EST folks are back online. Additionally, coop expanded the buildapi setup docs on the wiki with info on the RabbitMQ and setting up the databases, so this should prove useful for me as well!

Rock < Me < Hardplace

Bug 793989: It's been a few days since my last update, but here is the gist. I am still chasing the issue I mentioned before. It doesn't look like I am able to run any controller function that ends up calling g.mq.* (where g is app_globals), because g.mq is returning NoneType. It appears as though buildapi.lib.mq is never actually added to app_globals, or if it is, I cannot seem to find it… How is this setup in the production version of buildapi? For instance, I am assuming that when an 'authorized' user enters a valid revision into a the box at the bottom of https://secure.pub.build.mozilla.org/buildapi/self-serve/try where it says "Create new dep builds on try revision", that it'll successfully kick-off that functionality. In my instance, this simply fails with "AttributeError: 'NoneType' object has no attribute 'newBuildAtRevision'". I have played with pdb a bit to try and unearth something, but it seems to me that there is simply a configuration of some sort missing in my local instance, that is present in the production environment. I am throwing out these questions to coop to see if he has run into this issue before.

Bug 931580: So, in the meantime, I am back to working on bug 931580.

Add-On Idea: Additionally, I threw an idea around to some devs about making an add-on for Firefox that takes your hg-related email (the one you always use to make checkins on hg), and it'll look for, track/log and alert you when a checkin you have made has completed all builds/tests and if it Passed or Failed (Some issue other than all greens). This plugin would make use of the buildapi extension that I already built this summer which returns json to tell whether a checkin has finished all builds/tests and if it has passed them all or failed (again, something other than all greens)… that extension relates to bug 900318

Taking a swing at database entry with Pylons

Catlee answered the secondary questions I had concerning bug 793989 and clarified a bunch of things. For phase 1, I am going to implement the functionality at /self-serve/{branch}/builders/{buildername} that simply allows a user to construct their own POST message complete with JSON arguments for changes and properties. A properly structured call to this URL should correctly enter a new buildrequest into the schedulerdb, that buildbot could then grab to quick of the new build. In order to test this, I am going to write up a unit test that checks the schedulerdb for a proper entry… this test should already succeed upon submission of a retrigger. Note: The tables that I need to enter data into are (via catlee):

buildrequests
buildsets
sourcestamps
sourcestamp_changes
changes
change_files

I started poking around with the existing rebuild/retrigger functionality that already exists in self-serve, and I have hit an issue. When I attempt to hit the 'rebuild' button from a branch page such as /self-serve/try, on an existing build/test, I am getting a sever error 500 and the traceback (btw, I disabled the who = self._require_auth() line by instead making who = "Me!" for the time being:

Error - : 'NoneType' object has no attribute 'rebuildBuild'
URL: http://127.0.0.1:5000/self-serve/try/build
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/weberror/errormiddleware.py', line 162 in __call__
  app_iter = self.application(environ, sr_checker)
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/beaker/middleware.py', line 152 in __call__
  return self.wrap_app(environ, session_start_response)
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/routes/middleware.py', line 131 in __call__
  response = self.app(environ, start_response)
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/pylons/wsgiapp.py', line 107 in __call__
  response = self.dispatch(controller, environ, start_response)
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/pylons/wsgiapp.py', line 312 in dispatch
  return controller(environ, start_response)
File '/Users/jzeller/buildapi-test/buildapi/buildapi/lib/base.py', line 20 in __call__
  return WSGIController.__call__(self, environ, start_response)
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/pylons/controllers/core.py', line 211 in __call__
  response = self._dispatch_call()
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/pylons/controllers/core.py', line 162 in _dispatch_call
  response = self._inspect_call(func)
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/pylons/controllers/core.py', line 105 in _inspect_call
  result = self._perform_call(func, args)
File '/Users/jzeller/buildapi-test/lib/python2.7/site-packages/pylons/controllers/core.py', line 57 in _perform_call
  return func(**args)
File '/Users/jzeller/buildapi-test/buildapi/buildapi/controllers/selfserve.py', line 414 in rebuild_build
  retval = g.mq.rebuildBuild(who, build_id, priority)
AttributeError: 'NoneType' object has no attribute 'rebuildBuild'

Upon further investigation of this error, it turns out that the g references app_globals, which is imported from pylons. Within app_globals should be mq (located at /buildapi/lib/mq.py), and as you can see the function rebuildBuild does exist. Looks like g, ie app_globals, is of type <class 'paste.registry.StackedObjectProxy'>, which should be right, but g.mq is indeed of type 'NoneType'… I do not know what gives here. This function is one that is included in the source code already and I am under the impression that it works fine in production, so why is it not working here? More digging is necessary to solve this.

My next steps here are to discover what the cause of this NoneType error is, resolve it and then run a rebuild/retrigger and manually check that the database entries for that build now exist. Once I have done that, I am going to build a unit test that will run a similar check on the database to check that a new build has indeed been added to the schedulerdb, into the proper tables, with valid info. Once this is complete, I am going to then complete the functions necessary to insert a new single buildrequest into the schedulerdb, and use the unittests to verify that the buildrequest entry is properly constructed for buildbot to be able to grab it and start up builds/tests.

Waiting for answers, breaking into Bug 931580

I am still waiting for some answers from catlee pertaining to how to properly construct a buildrequest and what tables to fill in and where.

In the meantime, I have started breaking into bug 931580. I am looking through the schedulerdb and statusdb schemas, as well as some existing models in buildapis, to determine how to query for, build and respond with a json list of all slaves organized by master/build_id given a branch and revision.

Lots of new information pertaining to Bug 793989

So catlee was able to get back to be yesterday… super fast!

The information that I was able to gain from this pertains to the partial patch that catlee already wrote for the changes needed to buildapi for bug 793989.

  1. Selection of a builder_name for the new end point /self-serve/{branch}/builders/{builder_name} is going to be up to the UI. Catlee imagines a simple text string in a form and some JS to take that to alter the POST url rather than including as form data. Most likely the builders we're interested in would have run at some point, perhaps just not per push. Catlee thinks the builder names could come from TBPL or some auto-bisect tool that's yet to be written.
  2. The JSON for properties and changes in the request POST is not coming from statusdb.properties and statusdb.changes, it's all submitted by the client.
  3. Not sure where the JSON is going to generate from to be added to the request POST.

    • There are a few things we need to create the buildrequest properly:

      • branch, revision: these go into sourcestamps and changes
      • files that have changed: these go into change_files. We're currently using change files to record the URLs of build and tests for the test jobs.
      • properties: used by builders like l10n to determine which locale to repack
    • All of those end up in different tables. We could look for them in different POST parameters.
  4. A very very simple HTML UI would be made for this, and most often this API would be used from other tools like TBPL or the auto-bisect tool.
  5. do_new_build_for_builder in buildapi/scripts/selfserve-agent.py handles inserting the new build request into the schedulerdb, at which point the buildbot masters will find the buildrequest and start it on a slave.
  6. In relation to the comments in selfserve.py

    • Line 508: # TODO: Make sure that the 'fake' branches for sourcestamps are obeyed?

      • Still doesn't remember what 'fake' refers to.
    • Line 527: # TODO: What do we do with change branches? If they're set to something real, then this will trigger schedulers. Can we use a fake value?

      • This piece ties into how buildbot scheduling works. Because we need to have files associated with the changes so we have a place to record the build and test URLs [1], we need to insert new changes into the DB. Changes need a branch.
    • Line 538: # TODO: What value do we choose for branch? If we choose a real value, like "mozilla-central", or "mozilla-central-opt-linux", then we risk triggering the regular schedulers and having a full suite of test runs scheduled instead of just the builder we're interested in.

      • I was thinking that probably ${branch}-selfserve would work.
    • Line 538: # TODO: invalidate cache for branch

      • There are similar comments on the other methods here. buildapi maintains a cache of pending/running/finished jobs per branch and revision. The comment says that we should explicitly invalidate that cache so that subsequent requests will see the new pending job. As it stands, you need to wait for the cache to expire (which is like 60 seconds I think). Not a major issue, and certainly doesn't need to be dealt with as part of your work here.
  7. In relation to the comments in selfserve-agent.py

    • Line 3.32 : # TODO: Attach files sourcestamps -> sourcestamp_changes -> changes -> change_files but new changes may trigger work by other schedulers…

      • This ties into your questions above about the data coming into the API, and also the choice of branch to use for the changes we're inserting.
    • Line 3.35 : # TODO: accept change objects here instead of contructing from files

      • I had started writing this code by passing in just a list of files that would get associated with a new change object. I wondered if it would make more sense to have the client fully specify the change object rather than restricting the API to only deal with change files.

Additionally, I was able to confirm/clarify these assumptions:

  1. At the bottom of a typical revision page on BuildAPI and at the bottom of a branch page, there are 3 boxes which allow you to retrigger a set of builds for dep, PGO and nightly. Ideally, we want to be able to (re)trigger a job on any builder without launching the entire suite of builds/tests that normally happen. The UI could use some cleanup here so that if you're on the per-revision page, there's just a button for new dep/PGO/nightly build with no revision field.
  2. Pulling a list of builds from TryChooser isn't a good idea, we need to find a better way.
  3. Taking in the computed syntax from TryChooser Syntax Builder for retriggering a build/test is not a good idea. Some things that aren't immediately available are the URLs for the build and test packages.
  4. For the first pass we should focus on making the API able to trigger jobs, and leave some of the complicated UI elements until later (e.g. which buildername, which change files, properties, etc). We could also change our logic for finding the build/test urls
  5. Pylons is a pretty madass MVC

More to come!