Triggering of Arbitrary builds/tests is now possible!

Bug 793989 requests the ability of buildapi's self-serve to be able to request arbitrary builds/tests on a given push, and this is now possible! If you are interested in triggering an arbitrary build/test you can start with this simple python script to get you started.

This new buildapi REST functionality takes a POST request with the parameters properties and files, as a dictionary and list, respectively. The REST URL is built as https://secure.pub.build.mozilla.org/buildapi/self-serve/{branch}/builders/{buildername}/{revision} where {branch} is a valid branch, {revision} is an existing revision on that branch, and {buildername} is the arbitrary build/test that you wish to trigger. Examples of appropriate buildernames are "Ubuntu VM 12.04 try opt test mochitest-1" or "Linux x86-64 try build"; Any buildername that shows up in the builder column on buildapi should work. Please file a bug if you find any exceptions!

As a specific example, if you are launching the test "Ubuntu VM 12.04 x64 try opt test jetpack" after having run the build "Linux x86-64 try build", then you need to supply the files list parameter as ["http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/cviecco@mozilla.com-b5458592c1f3/try-linux64-debug/firefox-30.0a1.en-US.linux-x86_64.tar.bz2", "http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/cviecco@mozilla.com-b5458592c1f3/try-linux64-debug/firefox-30.0a1.en-US.linux-x86_64.tests.zip"].

Once you have submitted an arbitrary build/test request, you can see the status of that build/test on BuildAPI in multiple locations. The one which allows you to see which masters have grabbed the build/test, you can use the following URL: https://secure.pub.build.mozilla.org/buildapi/revision/{branch}-selfserve/{revision}, where obviously {branch} is the branch you submitted your request on and {revision} is the revision you submitted on that branch.

NOTE: Currently TBPL is not showing the pending/running status of builds/tests started with this new BuildAPI functionality because of some issues outlined in Bug 981825, however once a build/test that was triggered with this new functionality is finished, it shows up on TBPL just as you would expect any other build/test to.

Docs, MySQL Dumps, Buildbot Recap

It's been a few days since my last update, so here it is!

  1. I finished writing up the docs addition for setting up a local buildbot instance
  2. Ran a manual table-by-table mysql dump on my local snapshots of statusdb and schedulerdb, which successfully shrunk their collective size from 108gb to 14.5gb, or by 86.6%! So these snapshots are alot easier to deal with. They currently only show the build data from July 8th 02:00:00 GMT to July 16th ~12:00:00 GMT, which is plenty for my current dev enviromment setup.
  3. Adjusted the master.cfg file for build-master in Buildbot to use the mysql schedulerdb on my local system, rather than the default sqlite:///state.sqlite

    • Changed line 143 of master.cfg from c['db_url'] = “sqlite:///state.sqlite” to c['db_url'] = "mysql://buildbot:buildbot@localhost/temp_schedulerdb"

      • NOTE: The regex does not like a mysql user that has no password. Originally, I had specified root: , ie "mysql://root:@localhost/temp_schedulerdb", so I made a new user named buildbot, password buildbot.
    • Ran pip install MySQL-python==1.2.3 from the buildbot virtualenv… found this via pip freeze from my already existing buildapi virtualenv
    • Added export PATH=$PATH:/usr/local/mysql/bin to the buildbot virtualenv bin/activate file
  4. Tested the buildapi/buildbot dev setup I have works end-to-end. It is unclear if buildbot is actually grabbing the new build requests and pending them. I need to check with Armen on this… It is unclear where exactly to look for these pending builds via buildbot. I have found that once entering a build_request that the new request instantly shows up under the same branch it was requested under pending jobs for the day today. i.e. I rebuild a job from July 10th, and then I go back to view jobs for today at a link like http://127.0.0.1:5000/self-serve/try and it has shown up in the pending jobs section. I tested with buildbot shut down and it still worked… so I am still unclear how to check that buildbot likes or dislikes a build_request
  5. Edited/Modified the manual unit tests for the patch
  6. Started writing the patch for bug 793989

    1. Took some time to get reacclimated to the exact issue and which db tables need to be filled
    2. Unclear on a couple of points, also going to clarify with coop/armen on Monday

More next week! Have a great weekend!

Buildbot is up and running!

Buildbot is now up and running on my local Mac! :)

Picking up from where I left off in my previous post, this is what was necessary to make this happen:

  1. Firstly, attempting to run 'make checkconfig' from ~/Buildbot/buildbot/ was completely misguided. I was under the impression that the master/ directory that I was looking for was ~/Buildbot/buildbot/master/. However, this was incorrect as the true master directory I needed was ~/Buildbot/build-master/.
  2. cd BASEDIR/build-master/
  3. Modify line 2 of Makefile from BUILDBOT=$(PWD)/bin/buildbot to be BUILDBOT=$(PWD)/../bin/buildbot
  4. make check-config

    • This fails with the Traceback:
    • (Buildbot)localhost:build-master jzeller$ make checkconfig
      cd master && /Users/jzeller/Buildbot/build-master/../bin/buildbot  checkconfig
      Traceback (most recent call last):
        File "/Users/jzeller/Buildbot/lib/python2.6/site-packages/buildbot-0.8.2-py2.6.egg/buildbot/scripts/runner.py", line 1042, in doCheckConfig
          ConfigLoader(configFileName=configFileName)
        File "/Users/jzeller/Buildbot/lib/python2.6/site-packages/buildbot-0.8.2-py2.6.egg/buildbot/scripts/checkconfig.py", line 31, in __init__
          self.loadConfig(configFile, check_synchronously_only=True)
        File "/Users/jzeller/Buildbot/lib/python2.6/site-packages/buildbot-0.8.2-py2.6.egg/buildbot/master.py", line 652, in loadConfig
          exec f in localDict
        File "/Users/jzeller/Buildbot/build-master/master.cfg", line 129, in <module>
          execfile(releaseConfigFile, releaseBranchConfig, releaseBranchConfig)
      IOError: [Errno 2] No such file or directory: 'release-firefox-mozilla-1.9.2.py'
      make: *** [checkconfig] Error 1

  5. Modify line 11 of master_config.json so that 'release_branches' contains an empty list. It did contain ["mozilla-1.9.2",  "mozilla-beta"] and they are unnecessary for my local setup.

  6. Again: make checkconfig

    • SUCCESS: Config file is good!

  7. make start

    • It will appear as if start has failed, but fear not! You'll see the message below, but this is simply because the buildbot process is launched as a dameon and buildmaster waits to give the all configuration is complete signal. However, Mozilla's configuration files are way to complicated to be dealt with in under 10 seconds, so it times out and you see the message below:

    • (Buildbot)localhost:build-master jzeller$ make start
      cd master && /Users/jzeller/Buildbot/build-master/../bin/buildbot  start $PWD
      Following twistd.log until startup finished..
      2014-01-22 14:47:30-0800 [-] Log opened.
      2014-01-22 14:47:30-0800 [-] twistd 12.0.0 (/Users/jzeller/Buildbot/bin/python 2.6.7) starting up.
      2014-01-22 14:47:30-0800 [-] reactor class: twisted.internet.selectreactor.SelectReactor.
      2014-01-22 14:47:30-0800 [-] monkeypatch_twisted_cbLogin applied
      2014-01-22 14:47:30-0800 [-] Creating BuildMaster — buildbot.version: 0.8.2
      2014-01-22 14:47:30-0800 [-] loading configuration from /Users/jzeller/Buildbot/build-master/master.cfg
      2014-01-22 14:47:30-0800 [-] unable to import dnotify, so Maildir will use polling instead
      2014-01-22 14:47:30-0800 [-] nextAWSSlave: start
      2014-01-22 14:47:30-0800 [-] nextAWSSlave: start
      2014-01-22 14:47:37-0800 [-] nextAWSSlave: start
      2014-01-22 14:47:37-0800 [-] nextAWSSlave: start

      The buildmaster took more than 10 seconds to start, so we were unable to
      confirm that it started correctly. Please 'tail twistd.log' and look for a
      line that says 'configuration update complete' to verify correct startup.

      make: *** [start] Error 1

  8. To doublecheck that the configuration was successful, just follow the directions and type: less twistd.log | grep "configuration update complete". As long as you see it pop up then you're good!

  9. Go to http://localhost:8501/

  10. If you a page with "Welcome to the Buildbot for the Firefox project!" then you have been successful!

  11. You can now check out what buildbot has pending by going to http://localhost:8501/waterfall

Now buildbot is all setup! Hooray!

The next steps are:

  1. Update all the documentation to reflect this new found path!
  2. Update and upload the scripts that can make this process much easier, a buildbot-on-laptop if you will
  3. Run a database dump on my personal schedulerdb and statusdb that represents a smaller subset of builds/tests… Ideally something small enough to send around to others.

    • Send this to Armen for his BuildAPI setup
  4. Replace my current schedulerdb and statusdb with these smaller subsets to claim back more harddrive space – currently 100gb+!
  5. Backup new schedulerdb and statusdb
  6. Link up buildbot to the proper database
  7. Test that running an existing command from BuildAPI has the expected results in buildbot by looking at the http://localhost:8501/waterfall UI
  8. Finish writing up all of the manual unit tests that need to be used for testing Bug 793989's patch
  9. Modify BuildAPI to add the new functionality for a Bug 793989 patch
  10. Test/Review the patch!

Here's to shipping this patch!

Setting Up a Development Master

Setting up buildbot on my local Mac has proved to be more difficult than I first realized. hwine has helped me out by modifying and sharing a script that helps to setup a new buildbot project called new-project.sh, and combining that with the create-staging-master script that can be found on the How To Setup a Personal Development Master, I think buildbot will be all setup on in my local dev environment pretty soon! However, I have run into some issues, which I already emailed hwine about, but I will rephrase them here.

Here is how I set things up, complete with how I fixed a few issues:
  1. Downloaded new-project.sh and create-staging-master to ~/
  2.  ./new-projects.sh Buildbot
  3. cd Buildbot; source bin/activate
  4. cd buildbot-configs; ./test-masters.sh

    • All tests passed
  5. cd ~
  6. Modify create-staging-master so that "my $basedir = '/builds/buildbot';" becomes "my $basedir = '/Users';" on line 39
  7. sudo perl create-staging-master –username=jzeller –master-kind=build –config-only

    • This creates a new directory at ~/build1 which contains config.json

      • More on this below, but I think the config.json file needs to be at ~/Buildbot/buildbot instead. However, when you run the script with the parameter –master-dir set to either Buildbot/ or Buildbot/buildbot, it complains that the directory already exists.
  8. Now that I want to see if it all works, I am trying to follow the Does It Work? section in the How To Setup a Personal Development Master wiki.
  9. cd $BASEDIR/master; ln -sf ../buildbot-configs/mozilla/universal_master_sqlite.cfg; cd –

    • Since $BASEDIR is not set by the 2 scripts you sent over, and the create-master.sh script in the Create a Build Master section sets it to /builds/buildbot/$USERNAME/build1, I go ahead and set it to BASEDIR=/Users/jzeller/Buildbot/buildbot
  10. Again: cd $BASEDIR/master; ln -sf ../buildbot-configs/mozilla/universal_master_sqlite.cfg; cd –

    • Successful… tried with BASEDIR=/Users/jzeller/Buildbot/ and it failed because there was no master directory in ~/Buildbot
  11. make checkconfig

    • When attempting from ~/Buildbot/buildbot, this fails with: make: *** No rule to make target `checkconfig'.  Stop.
    • After speaking with aki (check scrollback on mozbuild), it seems as though the Makefile I had in ~/Buildbot/buildbot was just dev utils
    • aki pointed me to the Makefile.master in buildbot-configs and I cp'd it to ~/Buildbot/buildbot and modified BUILDBOT=$(PWD)/bin/buildbot to be BUILDBOT=$(PWD)/../bin/buildbot
  12. Again: make checkconfig

    • Fails in a different way:

      • (Buildbot)localhost:buildbot jzeller$ make checkconfig
        cd master && /Users/jzeller/Buildbot/buildbot/../bin/buildbot checkconfig
        Traceback (most recent call last):
          File "/Users/jzeller/Buildbot/lib/python2.6/site-packages/buildbot-0.8.2-py2.6.egg/buildbot/scripts/runner.py", line 1042, in doCheckConfig
            ConfigLoader(configFileName=configFileName)
          File "/Users/jzeller/Buildbot/lib/python2.6/site-packages/buildbot-0.8.2-py2.6.egg/buildbot/scripts/checkconfig.py", line 18, in __init__
            copy(configFileName, tempdir)
          File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 84, in copy
            copyfile(src, dst)
          File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 50, in copyfile
            with open(src, 'rb') as fsrc:
        IOError: [Errno 2] No such file or directory: '/Users/jzeller/Buildbot/buildbot/master/master.cfg'
        make: *** [checkconfig] Error 1
    • I realized that the master.cfg found in ~/Buildbot/buildbot may be a faulty sym link, so I modified the command from step 10 and reran it with an absolute path: cd $BASEDIR/master; ln -sf /Users/jzeller/Buildbot/buildbot-configs/mozilla/universal_master_sqlite.cfg; cd –

      • This gets the same Traceback as above

I am stuck at the make checkconfig step here, SO CLOSE. Here are a few other questions off the bat:

  • config.json surely goes somewhere within the Buildbot base directory, but I am not sure where this would be appropriate. I attempted to set the –master-dir parameter to either Buildbot/ or Buildbot/buildbot, it complains that the directory already exists.
  • aki is saying that I may have an easier time recreating the master, rather than modifying configs, however I am not sure how that relates to using these 2 scripts, new-project.sh and create-staging-master in tandum.

So for the moment this is where I am stuck, and I will be attempting to tackle this more on Friday.

Yearning for Nosetests

I have begun writing unit tests using the python module unittests. This is not ideal because nosetests is a built in unit testing framework with pylons that does fancy things like making the mocking of internal app attributes much more seamless, or even possible.

Currently I have done the following:

  1. Determined that testing to make sure that /self-serve/{branch}/test_builders successfully calls selfserve.new_build_for_builder is not necessary since this functionality is for Pylons to handle, and isn't really effected by the new functionality needing the unit tests in the first place.
  2. I have been debating what the best way to test that selfserve.new_build_for_builder adds an entry to the mq is. On the one hand, if I just had nosetests working, I could simply mock up the function that grabs from the mq, then I could call the entry function and see that the mq entry is correct. However, if nosetests is not working, I seem to be left with few choices, each with their own problems:

    1. I could mock the function that does the mq entry, and then simply check that the object being passed to carrot.messaging.Publisher.send() has all the correct information necessary for the desired functionality. The problem with this is that mq.py is initialized at the start up of buildapi, and at that time carrot.messaging.Publisher is initialized with various config info. The more I dig into this, the more it appears that this is not a rabbithole I should continue to explore.
    2. I could create a new user to access the mq (through RabbitMQ) and then with buildapi and the mq started up, I could use urllib to send a custom request to buildapi, and then watch the mq for the entry, grab it and then verify it. This approach is seriously not ideal. For one, it requires manual setup from a user before being able to run the test, and so it's not easily portable. And two, it's not isolating just the function we want tested, and leaves the door open to other errors in functions unrelated to our test.
  3. I've consolidated the following two unit tests into a new one, which is simply stated as "selfserve.new_build_for_builder requests an entry that is complete and accurate"

    1. selfserve.new_build_for_builder adds an entry to the mq
    2. selfserve.new_build_for_builder's mq Entry is complete and accurate

So to recap, my revised list of unit tests to complete are:

  1. selfserve.new_build_for_builder requests an entry that is complete and accurate
  2. selfserve-agent.do_new_build_for_builder is called and see's all info from selfserve.new_build_for_builder's mq entry
  3. selfserve-agent.do_new_build_for_builder enters info into database correctly

Things not to test for:

  1. /self-serve/{branch}/test_builders successfully calls selfserve.new_build_for_builder

I just sent an email to catlee to ask if he had any guidance to offer on nosetests, given that 3 years ago he seemed to have success with test_builds.py

Making Progress

With my last post I made sure that my development environment was finally in working order and I could begin developing the patch needed for bug 793989. So, I've delved back into the partial patch that catlee has already written, made sense of exactly what is happening with all the pieces, and how they relate to a different sort of buildrequest entry, namely selfserve.rebuild_build. I now have a pretty solid idea of what a complete/accurate set of schedulerdb entries should look like for the new functionality required by this bug. Now I am taking the time to actually develop tests *before* writing the patch. This is a bit of a new thing for me, but I can definitely see the major advantages to testing first.

As far as I can tell, there are 5 main things I need to test for:

  1. /self-serve/{branch}/test_builders successfully calls selfserve.new_build_for_builder
  2. selfserve.new_build_for_builder adds an entry to the mq
  3. selfserve.new_build_for_builder's mq Entry is complete and accurate
  4. mq.do_new_build_for_builder is called and see's all info from selfserve.new_build_for_builder's mq entry
  5. mq.do_new_build_for_builder enters info into database correctly

A few questions are still floating around though. Pylons is setup to run with nosetests, which is really nice because you can load a partial WSGI app and then mess with its internals to test everything as it would be in a real app. However, I have never been able to get this to work successfully. So the question is currently, how long should I spend trying to figure out nosetests? If I decide to forgoe nosetests, I can easily use unittest as I have for other unittests before.

The plan is to contact catlee to ask if he has any additional knowledge pertaining to running test_builds.py with nosetests given that he is one who wrote it 3 years ago. If not, I am going to continue on making the unittests with unittest. Unittest should allow me to do most, if not all, of the testing I need to do, it just needs a little more finessing to get it just right.

I am going to start on the unittest version of tests before hearing back from catlee, since they are easily portable to nosetests and I won't be waiting to get it done.

Selfserve-Agent.py

So after the successes with getting RabbitMQ up and running, there was at least one more thing to be solved before buildapi was really truly entering buildrequests into the schedulerdb. Once I had rabbitmq up and running, it seemed as though buildapi was able to submit a new buildrequest, but in reality it became apparent that while buildapi was connected to the mq, there was something missing from the other side of said mq, to grab and execute the db entries called upon by the queue entries. This is a list of the changes that had to be made from the last post till now:

  • Had to start up selfserve-agent.py as a standalone process up next to the buildapi server
  • In order to successfully start up selfserve-agent.py, a new config was necessary. (catlee had me check out the puppet manifests to understand how selfserve-agent.py should be configured)
  • Add "carrot.vhost = /" under "carrot.hostname = localhost" int he config.ini for buildapi
  • Start up buildapi and then start up selfserve-agent.py
  • To start selfserve-agent.py you have to run it with the wait command to allow it to stay in a loop waiting for input: python buildapi/scripts/selfserve-agent.py -w -v

Once these additions were made to my local buildapi instance, I was able to verify via SequelPro that once I hit "rebuild" on a given build, that it was indeed added to the buildapi-web queue and then grabbed from the queue by selfserve-agent.py and added to the schedulerdb correctly. Finally the todo now is:

  • Update the wiki doc on Setting up a Local Virtualenv for BuildAPI with the new found instructions
  • Clarify the need from the OP on bug 793989
  • Begin writting up unittests to test for proper entry of new buildrequests into the schedulerdb
  • Write up the needed logic to enter a single buildrequest
  • Review the logic
  • Lather, Rinse, Repeat

RabbitMQ Deux: SUCCESS!

I spoke with catlee today to see if he could send over a copy of the scripts that he used to setup buildapi as a user on rabbitmq, and he did. Coop warned that there may be some finicky issues that are enironment specific to my Mac (ie paths, etc). Indeed when I attempted to run the script, with the RabbitMQ server off, I got the error "Error: unable to connect to node rabbit@localhost: nodedown". Then, when I turned the server on, I got the error "Error: {noproc,{gen_server2,call,[worker_pool,next_free,infinity]}}". Obviously something was not quite right, so I did some more looking around. I found that RabbitMQ has a set of plugins that it comes with and they are disabled by default, once I enabled those, I could go into the web app, add buildapi as a user and then changed some config options on buildapi, and BAM! It magically begam accepting entries into the db.

Here is the step by step I used to get RabbitMQ up and running and working with buildapi on Mac OSX.

  1. If MacPorts is not already installed, then go here.
  2. Once you've ensured that MacPorts is installed you can install RabbitMQ: sudo port install rabbitmq-server

    • The instructions for this can be found here
  3. Once RabbitMQ is installed, you need to add buildapi as a user. Enable the rabbitmq_management plugin: rabbitmq-plugins enable rabbitmq_management

    • The instructions for this can be found here
  4. Then restart RabbitMQ: sudo /opt/local/etc/LaunchDaemons/org.macports.rabbitmq-server/rabbitmq-server.wrapper restart
  5. Now go to http://localhost:15672/ and use the username/password combo of guest/guest
  6. Once in, go to 'Admin'
  7. Select the 'Add a user' option and enter the following

    • Username: buildapi
    • Password: buildapi
    • Tags: administrator
  8. Now submit the new user by selecting 'Add user'
  9. Once you have added 'buildapi' as a new user, you will see it listed undet the 'All users' section above
  10. Select 'buildapi' and a window for permissions will come up
  11. Make sure that the permissions are set to the following

    • Virtual Host: /
    • Configure regexp: .*
    • Write regexp: .*
    • Read regexp: .*
  12. Now submit these permissions by selecting 'Set Permission'
  13. Once you have done this, the only thing left is to adjust the config.ini file at the root of buildapi to include the following lines

    • carrot.hostname = localhost
    • carrot.userid = buildapi
    • carrot.password = buildapi
    • carrot.exchange = buildapi.control
    • carrot.consumer.queue = buildapi-web
  14. Once you have made sure that the previous lines were added to your config.ini file in buildapi, then start up buildapi
  15. Go to http://localhost:15672/#/connections and a connection with the username 'buildapi' should be listed and the state should be 'running'

And that's that! I attempted to click 'rebuild' again from a branch page like try and it worked! The database entry was successful!

Now that I have been able to get this mq issue figured out with the help of catlee and coop, thanks guys!, I will now move onto the following:

  • Update the wiki doc on Setting up a Local Virtualenv for BuildAPI with the new found instructions on getting RabbitMQ installed on Mac.
  • Begin writting up unittests to test for proper entry of new buildrequests into the schedulerdb
  • Write up the needed logic to enter a single buildrequest
  • Review the logic
  • Lather, Rinse, Repeat

RabbitMQ

I received an email back from ccop today and it sounds like he was getting similar exceptions to what I was, when trying to submit a build. He said that catlee helped him to install and integrate RabbitMQ with buildapi and he was then able to submit builds. Based on that, I am installing and integrating RabbitMQ into buildapi. I have hit a little snag in integrating on a Mac, since the original script from catlee is for linux, but I should be able to get more info on that in the morning when the EST folks are back online. Additionally, coop expanded the buildapi setup docs on the wiki with info on the RabbitMQ and setting up the databases, so this should prove useful for me as well!

Rock < Me < Hardplace

Bug 793989: It's been a few days since my last update, but here is the gist. I am still chasing the issue I mentioned before. It doesn't look like I am able to run any controller function that ends up calling g.mq.* (where g is app_globals), because g.mq is returning NoneType. It appears as though buildapi.lib.mq is never actually added to app_globals, or if it is, I cannot seem to find it… How is this setup in the production version of buildapi? For instance, I am assuming that when an 'authorized' user enters a valid revision into a the box at the bottom of https://secure.pub.build.mozilla.org/buildapi/self-serve/try where it says "Create new dep builds on try revision", that it'll successfully kick-off that functionality. In my instance, this simply fails with "AttributeError: 'NoneType' object has no attribute 'newBuildAtRevision'". I have played with pdb a bit to try and unearth something, but it seems to me that there is simply a configuration of some sort missing in my local instance, that is present in the production environment. I am throwing out these questions to coop to see if he has run into this issue before.

Bug 931580: So, in the meantime, I am back to working on bug 931580.

Add-On Idea: Additionally, I threw an idea around to some devs about making an add-on for Firefox that takes your hg-related email (the one you always use to make checkins on hg), and it'll look for, track/log and alert you when a checkin you have made has completed all builds/tests and if it Passed or Failed (Some issue other than all greens). This plugin would make use of the buildapi extension that I already built this summer which returns json to tell whether a checkin has finished all builds/tests and if it has passed them all or failed (again, something other than all greens)… that extension relates to bug 900318