Progress on shipit bug 826753

Last Monday I scoped the steps for Bug 826753, and spent the week prototyping everything to a functional state. I have now created a number of bugs to help with the steps needed to ship this functionality. Here are the steps as I have scoped them:

  1. Bug 1032975 - Add new table(s) to shipit database
  2. Bug 1032978 - Add a standalone process that listens to pulse for release related buildbot messages
  3. Bug 1032985 - Add REST API entry point to shipit that allows shipit-agent to enter release data into shipit database
  4. Bug 826753 - release automation should update ship it at certain points. Add the following endpoints to the app

    • /status – Lists all available releases, that have status data, in JSON format and can be queried with parameters (ie /status?var1=1&var2=10). Analogous to /releases
    • /statuses.html – Pretty GUI view of the info shown in /status. Analogous to /releases.html
    • /status/<release-name>

      • GET: Lists release status info in JSON. Analogous to /releases/<release-name>
      • POST: covered in Bug 1032985
    • /status/<release-name>.html – Pretty GUI view of the info shown in /status/ GET, possibly including a visual timeline of the 6 steps: Tagging completed, All builds/repacks completed, Updates on betatest, Ready for releasetest, Ready for release and Postrelease. (status steps taken from Bug 826753 comment 0)

 

Currently I have a set of outstanding questions, which I will summerize here by the bug they are posted in.

  • Bug 1032975

    • Should I add 2 tables to the shipit database, or just 1? (release_data and release_status, or just release_status)
    • How is the schema for release_status? Does it need to be more explicit and less abstract? Does it need more fields?
  • Bug 1032978

    • What should I use other than .netrc to authenticate when sending a POST request to the shipit entry point at /status?
  • Bug 1032985

    • I'm using the routing_key field of pulse.m.o messages to find buildbot release related messages. Specifically I am sorting for *release*, but there are multiple prefixes that show up including build.release-*, unittest.release-*, talos.release-*, etc, etc. Should I be keeping or ignoring these messages (ie are they useful to me in determining the 6 status steps mentioned in Bug 826753 comment 0)
    •  I am currently using the pulse.m.o message fields product + version + build_number to assemble the name to store release data with (ie Firefox-31.b05-build1). This is necessary because releases are stored in the shipit database with names of this format as their primary key. The problem is that many pulse.m.o messages have None in product and version. Additionally, some seem to deviate from the product (ie I have seen "mobile", "firefox", and "xulrunner", when I thought I should just be seeing "firefox"… I could be mistaken about what I should expect). Should I ignore the messages that hold None in product/version? Is there some other field/fields that I should be filtering based on?
    • Specifically what sort of messages pertain to each of the 6 steps: Tagging completed, All builds/repacks completed, Updates on betatest, Ready for releasetest, Ready for release and Postrelease? (status steps taken from Bug 826753 comment 0)

Useful things at this stage:

  • 640 collected pulse.m.o messages relating to Firefox-31.b05-build1 release, stored in a database with each message inhabiting a single row

    • Should contain most, if not all, buidlbot release messages from the Firefox-31.b05-build1 release, but it's possible that I missed some
    • To load and use simply run "cat releasedata.sql | sqlite releasedata.sqlite" and use the SQLite Manager Firefox Add-On to sort and use the database to gain a context for the sorts of messages I am sorting through.
  • I also have 22,000+ pulse.m.o messages from the period of time that the Firefox-31.b05-build1 release was running, that I simply scrapped for messages where routing_key="%release%"

If you have anything at all to add, please reply to the questions I have posted in the bugs that they pertain to :)

More to come on this!

Scoping Bug 826753 – release automation should update ship it at certain points

I spent Tuesday scoping Bug 826753 - release automation should update ship it at certain points, and I have come up with a set of clear deliverables. This list could still be modified going forward, but here is the rough layout.

I learned:

  • Buildbot publishes updates, in addition to emails to release@, for the progress of releases to pulse.m.o
  • Message format for Buildbot updates is found on pulse.m.o
  • I must look for build.release-*.finished in the ['payload']['buildername'] part of the pulse.m.o messages

Am doing:

  • Setup a logger on dev-master1 to scrap all the messages from pulse.m.o overnight to catch all messages pertaining to the just launched 31.0b4 release, which will give me a live current sample to test with

Deliverables (no particular order):

  • https://ship-it.mozilla.org/

    • /status (AND /status?var1=1&var2=10)

      • Lists all available releases with status in JSON format and can be queried with parameters
    • /status.html

      • pretty gui view of the info shown in /status
    • /status/<release-name>

      • GET: dumps release status info in JSON
      • POST: updates a new table in update.db with new status info about <release-name>
    • /status/<release-name>.html

      • pretty gui view of the info shown in /status/<release-name> GET and will likely include a nice timeline with the A, B, C steps referred to in comment 3 of bug 826753
  • Long running standalone script that listens to pulse.m.o for updates about releases and then uses the /status/<release-name> REST API entry point to update a new table in update.db with new status info about <release-name>
  • New status table in ubdate.db

More tomorrow (Tuesday)!

Manual Testing of Arbitrary Builds

When a new selfserve-agent change is pushed to production, it's necessary to verify functionality with some maual testing. Here are some steps to basic testing:

  1. If no new try job to mess with, then submit one, see ReleaseEngineering/TryServer

     

    • hg clone http://hg.mozilla.org/mozilla-central
    • cd mozilla-central
    • echo "THING" >> README.txt
    • hg qnew test-patch
    • hg qref –message "try: -b o -p linux64 -u none -t none"
    • hg push -f ssh://hg.mozilla.org/try/
  2. In my case you can see the try job running here: https://tbpl.mozilla.org/?tree=Try&rev=3a5e6ca198d8

     

    • If the push is successful it'll give you your own link
  3. Submit a blank arbitrary job request to https://secure.pub.build.mozilla.org/buildapi/self-serve/try/builders/Linux x86-64 try build/3a5e6ca198d8 using trigger_arbitrary_job.py
  4. python trigger_arbitrary_job.py –buildername "Linux x86-64 try build" –branch try –rev 3a5e6ca198d8

     

    • Leaving –file out so that files = []
  5. See running job here https://secure.pub.build.mozilla.org/buildapi/revision/try/3a5e6ca198d8
  6. Check for pending job at https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/3a5e6ca198d8
  7. Also check https://tbpl.mozilla.org/?tree=Try&rev=3a5e6ca198d8
  8. Check buildbot status can be found by finding the appropriate master on the buildapi page https://secure.pub.build.mozilla.org/buildapi/revision/try/3a5e6ca198d8

Deployed BuildAPI bug fix, L2 Access, Tupperware

A bunch of new stuff!

New Things

New Bugs

What's next?

  • Test Arbitrary Builds for bug Bug 1009565 – Triggering arbitrary jobs gets branch wrong
  • Make initial commit to hg.m.o/build/tupperware
  • Troubleshoot buildbot web interface on localhost:8501 
  • Make multiple Vagrantfiles to choose from based on required setup needs
  • Publish docker images to new Docker Index for Mozilla repo 
  • Create a wiki doc for Tupperware  
  • Create mysql-app that can load database schemas

That is all for now!

BuildAPI, Buildbot, RabbitMQ and MySQL containers are all up! Some testing left…

 BuildAPI, Buildbot, RabbitMQ and MySQL containers are all up now! To run pull http://hg.mozilla.org/users/jozeller_mozilla.com/vagrant-docker-setup and run 'vagrant up' from the vagrant-docker-setup/ directory.

The vagrant up command will take several minutes to run the first time because it needs to pull the docker images from the Docker Index at docker.io. More to come tomorrow on this. NOTE: Buildbot seems to be running, but I have not been able to test *full* functionality just yet. However, the buildapi-app, rabbitmq-app and orchardup/mysql containers run together just fine.

To view

  • BuildAPI: localhost:8888
  • RabbitMQ: localhost:15672
  • Buildbot: localhost:8501 – NOT YET

Keep checking back!

New

  • Added specific app users to mysql with passwords
  • Added version row with value 6 to schedulerdb
  • Showed that an added job from buildapi will show up in mysql on buildbot
  • The malformed url error was caused by the fact that the URL was not importing the environment variable
  • Once the env var was imported, I was still getting the malformed url, but this time it was because I had not created a password for the user. I remember when I was setting up my local buildbot instance that I ran into this same problem. There is a regex that is checking to see that the url is not malformed and it does not take kindly to the absense of passwords, regardless of the fact that mysql is okay with not having a password for a user at all.
  • Uploaded images for johnlzeller/rabbitmq, johnlzeller/buildapi and johnlzeller/buildbot to Docker Index
  • Verified that entire setup can be run in Vagrant

What's next?

  • Create repo on hg.mozilla.org/build for holidng Vagrantfile and Dockerfiles for images and update the new hg.m.o/build repo with Vagrantfile and Dockerfiles for images
  • Troubleshoot why the buildbot web interface is not showing up on localhost:8501
  • Publish setup to blog

After initial release

  • Have 1 of 2 things should happen:

     

    1. Have mysql-app setup to load its own schemas and users
    2. Have individual apps only load schemas and users if they do not already exist… this ensures persistence of the databases
  • Look into using the VOLUME docker command to setup an easy way to share a host directory for editing purposes. The goal here is to make it easy to make changes to the running dev setup and to test that setup. Currently, the docker setup just runs the tip of each repository for buildapi and buildbot

Questions

  • Why/how does schedulerdb.version get propogated with a version number int like 6. Buildbot-app was failing on the fact that there was no row in version. I just added 6 into it, since that is what my local schedulerdb dump had, but is there a more appropriate way to do this? Does this check need to be changed? The assert can be found on line 35 of /usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_f6d9311d9246_production_0.8-py2.7.egg/buildbot/db/schema/manager.py

BuildAPI-app, RabbitMQ-app and orchardup/mysql are working correctly

BuildAPI-app, RabbitMQ-app and orchardup/mysql are working correctly. This post is a short update of working through the What's next list from the previous post. Here is the updated list

What's next?

The next steps are these:

  • Resolve exceptions.ValueError in buildbot-app
  • Resolve sqlalchemy.exc.OperationalError in buildapi-app
  • Link rabbitmq, mysql, and buildapi and test that everything works
  • Link mysql, and buildbot and test that everything works
  • Link rabbitmq, mysql, buildapi AND buildbot and test that the whole package works
  • See if there is a good way to load statusdb and schedulerdb schemas into mysql in a mysql-app setup built on the orchardup/mysql image. This would prevent the redundanc of loading schemas in buildapi-app and buildbot-app

Linking of docker containers and further issues with buildbot-app

All docker containers now exist, and one of the only things left to do is get all the containers playing nice with one another.

MySQL-app

I set out to breakout mysql into its' own docker containerand made good progress, but before proceeding further with debugging some setup problems, I checked out if anyone was opposed to using another mysql docker container as a foundation for our own. There are hundreds of mysql docker containers out there so it seemed silly to dupliate work if unnecessary. Noone had objections, so I went ahead and picked out a mysql docker container to use. I chose orchardup/mysql from the Docker Index because it was pretty barebones and for the nice additional features it add in the form of being able to set environment variables in the container at runtime to do things like setup your own usernames, passwords, databases, etc, etc.

After awhile of trying to modify the run scripts that the orchardup/mysql image uses to launch the mysql server, I decided to back down for the time being. I was attempting to use orchardup/mysql as a base for our own mysql-app, so that I could then have our app do the additional loading of statusdb and scheduelrdb schemas. This proved to be a pain, and so rather than fight it further, I went with the redundant option of having buildapi-app and buildbot-app each individually load the schemas they needed into the database, regardless of if the schema already existed. I am not happy with this as a permanent solution for this development setup, but it should work well for our initial setup.

This also means that vagrant will now simply just need to pull the orchardup/mysql image, run it, forward ports, and link it with the other container apps, making this the lightest setup.

I modified buildbot-app and buildapi-app to use the newly created environment variables for the mysql app when connecting and using the databases (they appear upon running the docker containers when linking).

Buildbot-app

When I went to test buildbot-app, I ran into a an exceptions.ValueError: Malformed url

(Buildbot)root@96fbd42254f3:/# /start_buildbot.sh
mysql: option '-h' requires an argument
cd master && buildbot start $PWD
Following twistd.log until startup finished..
/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_f6d9311d9246_production_0.8-py2.7.egg/buildbot/scripts/logwatcher.py:52: PotentialZombieWarning: spawnProcess called, but the SIGCHLD handler is not installed. This probably means you have not yet called reactor.run, or called reactor.run(installSignalHandler=0). You will probably never see this process finish, and it may become a zombie process.
  env=os.environ,
2014-05-21 05:35:52+0000 [-] Log opened.
2014-05-21 05:35:52+0000 [-] twistd 9.0.0 (/usr/bin/python2.7 2.7.3) starting up.
2014-05-21 05:35:52+0000 [-] reactor class: twisted.internet.selectreactor.SelectReactor.
2014-05-21 05:35:52+0000 [-] monkeypatch_twisted_cbLogin applied
2014-05-21 05:35:52+0000 [-] Creating BuildMaster — buildbot.version: 0.8.2-hg-f6d9311d9246-production-0.8
2014-05-21 05:35:52+0000 [-] loading configuration from /Buildbot/build-master/master.cfg
2014-05-21 05:35:52+0000 [-] unable to import dnotify, so Maildir will use polling instead
2014-05-21 05:35:52+0000 [-] JacuzziAllocator 44938192: created
2014-05-21 05:35:52+0000 [-] nextAWSSlave: start
2014-05-21 05:35:52+0000 [-] nextAWSSlave: start
2014-05-21 05:35:54+0000 [-] JacuzziAllocator 37763792: created
2014-05-21 05:35:54+0000 [-] nextAWSSlave: start
2014-05-21 05:35:54+0000 [-] nextAWSSlave: start
2014-05-21 05:35:59+0000 [-] finished loading config file
2014-05-21 05:36:01+0000 [-] BuildMaster listening on port tcp:9000
2014-05-21 05:36:01+0000 [-] configuration update started
2014-05-21 05:36:01+0000 [-] configuration update failed
2014-05-21 05:36:01+0000 [-] Unhandled Error
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_f6d9311d9246_production_0.8-py2.7.egg/buildbot/master.py", line 628, in loadTheConfigFile
        d = self.loadConfig(f)
      File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_f6d9311d9246_production_0.8-py2.7.egg/buildbot/master.py", line 933, in loadConfig
        d.addCallback(lambda res:
      File "/usr/local/lib/python2.7/dist-packages/Twisted-9.0.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py", line 190, in addCallback
        callbackKeywords=kw)
      File "/usr/local/lib/python2.7/dist-packages/Twisted-9.0.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py", line 181, in addCallbacks
        self._runCallbacks()
    — <exception caught here> —
      File "/usr/local/lib/python2.7/dist-packages/Twisted-9.0.0-py2.7-linux-x86_64.egg/twisted/internet/defer.py", line 323, in _runCallbacks
        self.result = callback(self.result, *args, **kw)
      File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_f6d9311d9246_production_0.8-py2.7.egg/buildbot/master.py", line 934, in <lambda>
        self.loadConfig_Database(db_url, db_poll_interval))
      File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_f6d9311d9246_production_0.8-py2.7.egg/buildbot/master.py", line 1055, in loadConfig_Database
        db_spec = DBSpec.from_url(db_url, self.basedir)
      File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_f6d9311d9246_production_0.8-py2.7.egg/buildbot/db/dbspec.py", line 175, in from_url
        raise ValueError("Malformed url")
    exceptions.ValueError: Malformed url
    

The buildmaster took more than 10 seconds to start, so we were unable to
confirm that it started correctly. Please 'tail twistd.log' and look for a
line that says 'configuration update complete' to verify correct startup.

make: *** [start] Error 1

It's possible this has to do with the mysql setup, as I possibly didn't link things up fully. More testing is necessary for tomorrow.

Buildapi-app

To run rabbitmq, mysql, and buildapi all linked together, run these commands in sequence

  • docker run -d -p 5672:5672 -p 15672:15672 -p 4369:4369 -name rabbitmq rabbitmq-app
  • docker run -d -p 3306:3306 -name=mysql orchardup/mysql
  • docker run -t -i -p 8888:8888 -link rabbitmq:mq -link mysql:sql -name buildapi buildapi-app /bin/bash

This will drop you into a bash shell session in buildapi-app

When I attempt to run /start_selfserve_buildapi.sh I receive the following error:

root@e141d055c1c7:/# ./start_selfserve_buildapi.sh
Starting subprocess with file monitor
Running reloading file monitor
2014-05-21 06:37:13,352 Kombu connection revived
2014-05-21 06:37:13,353 Connected to amqp://selfserveagent@172.17.0.2:5672//
Traceback (most recent call last):
  File "/usr/local/bin/paster", line 9, in <module>
    load_entry_point('PasteScript==1.7.3', 'console_scripts', 'paster')()
  File "/usr/local/lib/python2.7/dist-packages/paste/script/command.py", line 84, in run
    invoke(command, command_name, options, args[1:])
  File "/usr/local/lib/python2.7/dist-packages/paste/script/command.py", line 123, in invoke
    exit_code = runner.run(args)
  File "/usr/local/lib/python2.7/dist-packages/paste/script/command.py", line 218, in run
    result = self.command()
  File "/usr/local/lib/python2.7/dist-packages/paste/script/serve.py", line 276, in command
    relative_to=base, global_conf=vars)
  File "/usr/local/lib/python2.7/dist-packages/paste/script/serve.py", line 313, in loadapp
    **kw)
  File "/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 247, in loadapp
    return loadobj(APP, uri, name=name, **kw)
  File "/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 272, in loadobj
    return context.create()
  File "/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 710, in create
    return self.object_type.invoke(self)
  File "/usr/local/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 146, in invoke
    return fix_call(context.object, context.global_conf, **context.local_conf)
  File "/usr/local/lib/python2.7/dist-packages/paste/deploy/util.py", line 56, in fix_call
    val = callable(*args, **kw)
  File "/buildapi/buildapi/config/middleware.py", line 55, in make_app
    config = load_environment(global_conf, app_conf)
  File "/buildapi/buildapi/config/environment.py", line 66, in load_environment
    init_scheduler_model(scheduler_engine)
  File "/buildapi/buildapi/model/__init__.py", line 7, in init_scheduler_model
    scheduler_db_meta.reflect(bind=engine)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/schema.py", line 2342, in reflect
    conn = bind.contextual_connect()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 2284, in contextual_connect
    self.pool.connect(),
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 209, in connect
    return _ConnectionFairy(self).checkout()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 370, in __init__
    rec = self._connection_record = pool._do_get()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 696, in _do_get
    con = self._create_connection()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 174, in _create_connection
    return _ConnectionRecord(self)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 255, in __init__
    self.connection = self.__connect()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 315, in __connect
    connection = self.__pool._creator()
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/strategies.py", line 80, in connect
    return dialect.connect(*cargs, **cparams)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 275, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/usr/local/lib/python2.7/dist-packages/MySQLdb/__init__.py", line 81, in Connect
    return Connection(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 187, in __init__
    super(Connection, self).__init__(*args, **kwargs2)
sqlalchemy.exc.OperationalError: (OperationalError) (2005, "Unknown MySQL server host 'SQL_PORT_3306_TCP_ADDR' (0)") None None

Again, this looks like a linking issue and more testing is necessary.

What's next?

The next steps are these:

  • Resolve exceptions.ValueError in buildbot-app
  • Resolve sqlalchemy.exc.OperationalError in buildapi-app
  • Link rabbitmq, mysql, and buildapi and test that everything works
  • Link mysql, and buildbot and test that everything works
  • Link rabbitmq, mysql, buildapi AND buildbot and test that the whole package works
  • See if there is a good way to load statusdb and schedulerdb schemas into mysql in a mysql-app setup built on the orchardup/mysql image. This would prevent the redundanc of loading schemas in buildapi-app and buildbot-app

Things I found useful here

  • docker logs <container id>
  • vboxmanage modifyvm boot2docker-vm –-nic1 delete http
  • vboxmanage modifyvm boot2docker-vm –natpf1 "http,tcp,127.0.0.1,8888,,8888"

Things to look into

  • VOLUME docker command
  • Renaming apps as mozilla/buildbot-dev or mozilla/buildbot-dev
  • Setting multiple natpf's for boot2docker testing

Buildbot-app issues resolved, next installing mysql

I was able to resolve the issues I was having previously with buildbot-app thanks to some help from nthomas and aki. The changes that they suggested and some that I discovered that solved things were as follows:

  • apt-get install -y python-openssl
  • Added a .bashrc with:

     

    • source /Buildbot/bin/activate
    • export PYTHONPATH=/Buildbot:/Buildbot/tools/lib/python
  • Switched to the production-0.8 branch of buildbot
  • Fixed issues with the configuration of buildbotcustom and tools being added to the path
  • ln -s /Buildbot/buildbotcustom /Buildbot/lib/python2.7/site-packages/buildbotcustom
  • Removed ["mozilla-1.9.2", "mozilla-beta"] from release_branches in master_config.json
  • Remove PYTHONPATH export from buildbot-configs/Makefile.master

     

    • This was 'export PYTHONPATH=""', so everytime I ran make anything it reset my PYTHONPATH, negating anything I had added to it

After all that, I was able to get buildbot-app running, but wasn't able to verify if it was actually up and ready to be used. Still left to do is breakout mysql into its own docker container and then link up buildbot to the mysql container to be used.

Lingering issue:

When running ./test-masters.sh in /Buildbot/buildbot-configs/, 2 solitary tests on 2 masters fail, but the logs show nothing at all.

(Buildbot)root@177c6fc1687c:/Buildbot/buildbot-configs# ./test-masters.sh
Checking 22 masters…
bm01-tests1-linux32 bm51-tests1-linux64 bm69-tests1-windows bm70-build1 bm75-try1 bm81-build_scheduler bm81-tests_scheduler bm88-tests1-tegra bm89-tests1-panda bm103-tests1-linux bm106-tests1-macosx bm113-tests1-linux64 bm01-tests1-linux32-universal bm51-tests1-linux64-universal bm69-tests1-windows-universal bm70-build1-universal bm75-try1-universal bm88-tests1-tegra-universal bm89-tests1-panda-universal bm103-tests1-linux-universal bm106-tests1-macosx-universal bm113-tests1-linux64-universal
INFO  – creating "bm89-tests1-panda" master
INFO  – created  "bm89-tests1-panda" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm89-tests1-panda
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm51-tests1-linux64" master
INFO  – created  "bm51-tests1-linux64" master, running checkconfig
ERROR – TEST-FAIL bm51-tests1-linux64 failed to run checkconfig
INFO  – log for "bm51-tests1-linux64" is "/Buildbot/buildbot-configs/test-output/bm51-tests1-linux64-0yGBNj-checkconfig.log"
INFO  – TEST-SUMMARY: 22 tested, 1 failed
INFO  – FAILED-MASTER bm51-tests1-linux64, log: 'test-output/bm51-tests1-linux64-0yGBNj-checkconfig.log', dir: 'test-output/bm51-tests1-linux64-0yGBNj'
INFO  – creating "bm81-build_scheduler" master
INFO  – created  "bm81-build_scheduler" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm81-build_scheduler
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm70-build1-universal" master
INFO  – created  "bm70-build1-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm70-build1-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm75-try1" master
INFO  – created  "bm75-try1" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm75-try1
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm75-try1-universal" master
INFO  – created  "bm75-try1-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm75-try1-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm70-build1" master
INFO  – created  "bm70-build1" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm70-build1
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm103-tests1-linux-universal" master
INFO  – created  "bm103-tests1-linux-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm103-tests1-linux-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm106-tests1-macosx" master
INFO  – created  "bm106-tests1-macosx" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm106-tests1-macosx
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm69-tests1-windows" master
INFO  – created  "bm69-tests1-windows" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm69-tests1-windows
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm106-tests1-macosx-universal" master
INFO  – created  "bm106-tests1-macosx-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm106-tests1-macosx-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm88-tests1-tegra" master
INFO  – created  "bm88-tests1-tegra" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm88-tests1-tegra
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm89-tests1-panda-universal" master
INFO  – created  "bm89-tests1-panda-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm89-tests1-panda-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm81-tests_scheduler" master
INFO  – created  "bm81-tests_scheduler" master, running checkconfig
ERROR – TEST-FAIL bm81-tests_scheduler failed to run checkconfig
INFO  – log for "bm81-tests_scheduler" is "/Buildbot/buildbot-configs/test-output/bm81-tests_scheduler-0DxsKt-checkconfig.log"
INFO  – TEST-SUMMARY: 22 tested, 1 failed
INFO  – FAILED-MASTER bm81-tests_scheduler, log: 'test-output/bm81-tests_scheduler-0DxsKt-checkconfig.log', dir: 'test-output/bm81-tests_scheduler-0DxsKt'
INFO  – creating "bm113-tests1-linux64-universal" master
INFO  – created  "bm113-tests1-linux64-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm113-tests1-linux64-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm103-tests1-linux" master
INFO  – created  "bm103-tests1-linux" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm103-tests1-linux
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm88-tests1-tegra-universal" master
INFO  – created  "bm88-tests1-tegra-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm88-tests1-tegra-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm69-tests1-windows-universal" master
INFO  – created  "bm69-tests1-windows-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm69-tests1-windows-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm51-tests1-linux64-universal" master
INFO  – created  "bm51-tests1-linux64-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm51-tests1-linux64-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm01-tests1-linux32-universal" master
INFO  – created  "bm01-tests1-linux32-universal" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm01-tests1-linux32-universal
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm01-tests1-linux32" master
INFO  – created  "bm01-tests1-linux32" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm01-tests1-linux32
INFO  – TEST-SUMMARY: 22 tested, 0 failed
INFO  – creating "bm113-tests1-linux64" master
INFO  – created  "bm113-tests1-linux64" master, running checkconfig
INFO  – TEST-PASS checkconfig OK for bm113-tests1-linux64
INFO  – TEST-SUMMARY: 22 tested, 0 failed
*** 2 master tests failed ***
Failed masters:
  bm51-tests1-linux64
  bm81-tests_scheduler

ImportError: No module named buildbotcustom.misc

In my previous post, I walk through the setup that I have so far for buildbot in a docker container and the issues I am having with it. It turns out that it all seems to revolve around an ImportError for a module named buildbotcustom.misc. I added the following to the Dockerfile based on suggestions:

RUN export PYTHONPATH=/Buildbot:/Buildbot/tools/lib/python
RUN ln -s /Buildbot/buildbotcustom /Buildbot/lib/python2.7/site-packages/buildbotcustom

When I run ./test-masters.sh in /Buildbot/buildbot-configs, it fails all 20 masters. When I look at *checkconfig.log in the test-output/ directory, each and every master is failing checkconfig with this error, which makes sense given that this is the same error given when running make checkconfigs in /Buildbot/build-master/. Seems like a simple path issue, but I am unsure what the next step should be here.

The Dockerfile and other files necessary for the container in question here can be found here: http://hg.mozilla.org/users/jozeller_mozilla.com/vagrant-docker-setup/file/b2c0600df541/buildbot-app

To run this container just pull the directory above and run:

docker build -t buildbot-app .; docker run -i -t buildbot-app /bin/bash;

Then the container will launch! Assuming you have docker already setup on your system. Don't forget to launch the virtualenv in /Buildbot/

Setting up the Buildbot-app docker container

Now that the rabbitmq-app and buildapi-app containers are all setup and running well for docker, it is now time to get the buildbot container running. Some things to keep in mind:

  • Most of the wiki info available for setting up buildbot is for linux systems as far as I can tell, which is a plus since setting up on Mac was a bit of a hassle last time
  • MySQL may need to be broken out into its own container so that buildbot can play nice with the other containers and be able to write to the dbs. This is possible through linking, but then if you only desired a buildbot container without buildapi, then that would sort of defeat the purpose of having containerized apps in the first place. So, likely this is going to happen next, which shouldn't be too bad.

I began building the buildbot-app Dockerfile, which can be found in my vagrant-docker-setup repo. To build this I have been referencing the fresh_bb_setup script in the buildbot-related braindump repo. Setup went pretty well and the docker file can setup the buildbot instance it seems. However, upon trying to run tests I am getting fails on all 20 master tests, when running test-masters.sh from buildbot-configs/.

(Buildbot)root@d85be8f020ed:/Buildbot/buildbot-configs# ./test-masters.sh
Checking 20 masters…
bm01-tests1-linux32 bm51-tests1-linux64 bm69-tests1-windows bm70-build1 bm75-try1 bm81-build_scheduler bm81-tests_scheduler bm88-tests1-tegra bm89-tests1-panda bm103-tests1-linux bm106-tests1-macosx bm01-tests1-linux32-universal bm51-tests1-linux64-universal bm69-tests1-windows-universal bm70-build1-universal bm75-try1-universal bm88-tests1-tegra-universal bm89-tests1-panda-universal bm103-tests1-linux-universal bm106-tests1-macosx-universal
INFO  – creating "bm69-tests1-windows" master
INFO  – created  "bm69-tests1-windows" master, running checkconfig
ERROR – TEST-FAIL bm69-tests1-windows failed to run checkconfig
INFO  – log for "bm69-tests1-windows" is "/Buildbot/buildbot-configs/test-output/bm69-tests1-windows-x50Zzq-checkconfig.log"
INFO  – TEST-SUMMARY: 20 tested, 1 failed
INFO  – FAILED-MASTER bm69-tests1-windows, log: 'test-output/bm69-tests1-windows-x50Zzq-checkconfig.log', dir: 'test-output/bm69-tests1-windows-x50Zzq'
INFO  – creating "bm88-tests1-tegra" master
INFO  – created  "bm88-tests1-tegra" master, running checkconfig
ERROR – TEST-FAIL bm88-tests1-tegra failed to run checkconfig
INFO  – log for "bm88-tests1-tegra" is "/Buildbot/buildbot-configs/test-output/bm88-tests1-tegra-sFKRvD-checkconfig.log"
INFO  – TEST-SUMMARY: 20 tested, 1 failed
INFO  – FAILED-MASTER bm88-tests1-tegra, log: 'test-output/bm88-tests1-tegra-sFKRvD-checkconfig.log', dir: 'test-output/bm88-tests1-tegra-sFKRvD'
INFO  – creating "bm103-tests1-linux-universal" master
INFO  – created  "bm103-tests1-linux-universal" master, running checkconfig
ERROR – TEST-FAIL bm103-tests1-linux-universal failed to run checkconfig
…this continues on for the other masters…
INFO  – log for "bm106-tests1-macosx" is "/Buildbot/buildbot-configs/test-output/bm106-tests1-macosx-mo1eFI-checkconfig.log"
INFO  – TEST-SUMMARY: 20 tested, 1 failed
INFO  – FAILED-MASTER bm106-tests1-macosx, log: 'test-output/bm106-tests1-macosx-mo1eFI-checkconfig.log', dir: 'test-output/bm106-tests1-macosx-mo1eFI'
*** 20 master tests failed ***
Failed masters:
  bm69-tests1-windows
  bm88-tests1-tegra
  bm103-tests1-linux-universal
  bm81-tests_scheduler
  bm81-build_scheduler
  bm51-tests1-linux64
  bm89-tests1-panda
  bm103-tests1-linux
  bm106-tests1-macosx-universal
  bm89-tests1-panda-universal
  bm51-tests1-linux64-universal
  bm70-build1-universal
  bm70-build1
  bm01-tests1-linux32
  bm01-tests1-linux32-universal
  bm75-try1
  bm75-try1-universal
  bm69-tests1-windows-universal
  bm88-tests1-tegra-universal
  bm106-tests1-macosx

Additionally, when I attempt to run 'make checkconfigs' from build-master, I am getting an error stating that the module buildbotcustom.misc does not exist.

root@31034fae6dba:/Buildbot/build-master# make checkconfig
cd master && buildbot checkconfig
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_6ac947fa5721_default-py2.7.egg/buildbot/scripts/runner.py", line 1042, in doCheckConfig
    ConfigLoader(configFileName=configFileName)
  File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_6ac947fa5721_default-py2.7.egg/buildbot/scripts/checkconfig.py", line 31, in __init__
    self.loadConfig(configFile, check_synchronously_only=True)
  File "/usr/local/lib/python2.7/dist-packages/buildbot-0.8.2_hg_6ac947fa5721_default-py2.7.egg/buildbot/master.py", line 652, in loadConfig
    exec f in localDict
  File "/Buildbot/build-master/master.cfg", line 8, in <module>
    import buildbotcustom.misc
ImportError: No module named buildbotcustom.misc
make: *** [checkconfig] Error 1

Next steps with setting up the buildbot-app docker container are:

  • Discover why all 20 masters are failing when I run test-masters.sh

    • Have ./test-masters.sh succeed
  • Figure out why the buildbotcustom.misc module is not being found, what it is in the first place and how to either install it or add it to the path properly

    • Have make checkconfig succeed in build-master, try-master, and test-master
  • Run the 'buildbot start' command and have it start successfully!
  • Expose the proper ports for this buildbot-app docker container and forward those ports when launching the container so that I can test that buildbot is up and running and that I can see the web app
  • Setup the mysql-app docker container and link it successfully with the buildapi-app, rabbitmq-app and buildbot-app docker containers
  • Test that the whole set of docker containers are interconnected and all running correctly by running similar commands that work in my own local development setup

Note: To run this container do the following:

  1. boot2docker up
  2. hg clone http://hg.mozilla.org/users/jozeller_mozilla.com/vagrant-docker-setup
  3. cd vagrant-docker-setup/buildbot-app
  4. docker build -t buildbot-app .
  5. docker run -i -t buildbot-app /bin/bash

This will get you into a shell session in the buildbot-app docker container and you can recreate the same issues that I am having above as of revision b2c0600df541