Fixes#602.
I replaced the SIGKILL with a SIGINT, and then catch SIGINT with a
handler. This handler calls cancelSubs if necessary, and can later be
edited to perform other clean-up operations, too. I thought about, in
this same commit, changing the SIGTERM in
maybe_restart_mirroring_script to a SIGINT, but after tracing out the
code paths, I realized that isn't necessary. (The SIGTERM is
necessarily performed on a process that has not subscribed to any
zephyr classes, so cancelSubs is unnecessary. If we do think that we
may want to add additional clean-up operations in the future, though,
then it might be worth investigating changing this SIGTERM.)
(imported from commit 692b295be6cb40b0e4ec2ca0bc58c58056ed9bd9)
Bots are not part of what we distribute, so put them in the repo root.
We also updated some of the bots to use relative path names.
(imported from commit 0471d863450712fd0cdb651f39f32e9041df52ba)
My previous commit (fbdc092029bbafea716e27fbb99fec58a6f24392)
incorrectly specified that you must have a version of python-requests
greater than 0.12.1, when it should be a >= relation since 0.12.1 is
sufficient.
(imported from commit 9f716af6dfe0ce17d982fc22d507f144e9543bec)
Previously, if users of our code put the API folder in their pyshared
they would have to import it as "humbug.humbug". By moving Humbug's API
into a directory named "humbug" and moving the API into __init__, you
can just "import humbug".
(imported from commit 1d2654ae57f8ecbbfe76559de267ec4889708ee8)
The old-style api.humbug is deprecated, so we rewrite the path for things
to work correctly.
(imported from commit 360282b70fab1ad9d9d517157fc4fe06f9acd6a2)
This was causing about 10% of the time, personals being forwarded by
tabbott/extra@ATHENA.MIT.EDU to get this error:
zwrite: Field too long for buffer while sending notice to
tabbott/extra@ATHENA.MIT.EDU
We don't fully understand the cause of the problem, but this fixes it.
(imported from commit 22c39a1061cde9ad6973ef07aee7227623a3d47d)
Sometimes the very first message we send isn't received by
python-zephyr; this could be because of our growing a 17-message queue
of zephyrs to service, so let's not do that.
(imported from commit 281bf1807442b6335b05c803b1a47e0a162bef4e)
This ensures that Humbug receives the verbatim text of the Nagios
body, rather than a slightly modified version. Tested by running
manually.
(imported from commit 7e0eea0b230fd8c5552860acfb7372099598f036)
The refactoring that we did a couple weeks ago to make the zephyr
mirror script restart itself automatically (by splitting it into
zephyr_mirror.py and zephyr_mirror_backend.py) had a poor interaction
with our code for killing old zephyr_mirror processes (to prevent
double-mirroring). If you manually ran two copies of the outer
mirroring script (zephyr_mirror.py), then the second one would on
startup kill the first one's zephyr_mirror_backend.py children (for
being duplicate zephyr_mirror_backend.py processes that would result
in double-mirroring). However, importantly, it did not kill the first
mirroring script's zephyr_mirror.py parent process, so the first
mirroring script would then proceed to startup up new children. The
process then repeats, with the two scripts' roles reversed.
This issue didn't affect the sharded mirroring case, where I had been
doing the testing of the kill code with that refactoring, because we
don't have a version of the outer zephyr_mirror.py loop for that
situation (a consequence of it being hard to restart the threads
properly with the run_parallel API that we're using to spawn all the
children).
(imported from commit d4886ac77312a6b0ebd0d612f6fb084970bf23a2)
This is probably a lot more annoying to use, in that e.g. there are
separate log files for the two directions of the mirror, but we
haven't used these logs for much, so whatever.
(imported from commit d3bc407d90099214d242526c01cd3d3cd9d9d9bd)
Previously all that we logged was the sender, which turns out to be a
constant for humbug=>zephyr mirroring.
(imported from commit 527a3ac4b9b815a2b4d6b63c3ad92d9d5ad99a6e)