Skip to content

Network Canary Down

Anandkumar Patel edited this page Apr 25, 2016 · 11 revisions

network canary investigation

  1. look in loggly for network canary failed message link to search That log message provides the ip addresses that where unreachable and also the dockerHost the test was run on.

  2. In mongo get list of containers we are supposed to ping

db.instances.find({ 
  'container.inspect.State.Running': true,
  'owner.github': ORG_ID, 
}, {
  'network.hostIp': 1,
  'name': 1,
  'container.dockerHost': 1
})

Ensure

  • we did not ping something we are not supposed to
  • dockerhost still exists
  • weave ps on server shows an ip for that container
  1. look at weave logs for errors (run this on the dockerhost of the unreachable instance and the targetDockerHost you got from loggly
docker logs weave 2>&1 | grep -q -m1 "no such device" && echo BAD || echo OK

if you see BAD weave is hosed

  1. kill weave container to fix and ensure it comes back up (if it does not check sauron)
docker kill weave

Clone this wiki locally