Project

General

Profile

Support #1534

Print out the job output, error and log for failed BGQ autobuild jobs and relaunch the jobs because of typical failures

Added by Nitin Bhat about 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
04/24/2017
Due date:
% Done:

0%


Description

This was discussed in the Core meeting. It would be good to have the output, error and log to determine the failure reason as pami/pamilrts jobs are failing intermittently for no concrete reason.

History

#1 Updated by Nitin Bhat almost 2 years ago

  • Priority changed from Normal to High
  • Subject changed from Print out the job output, error and log for failed autobuild jobs launched during pami/pamilrts nightly builds to Print out the job output, error and log for failed BGQ autobuild jobs

#2 Updated by Nitin Bhat almost 2 years ago

  • Subject changed from Print out the job output, error and log for failed BGQ autobuild jobs to Print out the job output, error and log for failed BGQ autobuild jobs and relaunch the jobs because of typical failures

There are spurious build failures which occur because of the job booting too many times and job not starting after boot. These cases result in job termination and cause the entire build to fail. Carefully identify these system failure cases and relaunch the jobs wherever applicable.

#3 Updated by Nitin Bhat almost 2 years ago

  • Status changed from New to In Progress

Fix to print out the error, output and scheduler log: https://charm.cs.illinois.edu/gerrit/#/c/3083/

#4 Updated by Phil Miller almost 2 years ago

  • Target version set to 6.8.1

#5 Updated by Phil Miller almost 2 years ago

  • Status changed from In Progress to Closed

Also available in: Atom PDF