Project

General

Profile

Bug #1954

Verbs Init code hangs when it tries connecting an inactive device

Added by Nitin Bhat 3 months ago. Updated 2 months ago.

Status:
Merged
Priority:
Normal
Assignee:
Category:
Machine Layers
Target version:
Start date:
08/06/2018
Due date:
% Done:

0%


Description

This issue was encountered when Yong Qin from Mellanox, tried running a charm++ application on their cluster which had multiple hca cards. The verbs init code should use ibstatus and find the active devices and try to connect to only the active devices. Currently, it tries to connect to the first device returned by ibv_devices. If the first device returned by ibv_devices is inactive, the program hangs.

The fix should be to connect to the only active devices. If all the devices are inactive, an appropriate error message should be displayed.

History

#1 Updated by Evan Ramos 2 months ago

  • Assignee set to Evan Ramos

#2 Updated by Evan Ramos 2 months ago

  • Status changed from New to Implemented

#3 Updated by Evan Ramos 2 months ago

  • Target version set to 6.9.0
  • Status changed from Implemented to Merged

Also available in: Atom PDF