Project

General

Profile

Feature #1237

Feature #1234: Avoid sender-side copy for large contiguous messages. API for charm and converse layers

Onesided sender side implementation for GNI layer

Added by Vipul Harsh almost 3 years ago. Updated about 2 years ago.

Status:
Merged
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
09/28/2016
Due date:
% Done:

0%

History

#1 Updated by Sam White almost 3 years ago

  • Status changed from New to In Progress

#2 Updated by Phil Miller over 2 years ago

  • Target version set to 6.8.0-beta1
  • Status changed from In Progress to Implemented

https://charm.cs.illinois.edu/gerrit/1908
Works for sizes that aren't multiples of 4, but not for base addresses that are not aligned to multiples of 4.

There's a design for lifting that implementation (using GNI PUT, which isn't constrained like GNI GET) but it can come later.

#3 Updated by Phil Miller over 2 years ago

  • Target version changed from 6.8.0-beta1 to 6.8.0

Since there's still several things to fix up here, and we want to get the beta out the door, we're deferring this to the main release.

#4 Updated by Phil Miller over 2 years ago

  • Status changed from Implemented to In Progress

#5 Updated by Sam White about 2 years ago

  • Priority changed from Normal to High
  • Tags set to #rdma, #machine-layers

It looks like Nitin has been doing work on this too, so might need to coordinate.

#6 Updated by Nitin Bhat about 2 years ago

  • Assignee changed from Justin Miron to Nitin Bhat

Implementing PUT calls from Sender to Receiver for non-aligned memory addresses as RDMA GET expects 4-byte aligned addresses and length.

#7 Updated by Nitin Bhat about 2 years ago

  • Status changed from In Progress to Implemented

Feature: https://charm.cs.illinois.edu/gerrit/#/c/1908/
- Used buffering of short messages for sending messages when they go out of "credits". The messages are buffered and get sent periodically
- Implemented PUT for unaligned cases (when source address, dest address and the length are not 4 byte aligned)
- Deregistered memory on completion of RDMA requests

To do:
- Improve performance by optimizing registration (probably in some other patch as it is relevant for verbs as well)

#8 Updated by Phil Miller about 2 years ago

  • Status changed from Implemented to Merged

#9 Updated by Phil Miller about 2 years ago

  • Status changed from Merged to In Progress

Looks like there's a hang in tests/util since merging this. Hopefully it's something simple to fix, since the higher-level code was tested to be working. Or, more hopefully, it was a spurious failure.

#10 Updated by Phil Miller about 2 years ago

  • Status changed from In Progress to Merged

Hang issue did not reproduce in extended testing. Re-closing

#11 Updated by Sam White about 2 years ago

Related patches merged:

Allocate machine specific info along with the charm message: https://charm.cs.illinois.edu/gerrit/#/c/2758/

Store ack message data before releasing it: https://charm.cs.illinois.edu/gerrit/#/c/2743/

Free smsgs used in RDMA for gni-crayxe builds: https://charm.cs.illinois.edu/gerrit/#/c/2768/

Also available in: Atom PDF