-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fabtests: New fabtest fi_flood to test over subscription of resources #10427
base: main
Are you sure you want to change the base?
Conversation
nikhilnanal
commented
Sep 30, 2024
8e087e8
to
e21b1f5
Compare
@@ -41,6 +41,7 @@ bin_PROGRAMS = \ | |||
functional/fi_rdm_stress \ | |||
functional/fi_multi_recv \ | |||
functional/fi_bw \ | |||
functional/fi_flood \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is just adding a new mode to the bw test - I would just replace/rename the bw test and add the new testing mode inside. No need to create a whole new test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, AWS CI has such flood_peer test that reuse fi_bw: https://github.com/ofiwg/libfabric/blob/main/fabtests/pytest/efa/test_flood_peer.py#L6
fabtests/common/shared.c
Outdated
@@ -3270,6 +3270,7 @@ void show_perf(char *name, size_t tsize, int iters, struct timespec *start, | |||
printf("%8.2fs%10.2f%11.2f%11.2f\n", | |||
elapsed / 1000000.0, bytes / (1.0 * elapsed), | |||
usec_per_xfer, 1.0/usec_per_xfer); | |||
printf("-----------------------------------------\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove random prints through this PR (there are a handful)
fabtests/functional/flood.c
Outdated
@@ -0,0 +1,319 @@ | |||
/* | |||
* Copyright (c) 2019 Intel Corporation. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove year
fabtests/functional/flood.c
Outdated
return ret; | ||
|
||
if (opts.machr) | ||
show_perf_mr(opts.transfer_size, opts.window_size, &start, &end, 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the performance reporting since this is a functional test and has a hardcoded sleep to force unexpected messages. Replace with a PASS/FAIL print
fabtests/functional/flood.c
Outdated
if (ret) | ||
return ret; | ||
|
||
ret = ft_tx(ep, remote_fi_addr, 1, &tx_ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the new option recently added that does this FT_OPT_NO_PRE_POSTED_RX
e21b1f5
to
471aba1
Compare
fabtests/functional/flood.c
Outdated
static void mr_close(struct ft_context *ctx_arr, int window_size) | ||
{ | ||
for (int i = 0; i < window_size; i++) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop brackets
fabtests/functional/flood.c
Outdated
|
||
return ret; | ||
} | ||
static void mr_close(struct ft_context *ctx_arr, int window_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to something that describes what's happening a bit more - this makes it sound like it's closing a single MR
fabtests/functional/flood.c
Outdated
} | ||
static void mr_close(struct ft_context *ctx_arr, int window_size) | ||
{ | ||
for (int i = 0; i < window_size; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare variables outside of for loop
fabtests/functional/flood.c
Outdated
mr_close(tx_ctx_arr, opts.window_size); | ||
mr_close(rx_ctx_arr, opts.window_size); | ||
|
||
printf("sequential memory registration:\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make your test prints consistent - capitalize first word, remove new line, and then print pass or fail in your out
printf("%s\n", ret ? "FAIL" : "PASS");
fabtests/functional/flood.c
Outdated
printf("sequential memory registration:\n"); | ||
ft_start(); | ||
if (opts.dst_addr) { | ||
for (int i = 0; i < opts.window_size; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare outside
fabtests/functional/flood.c
Outdated
if (ret) | ||
return ret; | ||
|
||
ft_post_tx_buf(ep, remote_fi_addr, opts.transfer_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this return something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always returns 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ft_post_tx_buf calls the macro FT_POST which can return an error
https://github.com/ofiwg/libfabric/blob/main/fabtests/common/shared.c#L2172
fabtests/functional/flood.c
Outdated
if (!opts.dst_addr) | ||
sleep(sleep_time); | ||
|
||
ft_start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop performance print and also timers
Timeout failure fi_eq_sread(): common/shared.c:1169, ret=-4 (Interrupted system call) |
cb5cb65
to
23587c8
Compare
bot:aws:retest |
@@ -652,6 +657,7 @@ dummy_man_pages = \ | |||
man/man1/fi_getinfo_test.1 \ | |||
man/man1/fi_mr_test.1 \ | |||
man/man1/fi_bw.1 \ | |||
man/man1/fi_flood.1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the runfabtests/exclude changes, it looks like you intend to replace bw with flood (which I agree with) but here you're adding a new test instead of renaming/adding to bw
fabtests/functional/flood.c
Outdated
if (ret) | ||
return ret; | ||
|
||
ft_post_tx_buf(ep, remote_fi_addr, opts.transfer_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ft_post_tx_buf calls the macro FT_POST which can return an error
https://github.com/ofiwg/libfabric/blob/main/fabtests/common/shared.c#L2172
fabtests/functional/flood.c
Outdated
|
||
#include <shared.h> | ||
|
||
int sleep_time = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare as static
@@ -362,6 +363,10 @@ functional_fi_bw_SOURCES = \ | |||
functional/bw.c | |||
functional_fi_bw_LDADD = libfabtests.la | |||
|
|||
functional_fi_flood_SOURCES = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also update the windows build files as well - fabtests.vcxproj and fabtests.vcxproj.filters
fabtests/functional/flood.c
Outdated
FT_CLOSE_FID(tx_ctx_arr[i].mr); | ||
} | ||
} | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} else {
{ | ||
int ret, i; | ||
|
||
/* Receive side delay is used in order to let the sender |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format block comments like so:
/* This is a block comment
* that spans multiple lines
*/
fabtests/functional/flood.c
Outdated
opts.options |= FT_OPT_ALLOC_MULT_MR; | ||
opts.options |= FT_OPT_NO_PRE_POSTED_RX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move these to the top of the function to align with other fabtests
fi_bw -e msg | ||
|
||
# fi_bw fails by hanging | ||
# fi_flood fails by runfabtest timeout only on the CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra spaces here. I'm also tempted to keep the two separate excludes since they are for different reasons so it's better documented separately
fabtests/functional/flood.c
Outdated
if (ret) | ||
goto err; | ||
|
||
ft_post_rx_buf(ep, opts.transfer_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for returned error
1. MR cache based registrations tests regsiter and send in batch and sequential modes while flooding the cache beyond the maximum size. 2. Test receipt of unexpected messages by overwhelming the receiver Signed-off-by: nikhil nanal <nikhil.nanal@intel.com>
b39a5c7
to
ac31c17
Compare