-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[web] fix rest-test issues #593
base: main
Are you sure you want to change the base?
Conversation
@simonlingoogle @wgtdkp any comments? |
1e2ef96
to
a43127a
Compare
Codecov Report
@@ Coverage Diff @@
## main #593 +/- ##
==========================================
- Coverage 71.12% 71.11% -0.01%
==========================================
Files 78 78
Lines 5240 5235 -5
==========================================
- Hits 3727 3723 -4
+ Misses 1513 1512 -1
Continue to review full report at Codecov.
|
tests/rest/test_rest.py
Outdated
try: | ||
response = urllib.request.urlopen(urllib.request.Request(url)) | ||
body = response.read() | ||
data = json.loads(body) | ||
result[index] = data | ||
|
||
|
||
except urllib.error.HTTPError as e: | ||
print(e.code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is raise exception not good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, why catching the exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect to just record the error here then raise one exception in main thread(on top of all subprocess), so for each subprocess, its job is either to get data(success) or to print error log(fail).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then raise one exception in main thread
@tttttangTH I didn't follow how this is implemented.
Let me ask a question: Will the CI fail if a urllib.error.HTTPError
exception is raised in this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, CI will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, CI will fail.
@tttttangTH Could you explain how it works?
I don't see how it can make the CI fail, you are just printing a message...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an array result
which is used to record the response of each thread. if one thread gets HTTP error, the value in result
will still be None
, we will check this in main thread.
src/rest/resource.cpp
Outdated
@@ -518,31 +513,22 @@ void Resource::UpdateDiag(std::string aKey, std::vector<otNetworkDiagTlv> &aDiag | |||
|
|||
void Resource::Diagnostic(const Request &aRequest, Response &aResponse) const | |||
{ | |||
otbrError error = OTBR_ERROR_NONE; | |||
otThreadSetReceiveDiagnosticGetCallback(mInstance, &Resource::DiagnosticResponseHandler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why setting callback for each Diagnostic
is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, If factoryreset
is called, the callback may be bind to another function. We should ensure that each time diagnostic
API in resource
is called, the response message can be collected by our resource
handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After factoryreset
, Resource::Init
will run again and sets up the callback ? Is the original code causing real issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource::Init
will only be run once when we start otbr-agent, then factoryreset
will set the callback to the default(seems a handler for CLI), I am not sure whether there is another approach that set otThreadSetReceiveDiagnosticGetCallback
to another function. do you think we do this each time before diagnostics
API is called is acceptable? or is there an approach that we could detect whether factoryreset
is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource::Init
will only be run once when we start otbr-agent, thenfactoryreset
will set the callback to the default(seems a handler for CLI), I am not sure whether there is another approach that setotThreadSetReceiveDiagnosticGetCallback
to another function. do you think we do this each time beforediagnostics
API is called is acceptable? or is there an approach that we could detect whetherfactoryreset
is called?
I think factoryreset
re-exec the whole process, so the process restarts as if it was just launched normally. We should not blame factoryreset
for any issue. If multiple components of ot-br-posix is using otThreadSetReceiveDiagnosticGetCallback
, maybe they should coordinate with each other.
Should we use EventEmitter
so that ot-br-posix calls otThreadSetReceiveDiagnosticGetCallback
just once and let other components subscribe to the corresponding event using EventEmitter::On
? @gjc13 @tttttangTH
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will try to do like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
tests/rest/test_rest.py
Outdated
try: | ||
response = urllib.request.urlopen(urllib.request.Request(url)) | ||
body = response.read() | ||
data = json.loads(body) | ||
result[index] = data | ||
|
||
|
||
except urllib.error.HTTPError as e: | ||
print(e.code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, why catching the exception?
if (error == OTBR_ERROR_NONE) | ||
{ | ||
aResponse.SetStartTime(steady_clock::now()); | ||
aResponse.SetCallback(); | ||
} | ||
else | ||
{ | ||
ErrorHandler(aResponse, HttpStatusCode::kStatusInternalServerError); | ||
} | ||
aResponse.SetStartTime(steady_clock::now()); | ||
aResponse.SetCallback(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, what happens if we fail in this function? Simply ignore the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that It may fail when node is detached or no buffer for message.
So I think one solution is to define more HTTP status code according to these newly added error code, I think I could do it after #532 is merged.
Another solution is like this, we ignore all errors(more simple, but still reasonable).
For one thing, it could address the no buffer
problem - although we fail in this API call, if we have diagnostic information(received in 4s but due to another call) left in resource
, this information is still valid and could be sent as response.
For another thing, we could just view empty response as "some issues happen, we can't get anything now".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply ignore the error?
@tttttangTH Is this true for your changes? I didn't see why we cannot handle the possible failures in this PR. It looks to me that keeping
else
{
ErrorHandler(aResponse, HttpStatusCode::kStatusInternalServerError);
}
should just work. Why did you remove this error handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because if we call otThreadSendDiagnosticGet
too many times within a specific period, we will get a no buffer
error for this call( for example, when we send 10 requests for diagnostics
concurrently , the no buffer
error usually occurs).
But actually I prefer not viewing it as an error for HTTP response because the reason for calling otThreadSendDiagnosticGet
each time when the server received a request for diagnostics
is to update the information we maintained(we have a hash table to record all the diagnostic information we received within 4 seconds). If we have a 'no buffer' error, it means we have call the API many times recently so information in the hash table is almost the latest.
I know the right way to deal with this problem is to make an exception for no buffer
rather than remove the error handler. I expect to do this after #532 because I think I may need to rewrite error handler of all RESTful API according to the modifications in #532 . So here I just ignore it currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is no buffer error
not good for such situation? I think no buffer error
was exactly what happened and should be reflected by the HTTP response.
Maybe we can return 507 Insufficient Storage
for no buffer error
.
But if the test is sending too much requests and is expecting no buffer error
to happen sometimes, the test client can conclude with success even if there are some no buffer errors.
Thoughts? @tttttangTH @wgtdkp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tttttangTH You should not just remove the error handler. You can at least add a log for this error.
I know the right way to deal with this problem is to make an exception for no buffer rather than remove the error handler. I expect to do this after #532 because I think I may need to rewrite error handler of all RESTful API according to the modifications in #532 .
I don't see why we need to wait for #532. Are there any REST feature that depends on #532?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have't found any other feature depend on #532 , but it seems we won't catch no buffer
at otThreadSetReceiveDiagnosticGetCallback
before #532 is applied.
I noticed no buffer
before when when I wrote this API, at that time, no buffer
was just an Info
, and I have chosen to ignore it at that time. But It‘s thoughtless for me to just remove the error handler here.
Thanks for the fix. I merged this PR to #532 and it did pass all the checks. |
ErrorHandler(aResponse, HttpStatusCode::kStatusInternalServerError); | ||
} | ||
aResponse.SetStartTime(steady_clock::now()); | ||
aResponse.SetCallback(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need to address it in this PR: the function name SetCallback
is bad... I would expect this function accepts a callback function as an argument but it is actually a flag indicates whether we have/enable callback. Can we rename to SetHasCallback
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do it in #537
Labelled as P2 since #532 depends on this. |
After #532 is applied, several issues occur.
|
@tttttangTH I think we are wrong about the principle of testing. To my understanding, you are just removing the error check in REST tests to get #532 passed. But the purpose of tests is to find out bugs. We are not sure if #532 has bugs or not, so it could be #532 that causes existing testcases to fail. So I think you should not change the tests to make PRs happy. Otherwise, why bother have those tests? Before figuring out why #532 fails the tests, we should hold this PR. |
Ok, from my side, if we put aside the |
@tttttangTH OpenThread persists data across restarts. It is expected that OpenThread will continue its function after resatrting. |
Yes. It's not an issue of otbr-rest. But we can still use this PR to enhance otbr-rest, e.g. configure test timeout. |
ok, will update this PR according to our discussion. |
@tttttangTH I think it would be better to close this PR and create a new PR with clearer purpose. |
Will do! |
This PR will be closed and will be divided into several PRs with clear purpose in which they aim to:
|
@tttttangTH We need to make sure #537 always work and mergeable. |
👌! |
c4f0ba5
to
598be52
Compare
This PR mainly fixes some issues on current tests of OTBR-REST, and makes some preparations for #532 .
ReceiveDiagnosticGetCallback
toresource
each time whendiagnostics API
is called(consistent with rest-test modifications and more logistic)