Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate error_msg for all action result messages (#4341) #4460

Open
wants to merge 44 commits into
base: main
Choose a base branch
from

Conversation

aosmw
Copy link
Contributor

@aosmw aosmw commented Jun 22, 2024


Basic Info

Info Please fill out this column
Ticket(s) this addresses (#4341)
Primary OS tested on Ubuntu
Robotic platform tested on none
Does this PR contain AI generated software? No

Description of contribution in a few bullet points

First pass at populating error_msg for all action result messages.

Description of documentation updates required from your changes

Adds some new error codes to FollowWaypoints + FollowGPSWaypoints

  • NO_WAYPOINTS_GIVEN=602
  • STOP_ON_MISSED_WAYPOINT=603

Future work that may be required in bullet points

For Maintainers:

  • Check that any new parameters added are updated in docs.nav2.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

@aosmw aosmw marked this pull request as draft June 22, 2024 12:11
@aosmw aosmw force-pushed the feature/mw/4341_error_msg_2 branch 2 times, most recently from d3be8c2 to da5e15a Compare June 22, 2024 12:31
@SteveMacenski
Copy link
Member

Let me know if you have any questions - but seems like we're covering that in the issue thread instead

@aosmw aosmw force-pushed the feature/mw/4341_error_msg_2 branch 2 times, most recently from 756982f to cdccbdd Compare June 26, 2024 07:39
Copy link

codecov bot commented Jun 26, 2024

Codecov Report

Attention: Patch coverage is 64.64646% with 105 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...avigator/src/navigators/navigate_through_poses.cpp 30.76% 18 Missing ⚠️
...2_bt_navigator/src/navigators/navigate_to_pose.cpp 29.16% 17 Missing ⚠️
nav2_controller/src/controller_server.cpp 35.00% 13 Missing ⚠️
nav2_waypoint_follower/src/waypoint_follower.cpp 40.90% 13 Missing ⚠️
nav2_behaviors/plugins/spin.cpp 40.00% 9 Missing ⚠️
...clude/nav2_behavior_tree/bt_action_server_impl.hpp 75.00% 7 Missing ⚠️
nav2_behaviors/plugins/assisted_teleop.cpp 33.33% 6 Missing ⚠️
...nclude/nav2_behaviors/plugins/drive_on_heading.hpp 66.66% 3 Missing ⚠️
nav2_behaviors/plugins/back_up.cpp 57.14% 3 Missing ⚠️
...e/plugins/action/navigate_through_poses_action.cpp 33.33% 2 Missing ⚠️
... and 10 more
Files with missing lines Coverage Δ
...ee/include/nav2_behavior_tree/bt_action_server.hpp 83.33% <ø> (ø)
...ior_tree/plugins/action/assisted_teleop_action.hpp 100.00% <100.00%> (ø)
...v2_behavior_tree/plugins/action/back_up_action.hpp 100.00% <100.00%> (ø)
...ugins/action/compute_path_through_poses_action.hpp 100.00% <100.00%> (ø)
...ree/plugins/action/compute_path_to_pose_action.hpp 100.00% <100.00%> (ø)
...or_tree/plugins/action/drive_on_heading_action.hpp 100.00% <100.00%> (ø)
...ehavior_tree/plugins/action/follow_path_action.hpp 100.00% <100.00%> (ø)
...e/plugins/action/navigate_through_poses_action.hpp 100.00% <100.00%> (ø)
...or_tree/plugins/action/navigate_to_pose_action.hpp 100.00% <100.00%> (ø)
...ehavior_tree/plugins/action/smooth_path_action.hpp 100.00% <100.00%> (ø)
... and 32 more

... and 2 files with indirect coverage changes

@aosmw aosmw force-pushed the feature/mw/4341_error_msg_2 branch 2 times, most recently from bd49b55 to 54f0402 Compare June 27, 2024 04:51
Copy link
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here - it is still draft so it was lower on my priority list while getting Jazzy out. It seems generally complete, but there seems to be a couple of things missing. When might this be ready for a full review?

See below for an initial review of things that jumped out

@aosmw
Copy link
Contributor Author

aosmw commented Jul 10, 2024

I can rip through these fixups today.
Its still on a feature branch for me, ie not yet tested in the wild, hence still in draft.

@aosmw aosmw force-pushed the feature/mw/4341_error_msg_2 branch 2 times, most recently from 2fdbaea to 19c5b10 Compare July 10, 2024 06:36
Copy link
Contributor

mergify bot commented Jul 10, 2024

@aosmw, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

@SteveMacenski
Copy link
Member

Builds seem to fail from your additions of the error messages:


In file included from /opt/overlay_ws/src/navigation2/nav2_docking/opennav_docking_bt/src/undock_robot.cpp:18:
/opt/overlay_ws/src/navigation2/nav2_docking/opennav_docking_bt/include/opennav_docking_bt/undock_robot.hpp: In static member function ‘static BT::PortsList opennav_docking_bt::UndockRobotAction::providedPorts()’:
/opt/overlay_ws/src/navigation2/nav2_docking/opennav_docking_bt/include/opennav_docking_bt/undock_robot.hpp:93:24: error: ‘error_msg’ was not declared in this scope; did you mean ‘error_t’?
   93 |         BT::OutputPort<error_msg>(
      |                        ^~~~~~~~~
      |                        error_t
/opt/overlay_ws/src/navigation2/nav2_docking/opennav_docking_bt/include/opennav_docking_bt/undock_robot.hpp:93:34: error: no matching function for call to ‘OutputPort<<expression error> >(const char [10], const char [14])’
   93 |         BT::OutputPort<error_msg>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~^
   94 |           "error_msg", "Error message"),
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /opt/ros/rolling/include/behaviortree_cpp/tree_node.h:21,
                 from /opt/ros/rolling/include/behaviortree_cpp/leaf_node.h:17,
                 from /opt/ros/rolling/include/behaviortree_cpp/action_node.h:22,
                 from /opt/overlay_ws/install/nav2_behavior_tree/include/nav2_behavior_tree/bt_action_node.hpp:22,
                 from /opt/overlay_ws/src/navigation2/nav2_docking/opennav_docking_bt/include/opennav_docking_bt/undock_robot.hpp:24:
/opt/ros/rolling/include/behaviortree_cpp/basic_types.h:482:1: note: candidate: ‘template<class T> std::pair<std::__cxx11::basic_string<char>, BT::PortInfo> BT::OutputPort(StringView, StringView)’
  482 | OutputPort(StringView name, StringView description = {})
      | ^~~~~~~~~~
/opt/ros/rolling/include/behaviortree_cpp/basic_types.h:482:1: note:   template argument deduction/substitution failed:
/opt/overlay_ws/src/navigation2/nav2_docking/opennav_docking_bt/include/opennav_docking_bt/undock_robot.hpp:93:34: error: template argument 1 is invalid
   93 |         BT::OutputPort<error_msg>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~^
   94 |           "error_msg", "Error message"),
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/ros/rolling/include/behaviortree_cpp/basic_types.h:569:55: note: candidate: ‘template<class T> std::pair<std::__cxx11::basic_string<char>, BT::PortInfo> BT::OutputPort(StringView, StringView, StringView)’
  569 | [[nodiscard]] inline std::pair<std::string, PortInfo> OutputPort(StringView name,
      |                                                       ^~~~~~~~~~
/opt/ros/rolling/include/behaviortree_cpp/basic_types.h:569:55: note:   candidate expects 3 arguments, 2 provided
/opt/overlay_ws/src/navigation2/nav2_docking/opennav_docking_bt/include/opennav_docking_bt/undock_robot.hpp:82:30: error: cannot convert ‘<brace-enclosed initializer list>’ to ‘BT::PortsList’ {aka ‘std::unordered_map<std::__cxx11::basic_string<char>, BT::PortInfo>’}
   82 |     return providedBasicPorts(
      |            ~~~~~~~~~~~~~~~~~~^
   83 |       {
      |       ~                       
   84 |         BT::InputPort<std::string>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~
   85 |           "dock_type", "The dock plugin type, if not previous instance used for docking"),
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   86 |         BT::InputPort<float>(
      |         ~~~~~~~~~~~~~~~~~~~~~ 
   87 |           "max_undocking_time", 30.0, "Maximum time to get back to the staging pose"),
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   88 | 
      |                               
   89 |         BT::OutputPort<ActionResult::_success_type>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   90 |           "success", "If the action was successful"),
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   91 |         BT::OutputPort<ActionResult::_error_code_type>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   92 |           "error_code_id", "Error code"),
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   93 |         BT::OutputPort<error_msg>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~
   94 |           "error_msg", "Error message"),
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   95 |       });
      |       ~~                      
/opt/overlay_ws/install/nav2_behavior_tree/include/nav2_behavior_tree/bt_action_node.hpp:118:57: note:   initializing argument 1 of ‘static BT::PortsList nav2_behavior_tree::BtActionNode<ActionT>::providedBasicPorts(BT::PortsList) [with ActionT = nav2_msgs::action::UndockRobot; BT::PortsList = std::unordered_map<std::__cxx11::basic_string<char>, BT::PortInfo>]’
  118 |   static BT::PortsList providedBasicPorts(BT::PortsList addition)
      |                                           ~~~~~~~~~~~~~~^~~~~~~~
gmake[2]: *** [CMakeFiles/opennav_undock_action_bt_node.dir/build.make:76: CMakeFiles/opennav_undock_action_bt_node.dir/src/undock_robot.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:202: CMakeFiles/opennav_undock_action_bt_node.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

Its still on a feature branch for me, ie not yet tested in the wild, hence still in draft.

Sounds good! In that case, I can give it a full review, I got through about ~half of it yesterday and I can do the other half right now.

@aosmw aosmw force-pushed the feature/mw/4341_error_msg_2 branch 2 times, most recently from 5ace62c to 135bbb5 Compare July 12, 2024 00:04
@aosmw aosmw force-pushed the feature/mw/4341_error_msg_2 branch 2 times, most recently from 5a1e7aa to 0461cfc Compare August 2, 2024 05:04
@aosmw
Copy link
Contributor Author

aosmw commented Aug 2, 2024

Status Report: I am still dog fooding this in my own backport to humble. I am successfully propagating useful error messages to our navigator action clients.

Tip for any zenoh users (zenoh-bridge-ros2dds 0.11.0-stable), default actions timeout after 5 seconds and you won't get a result, here is a snippet of zenoh config that may help.

        queries_timeout: {                                                                                                                                                                                                                                                                                               
          transient_local_subscribers: 3.0,                                                                                                                                                                                                                                                                              
          services: [                                                                                                                                                                                                                                                                                                    
            ".*=3.0",                                                                                                                                                                                                                                                                                                    
          ],                                                                                                                                                                                                                                                                                                             
          actions: {                                                                                                                                                                                                                                                                                                     
            send_goal: 1.0,                                                                                                                                                                                                                                                                                              
            cancel_goal: 1.0,                                                                                                                                                                                                                                                                                            
            get_result: [                                                                                                                                                                                                                                                                                                
              ".*/navigate_to_pose=3600",                                                                                                                                                                                                                                                                                
              ".*/navigate_through_poses=3600",                                                                                                                                                                                                                                                                          
              ".*/navigate_complete_coverage=3600",                                                                                                                                                                                                                                                                      
              ".*/follow_waypoints=3600",                                                                                                                                                                                                                                                                                
              ".*/follow_path=3600",                                                                                                                                                                                                                                                                                     
              ".*/compute_path_to_pose=30",                                                                                                                                                                                                                                                                              
              ".*/compute_path_through_poses=30",                                                                                                                                                                                                                                                                        
              ".*/compute_coverage_path=30",                                                                                                                                                                                                                                                                             
              ".*/smooth_path=15",                                                                                                                                                                                                                                                                                       
              ".*/backup=15",                                                                                                                                                                                                                                                                                            
              ".*/assisted_teleop=3600",                                                                                                                                                                                                                                                                                 
            ]                                                                                                                                                                                                                                                                                                            
          }  

I still need to hunt down the exception handling triggering [bt_navigator]: Action server failed while executing action callback: "basic_string: construction from null is not valid" showing up in test_waypoint_follower

    [tester_node-18] [INFO] [1722577025.528186847] [nav2_waypoint_tester]: Sending goal request...
    [waypoint_follower-13] [INFO] [1722577025.528650856] [waypoint_follower]: Received follow waypoint request with 1 waypoints.
    [bt_navigator-12] [INFO] [1722577025.529077965] [bt_navigator]: Begin navigating from current location (1.37, -0.48) to (100.00, 100.00)
    [planner_server-10] [WARN] [1722577025.529770042] [planner_server]: GridBasedplugin failed to plan from (1.37, -0.48) to (100.00, 100.00): "Goal Coordinates of(100.000000, 100.000000) was outside bounds"
    [planner_server-10] [WARN] [1722577025.529823778] [planner_server]: [compute_path_to_pose] [ActionServer] Aborting handle.
    [tester_node-18] [INFO] [1722577025.533000946] [nav2_waypoint_tester]: Goal accepted
    [tester_node-18] [INFO] [1722577025.533496865] [nav2_waypoint_tester]: Waiting for 'follow_waypoints' action to complete
    [tester_node-18] [INFO] [1722577025.536710890] [nav2_waypoint_tester]: Received amcl_pose
    [bt_navigator-12] [ERROR] [1722577025.559278904] [bt_navigator]: Goal failed
    [bt_navigator-12] [WARN] [1722577025.559374968] [bt_navigator]: [navigate_to_pose] [ActionServer] Aborting handle.
    [bt_navigator-12] [ERROR] [1722577025.559574641] [bt_navigator]: Action server failed while executing action callback: "basic_string: construction from null is not valid"
    [waypoint_follower-13] [INFO] [1722577025.978807964] [waypoint_follower]: Failed to process waypoint 0, moving to next.
    [waypoint_follower-13] [INFO] [1722577025.978852044] [waypoint_follower]: Completed all 1 waypoints requested.
    [tester_node-18] [INFO] [1722577025.981444559] [nav2_waypoint_tester]: Goal failed to process all waypoints, missed 1 wps.
    [tester_node-18] Traceback (most recent call last):
    [tester_node-18]   File "/opt/overlay_ws/src/navigation2/nav2_system_tests/src/waypoint_follower/tester.py", line 278, in <module>
    [tester_node-18]     main()
    [tester_node-18]   File "/opt/overlay_ws/src/navigation2/nav2_system_tests/src/waypoint_follower/tester.py", line 241, in main
    [tester_node-18]     test.action_result.missed_waypoints[0].error_code
    ```

@SteveMacenski
Copy link
Member

"basic_string: construction from null is not valid"

That's a classic decoding issue in the BT. I'm guessing you have the wrong type set for the input/output port. All the blackboard and port objects gets stringified to store, so its having a problem with that process. I'd look into the types and make sure that (A) they're types that have conversion functions implemented (or basic types) and (B) you're both setting the port correctly and setting its value to the right type in the BT node. I think we ran into an issue with a couple being wrong, so maybe you just missed one more.

Exciting update! Let me know how the dogfooding goes!

@aosmw aosmw force-pushed the feature/mw/4341_error_msg_2 branch from 0461cfc to bcb0638 Compare August 10, 2024 11:33
@aosmw
Copy link
Contributor Author

aosmw commented Aug 10, 2024

Finally found the "basic_string: construction from null is not valid" - details of how in the commit message bcb0638

@SteveMacenski
Copy link
Member

SteveMacenski commented Aug 15, 2024

Woof, I see that error, that's an annoying one to find. I'm glad you got that unblocked - copy paste errors can be the worst!

Let me know if I'm blocking you :-)

@aosmw
Copy link
Contributor Author

aosmw commented Aug 20, 2024

You are not blocking me. I am STILL hunting down what I think is a cancellation that gets reported by the navigator action server as an abort. The result error_code = 0 and error_msg is empty.

…ion#4341)

This was tough to spot.

The nav2_system_tests waypoint follower provided a reproducable test

Found out how to run just the waypoint_follower test
colcon test --packages-select nav2_system_tests --ctest-args -N --event-handler console_direct+ | grep waypoint_follower

changed src/navigation2/nav2_bringup/launch/navigation_launch.py
to start nav2_bt_navigator Node with
+                prefix=['gnome-terminal -- gdb -ex run --args'],
+                respawn=False,

Trying to use gdb to catch where the exception was thrown is
too slow and the test gives up on itself.

I found the relevant throw std::logic_error on
my system by writing tiny program and breaking on std::logic_error

int main() {
    const char *a = nullptr;
    try {
        std::string b(a);
    } catch (const std::logic_error &ex) {
        std::cout << "logic_error:" << ex.what() << std::endl;
    } catch (const std::exception &ex) {
        std::cout << "some other type:" << ex.what() << std::endl;
    }
}

Turns out it get thrown here on my system
/usr/include/c++/11/bits/basic_string.tcc:212

Created ~/.gdbinit
set breakpoint pending on
break /usr/include/c++/11/bits/basic_string.tcc:212

This specific breakpoint on a line works fast enough
for the integration test to throw the
'basic_string::_M_construct null not valid'
and the silly initialisation mistake to be found.

Signed-off-by: Mike Wake <[email protected]>
Use SFINAE to log get_error_details_if_availble() in ActionT::Result

Ideally all Action messages would have error_code and error_msg
fields in their Result.  BUT the tests use standard messages that
do not and will not.  i.e. test_msgs/action/Fibonacci.action

Some Action messages only have error_code and not error_msg

Signed-off-by: Mike Wake <[email protected]>
Key point is only use the internal_error_(code|msg)_
if the action result error_code is 0.

Signed-off-by: Mike Wake <[email protected]>
…on#4341)

Clear output path for ComputePathThroughPoses similar to
ComputePathToPose

Explicitly note NOT to clear error_code_id and error_msg
behavior tree output ports.  Otherwise we cannot read them
when the navigator reports errors.

Signed-off-by: Mike Wake <[email protected]>
…s-navigation#4341)

Make it consistent with NavigateThroughPoses.
This was introduced when changing logic to remove
internal exception.

Signed-off-by: Mike Wake <[email protected]>
…ion#4341)

Allow of reason for missed waypoint as error_msg.

Signed-off-by: Mike Wake <[email protected]>
Include error_msg in reason for missed waypoint

Signed-off-by: Mike Wake <[email protected]>
…#4341)

It was a red herring in search for error_msg clearing bug on abort

Signed-off-by: Mike Wake <[email protected]>
Copy link
Contributor

mergify bot commented Sep 15, 2024

This pull request is in conflict. Could you fix it @aosmw?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants