Skip to content

jamesljlster/hourglass

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hourglass: A Online Learning Robot Tracking System

Introduction

Traditionally, using fuzzy or PID control system to deal with tracking problem needs a lot of time wasted on parameter tuning. In order to reduce the time and efforts on control parameter optimization, we build a online learning system with LSTM (Long Short-Term Memory) and RL (Reinforcement Learning).

Project Dependences

Open source:

Personal project (used as git submodule):

  • args: Summarizing program arguments.
  • CSV_DataProc: Reading CSV format file.
  • laneft: Control feature processing.
  • lstm: Long Short-Term Memory (LSTM) library.
  • ModConfig: Reading configure file.
  • SimplePID: A very simple PID controller.
  • tcpmgr: Managing TCP conntection.
  • Wheel: Robot wheel controll module.
  • Wheel_Server: Robot wheel controll server.

Feature extraction

To control robot tracking on lane, we use a camera to provide a lane image. Then do the following processing to get control offset:

  • Canny edge detection.
  • Recursive neighbor search for line generation.
  • Control offset calculation.

Initial state building

The LSTM initial control model is based on dataset collected with simple PID controller (slow but stable).

Video demo:

Reinforcement learning

In reinforcement learning task, the robot will keep running on lane in order to collect new control data and renew control data through RL rule. Then renewed datas will be sent to the training server. After 2500 (configurable argument) instances of data been sent, the robot will ask training server for a new LSTM control model.

Video demo:

Final state

After reinforcement learning, the control performance improvement is noticeable.

Video demo:

Video demo (reserve):

System Architecture

The system could be separated to three individual part, communcating with TCP connection. The online learning feature is combined with two parts, robot and LSTM trainer.

LSTM trainer architecture:

Robot architecture:

Monitor architecture:

Experimental Result

The experiment is run with the rule below:

After 51 times control model update, running speed average is increased to 60% and control offset is saturated around 0.2. Control speed average logs of initial model and final model are shown in right side of the picture.

About

A Online Learning Robot Tracking System

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published