Safety and efficiency are two crucial factors for human-robot collaboration. It is challenging to ensure human safety while not sacrificing the task efficiency. In this letter, we present a reinforcement learning (RL) based method with a hazard estimator to balance these two factors. Our method has two phases. In the training phase, an RL control policy and a hazard estimator are trained; in the testing phase, we dynamically select a guiding goal along a given task path to balance between human avoidance and task execution. The proposed method is compared among three previous methods: another RL based method, a reactive method, and a motion planner both in simulated and real-world experiments. Results show that our method can 1) enable a robot to follow a demonstrated (reference) path if the human stays far from the robot; 2) apply responsive online motion adaption to balance human avoidance and task efficiency if the human moves closer toward the robot. In addition, the dynamic goal selection method is easy to use, and can effectively increase the success rate and provide a better trade-off between safety and efficiency.