Using a single sensor to determine the pose estimation of a device cannot give accurate results. This paper presents a fusion of an inertial sensor of six degrees of freedom (6-DoF) which comprises the 3-axis of an accelerometer and the 3-axis of a gyroscope, and a vision to determine a low-cost and accurate position for an autonomous mobile robot. For vision, a monocular vision-based object detection algorithm speeded-up robust feature (SURF) and random sample consensus (RANSAC) algorithms were integrated and used to recognize a sample object in several images taken. As against the conventional method that depend on point-tracking, RANSAC uses an iterative method to estimate the parameters of a mathematical model from a set of captured data which contains outliers. With SURF and RANSAC, improved accuracy is certain; this is because of their ability to find interest points (features) under different viewing conditions using a Hessain matrix. This approach is proposed because of its simple implementation, low cost, and improved accuracy. With an extended Kalman filter (EKF), data from inertial sensors and a camera were fused to estimate the position and orientation of the mobile robot. All these sensors were mounted on the mobile robot to obtain an accurate localization. An indoor experiment was carried out to validate and evaluate the performance. Experimental results show that the proposed method is fast in computation, reliable and robust, and can be considered for practical applications. The performance of the experiments was verified by the ground truth data and root mean square errors (RMSEs).