In visual, augmented, or mixed reality applications, real-time acquisition of user and object poses is a prerequisite for building a highly immersive virtual environment. With the continuous development of virtual reality technology, users’ demands for the range of motion in virtual environments have been increasing. They are no longer content with limited movement within the confined space of a single room; instead, they seek to roam and interact in a larger range of environments. Most of the tracking systems used by popular AR/VR devices today are designed for room-level or even smaller range tracking. When larger range tracking is required, these systems either introduce greater error drift or require more hardware to be arranged in the room to cover a larger area (e.g. Light House), which creates huge hardware costs and a complex configuration process, making them not suitable for general and personal use. To address this, a system for optical positioning and tracking was proposed, which could achieve accurate 3D camera tracking by deploying a small number of infrared LED markers on the ceiling or floor. The proposed tracking system utilized the most basic dot and line elements to build the landmark pattern. Compared with traditional marker-based systems, individual dots do not contain any information and are identified only after they are formed into a basic graphic element with a line next to them. The straight line segments exist to increase the redundancy of the basic graph elements, thus avoiding the situation where the dots are obscured and cannot be recognized. By designing the encoding principle of the marker patterns, employing the layout repeated feature retrieval method, and implementing the corresponding points matching algorithms, the fast and accurate decoding of the landmark images was realized. Experiments have proven that the system could achieve the position accuracy at the millimeter level. In robustness experiments, the proposed method could maintain higher recognition accuracy even in the presence of challenges such as large inclination angles and marker point occlusion. These measurements show the potential of our system to cope with more extreme situations. We also count the processing time of the system, and the average latency of our method is 4.34 ms, which indicates that performing sparse graph element layout and simplifying marker point decoding effectively reduces the system computation time. The resulting tracking system possesses characteristics such as low cost, easy scalability, and resilience to occlusion, thereby meeting the demand for real-time tracking and positioning at the 100 square meter level.