# Detect the first node of the cycle in a singly-linked list

If a singly-linked data structure contains a cycle, determine the first node that participates in the cycle. you may use only a constant number of pointers into the list (and no other variables). The running time of your algorithm should be linear in the number of nodes in the data structure. You are given the pointer to the first node of the list.

(Java Code on github at the bottom of the post. )

My thoughts and solutions:

This is the third time I encountered this question and once again, failed. Well I remembered how to write the code, but I couldn’t figure out the complete reasoning, which made the code kind of useless. Memorizing the code is never my intention but it seems that my brain disagrees…The following is a conversation:

Q: What is the unknown? What are the data? Any conditions or constraints?

A: The unknown is the where the cycle begins. The data is a pointer to the first node of the list. The constraints are that we can only use constant numbers of pointers and the running time should be linear.

Q: Have you seen a similar question before? If yes, can we use its result or method here?

A: Yes, a similar question is to detect if a cycle exists. Hmm, we used a fast runner and slow runner approach. I guess it’s a common strategy to use when dealing with cycles. So we can still consider using it here.

Q: Describe the approach above.

A: We have two pointers F and S, with S moving at rate 1 and F moving at rate 2.

Q: OK, can we introduce a symbol for our unknown? Quantify it?

A: Sure, suppose that node is k steps away from the first node of the list.

Q: Now suppose S moved k steps, where are F?

A: In that case, S is at that starting node of the cycle while F is k steps into the cycle because F moved 2k steps when S moved k steps.

Q: Relatively speaking, what’s the location of F according to S and vice-versa?

A: Then S is k steps behind F and F is CYCLE_SIZE – k steps behind S.

Q: What if k is bigger than CYCLE_SIZE?

A: Hmm, then we can replace k with K = k % CYCLE_SIZE. Whether k or K steps into the cycle points to the same node.

Q: Good, now rephrase the answer to the relative location question.

A: OK, S is K steps behind F and F is CYCLE_SIZE – K steps behind S.

Q: If F and S keep moving, they will meet somewhere in the cycle. When and where will they meet?

A: Aha, suddenly it looks like a junior high physics question…F will catch up S at a rate of 1. So after CYCLE_SIZE – K unit times, they will meet. Since at the beginning S is at entrance of the cycle, after CYCLE_SIZE – K time, S is CYCLE_SIZE – K steps into the cycle. Hmm so when F and S meet, they are simply K steps behind the entrance of the cycle.

Q: Great. Now who else is K steps behind the entrance of the cycle besides F and S?

A: Eh, no one?

Q: OK look at our data. Did we use all the data?

A: Eh the data is the first node of the list. Hmm, but it’s k steps from the cycle…

Q: What’s the relationship between k and K?

A: Oh right, K = k % CYCLE_SIZE. In the cycle, K and k are interchangeable. So F, S and the first node of the list are all k steps behind the cycle! If now we start from the first node of the list with a pointer T and keep move, the place where S and T meet is our answer!

Q: You got it.

Summary:

• Locate the unknown. Symbolize and quantify it!
• When dealing with loops or cycles, think about multiple pointers with different moving rates.
• In this catch and run problem, ask both when and where.
• In a cycle, K (= k % CYCLE_SIZE) and k steps into the cycle point to the same node.
• To find a location, imagine how you get there and when you get there, what are other things like? (when S is at the entrance of the cycle, where is F?)

Code on github: