When will hadoop NN recognize a failure?
I’ve been trying to cause the NN to recognize that it is inconsistent and turn to the SNN for data.
Scenario 1: Will the NN Recognize that its data is not up-to-date? (No)
- Started with NN on server A , SNN on B , DN on C
- Enter 3 files (testFileX.test)
- Wait for an SNN image to be written (usually happens after 5 minutes)
- kill NN, SNN , DN .
- Start the NN on B with -importCheckpoint , start SNN on B , start DN on C
- Enter 3 new files (failbackFileX.test)
- stop-dfs.sh
- Restart the original structure (NN on A , SNN on B , DN on C)
- what files does hadoop recognize? - Only the testFileX.test files (the first ones).
Result : Hadoop NN doesn’t recognize that its data is not updated.
Scenario 2: Force it to load with importCheckpoint
Same scenario as above until 7 (include)
- hadoop namenode -importCheckpoint :
… NameNode already contains an image in /usr/apps/hadoop/name …
That didn’t work either.
Scenario 3: Corrupt the VERSION file:
Hadoop recognizes the version file is corrupted , but just fails - no access to the SNN. Doesn’t work.
Scenario 4: Corrupt the fsimage file:
ERROR fs.FSNamesystem (FSNamesystem.java:
No SNN involvment.
Conclusion:
I have no idea when hadoop decides to turn to the SNN - it seems it should have done that in any of the above scenarios , but it won’t.
Comments powered by Disqus.