Massive online open courses (“MOOC”) are aimed at large-scale courses provided to participants around the world. Because the number of participants can be large, sometimes reaching over tens or hundreds of thousand people, it is difficult for the instructor and/or teaching assistants to identify whether any students have plagiarized any content for their homework or assignments. Many assignments include writing essays or developing computer programs to solve a particular problem. With the increasing popularity of distance education programs, manually sifting through a large number of documents to detect plagiarism is a cumbersome process.