Often it happens that documents are identical in many different ways either logical or statistical. To eliminate this problem, you have to use canonicalization which helps people in identifying 2 similar documents with similar digital signature. Similar documents use canonicalization to identify similar pattern & signature.