文件类型:PDF文档
文件大小:245K
PROBLEM TO BE SOLVED : To provide the technology for making it unnecessary to prepare the dictionary of words to be set anonymity, and for properly setting a character string anonymity even when the combination of the words and peripheral notation including the words is rare.
SOLUTION : Each text data including a character string is classified into a plurality of types according to classification conditions, and a plurality of words included in each text data (name gathering data) classified into the same type according to the classification are extracted, and each of the combination of words including one or more of the extracted words, wherein the number of name gathering data including all the words configuring the word combination is a threshold or more, is extracted, and the words included in the character string included in each text data, that is, the words matched with at least a portion of the extracted words, and the words mismatched with the words configuring the extracted word combination are set anonymity.
COPYRIGHT : (C)2009, JPO&INPIT