sos.cleaner.parsers — Parser Interface¶
- class sos.cleaner.parsers.SoSCleanerParser(config={}, skip_cleaning_files=[])[source]¶
Bases:
objectParsers are used to build objects that will take a line as input, parse it for a particular pattern (E.G. IP addresses) and then make any necessary subtitutions by referencing the SoSMap() associated with the parser.
Ideally a new parser subclass will only need to set the class level attrs in order to be fully functional.
- Parameters:
conf_file (
str) – The configuration file to read from- Variables:
name (
str) – The parser name, used in logging errorsregex_patterns (
list) – A list of regex patterns to iterate over for every line processedmapping (
SoSMap()) – Used by the parser to store and obfuscate matchesmap_file_key (
str) – The key in themap_fileto read when loading previous obfuscation matches
- compile_regexes = True¶
- generate_item_regexes()[source]¶
Generate regexes for items the parser will be searching for repeatedly without needing to generate them for every file and/or line we process
Not used by all parsers.
- get_map_contents()[source]¶
Get the contents of the mapping used by the parser
- Returns:
All matches and their obfuscate counterparts
- Return type:
dict
- map_file_key = 'unset'¶
- name = 'Undefined Parser'¶
- parse_line(line)[source]¶
This will be called for every line in every file we process, so that every parser has a chance to scrub everything.
This will first try to identify needed obfuscations for items we have already encountered (if the parser uses compiled regexes that is) and make those substitutions early on. After which, we will then parse the line again looking for new matches.
- parse_string_for_keys(string_data)[source]¶
Parse a given string for instances of any obfuscated items, without applying the normal regex comparisons first. This is mainly used to obfuscate filenames that have, for example, hostnames in them.
Rather than try to regex match the string_data, just use the builtin checks for substrings matching known obfuscated keys
- Parameters:
string_data (
str) – The line to be parsed- Returns:
The obfuscated line
- Return type:
str
- parser_skip_files = []¶
- regex_patterns = []¶
- skip_cleaning_files = []¶
- skip_line_patterns = []¶