Skip to content

The Problem Parser

Henri Lotze edited this page Mar 21, 2021 · 3 revisions

The parser enforces a few methods that need to be implemented. We give a short overview over the expected behavior of an implementation.

decode(self, raw_input: bytes) -> any:

This is the first method of the parser that will be called by the match. The raw_input is a byte-encoded string, which contains the output of the generator or solver. The job of the parser.decode method is to convert the string into a format which can be further processed by the split_into_instance_and_solution method. For most problems, a sensible functionality for the parser.decode method is splitting the input string into lines at its newlines and possibly already splitting up the elements of each line into a list or tuple. The goal is to preprocess the input in a way such that the parser.split_into_instance_and_solution can do its job without having to further preprocess the strings.

An example could look like the following

b'e 1 2\ne 3 2\ns set1 1\ns set1 3\ns set2 2\na set 2\ns foo bar\ne foo bar'

is converted to

[('e', '1', '2'), ('e', '3', '2'), ('s', 'set1', '1'), ('s', 'set1', '3'), ('s', 'set2', '2'), ('a', 'set', '2'), ('s' 'foo', 'bar'), ('e', 'foo', 'bar')]

split_into_instance_and_solution(self, raw_input: any) -> Tuple[any, any]

The parser.split_into_instance_and_solution method accepts whatever format is output by the parser.decode method and outputs a tuple, with the first element containing the instance line(s) and the second element containing the solution line(s). All lines that are not instance or solution lines as defined in the problem are discarded.

An example could look like the following

[('e', '1', '2'), ('e', '3', '2'), ('s', 'set1', '1'), ('s', 'set1', '3'), ('s', 'set2', '2'), ('a', 'set', '2'), ('s' 'foo', 'bar'), ('e', 'foo', 'bar')]

is converted to

([('e', '1', '2'), ('e', '3', '2'), ('e', 'foo', 'bar')], [('s', 'set1', '1'), ('s', 'set1', '3'), ('s', 'set2', '2'), ('s' 'foo', 'bar')])

parse_instance(self, raw_instance: any, instance_size: int) -> any

The parser.parse_instance method is one of the two main functions of the parser. Its job is to discard any syntactically wrong lines from a given instance and returning a cleaned up instance. Removing lines may make the instance semantically invalid already, but this should not be checked at this point already, but by the verifier later. While each problem has its own syntax, common checks for an instance are the following:

  • Are there lines with too many entries?
  • Are there lines with too few entries?
  • Are the lines compatible with the given instance_size?
  • Are there duplicate lines?
  • Are there lines not following the wanted format, e.g. replacing numbers by letters?

An example could look like the following

[('e', '1', '2'), ('e', '3', '2'), ('e', 'foo', 'bar')]

is converted to

[('e', '1', '2'), ('e', '3', '2')]

parse_solution(self, raw_solution: any, instance_size: int) -> any

Similarly to the parser.parse_instance method, the parser.parse_solution method is responsible for discarding syntactically wrong lines from a given solution. The common checks for the solution are similar to those of the parser.parse_instance method.

An example could look like the following

[('s', 'set1', '1'), ('s', 'set1', '3'), ('s', 'set2', '2'), ('s' 'foo', 'bar')]

is converted to

[('s', 'set1', '1'), ('s', 'set1', '3'), ('s', 'set2', '2')]

encode(self, input: any) -> bytes:

The parser.encode method converts an instance back to a bytes object that can be passed to a solver (and meets the problem specifications on the expected format). It is fed the output of the parser.parse_instance method, so make sure that these methods are compatible!

An example could look like the following

[('e', '1', '2'), ('e', '3', '2')]

is converted to

b'e 1 2\ne 3 2'