Next Previous Contents

4. Filters

4.1 Filters

You can find sources of the filter language in filter/ directory. File filter/config.Y contains filter grammar and basically translates the source from user into a tree of f_inst structures. These trees are later interpreted using code in filter/filter.c.

A filter is represented by a tree of f_inst structures, later translated into lists called f_line. All the instructions are defined and documented in filter/f-inst.c definition file.

Filters use a f_val structure for their data. Each f_val contains type and value (types are constants prefixed with T_). Look into filter/data.h for more information and appropriate calls.


Function

enum filter_return interpret (struct filter_state * fs, const struct f_line * line, uint argc, const struct f_val * argv, struct f_val * val)

Arguments

struct filter_state * fs

filter state

const struct f_line * line

-- undescribed --

uint argc

-- undescribed --

const struct f_val * argv

-- undescribed --

struct f_val * val

-- undescribed --

Description

Interpret given tree of filter instructions. This is core function of filter system and does all the hard work.

Each instruction has 4 fields

code (which is instruction code), aux (which is extension to instruction code, typically type), arg1 and arg2 - arguments. Depending on instruction, arguments are either integers, or pointers to instruction trees. Common instructions like +, that have two expressions as arguments use TWOARGS macro to get both of them evaluated.


Function

enum filter_return f_run (const struct filter * filter, struct rte ** rte, struct linpool * tmp_pool, int flags) -- run a filter for a route

Arguments

const struct filter * filter

filter to run

struct rte ** rte

route being filtered, may be modified

struct linpool * tmp_pool

all filter allocations go from this pool

int flags

flags

Description

If filter needs to modify the route, there are several posibilities. rte might be read-only (with REF_COW flag), in that case rw copy is obtained by rte_cow() and rte is replaced. If rte is originally rw, it may be directly modified (and it is never copied).

The returned rte may reuse the (possibly cached, cloned) rta, or (if rta was modified) contains a modified uncached rta, which uses parts allocated from tmp_pool and parts shared from original rta. There is one exception - if rte is rw but contains a cached rta and that is modified, rta in returned rte is also cached.

Ownership of cached rtas is consistent with rte, i.e. if a new rte is returned, it has its own clone of cached rta (and cached rta of read-only source rte is intact), if rte is modified in place, old cached rta is possibly freed.


Function

enum filter_return f_eval_rte (const struct f_line * expr, struct rte ** rte, struct linpool * tmp_pool, uint argc, const struct f_val * argv, struct f_val * pres) -- run a filter line for an uncached route

Arguments

const struct f_line * expr

filter line to run

struct rte ** rte

route being filtered, may be modified

struct linpool * tmp_pool

all filter allocations go from this pool

uint argc

-- undescribed --

const struct f_val * argv

-- undescribed --

struct f_val * pres

-- undescribed --

Description

This specific filter entry point runs the given filter line (which must not have any arguments) on the given route.

The route MUST NOT have REF_COW set and its attributes MUST NOT be cached by rta_lookup().


Function

int filter_same (const struct filter * new, const struct filter * old) -- compare two filters

Arguments

const struct filter * new

first filter to be compared

const struct filter * old

second filter to be compared

Description

Returns 1 in case filters are same, otherwise 0. If there are underlying bugs, it will rather say 0 on same filters than say 1 on different.


Function

void filter_commit (struct config * new, struct config * old) -- do filter comparisons on all the named functions and filters

Arguments

struct config * new

-- undescribed --

struct config * old

-- undescribed --


Function

struct f_tree * build_tree (struct f_tree * from)

Arguments

struct f_tree * from

degenerated tree (linked by tree->left) to be transformed into form suitable for find_tree()

Description

Transforms degenerated tree into balanced tree.


Function

int same_tree (const struct f_tree * t1, const struct f_tree * t2)

Arguments

const struct f_tree * t1

first tree to be compared

const struct f_tree * t2

second one

Description

Compares two trees and returns 1 if they are same

4.2 Trie for prefix sets

We use a (compressed) trie to represent prefix sets. Every node in the trie represents one prefix (addr/plen) and plen also indicates the index of bits in the address that are used to branch at the node. Note that such prefix is not necessary a member of the prefix set, it is just a canonical prefix associated with a node. Prefix lengths of nodes are aligned to multiples of TRIE_STEP (4) and there is 16-way branching in each node. Therefore, we say that a node is associated with a range of prefix lengths (plen .. plen + TRIE_STEP - 1).

The prefix set is not just a set of prefixes, it is defined by a set of prefix patterns. Each prefix pattern consists of ppaddr/pplen and two integers: low and high. The tested prefix paddr/plen matches that pattern if the first MIN(plen, pplen) bits of paddr and ppaddr are the same and low <= plen <= high.

There are two ways to represent accepted prefixes for a node. First, there is a bitmask local, which represents independently all 15 prefixes that extend the canonical prefix of the node and are within a range of prefix lengths associated with the node. E.g., for node 10.0.0.0/8 they are 10.0.0.0/8, 10.0.0.0/9, 10.128.0.0/9, .. 10.224.0.0/11. This order (first by length, then lexicographically) is used for indexing the bitmask local, starting at position 1. I.e., index is 2^(plen - base) + offset within the same length, see function trie_local_mask6() for details.

Second, we use a bitmask accept to represent accepted prefix lengths at a node. The bit is set means that all prefixes of given length that are either subprefixes or superprefixes of the canonical prefix are accepted. As there are 33 prefix lengths (0..32 for IPv4), but there is just one prefix of zero length in the whole trie so we have zero flag in f_trie (indicating whether the trie accepts prefix 0.0.0.0/0) as a special case, and accept bitmask represents accepted prefix lengths from 1 to 32.

One complication is handling of prefix patterns with unaligned prefix length. When such pattern is to be added, we add a primary node above (with rounded down prefix length nlen) and a set of secondary nodes below (with rounded up prefix lengths slen). Accepted prefix lengths of the original prefix pattern are then represented in different places based on their lengths. For prefixes shorter than nlen, it is accept bitmask of the primary node, for prefixes between nlen and slen - 1 it is local bitmask of the primary node, and for prefixes longer of equal slen it is accept bitmasks of secondary nodes.

There are two cases in prefix matching - a match when the length of the prefix is smaller that the length of the prefix pattern, (plen < pplen) and otherwise. The second case is simple - we just walk through the trie and look at every visited node whether that prefix accepts our prefix length (plen). The first case is tricky - we do not want to examine every descendant of a final node, so (when we create the trie) we have to propagate that information from nodes to their ascendants.

There are two kinds of propagations - propagation from child's accept bitmask to parent's accept bitmask, and propagation from child's accept bitmask to parent's local bitmask. The first kind is simple - as all superprefixes of a parent are also all superprefixes of appropriate length of a child, then we can just add (by bitwise or) a child accept mask masked by parent prefix length mask to the parent accept mask. This handles prefixes shorter than node plen.

The second kind of propagation is necessary to handle superprefixes of a child that are represented by parent local mask - that are in the range of prefix lengths associated with the parent. For each accepted (by child accept mask) prefix length from that range, we need to set appropriate bit in local mask. See function trie_amask_to_local() for details.

There are four cases when we walk through a trie:

- we are in NULL - we are out of path (prefixes are inconsistent) - we are in the wanted (final) node (node length == plen) - we are beyond the end of path (node length > plen) - we are still on path and keep walking (node length < plen)

The walking code in trie_match_net() is structured according to these cases.

Iteration over prefixes in a trie can be done using TRIE_WALK() macro, or directly using trie_walk_init() and trie_walk_next() functions. The second approach allows suspending the iteration and continuing in it later. Prefixes are enumerated in the usual lexicographic order and may be restricted to a subset of the trie (all subnets of a specified prefix).

Note that the trie walk does not reliably enumerate `implicit' prefixes defined by low and high fields in prefix patterns, it is supposed to be used on tries constructed from `explicit' prefixes (low == plen == high in call to trie_add_prefix()).

The trie walk has three basic state variables stored in the struct f_trie_walk_state -- the current node in stack[stack_pos], accept_length for iteration over inter-node prefixes (non-branching prefixes on compressed path between the current node and its parent node, stored in the bitmap accept of the current node) and local_pos for iteration over intra-node prefixes (stored in the bitmap local).

The trie also supports longest-prefix-match query by trie_match_longest_ip4() and it can be extended to iteration over all covering prefixes for a given prefix (from longest to shortest) using TRIE_WALK_TO_ROOT_IP4() macro. There are also IPv6 versions (for practical reasons, these functions and macros are separate for IPv4 and IPv6). There is the same limitation to enumeration of `implicit' prefixes like with the previous TRIE_WALK() macro.


Function

struct f_trie * f_new_trie (linpool * lp, uint data_size) -- allocates and returns a new empty trie

Arguments

linpool * lp

linear pool to allocate items from

uint data_size

user data attached to node


Function

void * trie_add_prefix (struct f_trie * t, const net_addr * net, uint l, uint h)

Arguments

struct f_trie * t

trie to add to

const net_addr * net

IP network prefix

uint l

prefix lower bound

uint h

prefix upper bound

Description

Adds prefix (prefix pattern) n to trie t. l and h are lower and upper bounds on accepted prefix lengths, both inclusive. 0 <= l, h <= 32 (128 for IPv6).

Returns a pointer to the allocated node. The function can return a pointer to an existing node if px and plen are the same. If px/plen == 0/0 (or ::/0), a pointer to the root node is returned. Returns NULL when called with mismatched IPv4/IPv6 net type.


Function

int trie_match_net (const struct f_trie * t, const net_addr * n)

Arguments

const struct f_trie * t

trie

const net_addr * n

net address

Description

Tries to find a matching net in the trie such that prefix n matches that prefix pattern. Returns 1 if there is such prefix pattern in the trie.


Function

int trie_match_longest_ip4 (const struct f_trie * t, const net_addr_ip4 * net, net_addr_ip4 * dst, ip4_addr * found0)

Arguments

const struct f_trie * t

trie

const net_addr_ip4 * net

net address

net_addr_ip4 * dst

return value

ip4_addr * found0

optional returned bitmask of found nodes

Description

Perform longest prefix match for the address net and return the resulting prefix in the buffer dst. The bitmask found0 is used to report lengths of prefixes on the path from the root to the resulting prefix. E.g., if there is also a /20 shorter matching prefix, then 20-th bit is set in found0. This can be used to enumerate all matching prefixes for the network net using function trie_match_next_longest_ip4() or macro TRIE_WALK_TO_ROOT_IP4().

This function assumes IPv4 trie, there is also an IPv6 variant. The net argument is typed as net_addr_ip4, but would accept any IPv4-based net_addr, like net4_prefix(). Anyway, returned dst is always net_addr_ip4.

Result

1 if a matching prefix was found, 0 if not.


Function

int trie_match_longest_ip6 (const struct f_trie * t, const net_addr_ip6 * net, net_addr_ip6 * dst, ip6_addr * found0)

Arguments

const struct f_trie * t

trie

const net_addr_ip6 * net

net address

net_addr_ip6 * dst

return value

ip6_addr * found0

optional returned bitmask of found nodes

Description

Perform longest prefix match for the address net and return the resulting prefix in the buffer dst. The bitmask found0 is used to report lengths of prefixes on the path from the root to the resulting prefix. E.g., if there is also a /20 shorter matching prefix, then 20-th bit is set in found0. This can be used to enumerate all matching prefixes for the network net using function trie_match_next_longest_ip6() or macro TRIE_WALK_TO_ROOT_IP6().

This function assumes IPv6 trie, there is also an IPv4 variant. The net argument is typed as net_addr_ip6, but would accept any IPv6-based net_addr, like net6_prefix(). Anyway, returned dst is always net_addr_ip6.

Result

1 if a matching prefix was found, 0 if not.


Function

void trie_walk_init (struct f_trie_walk_state * s, const struct f_trie * t, const net_addr * net)

Arguments

struct f_trie_walk_state * s

walk state

const struct f_trie * t

trie

const net_addr * net

optional subnet for walk

Description

Initialize walk state for subsequent walk through nodes of the trie t by trie_walk_next(). The argument net allows to restrict walk to given subnet, otherwise full walk over all nodes is used. This is done by finding node at or below net and starting position in it.


Function

int trie_walk_next (struct f_trie_walk_state * s, net_addr * net)

Arguments

struct f_trie_walk_state * s

walk state

net_addr * net

return value

Description

Find the next prefix in the trie walk and return it in the buffer net. Prefixes are walked in the usual lexicographic order and may be restricted to a subset of the trie during walk setup by trie_walk_init(). Note that the trie walk does not iterate reliably over 'implicit' prefixes defined by low and high fields in prefix patterns, it is supposed to be used on tries constructed from 'explicit' prefixes (low == plen == high in call to trie_add_prefix()).

Result

1 if the next prefix was found, 0 for the end of walk.


Function

int trie_same (const struct f_trie * t1, const struct f_trie * t2)

Arguments

const struct f_trie * t1

first trie to be compared

const struct f_trie * t2

second one

Description

Compares two tries and returns 1 if they are same


Function

void trie_format (const struct f_trie * t, buffer * buf)

Arguments

const struct f_trie * t

trie to be formatted

buffer * buf

destination buffer

Description

Prints the trie to the supplied buffer.


Next Previous Contents