Sunday, July 5, 2015

Implications

This blog is a continuation of the FCA blog. I actually mused over this in Summer 2004, but wanted to recap for myself and share some source code.

The article I read back then was Endliche Hüllensysteme und ihre Implikationenbasen (Finite hull systems and their implications). I summarize this article here, since I didn't find an english version online.

In short, a context is constructed, where rows are subsets of attributes ''A'', i.e. elements of the power set, and columns are implications between subsets and the incidence is given by respects ( ), H = {2A, 2A × 2A, ⊢} . This context is simplified, by applying the Armstrong rules, to form a non-redundant, complete, reduced context, named B . Then an algorithm is presented, which takes the intents of some context and recursively constructs the respected implications using B and afterwards uses the 3rd Armstrong rule to find an easy set of implications, i.e. not necessarily minimal, but easier than the Duquenne-Guiges-Basis.

Definitions

  • U → V : an implication is a pair (U, V) , U ⊆ M and V ⊆ M U is premiss and V conclusion.
  • WU → V : A set W respects (U, V) , if UWorV ⊆ W . One also says: (U, V) holds for W .
  • If V ⊆ U the implication is called trivial.

An intelligent system repeatedly gets sets of elements (attributes) as input (objects) and associates these elements. Implications are the result of many such sets.

  • A concept lattice C respects (U, V) , iff U’ ⊆ V .
  • (U, V) holds in C , iff V ⊆ U’’

Implications can be composed to give new implications. The minimal set of implications that produce all other implications is called an implication basis.

A minimal implication basis is given by the Duquenne-Guiges-Basis (stem basis), which consists of all pseudo-contents:

Let X ↦ X’’ be a closure operation, then P is pseudo-closed if

  • P  ≠  P’’
  • Q ⊂ P  ⇒  Q’’ ⊆ P

There cannot be a implication basis with less implications, than the Duquenne-Guiges-Basis, but one can choose implications that are easier (with low |U||V| , U → V with V ≠ ∅ and UV = ∅ ).

Construction of H

H = {2A, 2A × 2A, ⊢} is a context and the according concepts C(H) = {(O, A)} are subsets (= objects of H ) that respect a set of implications (= attributes H ).

The basic implications ( ⇒  ) respected by this context C ( ) are given by the Armstrong Axioms:

 ⇒ {U → U}  (A1) {U → V}  ⇒ {UW → U}  (A2) {U → V, VW → Z}  ⇒ {UW → Z}  (A3)

Some derived ones are ( ≡  is  ⇒  in both directions):

{U → V}  ≡ {U → UV}  (A4) {U → V, U → W}  ≡ {U → VW}  (A5) {U → V}  ≡ {W → V|U ⊆ W}  (A6) {U → V, UVW → Z}  ≡ {U → V, UW → Z}  (A7)

Note

Notation: UV means UV and Ui means U∪{i}

H can be constructed inductively. The sequence of the attributes is of no importance when constructing the power set.

Note

In [1] {⋆} named the new attribute in the step n and {⋆⋆} in the step n + 1 . Here I name them i and j and becomes a wild card meaning whatever element in a previous step.

In the following i means a new attribute. Because of A4:{X → Y}  ≡ {X → XY} only implications with sets U ⊆ V are included, therefore Ui → V is not there.

  • Ho = 1 , because ∅⊢∅ → ∅ .
  • Lo = 0 , because ∅⊬∅ → {i} .
Hn + 1 =  ⊢  U → V U → Vi Ui → Vi X Hn Ln 1 Xi Hn Hn Hn
  • (2, ⋆) implications hold with Xi = X∪{i} , because X respects U → V at location (1, 1) .
  • The L in (1, 2) is discussed below
  • The 1 in (1, 3) is because UiX makes the implication hold.

Ln + 1 is part of Hn + 2 , which adds another element j .

Ln + 1 =  ⊢  U → Vy U → Vij Ui → Vij X Ln Ln 1 Xi Ln Ln Ln
  • The columns of Ln + 1 are those of Hn + 1 augmented by j at the conlusion side.
  • The rows are the same as in Hn + 1 .
  • As i , so is j not in X and we can distribute L to all but (1, 3) , where UiX makes it hold.

Construction of B

Because of A5 ({U → V, U → W}  ≡ {U → VW} ) we can reduce H to implications of the shape U → U∪{i} , abbreviated U → i (thus Ui → j means U∪{i} → U∪{i, j} .

In the following names the extra element on the right of a previous recursion step, whichever that was.

B is in analogy to H horizontally subdivided by three event:

  • no new element i , ie some other of a previous step is on the right.
  • with i on right side
  • with i on both sides: the extra element on the right is any of a previous step.
B1 =  0 1       A1 =  0 1 0 0
Bn + 1 =  ⊢  U → ⋆  U → i Ui → ⋆ X Bn An 1 Xi Bn 1 Bn
  • (1, 3) : UiX , so it holds.
  • (2, 2) : XiU → i . If UX , so is UiXi , ie it holds; same for  ⊆  .
  • (1, 1) ⇒ (2, 1), (2, 3) : XU → ⋆ ≡ XiU → ⋆ ≡ XiUi → ⋆
  • (1, 2) : see below

An + 1 is part of Bn + 2 , which adds another element j on the right side. The new element of step n (i ) can be on the left side or not, hence the horizontal partitioning.

An + 1 =  ⊢  U → j Ui → j X An 1 Xi An An

The implications (columns) in B are reduced. The rows are reduced apart from the last line, which is always full.

Application to Find Implications for a Context

H , L , B , A ( = X ) are used to find the implication basis and the intents.

We recursively construct the composed context KX defined by X(K(g)) , g is object of K .

Every attribute of K(g) = g gets a position. This position says whether the attribute is there or not there - column after the if in the following formulas. So every g is coded by a binary number. This is λ:2M↦{0, 1}n in [1].

The first two lines for (g[n])Hn show the according implication coding. This is κ:(UV)↦{0, 1, 2}n with 2 = i ∈ U , 1 = i ∈ V and UV = ∅ in [1]. So every binary attribute coding becomes a ternary implication coding.

For n = 1

(g[1])H1 =  1, 0, 1 1, 1, 1    (g[1])L1 =  0, 0, 1  if g1 = 0 0, 0, 0  if g1 = 1

For n > 1

(g[n])Hn =  0   1   2 U → V U → Vi Ui → Vi (g[n − 1])Hn − 1  (g[n − 1])Ln − 1  13n − 1  if gn = 0 (g[n − 1])Hn − 1  (g[n − 1])Hn − 1  (g[n − 1])Hn − 1  if gn = 1
(g[n])Ln =  (g[n − 1])Ln − 1  (g[n − 1])Ln − 1  13n − 1  if gn = 0 (g[n − 1])Ln − 1  (g[n − 1])Ln − 1  (g[n − 1])Ln − 1  if gn = 1

The recursive construct with coding 1 = x ∈ V < 2 = x ∈ U shows that the column position, interpreted as ternary number, codes the implication. If the incidence at the column position is 1, then g respects this implication: {g’}Hn is indicator function for the respected implications.

The coding will also place a larger V before a smaller one when going from right to left. This way one can filter out implications with already encountered same left side.

The UV will be closed. If there is no implication for a left side, this is a closed intent, too.

Note

Python Implementation

Python's arbitrary precission integers is used for coding. An attribute's position is the bit position of the integer. Every object is thus an integer. A context is a list of integers.

In here higher bits are more left; in the [1] they are more right; the horizontal order is inverted.

def UV_H(Hg,gw):
    """
    Return implications UV respecting g.
    gw = g width
    Hg = H(g), g is the binary coding of the attribute set
    UV = all non-trivial (!V⊂U) implications U->V with UuV closed; in ternary coding (1=V,2=U)
    K = all closed sets
    """
    lefts = set()
    K = set()
    UV = set()
    p = Hwidth(gw)
    pp = 2**p
    while p:
        pp = pp>>1
        p = p-1
        if Hg&pp:
            y = istr(p,3,gw)
            yy = y.replace('1','0')
            is02 = y.find('1') == -1 #y∈{0,2}^n
            if is02:
                K.add(y)
            elif yy not in lefts:
                lefts.add(yy)
                UV.add(y)
    return (UV,K)

Now we do the analogue procedure with the B context. In B the implications are of the shape U → U∪{x} , so we need to specify U and the one binary position of x. This implication coding can be derived from the column position recursively (see below).

For n = 1 :

(g[1])B1 =  0 1    (g[1])A1 =  0 1  if g1 = 0 0 0  if g1 = 1

For n > 1 :

(g[n])Bn =  U → ⋆  U → i Ui → ⋆ (g[n − 1])Bn − 1  (g[n − 1])An − 1  1(n − 1)2n − 2  if gn = 0 (g[n − 1])Bn − 1  12n − 1  (g[n − 1])Bn − 1  if gn = 1
(g[n])An =  U → j Ui → j (g[n − 1])An − 1  12n − 1  if gn = 0 (g[n − 1])An − 1  (g[n − 1])An − 1  if gn = 1

The following Python returns the implication coding for a column t .

Awidth = lambda n: 2**n
Bwidth = lambda n:n*2**(n-1)
def A012(t,i):
    if i<0:
        return ""
    nA = Awidth(i)
    if t < nA:
        return "0"+A012(t,i-1)
    else:
        return "2"+A012(t-nA,i-1)
def B012(t,i):
    """
    Constructs ternary implication coding (0=not there, 2=U, 1=V)
    t is B column position
    i = |M|-1 to 0
    """
    if not i:
        return "1"
    nA = Awidth(i)
    nB = Bwidth(i)
    nBB = nB + nA
    if t < nB:
        return "0"+B012(t,i-1)
    elif t < nBB:
        return "1"+A012(t-nB,i-1)
    else:
        return "2"+B012(t-nBB,i-1)

The positions of the 1's in the B row for a given g yield the implications. Here the python source code:

def UV_B(Bg,gw):
    """
    returns the implications UV based on B
    Bg = B(g), g∈2^M
    gw = |M|, M is the set of all attributes
    """
    UV = []
    p = Bwidth(gw)
    pp = 2**p
    while p:
        pp = pp>>1
        p = p-1
        if Bg&pp:
            uv = B012(p,gw-1)
            UV.append(uv)
    return UV

To get the implications that hold in a context K , i.e. for intent 1 and intent 2 and ..., we &-combine (multiply) the B rows for intent 1 and intent 2 and ...

Note

See pyfca for the full and possibly updated source code.

Constructing a Basis

The implications that result via B can be pruned by

  • first finding the minimal antichain
  • then applying the 3rd Armstrong

The minimal Antichain

The 1-position (j) in the ternary coding is called significant component. Lj are all implications with the same significant component.

The implications U → j constructed in this way via KB are ordered lexicographically based on 0 ≤ 1 ≤ 2 .

They are also ordered with the product order, which is a partial order, and it depicts the containment hierarchy of the U side. We want to find the smallest U of every chain in Lj . This minLj is an antichain and so is L = ∪minLj .

The UV_B above changed to return this minimal antichain:

def v_Us_B(Bg,gw):
    """
    returns the implications {v:Us} based on B
    v is the significant component
    Bg = B(g), g∈2^M
    gw = |M|, M is the set of all attributes
    """
    v_Us = defaultdict(list)
    p = Bwidth(gw)
    pp = 2**p
    while p:
        pp = pp>>1
        p = p-1
        if Bg&pp:
            uv = B012(p,gw-1)
            #lets find minima regarding product order
            #{v:[Umin1,Umin2,...]}
            v = uv.find('1')#v=significant
            u = uv[:v]+'0'+uv[v+1:]
            u = int(u.replace('2','1'),2)
            Umin_s = self[gw-v-1]#bit position from right
            it = [i for i,U in enumerate(Umin_s) if U&u==u]
            for i in reversed(it):
                del Umin_s[i]
            else:
                Umin_s.append(u)
    return v_Us

The Basis

In the construction of the minimal antichain the 2nd Armstrong rule has been used, but there is still redundancy due to the 3rd Armstrong rule {U → V, VW → Z}  ⇒ {UW → Z} .

A component-wise operation is created to represent the 3rd Armstrong rule for the 0,1,2-coding

  • UVW ⇒ UW : 0○0 = 0 and 2○0 = 0○2 = 2○2 = 2 This is a binary or operation.
  • VVW ⇒ UW , no V in UW : 1○2 = 2○1 = 0 .
  • Z ⇒ Z , VZ = ∅ : 1○0 = 0○1 = 1
  • LiLj = {xy|x ∈ Li, y ∈ Lj, i ≠ j}

Note that it is required that, with X → x and Y → y , x ∈ Y xor y ∈ X to make this the 3rd Armstrong rule.

This is implemented in python as

#u1,u2 code the attribute sets via the bits of an integer
#v1,v2 are bit indices
#u1->v1, vv1 = 2**v1
#u2->v2, vv2 = 2**v2
if vv2&u1 and not vv1&u2:
    return (v1,[(u1|u2)&~vv2])
elif vv1&u2 and not vv2&u1:
    return (v2,[(u1|u2)&~vv1])

The properties of are:

  • xy = yx
  • x ≠ xy ≠ y
  • If x ≤ ny , then xy ≤ ny ,  ≤ n is the product order
  • If xy ∈ Li , then x ∈ Li or y ∈ Li (note: Li not minLi )
  • If there exits xy = z ∈ L , then there exist ,  ∈ L with  = z

The objective is to find Y ⊆ L , of which the repeated application of produces L : Y = L .

Y is defined as

  • Y2 = YY
  • Yn + 1 = Yn∪{z ∈ L|z = xy, x ∈ Yn, y ∈ YYn} This means that of Yn○(YnY) only elements in L are added at each step. But the algorithm below can cope with Yn getting bigger than L.
  • Y = ∪n ≥ 2Yn
  • Y = Y∪○Y

One starts with Y0 = L (LL) , i.e. all those elements of L that are not results of an application of the 3rd Armstrong rule. As long as Yn ≠ L , we add z ∈ L Yn to Yn . Yn∪{z} cannot generate elements of Y0 , but it can generate elements of Zn = Yn Y0 . Therefore we do Yn + 1 = Y0∪(Zn (○(Ynz)))∪z , i.e. we also remove elements, but only those that can be generated together with z , hence Yn ⊂ Yn + 1 .

Note

The formula Yn + 1 = Y0∪(Zn (○(Ynz)))∪{z} in [1] lets you think, that only the current z can generate itself. But actually other z ∈ Zn could generate themselves. Zn (○(Ynz)) could remove elements other than z that are needed for the generation of elements in Yn , thus making Yn¬ ⊂ Yn + 1 . Still the latter is postulated in [1]. There is a comment, though, that states, one should use disjunct union. This is interpreted as: don't remove elements of Zn , where the current z is not involved in the generation.

Here ist the python code for that:

def basis_minomega(self):
    L = self
    Y = L - (L*L)
    while True:
        Ybar = Y + ~Y
        take = L - Ybar
        if not len(take):
            return Y
        else:
            z = next(take.flatten())
            Yzgen = Y**z #generate with z involved
            Y = (Y - Yzgen) + z

Note

This algorithm can generate a basis that is smaller regarding ω(UV) = |U||V| , but not necessarily in count compared to the Duquenne-Guiges-Basis.

[1](1, 2, 3, 4, 5, 6) Endliche Hüllensysteme und ihre Implikationenbasen by Roman König.