Thursday, August 18, 2016

NI DAQmx

Data Acquisition with the NI USB-6212

There is a C API.

Help for the C API is installed in:

C:\Program Files (x86)\National Instruments\NI-DAQ\Docs\cdaqmx.chm

But the usage is best learned via the examples in:

C:\Users\public\Documents\National Instruments\NI-DAQ\Examples\DAQmx ANSI C

I worked with the PyDAQmx python wrapping. The wrapper's function names are the C ones without the DAQmx prefix. So the same help as for the ANSI C functions can be used. With this python wrapper the tests are a good starting point.

Here are some tests of mine.

Write to analogue out and read back in at the same time

Physically connect ao0 with ai0. The code writes a signal to ao0 and reads it back in via ai0. Dev2 is the device as shown in NI MAX (Measurement & Automation Explorer).

import ctypes
import numpy as np
import time
import matplotlib.pyplot as plt
from PyDAQmx import *

DEV = "Dev2"
f_s = 1000

tAI = Task()
tAI.CreateAIVoltageChan(DEV+"/ai0","Voltage",DAQmx_Val_RSE,-10,10,DAQmx_Val_Volts,None)
tAI.CfgSampClkTiming("",f_s,DAQmx_Val_Rising,DAQmx_Val_FiniteSamps,f_s)

written = int32()
dataO = np.array([sin(24.0*v/f_s)*f_s/v/24.0 for v in range(1,f_s+1)])
tAO = Task()
tAO.CreateAOVoltageChan(DEV+"/ao0","",-10.0,10.0,DAQmx_Val_Volts,None)
tAO.CfgSampClkTiming("",f_s,DAQmx_Val_Rising,DAQmx_Val_ContSamps,f_s)
tAO.WriteAnalogF64(f_s,0,10.0,DAQmx_Val_GroupByChannel,dataO,None,None)
tAO.StartTask()

dataI = np.zeros(f_s)
dataI = np.zeros(f_s)
readI = int32()
tAI.ReadAnalogF64(f_s,10.0,DAQmx_Val_GroupByChannel,dataI,f_s,byref(readI),None)
tAI.StartTask()
time.sleep(1)
tAI.StopTask()
tAI.ClearTask()

tAO.StopTask()
tAO.ClearTask()

plt.plot(dataI)
plt.show()

Reading more channels at the same time

I tried to create two tasks, each with an input channel. When starting them I got the error:

The specified resource is reserved.

Making the two channels part of the same task works.

import ctypes
import numpy as np
import time
import matplotlib.pyplot as plt
from PyDAQmx import *

DEV = "Dev2"
f_s = 1000
t = 1#s
V_MAX = 10
SHUNT = 100
I_MAX = 0.1
timeout = 10.0
Nch = 2

tsk = Task()
tsk.CreateAIVoltageChan(DEV+"/ai0","Voltage",DAQmx_Val_RSE,-V_MAX,V_MAX,DAQmx_Val_Volts,None)
# Taking the previous or the next line away makes the commented lines below work
tsk.CreateAICurrentChan(DEV+"/ai1","Current",DAQmx_Val_Cfg_Default,-I_MAX,I_MAX,DAQmx_Val_Amps,DAQmx_Val_Default,SHUNT,None) #

Ns = t*f_s
tsk.CfgSampClkTiming("",f_s,DAQmx_Val_Rising,DAQmx_Val_FiniteSamps,Ns)

# #A third output channel on a separate task does not work (1)
# written = int32()
# dataAO = np.array([sin(24.0*v/f_s)*f_s/v/24.0 for v in range(1,f_s+1)])
# tAO = Task()
# tAO.CreateAOVoltageChan(DEV+"/ao0","",-10.0,10.0,DAQmx_Val_Volts,None)
# tAO.CfgSampClkTiming("",f_s,DAQmx_Val_Rising,DAQmx_Val_ContSamps,f_s)
# tAO.WriteAnalogF64(f_s,0,10.0,DAQmx_Val_GroupByChannel,dataAO,None,None)
# tAO.StartTask()

read = int32()
data = np.zeros(Nch*Ns)
tsk.ReadAnalogF64(Ns,timeout,DAQmx_Val_GroupByChannel,data,Nch*Ns,byref(read),None)
tsk.StartTask()
time.sleep(1)

# #A third output channel on a separate task does not work (2)
#tAO.StopTask()
#tAO.ClearTask()

tsk.StopTask()

data.shape = (Ns,Nch)
plt.plot(data)
plt.show()

Put StartTask() after WriteAnalogF64(). ReadAnalogF64() will work properly also without calling StartTask(), but by calling it one keeps control of starting and stopping.

DAQmx_Val_GroupByChannel makes blocks by channel in the buffer. data.shape = (Ns,Nch) takes account of that.

I used the above samples (for t > 1) to verify that a frequency is there in a rather noise signal.

def has_frequency(data, f, f_s):
"""data is sampled over a multiple of the period, e.g. t = int(10/f+1)

The code uses heuristic parameters for a special context.
"""
    data = data.transpose()[0]
    N = data.size
    ssf = sps.medfilt(data,29) #filter first
    amplitudes = np.abs(np.fft.rfft(ssf))*2/N #then do an FFT
    f_idx = int(f*N/f_s)
    allthere = []
    #rectangular signal
    #check that in several modes the same frequency is present
    for m in range(1,4):
        Nwindow = 9
        rng = np.arange(-Nwindow,Nwindow,dtype=np.int)
        amps = amplitudes[m*f_idx+rng]
        imax_in_rng = m*f_idx + rng[np.argmax(amps)]
        there = abs(m*f_idx - imax_in_rng) <= 1 #allow for f tolerance
        allthere.append(there)
    return all(allthere)

Writing and reading from digital lines

StartTask() is not needed, as it is called automatically.

class digital_in_out:
    def __init__(self):
        self.tDO = Task()
        self.DO_names = "FootEnable,HandAbort,HandStart,DI,RC1,RC2,RC3,RC4".split(',')
        self.tDO.CreateDOChan(DEV+"/port0/line0:7",self.DO_names,DAQmx_Val_ChanForAllLines)
        self.do = np.array([0]*8,np.uint8)
        self.tDI = Task()
        self.DI_names = "DO1,DO2,Stim,Ready,Leadoff,Current,Power".split(',')
        self.tDI.CreateDIChan(DEV+"/port2/line0:6",,DAQmx_Val_ChanForAllLines)
        self.di = np.array([0]*7,np.uint8)
    def __end__(self):
        self.tDO.StopTask()
        self.tDO.ClearTask()
        self.tDI.StopTask()
        self.tDI.ClearTask()
    def set(self, what):
        for w in what:
            self.do[self.DO_names.index(w)] = 1
        self.tDO.WriteDigitalLines(1,1,10.0,DAQmx_Val_GroupByChannel,self.do,None,None);
    def clear(self, what):
        for w in what:
            self.do[self.DO_names.index(w)] = 0
        self.tDO.WriteDigitalLines(1,1,10.0,DAQmx_Val_GroupByChannel,self.do,None,None);
    def press(self, what):
        self.set(what)
        time.sleep(0.2)
        self.clear(what)
    def read(self, what):
        readDI = int32()
        sampleDI = int32()
        self.tDI.ReadDigitalLines(1,10.0,DAQmx_Val_GroupByChannel,self.di,len(self.DI_names),byref(sampleDI),byref(readDI),None)
        return int(self.di[self.DI_names.index(what)])

dio = digital_in_out()
dio.set("RC1 RC2".split())
dio.press(["HandStart"])
dio.read("Stim")
#...

Sunday, May 8, 2016

GnuPG

I recently decided to use GNU Privacy Guard (gnupg, gpg2). This blog summarizes the steps taken.

To avoid exposing keys to risk of interception asymmetric encryption was adopted, in which everybody can know your public key. It is needed to encrypt messages for only you to read. While the private key never needs to be communicated. It is used to decrypt messages meant for you, and to sign messages to assure others you are the sender. A message can be any data, any file, binary or text.

gpg2 -e <file>                   # encrypt for yourself; creates .gpg file
gpg2 -e <file> -r <receiver> # encrypt for someone else
gpg2 -d <file>.gpg               # decript

gpg2 -s <file>                   # sign a file
gpg2 -s -e <file> -r <receiver>  # sign and ecript a file for someone
gpg2 -b <file>                   # create separte signature .asc file
gpg2 -v <file>.asc               # verify authenticity of message

To get to use gpg2 like that you need to create a key pair for you.

Note

A password you will use for some time, but a gpg key pair should accompany you for a long, long time. It is like the key to a flat. If you loose it you loose access to your encrypted data or access to internet services you use, but you also loose the trust associated with your public key.

Opt therefore for a longer and stronger key length (currently 4096 bits) and adopt Debian practice to only use subkeys, which get automatically signed by the primary key (selfsigning), i.e. inherit the trust of the primary key.

To set you up:

generate a primary key (automatically generates an encrypting subkey)
generate a signing subkey
backup the keys
remove the master private key
import the subkeys on other computers
let (the public part of) the primary key be signed

The steps in more detail:

generate a primary key
```
gpg2 --full-gen-key
```
allows to specify 4096 as key length, which is wanted by debian. gpg2 --gen-key uses the current default of 2048 bits.

It takes some time to gain entropy needed for the actual creation of the key.
```
gpg2 -k
```
prints:
```
/home/roland/.gnupg/pubring.gpg
-------------------------------
pub   rsa4096/43B86228 2016-04-09
uid         [ultimate] Roland Puntaier (1971) <roland.puntaier@gmail.com>
sub   rsa4096/89E58C2B 2016-04-09
```
This
```
gpg2 -k --with-colons
```
also prints the usage flags: sc (sign certification) for the primary key, and e (encryption) for the sub key:
```
tru::1:1460238398:0:3:1:5
pub:u:4096:1:BC9A552643B86228:1460237000:::u:::scESC:::::::
uid:u::::1460237000::67D66F8E5A03CB7F1D67822B723617535D8A3CC9::Roland Puntaier (1971) <roland.puntaier@gmail.com>:::::::::
sub:u:4096:1:39DCC42189E58C2B:1460237000::::::e::::::
```
− k stands for --list-keys. It will list also the key IDs of all your contacts, of whom you only have the public key. There is also a --list-secret-keys that lists the key IDs of the keys for which you have the private part, too.

There was one subkey for encryption automatically generated, but we want to do without the primary key for the daily use. Therefore we need to create a signing subkey.

generate a signing subkey

gpg2 --edit-key 43B86228

prints:

gpg (GnuPG) 2.1.10; Copyright (C) 2015 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

sec  rsa4096/43B86228
     created: 2016-04-09  expires: never       usage: SC
     trust: ultimate      validity: ultimate
ssb  rsa4096/89E58C2B
     created: 2016-04-09  expires: never       usage: E
[ultimate] (1). Roland Puntaier (1971) <roland.puntaier@gmail.com>

At the prompt wite:

addkey

and choose a signing only RSA 4096 key. Then save and quit:

save
quit

Note

This signing subkey cannot be used to sign other keys, because a subkey is expected to be revoked and replaced after some time.

backup the keys
1. Back up the primary key separately.
```
gpg2 --armor --output my-secret-key.asc  --export-secret-keys roland.puntaier@gmail.com
```
  You will need it to replace the current subkeys some time in the future, in a few years, or to sign other keys (see further down, and subkeys).
2. Back up the subkeys without the primary key.
```
gpg2 --armor --output my-secret-subkeys.asc --export-secret-subkeys roland.puntaier@gmail.com
```
  Note
  
  gpg2 command line parameters containing secret denote both the private and the public part of keys you have created yourself. There is no need to additionally use the non-secret command line parameter. gpg2 output and documentation seems to use private and secret interchangingly.
  
  This backup can be imported on other devices.
```
gpg2 --import /mnt/usb/my-secret-subkeys.asc
```
  There is no need to backup the whole .gnupg folder. You will revoke your subkeys at some time in the future, and would need to update the backup. gpg also continues to develop: file names in the .gnupg folder might change, but the import feature will still work.
remove the master private key

The files storing the private key is ~\.gnupg\private-keys-v1.d\<keygrip>.key, where keygrip is shown with --with-keygrip.
```
gpg2 -k --with-keygrip
```
Remove this file, or alternatively:
```
gpg-connect-agent "DELETE_KEY <keygrip>" /bye
```
Or yet another way:
```
gpg2 --delete-secret-key roland.puntaier@gmail.com
gpg2 --import my-secret-subkeys.asc
```
let the (public part of) your primary key be signed

The public key is identified by an ID and or by a fingerprint. The latter you need to provide in order to let your public key signed. The signer will know you in person or will verify your identity via some official document.

If your public key does not have trust yet you can give the fingerprint to a person on request, e.g. by telephone.

The command
```
gpg2 --fingerprint
```
shows:
```
/home/roland/.gnupg/pubring.kbx
-------------------------------
pub   rsa4096/43B86228 2016-04-09
      Key fingerprint = F85C B631 327E D8DC D50C  9FD3 BC9A 5526 43B8 6228
uid         [ unknown] Roland Puntaier (1971) <roland.puntaier@gmail.com>
sub   rsa4096/89E58C2B 2016-04-09
sub   rsa4096/0DF70FCE 2016-04-10
```
At a signing party participants will verify each other's fingerprint and identity, the latter by inspecting an official document, like a passport or driving license.

The organizer will probably have provided a link to upload your --fingerprint output. He will print out a list for everyone. The list contains all the fingerprints. In addition bring along a printout of your own and read from this one and let the others check against the list entry.

Back at home you sign the keys:

First you need to get the public keys:
```
gpg2 --recv-keys <(last 8 or 16 digits of)fingerprint> #keey needs to be on key server
#or
gpg2 --import <text file with public key> #e.g. sent via email
```
Then use the backed up primary key via or as in this script modified from here.
```
KEYFILE=/mnt/usb1/roland/my-secret-key.asc
tmpdir=$(mktemp -d --tmpdir gpg.XXXXXX)
gpg2 --homedir $tmpdir --import $KEYFILE

gpg2 --homedir $tmpdir --keyring ${GNUPGHOME:-$HOME/.gnupg}/pubring.kbx $@

[ -f $tmpdir/S.gpg-agent ] && gpg-connect-agent --homedir $tmpdir KILLAGENT /bye
rm -rf $tmpdir
```
Next upload the signed keys to the keyserver
```
gpg2 --send-key <keyid>
```
and/or send them via email to the participant of the signing party.

Monday, December 14, 2015

Waf build system

I've blogged about scons in January of this year. This autumn I gave waf another try, because I wanted to get rid of the python2 dependency, and I found out that waf supported almost everything I needed. I missed support for WiX Toolset and I wanted waf to handle localization satellite assemblies. To integrate these tools myself was a good exercise to get aqcuainted with the waf sources, which is also the purpose of this blog.

After skimming through the wafbook I dived into the sources to get an idea how it works. When trying it, I had troubles with the file system. As with scons it is not easy to move files around arbitrarily. It was, though, easier to solve with waf than with scons. Altogether the wscript's turned out cleaner than the SConscript's.

My Test Project

The test project is similar to that of scons:

C code is generated from a SimpleTemplate file (stpl) using bottle.
A python wrapper is created using cffi
A C# wrapper is created
which is used in a C# based GUI

I've added it to the waf playground folder and created a pull request, which was accepted. It's the stpl_c_py_cs_satellite_wix folder.

With waf you can create one waf script file containing all of waf. You can then check this waf script into your source code control system. So the build system becomes part of your sources, i.e. you don't depend so much on the further development of the waf project or whether your pull requests are accepted or not.

python waf-light --tools=resx,satellite_assembly,wix

The test project builds with the generated waf script on a normal cmd prompt. Developer Command Prompt for VS2015 initializes the console for x86, while waf defaults to x86_64, leading to a clash of x86 libcmt with x86_64 code. So let waf do the initialization.

MSys2 shell on Windows works, too. With scons there were problems, because scons uses the the shell to run the compilers and csc cannot handle file names with forward slashes.

On Linux the project compiles as well, with Mono covering the .NET part. To run the test there you need to help ld find libfuni.so via environment variables:

LD_LIBRARY_PATH=$PWD/../build/api/: PATH=$PATH:$LD_LIBRARY_PATH waf configure build test --stubs

Files

Let's start with files:

├── build
│   ├── api
│   └── gui
└── src
    ├── api
    │   └── wscript_build
    ├── build.py
    ├── gui
    │   └── wscript_build
    └── wscript

You will run waf from the src folder.

waf configure build

In the wscript, files can be given as strings or nodes (Nodes.py).

Node (Node.py):

abspath relpath path_from bld_dir bldpath srcpath change_ext chmod delete
find_dir find_node search_node find_or_declare find_resource get_bld
get_bld_sig get_src suffix height is_bld is_child_of is_src listdir make_node
mkdir mkfile read write ant_glob ant_iter

parent children ctx

String paths can be relative to wscript(_build) in the source tree. They are converted to nodes using TaskGen.to_nodes(). It uses find_resource(), and this first uses find_node() to look in the source tree and then search_node() for generated files in the build tree. find_node() finds the file only, if it exists on the file system. search_node() finds declared-only nodes, too.

Note

If you have a file as part of a compiler option, then this file must be relative to build, because that's where the compilers will run.

Generated files don't exist on disk until a later phase, when an actual compiler was run. If you wanted to have a generated file in the source tree, how do you do? ctx.path.get_src().find_or_declare() will not declare a node in the source tree, make_node() will, but find_node() will not find it, so all dependencies will fail. The work around is to actually create a temporary empty file on disk, so find_node() can find it. You can use make_node().write('') to create an empty file during the wscript execution.

from waflib.Node import Node
root =Node('',None)
toppath = os.path.expanduser('~/tmp/waftest')
top = root.make_node(toppath)
top.abspath()
nod = top.make_node('notondisk')
assert not os.path.exists(nod.abspath())
#but find_node will not find it either
assert None == top.find_node('notondisk')
#search_node will find it
assert top.search_node('notondisk')
nod.write('')
assert os.path.exists(nod.abspath())
assert top.find_node('notondisk')

Contexts

In wscript you have a function per waf command line command, like configure(), build(), ... or yourownone(). You specify their sequence of execution in the command line: waf configure build yourownone. If no command line command is given, the default is build.

yourownone will get a basic context

Context (Context.py):

cmd_and_log end_msg exec_command execute fatal finalize load
load_special_tools msg post_recurse pre_recurse recurse start_msg to_log

cmd variant

But the major commands (build, install, ..) will have their special context (created via the module function create_context()).

These contexts have special functions that help you define the task_gen object, which then creates the task object in its post() method.

load() is for loading (the command from) tools, recurse() for doing the command in the wscript of subfolders.

The options, configure, build, clean, install commands get these contexts as parameters:

OptionsContext (Options.py):

add_option add_option_group execute get_option_group jobs parse_args

ConfigurationContext (Configure.py):

err_handler eval_rules execute get_env init_dirs load post_recurse prepare_env set_env setenv store

BuildContext (Build.py):

add_group add_manual_dependency add_post_fun add_pre_fun add_to_group compile
declare_chain execute execute_build get_all_task_gen get_build_iterator
get_env get_group get_group_idx get_group_name get_targets get_tasks_group
get_tgen_by_name get_variant_dir hash_env_vars init_dirs install_as
install_files launch_node load_envs post_build post_group pre_build
progress_line restore rule set_env set_group setup store symlink_as total

env variant_dir (depending on) variant

CleanContext (Build.py):
```
clean execute
```

InstallContext (Build.py):

copy_fun do_install do_link install_as install_files run_task_now symlink_as

BuildContext is of course the important one.

It has a variant-specific environment (env) of type ConfigSet (ConfigSet.py):

append_unique append_value prepend_value derive detach get_flat get_merged_dict keys revert stash load store update

The build context's important functions, like program, shlib, stlib, objects will be provided by the tools loaded, and in this fashion

@conf
def program(bld, *k, **kw):
    set_features(kw, 'program')
    return bld(*k, **kw)

The last line is BuildContext::__call__(), which therefore is the important starting point. BuildContext::__call__() takes either a rule parameter or a features parameter.

Rules

rule can either be

a command like touch tst.txt, but normally with macros that are expanded in compile_fun() (Task.py), and executed via python subprocess, by default and luckily without using shell.
or a python function

Features

The __call__() parameters will be saved in an intermediate class task_gen (TaskGen.py). This can be equipped with additional functions by tools, when they are loaded.

Tools can change the task_gen class by adding methods before or after existing methods, especially process_rule and process_source. But with more tools this can get chaotic. Therefore these functions are associated with a string (= feature). You activate these function by providing the features parameter in the ctx.__call__() method, instead of the rule parameter.

Execution

Here a very shortened sequence of running waf for the important build command, i.e. BuildContext.execute():

We start in Scripting.py:

waf_entry_point
    run_commands
        parse_options
        run_command
            ctx = create_context(cmd_name)
            ctx.execute()
                recurse
                    exec/call
                execute_build
                    compile
                        get_build_iterator
                            post_group
                                tskgen.post
                                    process_rule
                                        create_task
                                    process_source
                                        create_task
                        Parallel.start

Via ctx.execute() either configure's execute or build's execute, and so on, is run.
ctx.recurse will run wscript.build()/wscript_build, which will create and put the task_gen objects into the current group. bld.add_group() in wscript makes a new group. Groups run sequentially.
execute_build{compile{get_build_iterator{post_group()}}} will call the task_gen object's post() method, which creates the tasks, which are returned in the iterator and run by a Parallel (Runner.py) object. post_group() takes into account the --targets= command line parameter.
The task_gen object's post() will sort process_rule process_source and the other methods added to task_gen via tools, using before_method, after_method and features. Use waf --zones=task_gen to see.
process_rule will create a task (create_task(name)), if a rule was specified.
else process_source will find the rule via tast_gen.mappings, which is filled via the extensions() (TaskGen.py) decorator. The decorated function must then call create_task('taskname'). In this case you must somewhere have class taskname(Task):..., or you use task_factory() (TaskGen.py), as is done in in the ctx.__call__() scenario and via declare_chain() (TaskGen.py). task_factory() uses Python's type() function to dynamically create the class derived from Task.

Variants

You use ctx.setenv('somevariant') in wscript's configure to define the environment for more variants. To do the variant via command line you need to subclass the according context and give it a name to use in the command line.

From the wafbook:

from waflib.Build import BuildContext
class debug(BuildContext):
        cmd = 'debug'
        variant = 'debug'

But how is that associated with the wscript's build() function? ctx.recurse() uses ctx.fun, not ctx.cmd, and that is still build.

Write a Tool

When you write a tool, you can either

inject methods into task_gen:

@feature('a_tool')
@after_method('process_source')
def somefun(tskgen):...

use the extension decorator:
```
@extension('.rb')
def process(self, node):
    return self.create_task('run_ruby', node)

class run_ruby(Task.Task):
    run_str = '${RUBY} ${RBFLAGS} -I ${SRC[0].parent.abspath()} ${SRC}'
```
A manually derived task has normally a run_str or run() overridden. You can also provide scan for implicit dependency scanning and vars to influence tsk.signature() (calculated per task and depending on all that determines whether this task must be executed again).

use declare_chain():

TaskGen.declare_chain(name = 'erlc',
    rule      = '${ERLC} ${ERLC_FLAGS} ${SRC[0].abspath()} -o ${TGT[0].name}',
    ext_in    = '.erl',
    ext_out   = '.beam')

The tool is loaded as a module, which means its body is executed. The build setup is normally done in the body itself. If you have a command function in the tool module, normally options and/or configure, it will be executed, when load() is called. In this cases call ctx.load() for each of these command functions.

Summary

Here the selection of the waf source code vocabulary:

Node
    find_node()
    find_resource()
    find_or_declare()
    make_node()
    mkdir()
    write()

    ctx

Context
    setenv()
    load()
    add_group()
    __call__()
    program()
    shlib()
    stlib()

    cmd_and_log()
    exec_command()

    cmd env variant

ConfigSet
    append_value()
    append_unique()
    prepend_value()

task_gen
    post()
        process_rule()
        process_source()
extension
declare_chain
before_method
after_method

Task
    run()
    run_str
    vars
    scan()

Sunday, July 5, 2015

Implications

This blog is a continuation of the FCA blog. I actually mused over this in Summer 2004, but wanted to recap for myself and share some source code.

The article I read back then was Endliche Hüllensysteme und ihre Implikationenbasen (Finite hull systems and their implications). I summarize this article here, since I didn't find an english version online.

In short, a context is constructed, where rows are subsets of attributes ''A'', i.e. elements of the power set, and columns are implications between subsets and the incidence is given by respects (⊢ ), H = {2^A, 2^A × 2^A, ⊢} . This context is simplified, by applying the Armstrong rules, to form a non-redundant, complete, reduced context, named B . Then an algorithm is presented, which takes the intents of some context and recursively constructs the respected implications using B and afterwards uses the 3rd Armstrong rule to find an easy set of implications, i.e. not necessarily minimal, but easier than the Duquenne-Guiges-Basis.

Definitions

U → V : an implication is a pair (U, V) , U ⊆ M and V ⊆ M U is premiss and V conclusion.
W⊢U → V : A set W respects (U, V) , if U⊈W or V ⊆ W . One also says: (U, V) holds for W .
If V ⊆ U the implication is called trivial.

An intelligent system repeatedly gets sets of elements (attributes) as input (objects) and associates these elements. Implications are the result of many such sets.

A concept lattice C respects (U, V) , iff U’ ⊆ V’ .
(U, V) holds in C , iff V ⊆ U’’

Implications can be composed to give new implications. The minimal set of implications that produce all other implications is called an implication basis.

A minimal implication basis is given by the Duquenne-Guiges-Basis (stem basis), which consists of all pseudo-contents:

Let X ↦ X’’ be a closure operation, then P is pseudo-closed if

P ≠ P’’
Q ⊂ P ⇒ Q’’ ⊆ P

There cannot be a implication basis with less implications, than the Duquenne-Guiges-Basis, but one can choose implications that are easier (with low ∑|U||V| , U → V with V ≠ ∅ and U∩V = ∅ ).

Construction of H

H = {2^A, 2^A × 2^A, ⊢} is a context and the according concepts C(H) = {(O, A)} are subsets (= objects of H ) that respect a set of implications (= attributes H ).

The basic implications ( ⇒ ) respected by this context C (⊨ ) are given by the Armstrong Axioms:

∅ ⇒ {U → U} (A1) {U → V} ⇒ {UW → U} (A2) {U → V, VW → Z} ⇒ {UW → Z} (A3)

Some derived ones are ( ≡ is ⇒ in both directions):

{U → V} ≡ {U → UV} (A4) {U → V, U → W} ≡ {U → VW} (A5) {U → V} ≡ {W → V|U ⊆ W} (A6) {U → V, UVW → Z} ≡ {U → V, UW → Z} (A7)

Note

Notation: UV means U∪V and Ui means U∪{i}

H can be constructed inductively. The sequence of the attributes is of no importance when constructing the power set.

Note

In [1] {⋆} named the new attribute in the step n and {⋆⋆} in the step n + 1 . Here I name them i and j and ⋆ becomes a wild card meaning whatever element in a previous step.

In the following i means a new attribute. Because of A4:{X → Y} ≡ {X → XY} only implications with sets U ⊆ V are included, therefore Ui → V is not there.

H_o = 1 , because ∅⊢∅ → ∅ .
L_o = 0 , because ∅⊬∅ → {i} .

H_n + 1 = ⊢ U → V U → Vi Ui → Vi X H_n L_n 1 Xi H_n H_n H_n

(2, ⋆) implications hold with Xi = X∪{i} , because X respects U → V at location (1, 1) .
The L in (1, 2) is discussed below
The 1 in (1, 3) is because Ui⊈X makes the implication hold.

L_n + 1 is part of H_n + 2 , which adds another element j .

L_n + 1 = ⊢ U → Vy U → Vij Ui → Vij X L_n L_n 1 Xi L_n L_n L_n

The columns of L_n + 1 are those of H_n + 1 augmented by j at the conlusion side.
The rows are the same as in H_n + 1 .
As i , so is j not in X and we can distribute L to all but (1, 3) , where Ui⊈X makes it hold.

Construction of B

Because of A5 ({U → V, U → W} ≡ {U → VW} ) we can reduce H to implications of the shape U → U∪{i} , abbreviated U → i (thus Ui → j means U∪{i} → U∪{i, j} .

In the following ⋆ names the extra element on the right of a previous recursion step, whichever that was.

B is in analogy to H horizontally subdivided by three event:

no new element i , ie some other ⋆ of a previous step is on the right.
with i on right side
with i on both sides: the extra element on the right is any ⋆ of a previous step.

B₁ = 0 1 A₁ = 0 1 0 0

B_n + 1 = ⊢ U → ⋆ U → i Ui → ⋆ X B_n A_n 1 Xi B_n 1 B_n

(1, 3) : Ui⊈X , so it holds.
(2, 2) : Xi⊢U → i . If U⊈X , so is Ui⊈Xi , ie it holds; same for ⊆ .
(1, 1) ⇒ (2, 1), (2, 3) : X⊢U → ⋆ ≡ Xi⊢U → ⋆ ≡ Xi⊢Ui → ⋆
(1, 2) : see below

A_n + 1 is part of B_n + 2 , which adds another element j on the right side. The new element of step n (i ) can be on the left side or not, hence the horizontal partitioning.

A_n + 1 = ⊢ U → j Ui → j X A_n 1 Xi A_n A_n

The implications (columns) in B are reduced. The rows are reduced apart from the last line, which is always full.

Application to Find Implications for a Context

H , L , B , A ( = X ) are used to find the implication basis and the intents.

We recursively construct the composed context K○X defined by X(K(g)) , g is object of K .

Every attribute of K(g) = g’ gets a position. This position says whether the attribute is there or not there - column after the if in the following formulas. So every g’ is coded by a binary number. This is λ:2^M↦{0, 1}ⁿ in [1].

The first two lines for (g_[n])^H_n show the according implication coding. This is κ:(U↦V)↦{0, 1, 2}ⁿ with 2 = i ∈ U , 1 = i ∈ V and U∩V = ∅ in [1]. So every binary attribute coding becomes a ternary implication coding.

For n = 1

(g_[1])^H₁ = 1, 0, 1 1, 1, 1 (g_[1])^L₁ = 0, 0, 1 if g₁ = 0 0, 0, 0 if g₁ = 1

For n > 1

(g_[n])^H_n = 0 1 2 U → V U → Vi Ui → Vi (g_{[n − 1]})^{H_n − 1} (g_{[n − 1]})^{L_n − 1} 1^{3^n − 1} if g_n = 0 (g_{[n − 1]})^{H_n − 1} (g_{[n − 1]})^{H_n − 1} (g_{[n − 1]})^{H_n − 1} if g_n = 1

(g_[n])^L_n = (g_{[n − 1]})^{L_n − 1} (g_{[n − 1]})^{L_n − 1} 1^{3^n − 1} if g_n = 0 (g_{[n − 1]})^{L_n − 1} (g_{[n − 1]})^{L_n − 1} (g_{[n − 1]})^{L_n − 1} if g_n = 1

The recursive construct with coding 1 = x ∈ V < 2 = x ∈ U shows that the column position, interpreted as ternary number, codes the implication. If the incidence at the column position is 1, then g’ respects this implication: {g’}^H_n is indicator function for the respected implications.

The coding will also place a larger V before a smaller one when going from right to left. This way one can filter out implications with already encountered same left side.

The U∪V will be closed. If there is no implication for a left side, this is a closed intent, too.

Note

Python Implementation

Python's arbitrary precission integers is used for coding. An attribute's position is the bit position of the integer. Every object is thus an integer. A context is a list of integers.

In here higher bits are more left; in the [1] they are more right; the horizontal order is inverted.

def UV_H(Hg,gw):
    """
    Return implications UV respecting g.
    gw = g width
    Hg = H(g), g is the binary coding of the attribute set
    UV = all non-trivial (!V⊂U) implications U->V with UuV closed; in ternary coding (1=V,2=U)
    K = all closed sets
    """
    lefts = set()
    K = set()
    UV = set()
    p = Hwidth(gw)
    pp = 2**p
    while p:
        pp = pp>>1
        p = p-1
        if Hg&pp:
            y = istr(p,3,gw)
            yy = y.replace('1','0')
            is02 = y.find('1') == -1 #y∈{0,2}^n
            if is02:
                K.add(y)
            elif yy not in lefts:
                lefts.add(yy)
                UV.add(y)
    return (UV,K)

Now we do the analogue procedure with the B context. In B the implications are of the shape U → U∪{x} , so we need to specify U and the one binary position of x. This implication coding can be derived from the column position recursively (see below).

For n = 1 :

(g_[1])^B₁ = 0 1 (g_[1])^A₁ = 0 1 if g₁ = 0 0 0 if g₁ = 1

For n > 1 :

(g_[n])^B_n = U → ⋆ U → i Ui → ⋆ (g_{[n − 1]})^{B_n − 1} (g_{[n − 1]})^{A_n − 1} 1^{(n − 1)2^n − 2} if g_n = 0 (g_{[n − 1]})^{B_n − 1} 1^{2^n − 1} (g_{[n − 1]})^{B_n − 1} if g_n = 1

(g_[n])^A_n = U → j Ui → j (g_{[n − 1]})^{A_n − 1} 1^{2^n − 1} if g_n = 0 (g_{[n − 1]})^{A_n − 1} (g_{[n − 1]})^{A_n − 1} if g_n = 1

The following Python returns the implication coding for a column t .

Awidth = lambda n: 2**n
Bwidth = lambda n:n*2**(n-1)
def A012(t,i):
    if i<0:
        return ""
    nA = Awidth(i)
    if t < nA:
        return "0"+A012(t,i-1)
    else:
        return "2"+A012(t-nA,i-1)
def B012(t,i):
    """
    Constructs ternary implication coding (0=not there, 2=U, 1=V)
    t is B column position
    i = |M|-1 to 0
    """
    if not i:
        return "1"
    nA = Awidth(i)
    nB = Bwidth(i)
    nBB = nB + nA
    if t < nB:
        return "0"+B012(t,i-1)
    elif t < nBB:
        return "1"+A012(t-nB,i-1)
    else:
        return "2"+B012(t-nBB,i-1)

The positions of the 1's in the B row for a given g yield the implications. Here the python source code:

def UV_B(Bg,gw):
    """
    returns the implications UV based on B
    Bg = B(g), g∈2^M
    gw = |M|, M is the set of all attributes
    """
    UV = []
    p = Bwidth(gw)
    pp = 2**p
    while p:
        pp = pp>>1
        p = p-1
        if Bg&pp:
            uv = B012(p,gw-1)
            UV.append(uv)
    return UV

To get the implications that hold in a context K , i.e. for intent 1 and intent 2 and ..., we &-combine (multiply) the B rows for intent 1 and intent 2 and ...

Note

See pyfca for the full and possibly updated source code.

Constructing a Basis

The implications that result via B can be pruned by

first finding the minimal antichain
then applying the 3rd Armstrong

The minimal Antichain

The 1-position (j) in the ternary coding is called significant component. L_j are all implications with the same significant component.

The implications U → j constructed in this way via K○B are ordered lexicographically based on 0 ≤ 1 ≤ 2 .

They are also ordered with the product order, which is a partial order, and it depicts the containment hierarchy of the U side. We want to find the smallest U of every chain in L_j . This minL_j is an antichain and so is L = ∪minL_j .

The UV_B above changed to return this minimal antichain:

def v_Us_B(Bg,gw):
    """
    returns the implications {v:Us} based on B
    v is the significant component
    Bg = B(g), g∈2^M
    gw = |M|, M is the set of all attributes
    """
    v_Us = defaultdict(list)
    p = Bwidth(gw)
    pp = 2**p
    while p:
        pp = pp>>1
        p = p-1
        if Bg&pp:
            uv = B012(p,gw-1)
            #lets find minima regarding product order
            #{v:[Umin1,Umin2,...]}
            v = uv.find('1')#v=significant
            u = uv[:v]+'0'+uv[v+1:]
            u = int(u.replace('2','1'),2)
            Umin_s = self[gw-v-1]#bit position from right
            it = [i for i,U in enumerate(Umin_s) if U&u==u]
            for i in reversed(it):
                del Umin_s[i]
            else:
                Umin_s.append(u)
    return v_Us

The Basis

In the construction of the minimal antichain the 2nd Armstrong rule has been used, but there is still redundancy due to the 3rd Armstrong rule {U → V, VW → Z} ⇒ {UW → Z} .

A component-wise operation is created to represent the 3rd Armstrong rule for the 0,1,2-coding

U○VW ⇒ UW : 0○0 = 0 and 2○0 = 0○2 = 2○2 = 2 This is a binary or operation.
V○VW ⇒ UW , no V in UW : 1○2 = 2○1 = 0 .
Z ⇒ Z , V∩Z = ∅ : 1○0 = 0○1 = 1
L_i○L_j = {x○y|x ∈ L_i, y ∈ L_j, i ≠ j}

Note that it is required that, with X → x and Y → y , x ∈ Y xor y ∈ X to make this the 3rd Armstrong rule.

This is implemented in python as

#u1,u2 code the attribute sets via the bits of an integer
#v1,v2 are bit indices
#u1->v1, vv1 = 2**v1
#u2->v2, vv2 = 2**v2
if vv2&u1 and not vv1&u2:
    return (v1,[(u1|u2)&~vv2])
elif vv1&u2 and not vv2&u1:
    return (v2,[(u1|u2)&~vv1])

The properties of ○ are:

x○y = y○x
x ≠ x○y ≠ y
If x ≤ _ny , then x○y ≤ _ny , ≤ _n is the product order
If x○y ∈ L_i , then x ∈ L_i or y ∈ L_i (note: L_i not minL_i )
If there exits x○y = z ∈ L , then there exist x̃, ỹ ∈ L with x̃○ỹ = z

The objective is to find Y ⊆ L , of which the repeated application of ○ produces L : Y = L .

Y is defined as

Y² = Y○Y
Y^n + 1 = Yⁿ∪{z ∈ L|z = x○y, x ∈ Yⁿ, y ∈ Y∪Yⁿ} This means that of Yⁿ○(Yⁿ∪Y) only elements in L are added at each step. But the algorithm below can cope with Yⁿ getting bigger than L.
○Y = ∪_n ≥ 2Yⁿ
Y = Y∪○Y

One starts with Y₀ = L (L○L) , i.e. all those elements of L that are not results of an application of the 3rd Armstrong rule. As long as Y_n ≠ L , we add z ∈ L Y_n to Y_n . Y_n∪{z} cannot generate elements of Y₀ , but it can generate elements of Z_n = Y_n Y₀ . Therefore we do Y_n + 1 = Y₀∪(Z_n (○(Y_n∪z)))∪z , i.e. we also remove elements, but only those that can be generated together with z , hence Y_n ⊂ Y_n + 1 .

Note

The formula Y_n + 1 = Y₀∪(Z_n (○(Y_n∪z)))∪{z} in [1] lets you think, that only the current z can generate itself. But actually other z ∈ Z_n could generate themselves. Z_n (○(Y_n∪z)) could remove elements other than z that are needed for the generation of elements in Y_n , thus making Y_n¬ ⊂ Y_n + 1 . Still the latter is postulated in [1]. There is a comment, though, that states, one should use disjunct union. This is interpreted as: don't remove elements of Z_n , where the current z is not involved in the generation.

Here ist the python code for that:

def basis_minomega(self):
    L = self
    Y = L - (L*L)
    while True:
        Ybar = Y + ~Y
        take = L - Ybar
        if not len(take):
            return Y
        else:
            z = next(take.flatten())
            Yzgen = Y**z #generate with z involved
            Y = (Y - Yzgen) + z

Note

This algorithm can generate a basis that is smaller regarding ω(UV) = ∑|U||V| , but not necessarily in count compared to the Duquenne-Guiges-Basis.

[1]	(1, 2, 3, 4, 5, 6) Endliche Hüllensysteme und ihre Implikationenbasen by Roman König.