February 22, 2008

New technology to prevent terrorist attacks

I found
this text
; on technology to analyse email traffic to prevent terrorist attact
on and a lot of other newssites.
The point is here that with the technology they want to identify potential insider
threats to the enterprise.

The question is : will this work ?

Email analysis is just like other data analysis : it is datamining, but preceded by a
preparation phase of converting unstructured information (text) to structured
There is ample evidence that datamining/textmining works and can generate a lot of
otherwise non-accessible information.
But as in other targeting analysis, the search for terrorists or other “bad” people
is binary : the datamining analysis tags you as good or bad. The problem with
this tagging is that at the end you get four categories : 1) the ‘goods’ that
really are good, 2)the ‘goods’ that really are bad (the false negatives), 3) the
‘bads’ that really are bad and 4) the ‘bads’ that really are good (false positives).
It is clear that categories 2) and 4) are the problematic ones.
Tagging a bad person as good leaves the possibility of a malevolent attact. You misses
him, just like we miss him if we do no analysis whatsoever.
More problematic is tagging a good person as bad. What are we going to do with
that person ? Punish him in advance ? putting him on a black list ? Anyway it wil
have an impact on his privacy, on his quality of life etc.

And even more problematic is that, with so few really bad persons around (and this
is the problem of mining so-called sparse events), you will tag far more good
persons as bad ones that really are bad persons.
You can play with the threshold in order to get much less false positives. But if
you set the treshold high enough so that the number of false positives becomes
acceptable, it is quite sure that the number of true positives will be very, very
small (close to negligable) AND that the number of false negatives, i.e. the
missed bad persons will be so high that the whole analysis becomes worthless.

So yes, there may be a technology and yes, it might work, but as far as I am
concerned in real life it is probably going to be useless.

