**I never liked prior probabilities, nor classification.**

This is probably a bit weird from someone who likes to mining data with decision trees. But I will explain.

For me classification (in data mining) means that you decide with some sophisticated algorithm to which category an observation belongs. “Is someone a terrorist or not ?”

The algorithm calculates the appropriate categories in three steps.

– First the real data mining is done : based on some discriminating, independent variables a model is trained/calculated/derived.

– The second step is to feed observations to that model for which it will calculate probabilities to belong to the different categories. “Jim has a probability of 0.85 for being a terrorist”.

– Third step : decide to which category the observation belongs.

Prior probabilities can be used in steps 2 and 3. If you do not use prior probabilities but you did some over- or undersampling, the model will etc … etc… Sigh ! For a simple data miner like me it becomes to complicated, to artificial.

If you want to know more about it anyway, you should look at some definitions/explanations : [I], [II], [III], [IV]

A more easy, more straightforward, and more trustworthy way is the following :

– Execute step 1 to obtain you model.

– Use an unseen real life data sample and feed it to the model to calculate the probabilities.

– Use the same unseen real life data sample with known real categories, add the calculated probabilities to the observations and calculate the real probabilities. You can do this simply by sorting them from high to low probabilities and calculate the frequencies of each category in bins of for example 1% of your observation. Now you have a means of transforming the probabilities calculated by the model into real probabilities.

You still have no categories ! Right. But why would you need them ?

Example : commercial targeting. You make a model to optimize your target groups because you want to know who has a high probability to buy you product. What would be your categories ? (potential) Buyers and (potential) non-buyers ? This is nonsense! Models are not perfect. Even the 1 percent of observations with the highest probabilities of belonging to category A will contain a number of observations of the other categories. The only thing you calculate is the probability, not the real category.

Optimizing your target group means finding a balance between 1) the number of people in the group, 2) the cost to contact these people and 3) the expected return.

The probabilities are what they call “soft metrics”. This is sort of a new term for what since long is know as fuzzy logic. It is like if you only want to distinguish black and white in a world of gray scales. It is like you do not know the temperature, but you know it’s warm.

! Wikipedia has no item for “soft metrics” !

Some definitions :

“Soft metrics evaluate the things that aren’t apparent but may help predict a company’s future: are there heavy hitters on the board of directors? Has the management team succeeded before?” (answers.com)

“An approach to decision making based on soft metrics could allow problems to be solved where no definitive “yes-no” answer is possible” (via @ayoubsciences)

In our current social internet world for marketing it means “sentiment metrics” : engagement, conversations, buzz, interactions, word of mouth, awareness and brand as outcomes of marketing campaigns. (hard metrics : sales figures, number of new customers …).

If you want to read more about soft metrics :

Computational Models of Group Dynamics for National and International Security Applications (Mihaela Quirk)

Marketing metrics : the hard and the soft

How Soft Metrics Can Make Hard Choices Easier

*Other posts you might enjoy reading :*

Data mining for marketing campaigns : interpretation of lift

Howmany inputs do data miners need ?

Oversampling or undersampling ?

data mining with decision trees : what they never tell you

The top-10 data mining mistakes

Good enough / data quality

International Journal of Intelligent Defence Support Systems

Issue: Volume 2, Number 4 / 2009

Pages: 335 – 349

URL: Linking Options

Soft metrics: What are they and what use are they for the Intelligence Community?

Mihaela D. Quirk A1

A1 Decision Applications Division, Los Alamos National Laboratory, Los Alamos, NM 87544, USA

Abstract:

Modern decision making challenges the human capacity to reason in an environment of uncertainty, imprecision, and incompleteness of information. Atop the uncertainty, ranking in the presence of multiple criteria, multiple agents, and heterogeneous sources of information is often the main task to accomplish by the analysts in the Intelligence Community (IC). Soft metrics are attributes of decision criteria that cannot be expressed numerically. These metrics are at the core of a computational engine that is perception-based, with computational ‘atoms’ expressed in natural language (NL). The soft metrics approach as a basis for natural language-based computing contributes to fast analyses and an efficient use of human resources in contemporary decision making such as: intelligence data analysis, the Global Strike (target pairing), risk analysis, threat assessment, strategic interactions, conflict analysis, and the strategic deterrence assessment.

Keywords:

soft metrics, decision analysis, intelligence community, fuzzy logic, decision criteria, natural language, intelligence data analysis, target pairing, risk analysis, threat assessment, strategic interactions, conflict analysis, strategic deterrence assessment, information processing, information quality, fuzzy sets, decision ranking

By:

mihaela quirkon August 30, 2011at 7:08 am