Mixotricha

Posted by: zyxo | February 21, 2010

Data mining at a higher level

Data mining at a higher level ?
What is this higher level ? and what was the “normal” level ?

The classic commercial data mining approach, the “normal” level goes as follows :

take your historical data
indentify the customers who did something (ex. buy product xyz) and the ones who did not.
make a model to distinguish between the do-ers and the non-do-ers.
use that model to calculate for each of your identified customers the probability that they will do that something.
contact those with the highest probability in some marketing campaign

You can find more on this “normal” approach in my previous posts :
data mining for marketing campaigns : interpretation of lift
Mining highy imbalanced data sets with logistic regressions
Howmany inputs do data miners need ?
Classification, Prior Probabilities and Soft Metrics
data mining with decision trees : what they never tell you

But there are better ways, that need a higher level of data mining :

estimation of the $ purchase amount
net lift of purchase probability
net lift of $ purchase amount
the above combined
optimal pricing modeling

Estimation of the $ purchase amount

: Image via Wikipedia

This is a straightforward one. In stead of only modeling the probability of purchase you can also model the expected purchase amount. Probably a simple multivariate linear regression can do the trick (but there are possibilities with decision trees also, I will write about that later). When selecting the people to target in your marketing campaign you simply multiply the estimated absolute purchase probability by the estimated purchase amount. In that way you will contact the people who you expect to shop heavily in your stores.

Net lift of purchase probability

This involves a lot more effort. It is something you have to do in three big steps :

First step : organize a marketing campaign where you contact (e-mail, snail mail, personalized banner) a sufficiently large random target group of your customers. Be sure to keep an equally large control group, equally randomly chosen.
Second step (the modeling step) : you need to make two “normal level” models for calculating the purchase probability : for the first one, you only use the targeted people (I call it the targeted model), for the second one, you only use the non-targeted people (the control model).
Third step (the scoring step) : for each customer in your database you calculate the purchase probability twice : once with each of the two new models. Since the models are different, the two calculated probabilities for each single customer will differ. What you need is the customers where the calculated probability from the targeted model is a lot higher than the calculated probability from the control model. It means that as a result of the e-mail or the banner the purchasing probability went up. What you really model this way is the campaign effect. It gives you a means of selecting those people where your campaign will have the biggest effect on their purchasing behavior.

Net lift of $ purchase amount

This goes the same way as modeling the above. Only here you model the campaign impact on the $ purchase amount of your customers.

The above combined

Obviously this is just … the above combined, where you try to maximize
( net increase of purchase probability x net increase of purchase amount ) – contact cost and this summed over all the customers in you campaign target group.

Leaves us with one more : Optimal pricing modeling

Optimal price chart

This one goes about the same way as the net lift modeling of the purchase probability. There we calculated the probability differences between the target group and the control group. In the optimal pricing modeling we also need a first campaign to get the data and thenwe calculate several models. Each model is calculated for a different target group. The difference between the target group is, well, the price. OK, we need a business where we can easily ask different prices to different customers for the same service or item. Say for target group A we ask 5$ for the item, in target group B we ask 5.5$ and in target group C we ask 6$.
After this test campaign we calculate three models : the 5$model, the 5.5$model and the 6$model.
With these models we calculate the three probabilities to purchase for each customer. The price we ask a customer for our item should be the one where this price times his calculated purchase probability is the highest. If John has a probability of 7% for purchasing at 5$, a prob. of 6.7% of purchasing at 5.5$ and a probability of 5.5% of purchasing at 6$ you get respectively 0.35, 0.3685 and 0.33. So you should try to sell your item to John at a price of 5.5$ for each “John will pay you on the average 5.5* times 6.7% purchases = 0.3685$ .

1 Comment

Posted in Data mining | Tags: Add new tag, Business, Data, Data mining, Database, Events, Math, Model, Probability

Posted by: zyxo | February 12, 2010

Data Mining Models ? 10 reasons for not using them !

Modeling is difficult :
Wrong ! You are a marketer, so you do not have to do the modeling yourself.
Find or hire a data-loving whizzkid with a solid analysis background. He/she will do the hard work for you
Models are to expensive :
Wrong ! Wrong for two reasons :
1. The tools are actually free (download R or Weka from the internet)
2. The model pays itself if you use it in one or multiple campaigns
Models are black boxes :
Right ! But that doesn’t mean they are not useful !
Do you know exactly how your PC works ? Your TV set ? Your car ?
I am not sure the models will work :
Wrong ! Just ask the whizzkid to test the model with real-life data, that he did not use for modelling. He can make you a nice chart, showing that the top selected customers do much better than the others.
It is impossible to explain the models to the sales people :
Right ! Because the models are black boxes.
Wong ! Ask your whizzkid to give you the three most significant variables of the model and communicate these to your sales people.
(Not so black-box after all !)
selections based on the model are to heterogeneous / targets do not match our campaign materials (banners, ads …) :
Wrong ! OK, sometimes heterogeneous perhaps, but who said you have to communicate to all of them in the same way ? All you have to do is segment the selection with a communication goal in mind (talk to your whizzkid)
the selection will become to complicated :
Wrong ! The selection becomes simpler ! Only use the model score as the selection criterion (besides other “administrative criteria”).
I did it for years without data mining models. Why change ? :
So Sad ! You need to change, otherwise the competition will leapfrog you !
The expected returns based on the model data are way to low. I want to get more out of my campaign! : Unrealistic ! Ask your whizzkid to compare the expected returns from the model with those from your own selection criteria. Then let the data decide !
A decent data mining model allways outperforms man-made selection rules.
My job is to spend the campaign budget. I do not care about the results (nobody else does anyway) :
So sad ! You make life easy for your competitors !

You can also view this list as a free to download presentation

Leave a Comment

Posted in Data mining | Tags: Add new tag, Black box, communication, Data, Data mining, Databases, Model, Weka

Posted by: zyxo | February 7, 2010

Classification, Prior Probabilities and Soft Metrics

I never liked prior probabilities, nor classification.
This is probably a bit weird from someone who likes to mining data with decision trees. But I will explain.

For me classification (in data mining) means that you decide with some sophisticated algorithm to which category an observation belongs. “Is someone a terrorist or not ?”

The algorithm calculates the appropriate categories in three steps.

– First the real data mining is done : based on some discriminating, independent variables a model is trained/calculated/derived.
– The second step is to feed observations to that model for which it will calculate probabilities to belong to the different categories. “Jim has a probability of 0.85 for being a terrorist”.
– Third step : decide to which category the observation belongs.

Prior probabilities can be used in steps 2 and 3. If you do not use prior probabilities but you did some over- or undersampling, the model will etc … etc… Sigh ! For a simple data miner like me it becomes to complicated, to artificial.
If you want to know more about it anyway, you should look at some definitions/explanations : [I], [II], [III], [IV]

A more easy, more straightforward, and more trustworthy way is the following :
– Execute step 1 to obtain you model.
– Use an unseen real life data sample and feed it to the model to calculate the probabilities.
– Use the same unseen real life data sample with known real categories, add the calculated probabilities to the observations and calculate the real probabilities. You can do this simply by sorting them from high to low probabilities and calculate the frequencies of each category in bins of for example 1% of your observation. Now you have a means of transforming the probabilities calculated by the model into real probabilities.

You still have no categories ! Right. But why would you need them ?
Example : commercial targeting. You make a model to optimize your target groups because you want to know who has a high probability to buy you product. What would be your categories ? (potential) Buyers and (potential) non-buyers ? This is nonsense! Models are not perfect. Even the 1 percent of observations with the highest probabilities of belonging to category A will contain a number of observations of the other categories. The only thing you calculate is the probability, not the real category.
Optimizing your target group means finding a balance between 1) the number of people in the group, 2) the cost to contact these people and 3) the expected return.

The probabilities are what they call “soft metrics”. This is sort of a new term for what since long is know as fuzzy logic. It is like if you only want to distinguish black and white in a world of gray scales. It is like you do not know the temperature, but you know it’s warm.

: Image via Wikipedia

! Wikipedia has no item for “soft metrics” !

Some definitions :

“Soft metrics evaluate the things that aren’t apparent but may help predict a company’s future: are there heavy hitters on the board of directors? Has the management team succeeded before?” (answers.com)

“An approach to decision making based on soft metrics could allow problems to be solved where no definitive “yes-no” answer is possible” (via @ayoubsciences)

In our current social internet world for marketing it means “sentiment metrics” : engagement, conversations, buzz, interactions, word of mouth, awareness and brand as outcomes of marketing campaigns. (hard metrics : sales figures, number of new customers …).

If you want to read more about soft metrics :
Computational Models of Group Dynamics for National and International Security Applications (Mihaela Quirk)
Marketing metrics : the hard and the soft
How Soft Metrics Can Make Hard Choices Easier

Other posts you might enjoy reading :
Data mining for marketing campaigns : interpretation of lift
Howmany inputs do data miners need ?
Oversampling or undersampling ?
data mining with decision trees : what they never tell you
The top-10 data mining mistakes
Good enough / data quality

1 Comment

Posted in Data mining | Tags: Add new tag, Data mining, Database, Decision tree, Prior probability, Probability

Posted by: zyxo | February 6, 2010

Are Tennis Point Counts Unfair ?

: Image via Wikipedia

Ever wondered about this silly scoring system in tennis ? Me too. I suppose we do not think about the same thing : whether they count 15-30-40 or 1-2-3 does not really make a difference. My concerns here are about the composite system:

a game ends at 40+ (or 3+ if you count 1-2-3) if the difference between the two players is 2 points
a set ends at 6 if the difference between the two players is 2 games
if at 6 the difference is only one game, an extra game is played. This results in either 7-5 or 6-6. In the latter case a tie-break is played. In a tie-break they simply count points up to 7, but there must be a difference of 2. So a tie-break is just a special sort of game.
In the last set of the grand slam tornaments tie-breaks are not played. In stead they go on with the games until there is a difference of 2 games.

Now the question : is this a good system ?

But first another question : who wins the match ? Obvious answer : the best of the two players.
OK, but if he is the best, why doesn’t he always win all the sets ?
Or, if he is the best, why doesn’t he always win all the games ?
Or, if he is the best, why doesn’t he always win all the points ?

The answer : variability, noise, variance, standard deviations, standard error, luck, chance …

So let’s rephrase the above.
Who wins the match ? The one who is on average the best in the match (is this true ?)
Who wins the set ? The one who is on average the best in the set (is this true ?)
Who wins the game ? The one who is on average the best in the game (is this true ?)
Who wins the point ? The one who is the best in that point (this is true !)

And : is this a good system ?

Let us look a a game.

tennis quality of the 7 points of a single tennis game

This first chart shows the tennis quality of players A and B during each of the 7 points of a game. Four times player A was the better one and scored. Only three times was player B the better one. Result at 40-30 player A wins the final point and he wins the game. Was he really the best ?
Let us calculate the average quality of the 7 points : A= 5.86 B=5.93. So actually B was the better player, but A was more lucky. Conclusion : in each game there is a portion of luck.

What do we do in statistics ? We repeat the experiments a number of times to “average out” this luck. This is exactly what happens in a tennis set. After 6 (6-0) to 13 (7-6) games we can expect that both players are lucky a number of times and it is only the tennis quality that determines the outcome.
However, consider following set :

Player B wins with 5 – 7, but scores in total three points less then player A. Obviously he was not the best player but nevertheless he won the set.

And a similar discordance between player quality and match outcome may exist.
Look at the mach between Stepanek and Carlovic in the first round of the Australian Open 2010.

Stepanek lost his match, although he was the better player, winning 26 games against only 24 for Carlovic.

It is clear : with the current counting system it is not always the one who played the best, who wins the match.
If we would want this to be the case the counting would be very simple : just keep on counting point per point, up to a maximal number. If chance plays a role … by chance in any point it would be like tossing a coin : the more you toss, the closer you come to 50% heads or tails.
As men play on the average 230.5 points in a match I suppose 150 points would be a good target.

Why don’t they never use a system like that ? Sports is not only fun for the ones who play it, but it is also entertainment for the supporters. The Romans already knew that : Bread & Circuses ! And when do we like a game or a movie ? When tension builds up towards the and and the final outcome is a surprise. This means that in a sports game chance, luck, surprise must be possible untill the end. Games and sets are meant to start chances all over again. Whether you lose a set with 6 – 0 or 6 – 7 , next set still begins at 0 – 0, meaning that a slight difference in your favor during the second set can undo the huge inequality in the first.

Wait ! they do ! In soccer they do, in basketball they do. In waterpolo they do, in handball they do … They just count the number of points.
But in soccer the number of goals are normally so few (1-2, 3-0 are common soccer results) that in most matches we can consider this for a great deal a “chance game”, especially when an erroneous decision of the arbiter can result in a penalty kick, with a large probability of scoring.

Basketball is something different. Its scoring system is just based on averaging. No sets, no games, just one mach with each point adding up to the total score and with a large number of points at the end of the match. So It is obviously the fairest system possible.

2 Comments

Posted in our universe | Tags: Add new tag, Australian Open, Game, Recreation, Sport, Tennis, Tiebreaker, Win–loss record

Posted by: zyxo | January 31, 2010

Link list for january 2010

My links for january. Enjoy browsing !

Lifeless prions capable of evolutionary change
Why is’nt the milky way crawling with life (S.Hawking)
100 Job Search Tips From Fortune 500 Recruiters
What is engagement and how do we measure it ?
Three questions an executive should ask for the new year (and more)
When someone googles you, what do you want to happen ?
Angels and Demon’s anti-matter
The state of technology in 2010
Dolphins should be treated as ‘non-human persons
Future predictions of top scientists
Executives : the data is in your hands !
25-point website usability checklist
Google’s 10 toughest rivals
The tragedy of anti-data leadership and dataphobia
top-10 science/tech stories of the decade
Why most sales forecasts suck…and how Monte Carlo simulations can make them better
Google Blog – Helping computers understand language
top-15 chemical additives in your food
What if a jury could decide if you ar guilty by reading your mind?
hyper-heuristic decision tree induction
humans were once an edangered species
scientists develop walking robot maid
How to dual-boot Vista with Ubuntu
Avatar technology could bring back Clint Eastwood at 35 years
You won’t find consciousness in the brain
Emergence of a global brain – will it happen
11 ways to think outside the box
random rules for idea worth spreading

Leave a Comment

Posted in link lists

Posted by: zyxo | January 25, 2010

Will rich people become a different species ?

This Wall street journal blog post wonders if rich people will become another species, because they can afford all the new medical and technical remedies and life enhancements.

It made me wonder.

They have to stay the same species.

If rich people want to profit from donor organs, they better stay the same species, otherwise they will face tissue incompatibility problems.

It is difficult to form a new species.

Speciation or the splitting of an existing species into two (or more) new ones is the result of genetical barriers. When different populations of the same species are genetically isolated, they can evolve into different species. This genetical isolation is the result of some sort of geographical isolation.
Humans populations are far from isolated. Globalisation proves the opposite.
Is it possible that rich people can isolate themselves sufficiently to form a new species ?
We all know some examples of isolated human populations : Australian aboriginals, North-American indians. In all those centuries they evolved to a different kind of people, but still remained people of the one human species. Imagine how difficult it would be in our modern world for human speciation to occur.

It is more likely that “species” will lose it’s meaning.

With that I mean, that, at least for humans and eventually, the most intelligent other mammals, the species concept will possibly dissapear.
Sounds a bit weird ?
Imagine what the current technological evolution will mean in the future : messing with genes, bionics. A lot of people already have false theeth, hips, knees, heart valves, cochlear implants. Experiments are going on with gene therapy, chip implants in animal and human brains, artificial arms & legs.
On the other hand we fabricate robots that behave (up to a limit) as human beings.
This all means that the border between biological and artificial life is thinning.
Add to that the in vitro fertilisation, cloning, and similar technologies and we are heading for a world where we will no longer need to make love on order to have children.

Where were we ? OK : we will make humans (still humans ?) in the laboratory, and will to chose among a lot of genetical enhancement options. During the life of our kids we’ll still be able to add all sorts of technological enhancements.
Can you imagine the diversity of humans at the end? It will be beyond anything earth ever saw.

At the same time rich people will have the same options for their pets. All amazing sorts of dogs, cats, snakes, rats, horses, which finally will sort of blend, so that some dogcats will look like rabbitmonkeys …

There is a saying that dog-owners come to look like their dog. At the end people with plenty of money will have the possibility to really give their dog their own face or the face of their beloved who passed away. With the necessary brain implants, this animal (?) will have the normal human intelligence.

Do you still know where the species is?

Enjoyed this post ? Then you might be interested in the following :
– Web 5.0: The telepathic web
– Robotic insects or cyber-insects ?
– Self reassembling Robot
– Human brain copy protection by AnyMind Inc.
– Humans 2.0

Five essential things to know about evolution (arstechnica.com)
Evolution 2.0 (and 3.0 beta) (snarkmarket.com)

Leave a Comment

Posted in Artificial Intelligence, Emerging Patterns, Evolution | Tags: Artificial Life, Biology, Cochlear implant, Evolution, Human, humans 2.0, Speciation, species

Posted by: zyxo | January 13, 2010

Customers, website visitors, passing cars and forest birds : Howmany ?

Howmany customers does the grocer around the corner have ?
Howmany customers do you have on your website ?
Howmany cars are there that use a particular crossroad ?
Howmany birds of a particular species live in a determined patch of forest ?

Howmany ? Is there a way of finding out ?
(aside of the eventual meaningfulness of this question, I find it an intruiging one)

A first step is simple : Unique visitors in a given period of time.

– You just watch the grocery for an afternoon and count all people that enter. Make sure you do not count the returning one that forgot the suger twice !
– Get the unique visitor number of your website from Google Analytics ore whatever web analytics tool you use.
– Get a mojito from the bar at the corner and write down all license numbers you see, for one or two hours. Afterwards eliminate al returning ones.
– take a walk that covers the entire forest patch and count each individual of that particular bird species.

Simple, is it not ?
But did that get you the actual TOTAL number ? NO

What about the loyal grocery customers that came yesterday and will return tomorrow ? You missed them.
Not everyone visits your website all the time.
Some people leave their car at home and take a walk … to the forest where not every bird will show itself or will be singing.

Realise you only got a fraction of the number.

In biology they use something like capture-recapture.
1. first step : capture some birds, put a ring on one of their feet (identify customers, drop a coockie when they visit your website, write down the license numbers of passing cars)
2. second step : capture some birds, count the number with and without ring (count returning customers, count returning vs. first-time customers/cars)
3. do the simple math: identified/non-identified = marked/total_number
If the first time you captured 100 birds, the second time you captured the same number and 25 of them were already ringed, you can say that in your forest 1/4 of the birds are ringed, so the forest contains a total of 400 birds.
Idem for you website : if 50 % of your visitors are returning ones, you may say that you have twice the amount of coockies dropped as visitors.

These figures are correct … if we accept some assumptions, that we cannot accept :

– birds that are captured once become much more shy. They will be under-captured the second time.
– not all people have the same activity on the internet , or on the road. They do not all have the same probability of showing up. And some delete their coockies !
– The first day perhaps there was a tourist or two in the grocery who will never return !
– migration : some come, some go …
– with the same effect as migration : births, deads

Perhaps there is more to learn when we take the figures for a number of periods in succession, like say, capture birds or monitor the number of signons on your websitefor two weeks in a row.

Two things we can learn :
– an estimation of the total population ?
– what is the total number of identified individuals (ringed birds, website visitors who signed up) ?

We can assume that what we get should lay between two extremes :
1) No migration/births-deads
This first extreme should show us the actual, stable situation.

The chart shows two lines : the highly fluctuating line is the percentage of identified individuals, day per day, whereas the more stable line shows the cumulated data. These cumulated figures tend towards the real % of identified individuals in the population. Here we see that 24% carries an ID. By simply using the proportion we can easily calculate the total number of individuals.
OK, for birds in a forest it is simple, but for website visitors it is a bit more complicated. Not every visitor with an username/password for your website will sign on each time he visits, but let us assume this is the case anyway, for now.

The second chart shows the theoretical cumulative proportion of ID’d individuals. If each day 24% is ID’d, after 14 days nearly 100% of them will be captured, seen or have visited your site.

2) only migration/one-day flies
What about this second extreme ? This really means that you do not have any returning visitors, or that each bird you capture is some migrant passing trough.
So you never see ID’d or returning individuals. Not much information to show. Only the average number per day is interesting.
Although …
Let us take our second graph and add the line that corresponds with our second extreme :

The straight line shows the total number after the 1st, second, third, etc. day. Each day about the same number of individuals come along, but each day these are new ones, so they simply add up.

In real life you will find something in between the two lines (the yellow dots in the following chart). The closer your real-life line is to the curved one, the more stable your population of birds or website visitors. The closer to the straight line, the more volatile your population.

“Something between the two lines” actually means that you deal with two populations : your loyal returning customers (or you sedentary birds in the forest) and your one-time-customers( or birds accidentally passing through).
Considering this, it should be possible to reconstruct your actual, intermediate line, by combining the lines of these two populations… if only you should know them.
Fortunately there is something like excel, open-office calc, or whatever spreadsheet you may use. In stead of finding some complicated equation I made something simple, played a bit with the numbers to come up with the following chart :

The blue squares are the “actual” observations, the red diamonds represent the theoretical line for the “sedentary” population and the blue triangles show the one-pass-and-never-return population.
The two latter sum neatly up to match the actual observations.

What are the columns in my spreadsheet ?
1. day number (1, 2, 3, 4 …), is shown on the horizontal axis
2. volatile population (straight line) = one number, let’s call it A, multiplied by the day number from column 1
3. stable population (“theoritical” curve). This is more complicated. The first cell is a fraction (let’s say 25%) of the total number we will have seen on for example the 14th day. The second cell equals the first+the same fraction (25%) of the rest of the total number and so on.
4. the actual data = the cumulative number of individuals observed the first day, the first two days, the first three days etc.

Now you just have to tweak (play with) the three numbers : the one for the volatile population and the two (total number + fraction) for the stable population until the two populations sum up to (nearly) exactly the same data as column 4.
OK, this tweaking is not very scientific. You could do the necessary programming to obtain automatically the desired result or, if you are good at math, derive some equation to reach your goal more efficiently. At the end the result will be the same.

Did you enjoy this post ? Then you might like the following :

Are you a good data miner ?
Men are more accurate than women … or lousy statistics ?
Good enough/data quality

Leave a Comment

Posted in Data mining, knowledge managemnt | Tags: Add new tag, Bird, Cartesian coordinate system, Google Analytics, Population, Recreation, species, Unique visitor, Website

Posted by: zyxo | January 8, 2010

Does Avatar show the future?

2154 is the year when an Avatar fell in love with a cute Pandoran aboriginal. Well, I like her too 🙂

A blog post by Seth Grimes made me think. Seth has some problems with what we see in 2154 : mostly contradictions and anachronisms !

Here l borrow his ideas, add mine and some comments.

Anachronisms.

Remember we are talking 150 years from now. If we compare our current world, state of technology with that of 150 years ago a lot has changed : especially electronics has made the big difference. We are also very much aware that the speed of this technological evolution is continuously speeding up (remember Moore’s law : it is exponential).

So, according to Avatar’s creators, what will humans use 150 years from now ?
– a stupid weel chair
– napalm bombs
– manned air ships and ground battle machines
– very short range missiles
– helicopters with propellers
– mechanical man-machine interfaces
– ground troops

Contradictions

– humans are able to control the Avatar consciousness from a distance. They must have something like mind transmission. And yet they need physical contact to handle their equipment !
– Pandorans talk to eachother, have misunderstandings, quarrels and all we know whe, humans, have. And yet there are able to communicate by means of a direct connection to their animals and even their trees. Why not connect directly with eachother and form one connected mind?

How should it really look like in 2154 ?

– As far as I am concerned they should have something like anti-gravitation, like for someone with amputated legs, airplanes and the like.
– The same anti-gravitation device should be able to pull the giant tree out of the ground (with sufficiently soil around its roots) and plant it back somewhere else, so no need for napalm bombs and all that archaic shit.
– Already now are we, humans, able to read minds, let it be very, very basic and with huge machines (eMRI and the like). The next steps are brain-machine interfaces, wirelessly connected to the internet, which will make technology-enabled telepathy possible. So everything will be mind-controlled, no actual soldiers needed on the battle field.
– By then, Artificial Intelligence will have created the so-called singularity, a super-human intelligence. This fellow will sure find better ways of getting to “unobtainium” than bombing some tree.

Did you enjoy this post ? Then you might like the following :

Web 5.0 : the telepathic web
Humans 2.0 ?
Psychons: elementary particles of the mind.
The human cyborg
Human evolution : the future of men

Dawn of the Kill-Bots: the Conflicts in Iraq and Afghanistan and the Arming of AI (part 5) (singularityblog.singularitysymposium.com)
Electrodes in Brain Give Insight Into How We Use Language (singularityhub.com)

Leave a Comment

Posted in Evolution, our universe | Tags: Artificial Intelligence, Avatar, Future, Human, intelligence, Moore, Moore's law, Pandora, Technology, Transhumanism

Posted by: zyxo | January 1, 2010

Link list for december 2009

Enjoy browsing :

Dutch PhD student produces anti-noise to combat noise
tattoos in advertising
machine allows people to type with their minds
co-processors for the human mind
Eureqa calculates scientific laws
is social media worth your time ?
amazing pictures !
highcharts
why we do not care about information overload
Can “nice girls” negotiate ?
Google analytics illegal ! (say German regulators)
The idea that will make twitter more profitable than google
What matters now : 60 important words beautifully explained by 60 important people
coin tosses can be easily rigged
Brain cells reach a decision by computing probabilities
I am a God …
Ben Goertzel and the US military on the ethics of battlebots

Human enhancement : bioliberation versus biothreat
Scientists are drowning in data !
the three sexy skills of data geeks
text data quality (seth grimes) and Manya Mayes
10 things that make humans special
life before computers were invented
Shopping list of cyber criminals
5 ways humans could become obsolete
Golden ratio’s for beauty of female faces

3D fractals
SEO tools
How close are we to colonizing space ?
18 giga pixel photo of Prague
The known universe
Every 20 minutes we lose an animal species
Machine translate thought to speech in real time
Will uploaded minds in machines be alive ?
Resources for newcomers to R
Ecosystems on the run due to climate change
free services to schedule your tweets
Trending twitter topics of 2009
Data, not design is king in the age of google
Kissing the frog : a mathematicians’ guide to mating

Leave a Comment

Posted in link lists

Posted by: zyxo | December 30, 2009

Runaway Ecosystems : species extinction or accelerated evolution ?

: Image via Wikipedia

As a result of global warming, our ecosystems are running away at a speed of 420 meters per year. That is about 500 human paces, 1.5 per day.

What does that mean ?

At first sight this is not a big deal. What is 420 meters ? This is only 42 kilometers in 100 years ! The animals and plants just have to follow !

Wait !

The animals, perhaps they can follow, they can walk, fly, crowl, dig …

What about the plants ? If they are lucky to have mechanisms like dispersing of their seeds by wind, or by birds (seeds in fruits), they should manage it.

No, Wait !

What if there are obstacles ? Like seas, streams, mountains ? Neither animals nor plants can follow their ecosystem, because the ecosystem will cease to exist. An ecosystem that follows its preferred temperature and hits the seashore, will simply drown in the see. You ever saw a forest crossing a sea?

What is the alternative ?

Adapt or die !
Species that follow their ecosystem do this because at the back end the environment becomes a bit too difficult to live in. But when their ecosystem hits the sea, all that will stay behind is this unfavorable environment. The only solution is to adapt to this new environment. Which means : evolution, as quick as possible, before the environment becomes to harsh.

So I predict that we will witness accelerated evolutions going on all over the world.

Did you enjoy this post ? Then you should read the following :
– Human evolution : amazingly fast
– The direction of evolution : speed matters
– Evolution can occur in less than 10 years
– The chicken or the egg ?

Species Must Run From Global Warming to Survive (treehugger.com)
Is There a Climate-Change Tipping Point? (time.com)

Leave a Comment

Posted in Evolution, our universe

Posted by: zyxo | December 29, 2009

10 Predictions for 2010 to 2020

: Image via Wikipedia

What is to come ?

1) nano-stuff. The potentials are huge and the technology is developing fast. Exemples : A nano-window that washes itself or Tracking new cancer-killing particles with MRI

2) Artificial Intelligence. While the data-mining hype in the 80-s was a failure because of computer processing limits, a new wind blows trough AI that wants to create a Superhuman Intelligence. a so-called Singularity.

3) Mind-machine communication. This is still very basic but one success after the other is published. Example: people type only with thoughts. But this means more than mind-machine communication : if you ad a second mind on the other side of the machine, you have mind-mind communication : technology-enabled telepathy !

4) electric cars : Not really future any more. With the greener mind of a lot of people, the investments in wind- and solar energy it is just a matter of years before everyone will buy an electric one. Mine must have photovoltaic panels on its rooftop to recharge the battery when I am shopping.

5) photo-voltaic windows : photovoltaic panels are ugly when you put them on your roof. Photovoltaic windows are just like other windows and hence why should you not use them in stead of normal ones? But before the technology becomes really mature, we can take pĥotovoltaism into account when designing our buildings in stead of putting the panels afterwards on our roof.

6) wireless database-driven / data mining medicine (not just doctors with gut feelings) : Do you know of examples where your doctor was wrong for months before the patient or his/her family decided to go to another doctor who saw in the blink of an eye (or after some blood tests) what the real problem was? Now databases exist and medical software to assist the doctor. So he should see you and, guided by his software, should ask the right questions, eliminating all impossible diseases, to come up either with the right one or with extra tests to perform in order to detect the real problem. Exit medicine men !

7) movies without movie-stars (Avatar squared) : Movie stars out of work ! 1. Select the people you want to be the basis for your stars. 2. Film them in their real life. 3. Load them up in your computers. 4. Feed the computers with the scenario/script. 5. Select the looks/feels/characteristics of your stars. 6. Describe the decors. 7. Run the software 8. Evaluate the result and eventually go back to step 4 or 5. 9. Do some editing. 10. Ship the movie.

8 ) the semantic web (Web 3.0). It will become possible to tell your computer (smarthphone, whatever that’s connected to the internet) in your own language what you want. It will respond not based on the words, but on their meaning (in context).

9) accelerated evolution : global warming changes the environment for a lot of species much faster than usual. They will either follow their preferred ecosystem as it moves around, or if they encounter a serious obstacle an cannot move further, they will evolve lightning fast to the new conditions.

10) robots : Real humanlike robots like the japanese Asimo will stay too expensive for a normal human being. But a lot of military applications are possible so things like : cockroaches offer inspiration for running robots, or flying insects and robots will go at war !

I’m not the only one to make previsions of the future :

– the futurist
– Institute for emerging ethics & technology
– Lain Dale’s diary
– Wall street pit
– ReadWriteWeb
– ZDnet
– Darwin Central
– True/Slant
– The Security Blog

Did you like this post ? Then you might enjoy the following :

Human brain copy protection by Anymind inc.
Job interview or brain scan ?
Adam and Eve : Robot scientists
New laws of robotics
Web 5.0 : Computer telepathy ?

Leave a Comment

Posted in Artificial Intelligence, Emerging Patterns, Evolution, our universe | Tags: Add new tag, Artificial Intelligence, Data mining, Databases, google, Machine learning, Medicine, Philosophy, Semantic Web

Posted by: zyxo | December 13, 2009

Statistics on 250 twitter tools

Do you know which are the most popular twitter tools ?

Curious as I was, how could you know such a thing ? Organize a poll ? I fear I do not have enough readers or followers on twitter to end up with sufficient data.

But I had two options left :

count the number of times a twitter tool appears in lists of twitter tools. You know, there are lots and lots of lists of twitter tools on the internet. A tool that appears in every list must be very popular, at least for twitter tools listers.
get the number of hits google returns you when you search for the twitter tool

I decided to use the second method.

But first I needed the names of “all” the twitter tools. So I started to get them from the various twitter tools lists. Soon I saw that this could be an exercise that goes on forever !
Neither my patience nor my time are endless, so I decided to stop after 15 lists and 250 twitter tools. Feel free to continue the exercise !

First of all here are the 15 lists :

Mashable
BashBosh
techcruising
Rssapplied
dailyseoblog
Brian Solis
techcrunch
The twitter toolbox
online marketeer
99 Essential Twitter Tools And Applications
Top twitter tools
Top twitter tools for business
My Top 10 Free Twitter Tools (and 3 Honorable Mentions)
47 Awesome Twitter Tools You Should be Using
Twittermania: 140+ More Twitter Tools!

( At the end of this post, I give the remaining url’s of the twitter tools lists that I did not use. )

And now the results.
But first there are two remarks to make.

Most searches were straightforward because the tool has some typical “twitter-like” name so you – and google – cannot get confused to mix them up with some other already existing concept. Example : splitweet.
But there were a lot of more ambiguous names, like “hellotxt” or “glue”. In those cases I used either the website name of the tool (getglue.com) or added “twitter” to the searchterm.
I know the numbers that google returns are mere … “google numbers”. This means we do not exactly know what’s behind, unless we browse for example all 126.000.000 hits for bit.ly which is a bit too much for me. Also I noted that google never uses more than 3 meaningful digits, the rest are zeros. So these numbers are not very precise. But at least they give some overall picture, which is interesting to see, but, I am aware of, has not very much real meaning or values. Say it is just for fun and curiosity.

And here is the list of 250 twitter tools and their number of google search results : enjoy !
Sorry that I did not hyperlinked them all, just too much work !

1	twittersearch	402000000
2	friendorfollow	208000000
3	bit.ly	126000000
4	seesmic	15900000
5	twitter karma	15000000
6	twittangle	13070000
7	ping.fm	11900000
8	cotweet	9520000
9	twittercounter	8080000
10	tinyurl	7760000
11	brightkite	7470000
12	timer	6780000
13	hootsuite	6050000
14	wefollow	5330000
15	twitthis	4800000
16	headup	4230000
17	tweetmix	4200000
18	toro for twitter	4050000
19	twitterfeed	4030000
20	diigo	3730000
21	loopt	3550000
22	twitter grader	2800000
23	twitpic	2680000
24	tweetdeck	2170000
25	tweetmeme	1800000
25	twitterrific	1750000
27	splitweet	1040000
28	quitter	1020000
29	cheaptweet	841000
30	strawpoil	767000
31	hellotxt	765000
32	twitalyzer	669000
33	snipurl	654000
34	twitterfriends	650000
35	magpie	588000
36	twiggit	581000
37	hashtags	539000
38	twitwall	539000
39	tweepsearch	537000
40	emailtwitter	522000
41	Doesfollow	517000
42	twhirl	490000
43	twitgraph	476000
44	rememberthemilk	461000
45	twitxr	457000
46	hoopla	451000
47	tweetcloud	436000
48	retweetist	425000
49	destroytwitter	404000
50	twibs	397000
51	glue	393000
52	tweetree	375000
53	twellow	353000
54	twitt twoo	349000
55	spaz	326000
56	tweet2tweet	320000
57	twitterberry	291000
58	tinytwitter	290000
59	summize	284000
60	snitter	278000
61	twitterfon	271000
62	twitscoop	267000
63	retweetrank	261000
64	twitterfall	259000
65	favrd	256000
66	microplaza	253000
67	outwit	250000
68	digsby	247000
69	twitterreply	244000
70	twist	236000
71	flaptor twitter search	227000
72	twitter search firefox	222000
73	jott	219000
74	twitbin	218000
75	twittelator	211000
76	twitterkeys	189000
77	twitterlocal	188000
78	twitterbadge	179000
79	twitterholic	178000
80	powertwitter	176000
81	twibble	174000
82	twinbox	166000
83	tweetvisor	163000
84	easytweets	159000
85	tweetrank	157000
86	bubbletweet	154000
87	backtweets	150000
88	huitter	148000
89	tweetstats	145000
90	Itweet	142000
91	tweetbeep	142000
92	twitterfox	142000
93	tweetlinks	135000
94	slandr	131000
95	twitterless	126000
96	vakow	124000
97	twittervision	122000
98	twitdir	118000
99	twitzer	116000
100	twtpoll	114000
101	twitterfone	113000
102	twitter2go	111000
103	twittermail	110000
104	tweetvolume	108000
105	twitdom	103000
106	mrtweet	96300
107	twittytunes	93200
108	tweetlater	92900
109	peoplebrowsr	90100
110	twinfluence	88100
111	twitternotes	87500
112	twideoo	87000
113	mr milestone	86100
114	cursebird	86000
115	WP twitter tools	85700
116	tweetgrid	82600
117	tweetr	82000
118	twoogle	81000
119	twitterbar	80700
120	snaptweet	72700
121	tweetscan	71300
122	twitteroo	68700
123	hahlo	68500
124	tweetburner	68200
125	twuffer	66800
126	twittercal	65100
127	twittonary	65100
128	twitter updater	63400
129	tweetchat	62900
130	twitter100	62800
131	tweetake	60800
132	socialtoo	60600
133	nearbytweets	56900
134	monitter	56600
135	tweepler	55900
136	twtvite	55200
137	twilert	55100
138	tapulous	52300
139	tweetwire	50900
140	feedalizr	50700
141	secrettweet	49800
142	twitterhawk	48400
143	twitturly	48400
144	Xpenser	48400
145	grouptweet	48300
146	tweetcube	48000
147	tweet this	47900
148	tweetwheel	46700
149	linkbunch	46400
150	twiddict	46200
151	twittertise	44900
152	tweetsum	43500
153	twitstat	43200
154	followcost	43100
155	twitter sharts	41900
156	tweetrush	40300
157	untweeps	38700
158	twtqpon	37900
159	tweetsuite	35600
160	citytweets	35500
161	twistori	35100
162	twitpay	34800
163	twitterpatterns	33200
164	tweepular	32300
165	gps twit	31800
166	twonvert	30900
167	Matt	30500
168	livetwitting	30400
169	twitseeker	30000
170	twittergallery	30000
171	twitoria	29100
172	quotably	28400
173	mymilemarker	28100
174	tweetchannel	27800
175	twistory	27800
176	tweet pro	27400
177	twitzu	27000
178	justtweetit	26500
179	twitterIM	26100
180	gridjit	25700
181	twittercamp	25100
182	twubble	24800
183	socialwhois	24700
184	twittereyes	23500
185	twtrfrnd	23400
186	twittad	23200
187	twittersnooze	21900
188	brabblr	21400
189	twalala	20800
190	whostalkin	19700
191	twitsay	19500
192	twittearth	19400
193	istwitterdown	19300
194	twtcard	19200
195	pockettweets	19000
196	toptweet	18900
197	twerpscan	18600
198	nozbe	17500
199	twemes	17000
200	autopostr	16300
201	twixxer	16200
202	twitterdigest	16100
203	whoshouldifollow	15900
204	twithire	15800
205	madtwitter	15300
206	tweetwhatyouspend	15000
207	xefer	14500
208	twittertroll	13800
209	twitterlights	13000
210	twitority	12200
211	twitterfriends network browser	12100
212	feedtweeter	10100
213	tweetwasters	9870
214	tweetie for iphone	9680
215	twitrans	9520
216	twiffid	9200
217	mytweeple	9130
218	twitresponse	9110
219	itweet2	9020
220	microrevie	8970
221	twitterratio	8960
222	Nest.Unclutterer	8730
223	twply	7730
224	itwtr	6730
225	twitspy	6320
226	postica	6260
227	twitrand	5660
228	twinkle	5520
229	dreamtweet	4870
230	whatsyourtweetworth	4190
231	mycleenr	4090
232	WP twitip Id	3780
233	twitslikeme	3760
234	alphatwitter	3510
235	plodt	3180
236	readmytweets	3150
237	twittords	3020
238	vacatweet	3010
239	twenglish	2850
240	acamin	2420
241	tweetpad	2330
242	twitexplorer	2120
243	twi8r	1880
244	twitgeistr	1880
245	whofollowswhom	1660
246	tweeterology	1410
247	twitblocker	1150
248	twitalks	1040
249	twitterscan	39

Oops ! either I come one too short, or I made some mistake in my numbering …

And here are the url’s of the other twitter tools lists I did not use :

http://www.sociableblog.com/2009/03/18/100-twitter-tools-to-help-you-achieve-all-your-goals/
http://net.tutsplus.com/articles/10-awesome-ways-to-integrate-twitter-with-your-website/
http://www.folkd.com/go/top+10+twitter+tools

Top 10 Most Useful Practical Twitter Tools for The Twitter Professionals

http://www.1stwebdesigner.com/development/27-twitter-tools-to-help-you-find-and-manage-followers/
http://www.quickonlinetips.com/archives/2007/04/10-best-twitter-tools-for-wordpress-blogs/
http://www.seoptimise.com/blog/2009/10/30-twitter-tools-for-business.html
http://tendou86.blogspot.com/2009/01/top-10-twitter-tools.html
http://www.hellogiri.com/top-10-most-useful-twitter-tools-list-for-pc-mobiles-and-blogs/

Top 25 twitter tools for WordPress

http://www.newmediabytes.com/2008/01/18/best-twitter-tools-resources-and-clients-guide/
http://savethemedia.com/2009/02/17/top-twitter-tools-for-journalists/

Top 100 Most Influential Twitter Tools

http://www.google.be/search?hl=nl&client=firefox-a&rls=com.ubuntu:nl:official&q=TOP+10+LIST+OF+twitter+tools&start=30&sa=N
http://www.c4lpt.co.uk/recommended/
http://www.squidoo.com/twitterapps?utm_campaign=direct-discovery&utm_medium=sidebar&utm_source=pkmcr
http://pelfusion.com/tools/30-twitter-tools-for-managing-followers/

Tweets by Top10TwitTools

http://www.seodubai.org/2009/01/16/list-of-twitter-tools-that-you-must-have/
http://brendanhughes.ie/2009/06/21/top-10-twitter-tools-for-business/
http://www.smbceo.com/2009/03/25/top-27-twitter-applications/
http://www.smmguru.com/2008/10/22/the-master-list-of-twitter-tools-and-apps

Top 10 Twitter Tools For Musicians

http://www.socialmediatoday.com/SMC/80437
http://www.blogcatalog.com/topic/list+of+twitter+tools/
http://www.scgpr.com/wordpress/?p=492
http://www.socialmedialists.com/wiki/index.php?title=Twitter_Tools
http://www.twitadder.info/
http://www.thedailyanchor.com/2009/02/17/85-twitter-tools/
http://www.techtreak.com/downloads/10-awesome-twitter-tools-as-wordpress-plugins/
http://steve-wakefield.com/2009/10/my-top-10-twitter-tools-and-then-some/
http://www.thinktechno.com/2009/05/31/top-10-twitter-tools/
http://www.brandsamongmany.com/2009/03/09/the-ultimate-list-of-twitter-tools/
http://www.webuildyourblog.com/1289/increase-twitter-top-10-twitter-tools/
http://www.networkworld.com/slideshows/2008/060208-top-twitter-tools.html
http://www.warriorforum.com/blogs/dsmpublishing/8167-top-10-twitter-tools-everyone-should-own-their-online-business.html
http://www.girlopinion.com/2009/06/07/top-10-twitter-tools/
http://www.twitip.com/10-more-must-have-twitter-tools/
http://gnoted.com/100-twitter-tools-ultimate-power-collection/
http://freelancefolder.com/15-useful-twitter-tools-for-web-workers/

Twitter Tools & Resources For Jumpstarting Your Twitter Experience

http://www.placona.co.uk/blog/post.cfm/my-top-favourite-twitter-tools

If you enjoyed this post, then you might also be interested in the following :
top 10 lists of twitter tools
A bunch of tools for twitter
A second bunch of tools for twitter
Micro Email = twitmail

2 Comments

Posted in Emerging Patterns, knowledge managemnt, our universe | Tags: bit.ly, Brian Solis, google, Online Communities, Ping.fm, search, Social network, Twitter, twitter tools

Posted by: zyxo | December 6, 2009

Top 10 lists of twitter tools

: Image via Wikipedia

Twitter started in March of 2006 as a very simple service to connect people by sending short messages of max. 140 characters.
Who could imagine at that time that not only twitter would become so popular, but that, thanks to their API the number of twitter tools and services would explode the way it did ?

On the internet we find a wealth of lists of twitter tools (I wrote also two of them). As the evolution rocks off the charts, and I wanted to assemble a new list I figured it would be interesting to make a meta-list : a list of lists of twitter tools.

We can find all sorts of lists of twitter tools. I figured the lists I wanted to list had to have something in common, so the list would make some sense.

It has become a list of top-10 lists :

1	BashBosh : Top 10 Tools for Twitter Freaks
2	Techcruising : Top 10 Twitter tools for a power user
3	Rssapplied : Top Ten Twitter Tools
4	Dailyseoblog : 10 twitter tools to effectively manage your followers
5	The twitter toolbox : Top 10 Tools For Your WordPress Blog
7	Itpro : Top 10 Twitter tools for business
8	Dooleyonline : My Top 10 Free Twitter Tools (and 3 Honorable Mentions)
9	Tutsplus : 10 Awesome Ways to Integrate Twitter With Your Website
10	Atniz : Top 10 Twitter Tool
11	Quickonlinetips : 10 Best Twitter Tools, Plugins, Widgets for WordPress Blogs
12	Tendou86 : Top 10 Twitter Tools
13	Hellogiri : Top 10 Most useful Twitter Tools list for PC, mobiles and blogs
14	Top10 Twitter Tools : Twitter Tools Top 10
15	Brendanhughes : Top 10 Twitter Tools for Business
16	Hypebot : Top 10 Twitter Tools For Musicians
17	Techtreak : 10 Awesome Twitter Tools as WordPress Plugins
18	Steve-wakefield : My Top 10 Twitter Tools… and then some!
19	Thinktechno : Top 10 Twitter Tools
20	Webuildyourblog : Increase your Twitter following with these top 10 Twitter Tools
21	Warriorforum : Top 10 Twitter Tools That Everyone Should Own For Their Online Business
22	Girlopinion : Top 10 Twitter Tools
23	Twitip : 10 MORE Must Have Twitter Tools

If you enjoyed this post, then you might also be interested in the following :
A bunch of tools for twitter
A second bunch of tools for twitter
Micro Email = twitmail

3 Comments

Posted in knowledge managemnt, Swarm Intelligence | Tags: Application programming interface, blog, Business, Online Communities, Personal computer, Social network, Twitter, WordPress

Posted by: zyxo | November 30, 2009

Link list for november 2009

Enjoy browsing :

Douglas Hofstadters: musing on the singularity
a clever way of searching
how to really browse without a trace
you should follow me on twitter
The new era of inbound marketing
the twitter song
Top 10 Most useful Web Developers tools for Firefox
elevators to space
Sharing small snippets of information about your daily life could be generated automatically
who will edit your life ?
A Fractal Perspective on Enterprise 2.0 Adoption
10 things about google that you might not know
test your science knowledge with science cheerleaders (fun)
you think your child is smart ?
what is the meaning of “organism” ?
how ants make their nest
periodic table of marketing elements
Bill Bryson’s Notes from a Large Hadron Collider
The Über-Connected Organization: A Mandate for 2010
Is neighbor’s Wi-Fi signal free for me to use?
dark chocolate helps ease emotional stress
Why does’nt linux need defragmenting ?
Are solar cells warming up the earth ?
bounce rates
graphedge
six insane laws we will need in the future
explore your twitter friends and hashtags with mentionmaps
Top Ten list of excuses not to engage in co-creation
how to achieve something
how heavy is the internet ?
Intel wants a chip implant in your brain
in the brain, se7en is a magic number
We perform best when no one tells us what to do

Enjoyed this post ? Then you might be interested in the following :
link list for october 2009
link list for september 2009
link list for august 2009
link list for juny 2009
link list for may 2009

Leave a Comment

Posted in link lists

Posted by: zyxo | November 23, 2009

Thoughts on Traffic Jams

: Image via Wikipedia

I am sure everybody knows the feeling when you get stuck in a traffic jam. No need to say this is becoming a huge problem.
Why are there traffic jams ? Is it possible to prevent them ?

What is a traffic jam ?
Very simply put : you experience a traffic jam, when there is no space in front of you to move on. We all love an empty road ahead. But you do not really need an empty road in front of you. When the driver in front of you nicely drives on, he is constantly making the necessary space so that you can move on too.
So there are two factors : i) there is a car in front of you and ii) it is not moving.
(“Ants have no traffic jams !” Are they more intelligent ?)

How much space do you need ?
This is not so simple. It depends on your speed. You want enough space to have the time to stop when the one in front of you stops. Hence you only move on when there is more space before you than the minimum you feel save with. What you really want is not space, but time. A good -conservative- rule of thumb is 4 seconds or 2 crocodiles (just say : “one crocodile, two crocodiles”).

To put it the opposite way : When is there no traffic jam ?
First everybody must be moving, and second there has to be enough time between the cars.

How to prevent traffic jams?
Since there are two factors in play : space and speed (space/time) we can play with both.

i) The first is space : it is obvious that lowering the number of cars on a given time on a given road will be a good thing, making more room per car. So you need to prevent (some) people to take their car, for example by enhancing public transportation, by making it more expensive to drive a car (taxes).

ii) The second one is less intuitive : a general remedy to traffic jams is limiting the speed. Why ?
My first reaction is : this makes no sense at all! If at a high speed or a low speed you allways keep 4 seconds between two cars, this means that either way every 4 seconds there is a car. So at a lower speed the road cannot “transport” more cars per time-unit.
However, there is another consequence of driving slower : the space you need in front of you diminishes : 4 seconds at 70 km/hour means that you need 77.8 meters, but at 120 km/hour you need 133.3 meters. So the effect of speed limitation is that the road can contain a lot more cars : 12.8 per kilometer at 70 km/hour, compared to only 7.5 per kilometer at 120 km/hour.
So either lowering the number of cars or limiting the speed leads to the same consequence : it prevents saturation of the roads. However, from the moment on that the road is saturated, the same traffic jam misery will start again.

iii) A third solution would be to lower the distances between the cars without changing the speed. Sure, there would be a security problem, unless everybody becomes an extreme alert driver (like the formula 1 people). A (future ?) solution is electronics. We can easily imagine a device with sensors to keep automatically a minimum distance. In stead of our automatic cruise control, we could switch to automatic distance control : This already exists ! I remember there has been an experiment like this with trucks, with only one driver in the first one and the other trucks simply automatically follow everything the first did, just a few meters separated from one another. Here is a more recent article on a similar subject.

And what about the “mystery of traffic jams” or “phantom traffic jams” ?
This is not really a mystery or a phantom, it’s just the result of a saturation of the road and the behaviour of the drivers.

Anyway : the best way not to get stuck in traffic jams is to stay at home !

1 Comment

Posted in Emerging Patterns, our universe | Tags: Cruise control, Driving, Electronics, Public transport, Road, Rule of thumb, Sensor, Traffic congestion, Transport, Transportation and Logistics

Posted by: zyxo | November 7, 2009

Extreme torture : do you chose the real thing or the memories ?

: Image via Wikipedia

Ben Goertzel tweeted the following 3 tweets today :

Option A: you are tortured (with no permanent damage) and then the memory of the torture is erased.
Option B: you are not tortured and then a false memory of torture is programmed into your brain.
Which do you choose, A or B?

No funny thoughts, rather one of those choices you really prefer to never have to make. But if YOU had to chose, which one would it be ? A or B ? Let me know please !

His first 9 responses were : A : 7, B : 3
My own response : B (no actual pain) but afterwards I would go to let myself hypnothize to remove the awful memories ! 🙂

Makes me wonder : After the facts : what is real ? The memories you have seem to be real, but if there is a way to put memories there without having experienced the real situation, for you there memories correspond to the real situation.
I am sure there are ways to put memories in someone’s head ! A tough interrogation may result in the subject actually believing he was there, he saw this or that or he actually did it ! (see these articles : (1) (2) (3)

Torture Interferes With Memory (scientificamerican.com)

1 Comment

Posted in Artificial Intelligence | Tags: Ben Goertzel, brain, Memory, People, Torture

Posted by: zyxo | November 1, 2009

Link list for october 2009

The Google Technology Stack
Make your web data dissapear with vanish
Scientists develop nasal spray that improves memory
where are all the robots ?
computer program sketches faces of criminals
University of Southampton scientists develop computer telepathy(youtube)
Talk of Ben Goertzel on the singularity summit 2009
minimizing complexity in user interfaces
how to demo twitter
did dragons exist ?
did dragons exist (II)

New ‘consumer-intelligence’ technology will compile detailed profiles
the origin of new genes
The Number of Parallel Universes
The Past 5,000 Years Mark a New Epoch in Human Evolution
evolution in a bottle
nearby future view of real artificial intelligence
is the Higgs boson sabotaging its own discovery ?
if Matrix was programmed on windows XP
head or tails : not 50-50
Neuroimaging Of Brain Shows Who Spoke To A Person And What Was Said
The weirdest clouds you’ll ever see !
IBM’s twitter strategy
a robot skiing downhill
5 mind-blowing webstats you should know
what will the web look like in 5 years ?
muscle-based PC-interface

Enjoyed this post ? Then you might be interested in the following :
link list for september 2009
link list for august 2009
link list for juny 2009
link list for may 2009

Leave a Comment

Posted in link lists

Posted by: zyxo | October 30, 2009

Where is your soul ?

connectedbrains
Where is your soul located ?

(my working synonyms of soul : self, consciousness, spirit, identity).

First, and obvious answer : in your head.
According to Douglas Hofstadter in “I’m a strange loop” this is not entirely true.

Explanation :
1. what is my soul : a whole bunch of patterns in my brain (linked, hierarchical “thoughts”, patterns representing concepts). One of these patterns is special, because it groups everything that relates to “me”.
2. not every brain pattern that relates to me is in my own head. A whole lot is in the heads of my friends, my family. Although not so vast as the one in my own head.
If the sum of everything that relates to me is my soul, then I am distributed over the heads of a lot of people.

Does this sound a bit crazy ?
After all, I only have one mind, and everything about me that is in the mind of somebody else is not “me” but is what that other person thinks and knows about me.
So that is what I thought before I read the book.

But let us do a hypothetical experiment. (Douglas Hofstadter describes some experiments like that in his book, but this one here is my own).

Imagine one brain to start with, with twice the number of neurons of a normal brain.
Imagine we can manipulate physically each neuron as we like.
Imagine we take at random every second neuron and put it in a second (empty) head. When we finish, half of the neurons will be where they originally were, namely in the first head. The other half will be in the second head.
Imagine we left the original neuron-neuron connections intact, meaning that we replaced every “broken” connection by an artificial equivalent wireless connection.

The result :
fysically (or rather “locally”) we have two brains, each in it’s own head. Let us call them Adam and Eve.
functionally, they are still the same original superbrain because all neurons and connections are unchanged. In fact, we now have one brain with two bodies. What would this be like ? I assume that brain will control the two bodies, just like you and me control our two hands. Consequently there will be only one “me” (named AdamEve).

Now assume that some of the wireless connections are broken, or of lousy capacity, so that only part of the info is passed on from Adam’s neurons to Eve’s neurons and vice versa.
This means that all thoughts, concepts etc. formed by only Adam’s neurons will be stronger, and clearer in Adam’s head thatn in Eve’s and vice versa.
Result : the “shared” identity AdamEve will be weaker. In the same time two separate identities will probably develop : Adam and Eve.

Now suppose all wireless connections are replaced by words, sounds, expressions, gestures, emails, writings or whatever people in our real world use to communicate.
Result : the shared identity is very, very weak whereas the separated identities are very strong.
We all know such shared identities : a married couple, a football team, an army, a religion …

But this is not what Hofstadter writes ! In stead of talking about shared identities, he speaks of pieces of identities that are scattered over the minds of many people. Or if we only consider two people : the two separated identities live in the two heads.

Let us recapitulate :
if there is no connection there are two completely separated identities.
If the “between-people” connection is equally strong as the “intra-people” connection (as in my split-brain thought experiment), according to Hofstadter we have two separate identities, living equally strong in both heads. According to me, we have only one shared identity and no separate identities.
If the “between-people” connection is weaker than the “intra-people” connection, according to Hofstadter we have two separated identities each living in two heads, but the one living in its own head is stronger than the one living in the others head. According to me we have two separated identities plus one, weaker, shared identity.

Enjoyed this post ? Then you might be interested in the following :
– Web 5.0: The telepathic web
– Robotic insects or cyber-insects ?
Psychons : elementary particles of the mind
– Human brain copy protection by AnyMind Inc.
– Humans 2.0

1 Comment

Posted in Emerging Patterns, our universe | Tags: Adam, Adam and Eve, brain, communication, Consciousness, Douglas Hofstadter, Emerging Patterns, Religion, Soul

Posted by: zyxo | October 26, 2009

Making hidden patterns visible

Data mining and other forms of analytics have one primary goal : making invisible paterns visible. The information is in the data, but it is invisible for you if you are not a “Homo analyticus”.
Some examples :

If this seems somewhat mysterious for you there is a simple way to make this all visible. Although it is not really data mining, it is a somewhat funny way of showing what it means to see “hidden patterns made visible”

2 Comments

Posted in Data mining, Emerging Patterns | Tags: 9/11, Data mining, Enron, hidden patterns, visualisation

Posted by: zyxo | October 19, 2009

Human evolution : amazingly fast !

Who said there that humans stopped evolving because their technical means took over the necessity ?

blue eyes

Steve Jones (here)
Eric Bruce (here)
Steven Pinker (here)

Apparently this is not true !

It seems that on the contrary since humans started spreading over the entire world some 50,000 years ago their evolution speed shifted to a higher gear.

Some examples of evolution that took place the last 50,000 years :

human skull structures of various ethnic groups evolved in different directions (remember we all came out of Africa, looking more or less alike)
people living in the Tibet mountains have a special gene which causes the oxigen level in the blood to increas with 10%
Scandinavian people have blue eyes : no blue human eye existed before the last 10,000 years
sub-saharan Africans developed already 25 genes protecting them against malaria, a disease that is only 35,000 years itself
a gene that enables men to digest lactose (a milk sugar) is 8,000 years old and only came into existence after people began to keep cows
and a lot of others

Why this speeding up ?

As I wrote earlier in order to have evolution you need i) diversity, ii)(re)production and iii)selection by the environment.
All three of them are present in our last millenia, so there is no reason why there should not be any evolution of the human race.

But there is something remarkable in our recent history : we came out of Africa and conquered the whole planet which means that :

number of people was considerably growing , and consequently also our diversity
we started to live not only in environments very different from the african plains, but also from eachother, from the ice sheet of greenland to the Sahara desert, to our modern Manhattan

So no wonder that this incredible shift in environments and density caused an incredible speeding up of our evolution. At least 7% of our genome has mutated recently (in the last 40,000 years).

Did you enjoy this post ? Then you might be interested in the following :
top-10 lists on evolution
The pope believes in evolution
Human evolution : the future of men
Evolution towards Intelligent Design
The end of evolution

Leave a Comment

Posted in Emerging Patterns, Evolution | Tags: Biology, Evolution, Human, Human evolution, Sahara, Steve Jones, Steven Pinker

Posted by: zyxo | October 7, 2009

Web 5.0 : computer telepathy ?

“Telepathy on the Horizon: New Interface Allows Brain-to-Brain Communication”

Is that so ?

I thought not.

First of all, if you did not read about it or saw the video, it is a good time to do it now : (article, video)

What did they do ?

They connected brain A (a person who was thinking ‘lift left arm, lift right arm …’ to represent zero’s and one’s) to an EEG transmitter and then to an PC (pc1). This PC1 was connected via the internet to another PC2 which interpreted the transmitted brain patterns to ‘on’ or ‘off’ signals and used them to flash a light. The second person (brain B) saw the light, was also connected to an EEG transmitter and then to a PC3. This PC3 interpreted the brain B patterns to reproduce the original zero’s and ones.

OK, good technology, but definitely not telepathy.

Because telepathy is “transferring knowledge (understanded information) from one person’s brain to another person’s brain without using the normal means (gestures, speech, writing … to send and our five senses to receive). In the experiment the second person was not even aware of the information. He only saw the light flashing.
In his setup Dr. Christopher James at the University of Southampton has only used one direction of communication : “exporting” a meaningful pattern from the brain. He did this twice, once on both sides of the communication.
These one-directional computer-brain-interfaces are around several years now.

Real telepathy.

For real telepathy you should also be able to do it the other way around : put information back into someone’s brain without using this person’s senses. And that’s the tricky part. I am not aware of any experiment that managed to do such a thing.

Leave a Comment

Posted in Artificial Intelligence, Data mining, Emerging Patterns, web 3.0 | Tags: Artificial Intelligence, brain, communication, cyborg

Posted by: zyxo | October 2, 2009

Link list for september 2009

Here is my link list for september 2009.
Enjoy reading (but don’t forget to take a look at my own writings too :-))

Evolution of Darwin’s “on the origin of species”
Eleven carbon removal projects to stop global warming
Sustainable fertilizer : urine and wood ash, to grow big tomatoes.
Cities are organized like human brains
10 Awesome Websites That Help You Discover the Best Web Apps
the computer/lawyer
Men Losing Their Minds Over Women
Human brain could be replicated in 10 years
Memories exist, even when forgotten
The Hottest Tweets on Twaxed.com
Viral video about social media revolution
did you know 4.0
Wall Street’s Math Wizards Forgot a Few Variables
Scoring with Social Media: 6 Tips for Using Analytics
The most important writing lesson :”Nobody wants to read your shit”
cracking the brain’s numerical code
Schrodinger’s cat experiment proposed
First online under water observatory
Website analysis : internal search site analysis
computer ants to fight computer worms
information is beautiful

1 Comment

Posted in link lists

Posted by: zyxo | September 29, 2009

Data Mining : What is a good lift ?

… ever modeled the lift of a targeting model ?

In a previous post “Data mining for marketing campagns : interpretation of lift” I discussed the factors that influence de lift of a targeting model. Apart from the quality of the model, the lift is theoretically also influenced by
– the natural return = normal percentage of buyers among your customers during a specific period
– the size of your selection in % of the customer base

As a reaction to my post, Tim Manss, in his post I’ll show you mine if you show me yours… proposed to exchange lift figures in order to be able to have something of a benchmark to check the quality of targeting models.
It is indeed not easy to get these figures, because everybody wants to keep his or her secrets … well, secret.

So I decided to give away at least some info about the lift of my targeting models by calculating a model predicting their lift.

Here is what I did :

I took the lift figures of my models (a handful of dozens of them) together with the natural return and 4 different selection sizes : 10%, 5% 1% and 0.5%
And with this simple dataset I calculated a linear regression (I actually used the logarithms of these data).

What turned out ?

– There was of cause a lot of noise : R-squared = 0.45 which means that more than half of the variance is unexplained noise. Which also means that different targets have different predictability.
– the natural return showed no statistical significant meaning
– so the only relevant predictor is the selection size.

Here is the equation and the corresponding chart (lift=ordinate, selection size=axis)

ln(lift) = 3.06291 – 0.4829 * ln(selection_size)

Lift as a function of the selection size

Lift (vertical) as a function of the selection size (horizontal)

So, I showed mine… what about yours ? 🙂

Other posts you might enjoy reading :
data mining with decision trees : what they never tell you
The top-10 data mining mistakes
Good enough / data quality
Data mining for marketing campaigns : interpretation of lift
Are you a good data miner ?

Two interesting articles of Gregory Piatetsky-Shapiro (KDnuggets) on lift modeling :
Measuring lift quality in database marketing
Estimating campaign benefits and modeling lift

Leave a Comment

Posted in Data mining | Tags: Data mining, database marketing, lift, lift chart, model quality, targeting model

Posted by: zyxo | September 19, 2009

Are you a good data miner ?

Tough question. What is a good data miner ?

One way of finding out is to look at the job descriptions, for example this one : Credit Suisse Data Miner Job Description

M.George distinguished five areas of expertise necessary to be a good data miner :

techniques : to be able to do it
analytics : to be able to decide what and how to do it
business : to understand your customers
communication : make your findings clear to others
project management : manage everything and everyone from start to end

But all that still stays a bit abstract.
In what follows I will try to be somewhat more to the point.

Let us start with the data.

You have to be a bit of a detective just to find your data. Find the people who know where the data is, Find out how you can access the data. Find out who can give you access rights to the data, Find out the corresponding key variables to join the various tables into one flatfile …

Then you have to be a programmer to put all that info to use : sql, sas, BI tools, R, whatever not only to get your raw data, but also to get them usable : what to do with missing values ? which derived variables wil you calculate ? etc…
A lot of technical skills needed.

But there is not only the data, there is also the problem to solve. So you need to be an analyst.

As an analyst you have to make decisions about doing the things right and doing the right things :

take a step backwards, know where to start, where to stop
question everything : allways ask yourself where you are wrong, not good enough, to complicated, not efficient enough, …
question everything : when they ask for numbers, ask them to explain their problem and how these numbers will solve it. Propose better, cheaper, nicer solutions …

And now comes the fun part : you have to be a number cruncher
You love data, charts, statistics (not the theory, but what you can do with it). You love to explain to people why something happens, to show them relationships between numbers, the conclusions that you derive from your numbers …
You know the data mining techniques, the statistical techniques and what you can and cannot do with them, their advantages and drawbacks, how to interprete the results, how to present the results in an uderstandable way (remember :
the others are stupid and lazy,
so you have to make it simple and easy
).

Unfortunately there is also the business (profits, costs, ROI …)
They expect you to deliver usable results in a short time. An accountant must deliver numbers that are correct, a data miner is lucky : nothing has to be absolutely correct. When it is good enough, deliver ! (Think “Microsoft software quality” !).
They sometimes say : a data mining model is never finished, only the data miner stopped working on it. This is very true, so keep that in mind and know when to stop and deliver !

Of cause every data mining project is, wel … a project. So you have to be a project manager too.
As a project is per definition something with a start and an end, you should have somewhere a description (accepted by all involved parties) of “WHEN CAN YOU CONSIDER THE PROJECT AS FINISHED”. This description is the only thing you need, because it has to contain all the conditions that have to be fulfilled (goals, deliverables, quality metrics …).

What helps you to deliver more quickly is to stick on the following rule : do the same thing twice, but never do them three times. This means that for anything you will have to do more than two times you should find a solution to get it done automatically : write a program, download a program, write an excel macro, anything.
this means you also have to be a bit of a software engineer !
This automatisation/industrialisation holds for anything : data extraction, modelling, model result reporting, monitoring of your model quality, monitoring of the data quality etc …

And last but not least : you have to be a learner.
Never think you know it all, allways look for new ways, read articles, go to symposia, find out how ohters do it, look for ways to deliver as much quantity and quality as possible whithout working too much 🙂

Other posts you might enjoy reading :
Oversampling or undersampling ?
data mining with decision trees : what they never tell you
The top-10 data mining mistakes
Good enough / data quality
Data mining for marketing campaigns : interpretation of lift

Leave a Comment

Posted in Data mining | Tags: analyst, data miner, Data mining, model quality, modeling, programmer, project manager

Posted by: zyxo | September 6, 2009

Data mining : use a gel to obtain ROC and LIFT

… or do they talk about another ROC and another LIFT ?

roclift

Leave a Comment

Posted in Data mining | Tags: Data mining, lift, ROC

	zyxo on List of animal species with 46…
	Sam on List of animal species with 46…
	John on List of animal species with 46…
	How to get rid of bi… on Oversampling or undersampling…
	zyxo on Soccer rules are unfair and…

Mixotricha

Data mining at a higher level

Data Mining Models ? 10 reasons for not using them !

Classification, Prior Probabilities and Soft Metrics

Are Tennis Point Counts Unfair ?

Link list for january 2010

Will rich people become a different species ?

Customers, website visitors, passing cars and forest birds : Howmany ?

Does Avatar show the future?

Link list for december 2009

Runaway Ecosystems : species extinction or accelerated evolution ?

10 Predictions for 2010 to 2020

Statistics on 250 twitter tools

Top 10 lists of twitter tools

Link list for november 2009

Thoughts on Traffic Jams

Extreme torture : do you chose the real thing or the memories ?

Link list for october 2009

Where is your soul ?

Making hidden patterns visible

Human evolution : amazingly fast !

Web 5.0 : computer telepathy ?

Link list for september 2009

Data Mining : What is a good lift ?

Are you a good data miner ?

Data mining : use a gel to obtain ROC and LIFT

Categories

RSS feeds

Categories

Recent Posts

Top Posts

Authors suggestions

Recent Comments

Blogroll

My recent tweets

Pages

Older posts

Related articles by Zemanta

Related articles by Zemanta

Related articles by Zemanta

Related articles by Zemanta

Categories

RSS feeds

Categories

Recent Posts

Top Posts

Authors suggestions

Recent Comments

Blogroll

Pages

Older posts