Brian McGinty’s Web Logs: Static Web pages That Now Grow, Evolve, and Defend Against Attack is an interesting read. “You’re working up your crew psychology report?”
“You’re working up your crew psychology report?”
-Dave Bowman to HAL the computer, realizing that HAL has been surreptitiously quizzing him. 2001, A Space Odyssey, 1968
In an age where a software component that can substitute for human labor instantly prices the human out of the labor market, it would be surprising if there were no attempts to write them. User profiles and usage patterns can be analyzed by “collaborative filters.” These tailor the presentation and selection of stories to individual users. Presently, news stories tend to be treated as discrete memes, but nothing prevents stories from being rewritten on the fly. This is hardly a new idea. As a student I was assigned to see how the same news was covered in different magazines. The story was about the election of Lech Walesa in Poland. The Canadian weekly news magazine Maclean’s story was about the election results and the jubilation of the Polish people. The American weekly newsmagazine, Newsweek, wrote about the election results and the humiliation of the Communist Party.
Servers can already deliver different pages to search engines using a function called “Agent Name Delivery.” This is how it works. There is an agent field in every http request that typically provides the name (IP address) of your browser. Search spiders, however, report their own name in this field. So a server can provide a spider with summary information about the page, for example, which is what it is looking for anyway. On the other hand, when an individual sends a query, the server can read your IP address, “deduce” the sort of information you might be interested in when you hit the home page, and direct you to a dynamically created home page.
The use of software to build an “accurate user model” can draw on many sources, many of them far removed from your own computer. Older techniques such as matching keywords are now quaint. Keyword matching often fails due to synonymy (many ways of referring to the same concept) and polysemy (a word can have many meanings). Software using latent semantic indexing, term frequency rating, and Bayesian modeling of relevance distribution will spread instantly through the network if successful. As will the generation of software after that, and the generation of software after that.”