HOWTO: Information Privacy

There is a general rule I like to retain when writing code: I don't need or want the user to provide me with any information that I don't need to keep around for the code to work. That is to say, something like a username and password are /sometimes/ needed for identification and must be remembered by both a system and its users. Information like a credit card number, home address, name, or even email address are nonessential to most applications... though some may think otherwise.

But why bring this up? Well, information is simply not being handled this way. As a user, why must I put trust into a company on the Internet, an essentially anonymous group, with any personal information?

Well there's the cast, now how can information be handled in any other way? Take the simplest case of an email address. It's understandable that a website wishes to store the address as a way to contact you if you forget your password. It is the standard anyhow. But there are obvious ways that the server-end could misuse even this simple information such as selling to spam lists an such.

As I see it, there are three ways to fight this: legally, non-legally, or impossibility. The legal route is the current method - a privacy policy. At present, it is ineffective as nobody reads it and it is usually subject to change at any time. The "non-legal" method is just the case where information can be misused, but the method to control this is not a legal one. One example is "the market". If the server-side misuses information, be public about it and the service will lose popularity. This is unreliable... many people simply do not care about their information until something bad happens (SEE: actions on social networking sites affecting job opportunities OR complaints about Google caching images and articles from sites that intended they only be publicly accessible for a short period of time). Another example of the non-legal method is obviously an illegal, threat-based method. If information is misused, then a band of superhackers will summon their botnet and do a distributed denial of service attack. There are obvious problems here also.

But the third option is to withhold information from services, making it impossible to misuse it. Imagine there was a password retrieval service whose only function is to be sent lost passwords and the site associated with them. Spamming such a site is simply ineffective. If a password is forgotten, it is sent to this service, and you can retrieve it from the service.

There *is* another way, which is a mixture of the different privacy methods. Imagine a new email address is created for each account you create on different websites. If one account begins getting spam, you know which website to blame. This changes the game a bit (yes, spam can be seen as a game between the buyer (spammer) and seller of information), the social (and possibly legal) implications of selling an email address is drastically increased. The spammer payoff is cut drastically as well, since a single purpose email account can simply be cut.

But it's ridiculous to have so many email addresses. Now imagine each email address in the previous example corresponds to a public/private key pair. Also imagine an email protocol in which every message must be encrypted to a specific recipient. In fact, a user need not have an identifiable address. This example has the same dynamics as the previous example, but with much less hastle and the benefits of increased security.

This is similar to a concept I had for an anonymous chat like Omegle, but with a persistent buddy list. A unique key pair could be created for each new buddy. There would be no way for Andrew's friend Beth to tell Carl his screen name without informing Andrew (to create a new key pair and get Carl's) or giving Carl access to her account (where Carl uses Beth's key to communicate with Andrew through Andrew's existing public key).

Indeed, there should also be a way to pay with a credit card without giving card information to the selling website. Once again, it's simple. The bank creates a unique, one-time-use transaction ID when requested to by the buyer. The buyer, using his private key, signs the cost value and the transaction ID (with one signature) and send it to the bank. The seller does the same with its signature. ID and cost to the bank as well. The bank has prior knowledge of the card number (since it issued the card in the first place) and knows the buyer's public key. It cross-checks the cost of the sale and approves/denies the transaction accordingly (and based on the other standard factors). No credit card information is spread.

Finally, shipping. A home address can act the same as an email address. A home can have many address IDs associated with it, the sender only knows one ID which should be unique to the sender. The post office scans the number, its database does a lookup and the package is sent to the correct location. The only downside here is the human readable address would not be used as much. The upside is, they could obviously still be used.

I have a real problem with the people who think they know what is best with my information. Internet2 is a prime example. I do not trust any university to store and share my information appropriately. An organization which serves to profit will not act in my best interest unless forced to do so. This is fact. This is why laws change to keep the market intact. But I do not trust laws, I do not want to deal with or rely on them when I am wronged.

The game of privacy should be in the people's favor. Privacy should not be treated as an afterthought.

-- Alex

2010.01.08 - last edit >> Fri, 08 Jan 2010 13:06:29 -0500