Business Ethics and Data Mining

University of Washington, 12/7/14

Overview of Modern Data Mining

Data mining is core component of any successful company today. If you want to make smart business decisions, you need to rely on hard facts and statistics instead of the (often flawed) assumptions of the past. Companies that use customer data are able to adjust their decisions based on real market trends, and customers then have better access to products they want. However, as people’s lives shift online, it is often unclear where our personal data is ending up. Popular tech giants such as Apple, Google, and Facebook are using the personal data we provide them and selling it to advertisers, who can then target potential customers more effectively. Internet law is still in its infancy and can barely develop fast enough, as the Web grows and matures at a much greater speed than the legal system. This has allowed companies (such as those mentioned above) to provide all kinds of data to whoever pays for it, without fear of legal repercussion. This clearly brings up privacy issues, and it should come as no surprise that people are concerned about their personal information floating around cyberspace. 

As data-centered business strategies continue to develop and businesses continue to exploit these developments for profit, it is necessary now more than ever to pass tight legislation and promote transparency within companies to protect customer privacy and safety. 


The phrase is mentioned all too often, but just what exactly is data mining? To put it simply, data mining is the process of analyzing large masses of data, and interpreting the information it provides in order to make better business decisions, and to accomplish goals such as increasing revenue, cutting costs, or both. After accumulating large portions of data, software or entire computers are dedicated to processing it to find meaningful trends that can give a business an advantage over its competitors. In recent years, however, the uses of data mining have become more complex. As online consumers demand more personalized experiences, their personal information is now being used to create more relevant web surfing. For example, if you use Google to search for a certain type of jacket, Google will use your previous searches, along with information such as your age, gender, and location, to match you with the results that their algorithms predict will be best for you.

But it doesn’t stop there: the next time you search for clothing, the results returned to you will be slightly more tailored to your budget, and the items you see will reflect what people similar to you have purchased or looked for. Soon, Google (and many other companies, including Yahoo and Facebook) realized that this would be a great tool for advertising as well. If the ads that you see on a website are whichever goods or services that the website wanted to post, then the average user would hardly give them a second glance, let alone consider a purchase. However, if the ads on a website recognized your computer and know your preferences, then the ads can instantly be custom-tailored to your tastes. That means instead of random banners for products you have no interest in, the things you see instead are products that you would actually consider buying. This makes the sales process not only better for customers, who are more likely to find things they might want, but also better for companies who can now promote their product to the people who actually want it. In theory, it’s a mutually beneficial system. 





The continued free use of these websites is not without its caveats. Yes, you can use services like Google, Facebook, or iPhone apps, and not have to ever worry about paying for it. That does not mean however, that you escape providing valuable information in exchange for the convenience. Tech companies argue that this data-collection is essential to business, and enables them to provide services that would be impossible otherwise. Not only do consumers get more accurate search results and more personalized advertising, but they can also have more intimate connections with family, friends, and acquaintances when companies have some information on them. Take Rapleaf as an example, which compiles profiles of Internet users. CEO Auren Hoffman says "I don't like people tracking my location, but I want to know: 'what are some nearby Italian restaurants that my friends have liked’” (Menn 2012).

This system shares personal data more liberally, but allows users to connect with friends in new ways. Spend some time delving into the Terms & Conditions and any company will essentially tell you that your data is about to become their data too. It’s worth bearing in mind that this is not a one-sided exchange. It is a perfectly legitimate business model, so long as customers are never put at risk by the actions of the company in question, or at very least are fully informed about how their data is going to be used. To look at it from a Consequentialist standpoint, the actions of the company are permissible if, and only if, the customers are able to make an informed decision regarding their use of the product, and that they will not be put at risk, financially or otherwise, through their use of it.


Example Case

This is where Google faces problems. Although they have some of the best security and privacy settings online today (many people have qualms with providing personal information online due to the risk of it being hacked and retrieved), they have come under fire in recent years for not being up front about how customer data is actually being used. When it comes to selling data, aside from special protection for medical records, credit report information, and a few other narrow categories, almost anything is fair game. Google, among others, has faced legal threats and backlash by users after violating their own published privacy policies to varying extents.

The responses amongst companies under fire typically follow the same route, which is one that has allowed them to continue without much new legislation. They over-reach and get caught, then promise to improve. In the event that public outrage is greater than normal (on a level that would likely encourage new legislation), companies in the industry present new forms of self-regulation to counteract it. These measures include the publication of privacy policies or inclusion in their Terms & Conditions, neither of which are often read by users. This cycle has repeated countless times with all manner of companies involved in Big Data usage, and leads to the majority of web surfers wandering blindly, often completely oblivious to the information they’re giving in order to use their favorite sites.



Unfortunately, there are not many foreseeable alternatives to such use of personal data. Companies like these make a large portion of their money from such prospects, and are otherwise gaining virtually no value from their customers. What this means for the common user is that without contributing anything to the company for the service, these websites and applications would have to charge a fee for use. The fee would be what you pay for your own privacy and security. However, if companies like Google or Facebook suddenly began charging for their services, they would quickly lose customers. Additionally, technology tends to work with the law of inertia. Once a trend is up and rolling, it is difficult if not impossible to reverse it. Information technology has become such an integral and commonplace aspect of business today that the demand for data would likely shut down any resistance.

While Google is known for their cutting edge security, the majority of companies that handle personal (and financial) information are not as advanced. While not currently realistic, the most risky aspects of the issue could be eliminated with higher security standards and better data encryption. If users never have to worry about their data falling into the wrong hands (such as those with malicious intentions), universally better security and a strict control of information flow will allow them to go online safely.



With how quickly technology changes, it is difficult for the legal system to keep up. So it becomes necessary then, to create a more long-term solution. The root of the problem lies in two core issues: 1) customers are not informed enough on how their data is being used before they consent to its usage, and 2) customer data is leaked by some companies, whether sometimes knowingly and illegally, sometimes through security flaws or carelessness. Between these two issues, a series of complex solutions (legislative and otherwise) would be required to fully solve them, so given an incomplete legal knowledge, only simplified solutions for that aspect can be presented here.

When it comes to the first issue of customers not being informed enough, it’s clear that the methods used by tech companies to convey information are incredibly ineffective. For years there has been a running joke amongst software and web users regarding the fact that no one actually reads the Terms & Conditions given by companies. The legal documents are lengthy and often hard to decipher. One of the first major changes that needs to be made is for these documents to be presented again in an alternative form where users can clearly see any stipulations that exist regarding their use of the service. Anything that isn’t the universal standard should be clearly shown so that they are fully aware of where there information might go, and any other relevant information. While it’s easy for a company to simply change their standard practice, without a legally binding reason for it, there is nothing stopping a company from secretly breaking their own rules.


The next issue that needs to be dealt with is more challenging. As mentioned under the “Alternatives” section, security flaws are difficult to address, as they often lie in a lack of adequate talent or resources dedicated toward ensuring the security of the company’s data. It would be unrealistic today to impose some kind of security standard on companies, partly because of how hackers are constantly coming up with new methods and tricks, and partly because many companies lack the resources required to create such elegant security systems. Luckily, Google has reached the point where their systems are so advanced, they have a job offer and a $1 million reward to anyone that can hack their systems. So as far as the two major issues go, this is essentially a non-issue for Google. For other companies though, a more realistic option would be to require some kind of data storage that doesn’t leave sensitive information in places where hackers can gain access to it. Beyond that and possibly incentivizing companies to improve their security systems, there are not many legal options available to improve personal and financial security online.

Outside of legislative solutions, there are clearly changes Google can make in regard to increasing their transparency and informing their users better. Google is not only one of the most commonly used web services in the world, but the company prides itself on its ability to connect users with information. In fact, their mission statement reads “Google’s mission is to organize the world’s information and make it universally accessible and useful” (Google 2014). Obviously, this refers to information beyond their company, but in order to truly embody that mission, they should  make their own information more accessible and useful to the world – or more specifically, they should ensure that their customers are fully aware what they are getting themselves into before making the decision to provide personal information to the company. This begins by translating their massive documents of privacy policies into more consumable bites that can be quickly understood.

To be fair, Google has started doing this in some respects. In recent years, their privacy policy updates have included brief summaries on their site for those who are curious. While some of us will spend a few minutes to read through these, most people do not have the patience or interest, and will not take the time needed. This now poses an entirely new question: does the responsibility to keep the customer informed of its practices rest entirely on the shoulders of the company? At what point is it the user’s fault for blindly agreeing to a set of conditions that might affect them negatively down the road? Google has actually had to publically clarify their practices when misinformed customers make accusations. When their advertising practices became common, people realized that their Gmail accounts were not entirely private, and they accused Google of reading their emails. Obviously this would be an invasion of privacy, so Google had to clarify on their FAQ that “they do not ‘read’ your email per se. For use in targeted advertising on their other sites, and if your email is not encrypted, software (not a person) does scan your mail and compile keywords for advertising. For example, if the software looks at 100 emails and identifies the word ‘Doritos’ or ‘camping’ 50 times, they will use that data for advertising on their other sites” (Google 2014).



Data mining brings up all kinds of issues, most particularly to the uninformed. Without knowing how companies are using your data, it’s easy to be concerned about your personal privacy and financial security. In an ideal situation, a business can use your data to give you more relevant results, media, and ads, and in return can sell that information to trustworthy companies who can better advertise to you as an individual – and all the while, you’re aware of how the process is working. However, the greater public is only recently beginning to realize how extensive these databases are, and how much of their personal information is out there. On one hand, they might disagree with this information being so public, but on the other hand, we’re beginning to reach an age where personal privacy is a sacrifice we make in exchange for better access to information. When you are not paying for the service you use, you are essentially the product being sold. Without giving something up such as data, your alternative is to pay cash (and lose what is for some a convenience: relevant advertisements). As long as companies involved in data mining make clear their intentions with how they will use the data, and follow those promises, there should not be any more ethical issues regarding data mining.




Chengji. Data Mining. Digital image. Data Mining and Knowledge Discovery. N.p., n.d. Web. Nov. 2014.

"Company – Google." Company – Google. N.p., 2014. Web. Nov. 2014.

Frand, Jason. "Data Mining." Data Mining: What Is Data Mining? UCLA, n.d. Web. Nov. 2014.

Menn, Joseph. "Online Privacy Fears Stoked By Google, Twitter, Facebook Data Collection Arms Race." The Huffington Post., 19 Feb. 2012. Web. Nov. 2014.

Stein, Joel. "Data Mining: How Companies Now Know Everything About You." TIME. TIME Magazine, Mar. 2011. Web. Nov. 2014.