Gathering Business Data? Be Careful, Mom is Watching – A Comment on Data Scraping and the Compulife Case

“Faced with trying to regulate our own personal conduct, we have to be content with suggestive questions, such as ‘would you be comfortable with this appearing in the front page news tomorrow morning?’ or… ‘what would your mother think if she were looking over your shoulder right now?’” people say that “data is the new oil,” they’re talking about new ways of creating wealth. No matter what business you’re in, success today depends on learning everything you can about your customers and competitors. And there’s so much information sloshing around the internet, every industry—from restaurants to manufacturers to sports teams—is busy extracting insights from “big data” analysis.

But, like drilling for oil, prospecting for data sometimes gets your hands dirty. Recently, a court ruled that a startup company providing life insurance quotes to consumers had created its database – the engine of its busines – by taking data from an existing company (Compulife) that had built theirs from scratch. The new company didn’t break in and steal the whole thing. Instead, it used robotic software to “scrape” the information from Compulife’s website, by pretending to be a member of the public – actually by pretending to be 43 million members of the public, which is how many rate quotes they were able to extract in only four days.

Having pumped out all that data, they were able to understand the competitor’s system and replicate it. When hauled into court, they shrugged their shoulders and pointed out that the source website was open to the public and they were just gathering what was readily available. Surely, they argued, this couldn’t be trade secret misappropriation because the information wasn’t secret. Not so fast, said the court. Compulife expected that real individual people, not swarms of automated “bots,” would be using their website. The data, it concluded, had been acquired by “improper means.”

Peter Toren, a fellow trade secret practitioner, recently penned a two-part article lamenting this decision. While I very much respect Peter’s views, on this one I firmly believe he was wrong and the court was right.

Whether or not information can be gathered from the internet this way is obviously important. But the issue is not so much about bots and data as it is about your Mom.

Stay with me here, you’ll see what I mean.

From Tents to Bots

Back in 1970, the DuPont company was building a new chemical plant. If a competitor could get into the building site and examine the layout it could understand important aspects of DuPont’s secret processes. So, DuPont erected a fence around the perimeter, with guards and no-trespassing signs. One day the construction manager noticed a plane making multiple passes at an altitude low enough to read the registration number. It turned out that a rival company had hired the pilot to fly over the site and take pictures.

Faced with a lawsuit, the competitor claimed that the construction was in “plain view,” and it had broken no laws. The judge wasn’t impressed. DuPont shouldn’t have to erect a tent over the worksite to prevent what it called “a school-boy’s trick.” This should be no surprise, he explained, because “our ethos has never given moral sanction to piracy” and the “marketplace should not deviate far from our mores.”

Four years later, the U.S. Supreme Court relied on the DuPont case in describing why we enforce trade secret rights. It said that the “maintenance of standards of commercial ethics and the encouragement of invention” are the twin policy pillars of trade secret law, reflecting the “necessity of good faith and honest, fair dealing” in business.

Five years after that, the first version of the Uniform Trade Secrets Act was published, and it defined theft as including acquisition of information by “improper means.” The identical standard applies under the more recent federal law, the Defend Trade Secrets Act. And both of those statutes say that “improper means” “includes theft, bribery, misrepresentation, breach or inducement of a breach of a duty to maintain secrecy, or espionage through electronic or other means.”

In much of the IP world, we love bright lines and sharp edges. For example, to attack a patented invention for lack of novelty, it’s enough to find an academic paper covered with dust in an obscure library. Publication is sudden death. Predictability is highly valued.

Perhaps that’s why some IP lawyers find trade secret laws to be uncomfortable, because they are so, well – flexible. Perhaps this is why my friend Peter misread the Uniform Trade Secrets Act (UTSA) and Defend Trade Secrets Act (DTSA) as restricting “improper means” to a closed set of behaviors, rather than providing a list of examples, which the official comments to the UTSA describe as “a partial listing.” Perhaps that’s why he claimed that the Compulife case was the “first appellate decision in more than 50 years that has relied upon” the DuPont case, when the Supreme Court had leaned on it so firmly back in 1974.

Adding Bricks to the Edifice

Trade secret laws in the U.S. grow from our common law tradition, in which judges wrestling with novel arguments end up adding bricks to the edifice of principles. The foundation of it all, as the Supreme Court said, is the idea that business behavior should be ethical. And as we all know, ethics is highly contextual and situational. Faced with trying to regulate our own personal conduct, we have to be content with suggestive questions, such as “would you be comfortable with this appearing in the front page news tomorrow morning?” or – this is my favorite, and what I promised you earlier – “what would your mother think if she were looking over your shoulder right now?”

It’s not just the idea of “improper means” that imposes flexibility on trade secret law. Other key concepts are similarly driven by context. For example, we require that the trade secret holder have exercised “reasonable efforts” to maintain control over information it claims as a trade secret. We disallow protection for information that is “readily ascertainable,” but only when it can be ascertained “by proper means.” And we approve of reverse engineering (taking something apart to discover how it works), except when the thing was acquired unfairly.

None of this should be particularly troubling in the abstract, since we all (or the vast majority of us) want to be ethical actors. But the law keeps us on our toes with its ambiguity. Saving space to condemn creative thieves means that we risk getting in trouble if we go too close to the line, such as it is. This risk is made more complex by changing context. Today, DuPont would be out of luck trying to keep its construction site private, what with Google Earth and other satellite imagery.

Indeed, with rapid advances in technology we regularly introduce not only useful innovations to serve society, but also tools that can be used to capture another’s competitive advantage. The public-facing website resting on a large database gives us a good example of the conundrum. How do we balance the rights of those who want to make useful information available in limited ways against those who claim the right to use what can be found in plain sight?

Maintaining Competitive Advantage

As I’ve already explained, from the legal perspective, I think that the court in the Compulife case got it right, because what the startup did seemed unfair and improper. But how do we translate this modern version of the DuPont case into some guidelines for handling data in the age of ubiquitous data? What can owners of collections of useful data do in order to keep control of their competitive advantage?

First, where the commercial relationship is business to business, rely on carefully drafted contracts to limit the risk that the other party may misuse the information to which they’ve been given access.

Second, in a more public-facing environment, use not only restrictive EULA’s (end user license agreements) but also technical measures to make data extraction difficult, at least where this is possible without degrading the usefulness of the product or service being offered.

Third, make it obvious to any user that you don’t want your data misused. Provide warnings that are impossible to miss, like the “no trespassing” sign hanging on the fence. If this ever turns into a legal fight, the court will likely be impressed by evidence that the defendant must have known he was stepping over a line.

And what about those of you who are looking for creative ways to gather data? Whatever you’re thinking of doing, know that Mom is watching.

Image Source: Deposit Photos
Author: PixelsAway
Image ID: 6281166 


Warning & Disclaimer: The pages, articles and comments on do not constitute legal advice, nor do they create any attorney-client relationship. The articles published express the personal opinion and views of the author as of the time of publication and should not be attributed to the author’s employer, clients or the sponsors of

Join the Discussion

6 comments so far.

  • [Avatar for Raymond Van Dyke]
    Raymond Van Dyke
    September 23, 2020 08:03 pm

    The above being said, I am not happy with the 11th Circuit’s handling of this case, and I fully agree that the issues of exactly what is a trade secret here and what are the boundaries of improper means were inadequately addressed in the opinion, creating the confusion!

  • [Avatar for Anon]
    September 22, 2020 08:36 pm

    Mr. Van Dyke,

    And what exactly was the subverting of code?

    IF there was any actual subversion, you MAY have had a point.

    As it is, neither exists.

  • [Avatar for Raymond Van Dyke]
    Raymond Van Dyke
    September 22, 2020 04:43 pm

    Jim: a great exegesis of the foundations of trade secret law – from the guru of trade secret law! I was thinking of the Dupont trade secret case recently. Flying over the site to subvert protections is akin to subverting the code to improperly obtain data. Great analysis.

  • [Avatar for Anon]
    September 21, 2020 04:28 pm

    The “clearly at one bit at a time” is a truly meaningless distinction.

    If they are willing to put it into the public, they have NO right to constrain what they put into the public – no matter the “trickle rate.”

    My Mom would tell them to F off (paraphrasing, of course).

  • [Avatar for Jim Pooley]
    Jim Pooley
    September 21, 2020 02:30 pm

    Thanks for the comment. The problem with viewing this as involving data that the company “was willing to part with” is that they clearly were willing to part with only one bit at a time, in response to a human inquiry. The robotic tool used by the defendants faked that human inquiry 42 million times, allowing it in effect to glean the substance of the database otherwise hidden behind the website. Compulife certainly was not willing to part with that, but it’s what the defendant was able to do. As the Supreme Court pointed out in Kewanee v. Bicron, an essential pillar of trade secret law is the respect for “standards of commercial ethics” and the “improper means” element is the way that courts enforce that notion. Just because a restaurant sets up an all-you-can-eat buffet, that doesn’t justify, in any moral or legal sense, a competitor driving up and carting off all the food for the cost of one meal. If we’re going to encourage innovation, even very modest innovation of the sort that Compulife engaged in, we need to provide some protection for a sensible business model to implement it.

  • [Avatar for Anon]
    September 21, 2020 11:55 am

    I really do not think that the “Mom is watching” is pertinent for the particular case of using bots to merely more quickly and efficiently gather what the company was willing to part with.

    I know that my Mom – a paragon of virtue that I would hope up to ANY Mom – would say that nothing untoward was done, and that the company PUTTING OUT THE INFORMATION had only itself to blame, as they had total control of what they let go.