We evaluated millions of patents – and consistently forward citations were the biggest predictor of high value patents. In our last article we discussed why forward citations are relevant, and the importance of remaining patent term. Now we’d like to consider the remaining three factors we use to rank patents, and why they may be of use in helping to eliminate less useful patents quickly and efficiently.
Independent Claim Count (Adjusted by Means Claims)
We hypothesized that paying for additional claims (three are included in the basic filing fee) would be highly correlated with value. Our analysis focused on looking at claim counts for four primary sets of patents: (i) a set of all issued patents from 2005-2014, (ii) a set of litigated patents from the same period, (iii) a set of patents from the brokered market that were sold from 2009-2014, and (iv) the representative patents from brokered patent packages.
As predicted, having more than three claims was highly correlated to the probability of the patent being litigated, sold, or being listed as the representative patent for a sales package, e.g. the most important patent in the package.
We decided to model this ranking factor again by comparison between the prevalence of the claim count in the litigated patent (set ii) and the larger set of US issued patents (set i):
However, we know that the number of independent claims alone is insufficient consideration if, for example, all of the independent claims are formed as means-plus-function claims (35 USC §112(f)). At least in the United States, given the present case law, such claims generally have less value for our clients.
We analyzed the prevalence of means claims in our data sets (sets i-iv discussed above) and then developed a number of claims rank adjustment factor based on the number of means claims. By analyzing the different data sets, we arrived at an adjustment factor that a means claim generally has the value of 1/10th of a non-means claim. We did, however, provide an exception that if there were at least 5 independent non-means claims; no adjustment was done to the claims rank.
We then back-tested this ranking by looking at approximately 5000 randomly selected patents with issue dates from 2005-2014 and looked at the distribution of the new ranking factor. Notably, this ranking factor will only-lower the rank of ~12-13% of patents.
Claim 1 Word Count
Historically, our ranking heuristic viewed claim 1 word count as one of the more significant ranking factors and in put a heavy emphasis on shorter claims. However, when we analyzed the multiple data sets (sets i-iv discussed above) there was no significant variation between any of the sets that are proxies for higher value (litigated, sold, representative patent) and the baseline set of all patents.
Instead, we realize now that claim 1 word count is better viewed as a component to remove from consideration applications with extreme word counts. We used the data from litigated patents (set ii) as a guide in removing extreme claim 1 word counts from consideration.
Thus, as you can see the new ranking factor heavily down ranks patents with a word count for claim 1 less than 25 words or more than about 250 words. We identified a range from 63-163 words as being a sweet spot for the length of litigated claim 1 word counts. (Note, in a future version of the ranking system we might evaluate the shortest independent claims.)
Family Size and International Filings
Does family size matter when looking for the better patents? Intuitively, family size and diversity of international filings should be good indicators of value. We hypothesized that like independent claim count, the investment to produce a larger patent family and file international patents would correspond to greater value. However, we found the impact was less significant than even the word count of claim 1 – only a 10% contribution to the overall weighting.
Our new ranking system provides a maximum of 10 points for family size and international filing size:
- Up to 5 points for family size scaled linearly based on family size ranging from 0 to 12 (family with over 12 INPADOC publications is treated as 12 publications)
- Multiply the family size rank by:
- 2 if there is an issued EP, JP, CN patent
- 5 if there is a published EP, JP, CN patent
- 25 if there is a PCT publication and it is <2.75 years from priority
- 25 if <1.75 years from priority (adjust for risk of no data)
- 1 otherwise
Let’s begin by making it clear that these metrics needed to be combined based on weighting factors to create a balanced total score. While doing this, there were two major considerations. A properly weighted system should create a large ranking spread between interesting and uninteresting patents, but it should also use a mix of the metrics in order to give a more rounded perspective.
We limited the weighting factor for each metric to 10-to-60%. We then repeatedly ranked sets of random patents and known valuable sets with more than 400 different weighting factor possibilities. By comparing the possibilities that had the largest spread between the median patent ranks of each set we were able to see trends. We averaged the top 10 weighting factor possibilities to get our baseline factors, and then adjusted these slightly upon a manual review.
We then tested the system against smaller sets of patents, which we had previously reviewed. The automated ranking system was able to consistently rank the focus patents of each set highly. This confirmed that the automated ranks would allow us to quickly identify the patents that are most likely to be useful and also eliminate a number of less interesting patents quickly as well.
We set out to use the USPTO data on issued US patents (formerly hosted on Google Books but now directly hosted by the USPTO at https://data.uspto.gov/uspto.html) to refine our ranking system to provide a fully transparent, data-based ranking that can intuitively be explained to clients.
We successfully built a parser for the USPTO XML data set, using it to analyze the characteristics of US patents (issuing from 2005-2014) and compare different subsets of that data. This included leveraging our unique database of over $7B worth of brokered patents, allowing us to quickly highlight those of most interest to our buying clients.
The following table summarizes our ranking factors with Excel-like formulas (click to enlarge):
Join the Discussion
5 comments so far.
angry dudeMarch 31, 2016 09:49 am
This is not a theory but a fact of life.
They will not file their own patents building or improving upon a blocking patent they willfully infringe in their products (otherwise they would have to cite that patent in their application)
They simply pretend that patent does not exist
If that patent is on some necessary component for their technology they will cite any other alternative way to implement that component or simply will not include any description of that component in their patent applications for integrated products
When eventually sued for patent infringement they’ll make round eyes and tell you they’ve never heard of you or your stinking patent
This is how the system works from my own experience – I talked to many litigation lawyers: infringers will never admit any knowledge of the patent they willfully infringe – therefore, no forward citation
And yes, most patents from Apple, MS and such aren’t worth the paper they are written on: the watering down of the US patent system was intentional and was going on for many years
TJMMarch 31, 2016 04:50 am
The problem with this theory is that any patent they obtain would be easily invalidated, to the point that the patents aren’t worth the paper they’re printed on. Moreover, they could never enforce the patents because they would likely face serious sanctions. Most companies don’t want to throw away money prosecuting worthless patents that are too risky to enforce. I’m not saying that *no* company engages in these types of tactics, just that based on my own experience this is not the norm.
angry dudeMarch 29, 2016 11:21 pm
What I mean is this: suppose you a large technology aggregator-infringer like Apple
Apple uses great many patented technologies in their products (many of them without license from rightful patent owners) and also files their own patent applications in huge quantities
In their own patent filings they make sure they don’t cite any outside patents they (willfully) infringe in their products
Other tech aggregators do the same
It’s a very simple concept
This applies to patents on components (e.g. speech/video compression algos) not to patents on entire products as a whole (if there is such animal today – any high-tech gadget is covered by at least a thousand patent claims from many different patents)
TJMMarch 29, 2016 09:01 pm
Could you please explain what you mean by, “Infringers don’t cite patents they don’t like?” In what capacity do infringers “cite” anything? You don’t have to file a patent application to be an infringer; in fact,usually the opposite is true. Also, there is value in addressing the most relevant art during prosecution rather than allowing it to create over parent’s validity by not citing it, so I think that most practitioners are inclined to cite the most relevant art they know about.
angry dudeMarch 27, 2016 08:24 pm
This analysis is total BS
Patent values have dropped at least 5 times in the last 5 years, making it economically foolish to file for patents in the first place (e.g. average sale price of a solid tech patent was reduced from say 500K to 100K whereas the prosecution and maintenance costs went up significantly to e.g 30-40k)
Forward citation means nothing – infringers don’t cite patent they don’t like, they try to avoid it at all costs – only examiners do, if it comes up in their stupid text searches
I can go on an on – I’ve been through all of this – The Ocean Tomo’s valuations all the way to the demise of ICAP Patent Brokerage reduced to a shop for behind a curtain secret corporate transactions (for laughable amounts of money)
The recent Google’s patent palooza paid 250K at most for whatever patents they deemed “highly valuable” to them
Patent system is dead