Data: Mann and Underweiser found 266 cases, all appealed from district courts, in which the Federal Circuit reached final validity decisions on 366 patents (40.2% of which were found valid). For each patent, they collected information such as the number of claims and references and the "alignment" between the claims and the specification (using a content-analysis algorithm to determine whether the specification mentions important terms from the claims). To control for time and technology type, each patent was compared with other patents issued within six months in the same technology class. Finally, they collected variables from the prosecution histories, such as the number of classes searched, the original number of claims in the application, information from the Information Disclosure Statements (IDS) (if filed), time in examination, the number of rejections and amendments prior to allowance, and the existence of foreign counterpart applications.
Results: The authors seem to have thought carefully about their empirical strategy, issues of collinearity, the influence of non-quantitative factors, etc. They found a number of variables that seemed unrelated to validity, such as the number of inventors and the total number of claims. These are the variables that were correlated with validity, with significance at least at the 10% level (see Table 3 for more details):
- Positively correlated with validity: number of technology classes, number of IDS filings and references in those filings, number of rejections based on unlisted references, standardized number of foreign references.
- Negatively correlated with validity: existence of counterpart patents in the EPO and JPO, number of examiner-added U.S. references, post-examiner actions (continuations, BPAI decisions, or Rule 312 amendments), "distance" between words in specification and claims (based on content analysis), time since application date.
Implications: The authors discuss a few implications, but their most interesting proposal is that because of the predictive power of variables that can be determined before a patent leaves the PTO, "the optimal system for maximizing the likely validity of issued patents is one in which both applicants and the PTO have incentives to cooperate in developing and improving applications of high quality," rather than patent applicants having an "entitlement perspective" and the PTO having "only limited and indirect incentives to improve the quality of applications." Here are some of the specific recommendations the authors propose:
- Improved drafting: PTO software could measure textual "alignment" between the specification and claims, and poorly aligned applications could be rejected under § 112, charged higher fees, or put on a slower track.
- Claims: Reforms should not be based on number of claims (because this is uncorrelated with validity).
- Applicant search: Thorough prior art searches by applicants should be encouraged.
- Continuations: The data support efforts to rein in continuation practice.
- BPAI: It seems "perverse" that patents issued after BPAI review are about 50% more likely to be invalid; "further inquiry into the reliability of the Board's determinations is appropriate."
- Amendments: Because patents subject to Rule 312 amendments are much less likely to be valid, "it might be appropriate to scrutinize patents subject to those amendments much more carefully."
I'm glossing over a number of details here, but the paper is relatively short and very easy to read, so for more details (as well as the authors' speculations on why they observed the correlations they did), I recommend downloading the paper. Or if you have a specific question, feel free to put it in the comments, and I'll see if I can answer it.