IN RE: Modafinil Antitrust Litigation Mylan Laboratories, Inc.; Mylan Pharmaceuticals Inc.; Ranbaxy Laboratories, Ltd; Ranbaxy Pharmaceuticals, Inc., Appellants
“The class action is an ingenious device for economizing on the expense of litigation and enabling small claims to be litigated. The two points are closely related. If every small claim had to be litigated separately, the vindication of small claims would be rare. The fixed costs of litigation make it impossible.” Thorogood v. Sears, Roebuck and Co., 547 F.3d 742, 744 (7th Cir. 2008). But not every group of plaintiffs should be granted class action status, because “[t]he class action is an ‘exception to the usual rule that litigation is conducted by and on behalf of the individual named parties only.’ ” Wal–Mart Stores, Inc. v. Dukes, 564 U.S. 338, 348, 131 S.Ct. 2541, 180 L.Ed.2d 374 (2011) (quoting Califano v. Yamasaki, 442 U.S. 682, 700-01, 99 S.Ct. 2545, 61 L.Ed.2d 176 (1979)).
When thinking of a class action brought under Rule 23(b)(3), we typically think of a large aggregation of individuals (hundreds or even thousands), each with small claims. This case is quite different from that. Here, we are faced with a putative class of twenty-two large and sophisticated corporations, most of which have multi-million dollar claims, who wish to take advantage of the class action device. While we do not foreclose the possibility of class status in this case, or where the putative class is of similar composition, Plaintiffs have not met their burden of showing that the numerosity requirement of Rule 23(a)(1) has been satisfied. We now provide a framework for district courts to apply when conducting their numerosity analyses, and we will remand to the District Court to allow such an analysis in this case.
A. Regulatory Framework
The 1984 Drug Price Competition and Patent Term Restoration Act (the “Hatch–Waxman Act”), 98 Stat. 1585, as amended, provides a regulatory framework designed in part to (1) ensure that only rigorously tested drugs are marketed, (2) incentivize drug manufacturers to invest in new research and development, and (3) encourage generic entry into the marketplace. The Hatch–Waxman Act requires a drug manufacturer wishing to market a new brand-name drug to first submit a New Drug Application (“NDA”) to the federal Food and Drug Administration (“FDA”), and then undergo a long, complex, and costly testing process. See 21 U.S.C. § 355(b)(1) (requiring, among other things, “full reports of investigations” into safety and effectiveness; “a full list of the articles used as components”; and a “full description” of how the drug is manufactured, processed, and packed); see also F.T.C. v. Actavis, Inc., –––U.S. ––––, 133 S.Ct. 2223, 2228–29, 186 L.Ed.2d 343 (2013) (describing the statutory framework). If this process is successful, the FDA will grant the drug manufacturer approval to market the brand-name drug. After this approval, a generic manufacturer can obtain similar approval by submitting an Abbreviated New Drug Application (“ANDA”) that “shows that the generic drug has the same active ingredients as, and is biologically equivalent to, the brand-name drug.” Caraco Pharm. Labs., Ltd. v. Novo Nordisk A/S, ––– U.S. ––––, 132 S.Ct. 1670, 1676, 182 L.Ed.2d 678 (2012) (citing 21 U.S.C. §§ 355(j)(2)(A)(ii), (iv)). This way, a generic manufacturer is not required to undergo the same costly approval procedures to develop a drug that has already satisfied the FDA. Actavis, 133 S.Ct. at 2228 (“The Hatch–Waxman process, by allowing the generic to piggy-back on the pioneer's approval efforts, ‘speed[s] the introduction of low-cost generic drugs to market,’ thereby furthering drug competition.” (quoting Caraco, 132 S.Ct. at 1676)).
The FDA will not give final approval to produce a generic version of a drug that is entitled to non-patent exclusivity under the Hatch–Waxman Act, and it “cannot authorize a generic drug that would infringe a patent.” Caraco, 132 S.Ct. at 1676. Thus, among other things, an ANDA's approval will depend on “the scope and duration of the patents covering the brand-name drug.” Id. Brand manufacturers are required to include the patent number and expiration date of the patent that covers the drug or that covers a method of using that drug in their NDAs, which are then published by the FDA in the Orange Book, more formally known as the Approved Drug Products with Therapeutic Equivalence Evaluations. Id. (citing 21 U.S.C. § 355(b)(1) and 21 C.F.R. §§ 314.53(c)(2)(ii)(P)(3), (3) (2011)). Once a patent has been listed in the Orange Book, the generic manufacturer is free to file an ANDA if it can certify that its proposed generic drug will not actually violate the brand manufacturer's patents. Id. Under 21 U.S.C. § 355(j)(2)(A)(vii), there are four ways in which a generic manufacturer can make this certification:
(I) that such patent information has not been filed,
(II) that such patent has expired,
(III) of the date on which such patent will expire, or
(IV) that such patent is invalid or will not be infringed by the manufacture, use, or sale of the new drug for which the application is submitted.
An ANDA with a paragraph IV certification may only be filed after the expiration of the fourth year of the New Chemical Entity (“NCE”) five-year exclusivity period.1 21 U.S.C. § 355(j)(5)(E)(ii). The “ ‘paragraph IV’ route[ ] automatically counts as patent infringement.” Actavis, 133 S.Ct. at 2228 (citing 35 U.S.C. § 271(e)(2)(A)). As a result, this often “means provoking litigation” instituted by the brand manufacturer. Caraco, 132 S.Ct. at 1677.
If the brand manufacturer initiates a patent infringement suit, the FDA must withhold approval of the generic for at least 30 months while the parties litigate the validity or infringement of the patent. Actavis, 133 S.Ct. at 2228 (citing 21 U.S.C. § 355(j)(5)(B)(iii)). If the suit has concluded at the end of this 30-month period, then the FDA will follow the outcome of the litigation. Id. However, if the litigation is still proceeding, the FDA may give its approval to the generic drug manufacturer to begin marketing a generic version of the drug. Id. The generic manufacture then has the option to “launch at risk,” meaning that if the ongoing court proceeding ultimately determines that the patent was valid and infringed, the generic firm will be liable for lost profits despite the FDA's approval. C. Scott Hemphill, Paying for Delay: Pharmaceutical Patent Settlement as a Regulatory Design Problem, 81 N.Y.U. L. Rev. 1553, 1609 (2006).
In order to incentivize a generic drug manufacturer to challenge weak patents, the Hatch–Waxman Act provides that the first generic manufacturer to file a paragraph IV certification will enjoy a 180-day exclusivity period. 21 U.S.C. § 355(j)(5)(B)(iv). This means that during this exclusivity period, “no other generic can compete with the brand-name drug,” Actavis, 133 S.Ct. at 2229, an opportunity that can be “ ‘worth several hundred million dollars,’ ” to the first-filer, id. (quoting Hemphill, supra, at 1579).2 It is during this generic exclusivity period that the “vast majority of potential profits for a generic drug manufacturer materialize.” Id. (internal quotation marks omitted). That is because once the exclusivity period has expired other generic manufacturers are free to enter the market, bringing the price down to competitive levels. Importantly, this 180-day exclusivity period belongs only to the first generic manufacturer to file; if the first-filer forfeits its exclusivity rights, no other generic manufacturer is entitled to it. Id. (citing 21 U.S.C. § 355(j)(5)(D)).
In April 1997, the United States Patent and Trademark Office issued U.S. Patent No. 5,618,845 (“the ′845 patent”) to Cephalon, Inc. (“Cephalon”), a pharmaceutical company. The ′845 patent claimed a specific particle-size distribution of modafinil, a wakefulness-promoting agent used to treat narcolepsy and other sleep disorders, and Cephalon later applied for a reissue of the patent, resulting in the issuance of U.S. Reissue Patent No. 37,516 (“the ′516 patent”) in January 2002. Thus, Cephalon's use of modafinil was protected by a patent until October 6, 2014, to be later extended until April 6, 2015.
In December 1998, the FDA approved Cephalon's NDA for the brand-name drug Provigil and granted it NCE exclusivity. This five-year period of exclusivity was extended until December 24, 2005, due to Cephalon's status as an orphan drug.3 In March 2006, Cephalon obtained pediatric exclusivity, which added an additional six months of exclusivity. 21 U.S.C. § 355a(c). Thus, in the absence of the ′516 patent, Cephalon's exclusivity period for modafinil would have ended on June 24, 2006.
On December 24, 2002, the first day that an ANDA for modafinil could be filed, four generic drug manufacturers—Teva Pharmaceutical Industries, Ltd. and Teva Pharmaceuticals, USA, Inc. (collectively “Teva”); Ranbaxy Laboratories, Ltd. and Ranbaxy Pharmaceuticals, Inc. (collectively “Ranbaxy”); Mylan Pharmaceuticals, Inc. and Mylan Inc. (collectively “Mylan”); and Barr Laboratories, Inc. (“Barr”)—each independently filed an ANDA with paragraph IV certifications seeking to sell generic modafinil products. Due to FDA guidance promulgated after the paragraph IV certifications in this case were filed, all four generic manufacturers were treated as being the first filer, and thus all four would have shared in the 180-day exclusivity period, making it less valuable to each individual generic manufacturer. See Guidance for Industry on 180-Day Exclusivity when Multiple Abbreviated New Drug Applications are Submitted on the Same Day, 68 Fed. Reg. 45252, 45255 (Aug. 1, 2003).
Because the filing of the paragraph IV certification “automatically counts as patent infringement,” Actavis, 133 S.Ct. at 2228 (citing 35 U.S.C. § 271(e)(2)(A)), Cephalon sued the four generic manufacturers for patent infringement in the District of New Jersey on March 28, 2003. While motions for summary judgment were pending, Cephalon entered into what are known as “reverse-payment settlements”4 with each of the four generic manufacturers. First, Cephalon settled with Teva on December 9, 2005. This agreement ended the patent litigation between Cephalon and Teva, and as a result Teva was granted a license to sell modafinil in October 2012, which was before the expiration of Cephalon's patent but several years later than Teva could have entered the market if it had launched its generic “at-risk.” In exchange for its agreement to settle, Teva was paid millions of dollars to stay out of the market via royalty agreements, supply agreements, and other contractual provisions. Importantly, the only term of the deal that was publicized was what is known as the “contingent launch provision.” This provision allowed Teva to enter the generic modafinil market if any other company entered the market for any reason.
Almost two weeks later, on December 22, 2005, Ranbaxy entered into a similar reverse-payment settlement agreement with Cephalon on slightly less favorable terms, but also with a contingent launch provision. Again, the contingent launch provision was publicized via press release. Two weeks after the Ranbaxy settlement, on January 9, 2006, Mylan entered into a similar agreement—on less favorable terms than Ranbaxy—but also with a publicized contingent launch provision. The final remaining paragraph IV filer, Barr, settled on the least favorable terms on February 1, 2006. It too had a contingent launch provision, which was publicized as well. Because no subsequent paragraph IV filer would be entitled to the 180-day exclusivity period, there was no incentive for another generic manufacturer to unilaterally bear the litigation expenses for the reward that it would have to share with any other generic manufacturer who wanted to enter the market. See 21 U.S.C. § 355(j)(5)(D).5
The Direct Purchaser Plaintiff (“DPP”) putative class, appellees in this case, filed suit on April 27, 2006, alleging a global conspiracy involving Cephalon and all four generic defendants under 15 U.S.C. § 1; four separate conspiracies between Cephalon and each generic defendant under the same statute; and a monopolization claim against Cephalon under 15 U.S.C. § 2. The DPP class is made up of wholesalers who purchased Provigil directly from Cephalon.6
The District Court, with the full support of the parties, ordered that motions regarding class certification were not to be filed until after fact and expert discovery and the motions for summary judgment had been filed. Thus, the DPP class did not file its motion for class certification until May 12, 2014, after more than eight years of litigation. Approximately one month later, on June 23, 2014, the District Court granted summary judgment in favor of all of the defendants on the DPP class' global antitrust conspiracy claim. Over the next 13 months, several letter motions and hearings were held on the class certification issue, and the District Court certified the DPP class on July 27, 2015. During this period, Cephalon, Teva, and Barr settled with the DPP class for $512 million on April 17, 2015. A settlement class, which has the exact same composition as the putative DPP class at issue here, was also certified on July 27, 2015, and the settlement itself was approved by the District Court on October 15, 2015. Thus, the only defendants remaining at the time of the DPP certification decision being appealed were Ranbaxy and Mylan (collectively “Defendants”).
Defendants challenge two aspects of the District Court's class certification decision—numerosity and predominance. Thus, even though other issues were contested at the District Court level, we focus only on the District Court's numerosity analysis under Rule 23(a)(1) and its predominance analysis under Rule 23(b)(3).
Plaintiffs argued before the District Court that the class was comprised of twenty-two members. Defendants challenged the inclusion of four of these members. Thus, the District Court began its numerosity analysis by determining the proper class size because “relevant precedent makes significant distinctions between classes containing more than twenty class members and those containing twenty or fewer.” King Drug Co. of Florence, Inc. v. Cephalon, Inc., 309 F.R.D. 195, 204 (E.D. Pa. 2015). Defendants challenged two class members' inclusion solely for numerosity purposes because they were partial assignees of two other class members. They argued that counting the partial assignees would essentially allow the DPP class to “double dip” and artificially inflate the class size. Defendants next challenged the inclusion of a class member that ceased operations prior to generic entry actually occurred, arguing that there was no way to know whether it would have actually purchased generic modafinil. Lastly, Defendants challenged the class status of a class member that only purchased branded Provigil from Cephalon after generic modafinil had already entered the market. Defendants argued that there was no overcharge as a result of this. The District Court rejected all of Defendants' challenges to the class size. Id. at 204–06.
The District Court next considered whether joinder of these twenty-two class members was impracticable such that class certification was appropriate under Rule 23(a)(1). While the District Court acknowledged that a class of twenty-two members was small compared to most class actions, the District Court found persuasive several district court cases in the reverse-payment settlement context with similarly-situated classes where the numerosity requirement was found to be satisfied. Id. at 204 (collecting cases)
In analyzing whether joinder was impracticable, the District Court examined five factors: “(1) judicial economy, (2) geographic dispersion, (3) financial resources of class members, (4) the claimant's ability to institute individual suits, and (5) requests for injunctive relief that could affect future class members.” Id. at 203–04 (quoting In re Wellbutrin XL Antitrust Litig., No. 08–2431, 2011 WL 3563385, at *3 (E.D. Pa. Aug. 11, 2011)). The District Court placed great weight on the judicial economy factor, with particular emphasis on the late stage of the litigation. Specifically, the Court stated: “Considering the extensive history of this litigation and the exhaustive discovery that has been conducted, ․ judicial economy is best served by trying this case as a class action. Joinder of the absent class members would likely require additional rounds of discovery, which would only further delay a trial date.” Id. at 206–07.
Relatedly, the District Court also expressed the concern that if the class was not certified at this late date, unnamed class members would bring individual suits in other jurisdictions instead of seeking to be joined in the suit before him. Id. at 207 (“Further, if cases were brought within other jurisdictions, additional discovery is certainly a possibility, and separate trials could result in inconsistent verdicts.”). The other primary factor that the District Court found to weigh in favor of numerosity was the geographic dispersion of the class members, who were spread out over thirteen states and Puerto Rico. Id.
On the other hand, the Court noted that some factors weighed against class certification. First, the class members' vast financial resources weighed against certification, as each was a sophisticated corporation. Id. The District Court also looked to their incentive to bring individual claims, stating that the class members' ability to bring individual suits generally weighed against certification, but equivocating somewhat because the six class members with claims below $1 million “likely do not have the same incentive to engage in costly antitrust litigation on their own.” Id. It is not clear what weight was ultimately placed on the parties' financial incentive to bring suit, and the District Court appeared to treat this as either a neutral factor or one that weighed in favor of Defendants. Ultimately, the District Court held that the requirements of Rule 23(a)(1) were satisfied and the class was sufficiently numerous such that joinder was impracticable. Id.
The District Court next addressed Defendants' predominance argument that, after the District Court's grant of summary judgment on the global conspiracy claim, common issues of law and fact did not predominate over individualized inquiries under Rule 23(b)(3). Id. at 209. Defendants argued that the damages model of Plaintiffs' expert, Dr. Leitzinger, no longer matched Plaintiffs' theory of liability because it did not isolate the harm caused by each individual reverse-payment settlement. Defendants claimed that this mismatch is analogous to the problem at issue in the Supreme Court's decision in Comcast Corp. v. Behrend, ––– U.S. ––––, 133 S.Ct. 1426, 185 L.Ed.2d 515 (2013). Related to this argument, Defendants relied upon the doctrine of antitrust standing to support the view that, in the absence of a global conspiracy, each class member would have to show which agreement harmed him, and that this would necessarily be an individualized inquiry. The District Court rejected these arguments, concluding that Plaintiffs had antitrust standing and that the doctrine of joint and several liability was appropriate. Thus, it concluded that Comcast was not controlling because Dr. Leitzinger's damages model “matches Plaintiffs' remaining theory of liability and impact.” Id. at 214.
“The class action is an exception to the usual rule that litigation is conducted by and on behalf of the individual named parties only.” Wal–Mart, 564 U.S. at 348, 131 S.Ct. 2541 (internal quotation marks omitted). In order to justify this exception to the rule, “every putative class action must satisfy the four requirements of Rule 23(a) and the requirements of either Rule 23(b)(1), (2), or (3).” Marcus v. BMW of N.A., LLC, 687 F.3d 583, 590 (2012). In order to satisfy Rule 23(a), a plaintiff must show:
(1) the class must be “so numerous that joinder of all members is impracticable” (numerosity); (2) there must be “questions of law or fact common to the class” (commonality); (3) “the claims or defenses of the representative parties” must be “typical of the claims or defenses of the class” (typicality); and (4) the named plaintiffs must “fairly and adequately protect the interests of the class” (adequacy of representation, or simply adequacy).
In re Cmty. Bank of N. Va., 622 F.3d 275, 291 (3d Cir. 2010) (quoting Fed. R. Civ. P. 23). Rule 23(b)(3), which is the basis for certification here, “requires that (i) common questions of law or fact predominate (predominance), and (ii) the class action is the superior method for adjudication (superiority).” Marcus, 687 F.3d at 591 (quoting In re Cmty. Bank of N. Va., 622 F.3d at 291). “The party seeking certification bears the burden of establishing each element of Rule 23 by a preponderance of the evidence.” Id.
We have held that “the decision to certify a class calls for findings by the court, not merely a ‘threshold showing’ by a party, that each requirement of Rule 23 is met,” and that “[f]actual determinations supporting Rule 23 findings must be made by a preponderance of the evidence.” In re Hydrogen Peroxide Antitrust Litig., 552 F.3d 305, 307 (3d Cir. 2008). In addition, a court “must resolve all factual or legal disputes relevant to class certification, even if they overlap with the merits—including disputes touching on elements of the cause of action.” Id. Class certification will thus be “proper only ‘if the trial court, is satisfied, after a rigorous analysis, that the prerequisites' of Rule 23 are met.” Id. (quoting Gen. Tel. Co. of Sw. v. Falcon, 457 U.S. 147, 161, 102 S.Ct. 2364, 72 L.Ed.2d 740 (1982)); see also Newton v. Merrill Lynch, Pierce, Fenner & Smith, Inc., 259 F.3d 154, 166 (3d Cir. 2001) (“A class certification decision requires a thorough examination of the factual and legal allegations.”).
Thus, while a district court “possesses broad discretion to control proceedings and frame issues for consideration under Rule 23,” such discretion “does not soften the rule [that] a class may not be certified without a finding that each Rule 23 requirement is met.” Hydrogen Peroxide, 552 F.3d at 310. This is particularly true because, acknowledging the practicalities of class litigation, we have said that class certification “is often the defining moment in class actions (for it may sound the ‘death knell’ of the litigation on the part of plaintiffs, or create unwarranted pressure to settle nonmeritorious claims on the part of defendants).” Newton, 259 F.3d at 162.
“We review a class certification order for abuse of discretion, which occurs if the district court's decision rests upon a clearly erroneous finding of fact, an errant conclusion of law or an improper application of law to fact.” Hydrogen Peroxide, 552 F.3d at 312 (internal quotation marks omitted). Although Defendants raise the issue of predominance first, the requirements of Rule 23(a) are “threshold requirements,” Amchem Prods., Inc. v. Windsor, 521 U.S. 591, 613, 117 S.Ct. 2231, 138 L.Ed.2d 689 (1997), and we therefore address them first.
Rule 23(a)(1) sets forth what is commonly known as the numerosity requirement. The text is, however, conspicuously devoid of any numerical minimum required for class certification. Instead, the rule simply states that the numerosity requirement is satisfied when “the class is so numerous that joinder of all members is impracticable.” Fed R. Civ. P. 23(a)(1). “Impracticable does not mean impossible,” Robidoux v. Celani, 987 F.2d 931, 935 (2d Cir. 1993), and refers rather to the difficulties of achieving joinder. This calls for an inherently fact-based analysis that requires a district court judge to “take into account the context of the particular case,” thereby providing district courts considerable discretion in making numerosity determinations. Pa. Pub. Sch. Emps. Ret. Sys. v. Morgan Stanley & Co., 772 F.3d 111, 120 (2d Cir. 2014). A district court abuses that discretion, however, when it considers issues that have no place in the numerosity requirement. Hydrogen Peroxide, 552 F.3d at 312. In this case, the District Court abused its discretion by improperly emphasizing the late stage of the proceeding and by not considering the ability of individual class members to pursue their cases through the use of joinder.7
While “[n]o minimum number of plaintiffs is required to maintain a suit as a class action,” our Court has said that “generally if the named plaintiff demonstrates that the potential number of plaintiffs exceeds 40, the first prong of Rule 23(a) has been met.” Stewart v. Abraham, 275 F.3d 220, 226–27 (3d Cir. 2001); see also Robidoux, 987 F.2d at 936 (“[T]he difficulty in joining as few as 40 putative class members should raise a presumption that joinder is impracticable.”). At the other end of the spectrum, the Supreme Court has stated in dicta that a class of fifteen was “too small to meet the numerosity requirement.” Gen. Tel. Co. of the Nw, Inc. v. EEOC, 446 U.S. 318, 331, 100 S.Ct. 1698, 64 L.Ed.2d 319 (1980). Leading treatises have collected cases and recognized the general rule that “[a] class of 20 or fewer is usually insufficiently numerous ․ [a] class of 41 or more is usually sufficiently numerous․ [while] [c]lasses with between 21 and 40 members are given varying treatment. These midsized classes may or may not meet the numerosity requirement depending on the circumstances of each particular case.” 5 James Wm. Moore, et al., Moore's Federal Practice § 23.22; see also 5 William B. Rubenstein, Newberg on Class Actions § 3:12 (“As a general guideline ․ a class that encompasses fewer than 20 members will likely not be certified absent other indications of impracticability of joinder, while a class of 40 or more members raises a presumption of impracticability of joinder based on numbers alone.” (internal footnotes omitted)); Cox v. Am. Cast Iron Pipe Co., 784 F.2d 1546, 1553 (3d Cir. 1986) (citing Moore favorably).
At this point, we need not specify a “floor” at which a putative class will fail to satisfy the numerosity requirement. Instead, we simply note that the number of class members is the starting point of our numerosity analysis. Although district courts are always under an obligation to ensure that joinder is impracticable, their inquiry into impracticability should be particularly rigorous when the putative class consists of fewer than forty members. Because the District Court certified a class of twenty-two members, which is only slightly above the twenty-member floor suggested by the leading treatises, we first address Defendants' challenge to the size of the putative class concerning the partial assignment of some claims. After determining that the class is comprised of twenty-two members, we scrutinize the District Court's numerosity reasoning in this case. Because the District Court erred in its analysis of the two most important factors applicable here, we see no need to examine the other factors and will remand for the District Court to again engage in a numerosity inquiry consistent with the reasoning in this opinion.
1. The Size of the Class
The District Court rejected Defendants' argument that two class members should not be included in the class for numerosity purposes because they were partial assignees of two other class members. If Defendants were correct, the class would be comprised of only twenty class members, not twenty-two. On the other hand, for the first time on appeal, Plaintiffs argue that they have uncovered three more assignees of claims, and that the class consists of twenty-five members.
Defendants appear to have abandoned their partial assignment argument on appeal, arguing in one sentence that “four of the 22 potential class members were improperly included in the class.” Appellant Br. at 48.8 Defendants make no reference to case law and rely simply on cursory citations to the record. We could, for good reason, deem these arguments abandoned and waived on appeal. Kost v. Kozakiewicz, 1 F.3d 176, 182 (3d Cir. 1993). However, because we are remanding the numerosity issue to the District Court, we think it appropriate to consider this issue pertaining to the size of the class because the partial assignability issue impacts whether the three additional class members should be included in the class on remand. See Bagot v. Ashcroft, 398 F.3d 252, 256 (3d Cir. 2005) (“This Court has discretionary power to address issues that have been waived.”).
Initially, Defendants' partial assignment argument makes intuitive sense. Why should Plaintiffs be able to take one claim and turn it into two for numerosity purposes? How is this not a form of “double dipping”? Nevertheless, no matter how intuitively appealing this argument may be, it lacks legal support. The text of Rule 23(a)(1) says nothing about the number of claims; instead, it refers to the number of class members. Fed. R. Civ. P. 23(a)(1) (requiring an inquiry into whether “the class is so numerous that joinder of all members is impracticable” (emphasis added)).
Moreover, as the District Court recognized, there is persuasive circuit precedent establishing that partial assignees are appropriately considered to be members of a class. In In re Fine Paper Litigation, 632 F.2d 1081, 1089 (3d Cir. 1980), the state of Washington was the recipient of partial assignments of antitrust claims. It sought to be excluded from the settlement class, and the district court held, among other reasons for denying the right to opt out, that “the state's assertion of the assigned claims would result in an impermissible fragmentation of the ․ causes of action.” Id. We reversed and “reject[ed] the defendant's position that the partial assignments improperly fragment the claim.” Id. at 1090. We looked to section 156 of the Restatement of Contracts for guidance and concluded that “[a]n assignment of a fractional part of a single and entire right against an obligor is operative as if the part had been a separate right.” Id. at 1091; Restatement (First) of Contracts § 156 (“An assignment of either a fractional part of a single and entire right against an obligor ․ is operative as to that part or amount to the same extent and in the same manner as if the part had been a separate right.”).9
At the same time, we held that when the “collective right to the entire claim” is split, “the partial assignee may not maintain the original suit” unless the obligor has consented in order to protect the “right[ ] of the obligor to be free of successive and repeated suits growing out of the same basic facts.” In re Fine Paper Litig., 632 F.2d at 1091. When the obligor does not consent to these separate suits, then these rights are protected by the use of the joinder rules or the class action mechanism. Id. Thus, the state of Washington could be made a party that, unlike other class members, did “not have the right to opt out.” Id. In our case, Defendants are really seeking the opposite of what we said was permissible in Fine Paper Litigation: they want us to say that these two partial assignees must proceed independent of the class.
While Fine Paper Litigation did not address numerosity, we consider its reasoning instructive. Crucially, we held there that a partial assignment “is operative as if the part had been a separate right.” Id. at 1091. Moreover, Fine Paper Litigation envisioned the class action mechanism as a proper tool for partial assignees to participate in the lawsuit, albeit with fewer individual rights than other claimants. We agree with the District Court that, unless there is evidence that the class plaintiffs are seeking to artificially inflate the number of claimants, partial assignees may properly be treated as class members. On remand, the District Court will need to consider whether the three new assignees that Plaintiffs first mention on appeal should be considered as class members.10 Thus, at this point, we assume that the class consists of twenty-two members.
2. Impracticability of Joinder
In Marcus, we recognized the three core purposes of the numerosity requirement:
First, it ensures judicial economy. It does so by freeing federal courts from the onerous rule of compulsory joinder inherited from the English Courts of Chancery and the law of equity. Courts no longer have to conduct a single, administratively burdensome action with all interested parties compelled to join and be present. The impracticability of joinder, or numerosity, requirement also promotes judicial economy by sparing courts the burden of having to decide numerous, sufficiently similar individual actions seriatim. As for its second objective, Rule 23(a)(1) creates greater access to judicial relief, particularly for those persons with claims that would be uneconomical to litigate individually. Finally, the rule prevents putative class representatives and their counsel, when joinder can be easily accomplished, from unnecessarily depriving members of a small class of their right to a day in court to adjudicate their own claims.
687 F.3d at 594–95 (internal citations omitted). However, in Marcus, we had no need to provide a list of factors that should be considered in the numerosity analysis, because it was “[m]ere speculation” that anyone other than the named plaintiff was a class member. Id. at 596–97.
We have not had occasion to list relevant factors that are appropriate for district court judges to consider when determining whether joinder would be impracticable. We do so now. This non-exhaustive list includes: judicial economy, the claimants' ability and motivation to litigate as joined plaintiffs, the financial resources of class members, the geographic dispersion of class members, the ability to identify future claimants, and whether the claims are for injunctive relief or for damages. See 5 Moore's Federal Practice § 23.22; 5 Newberg on Class Actions § 3.12 (“These factors include: judicial economy arising from avoidance of a multiplicity of actions, geographic dispersion of class members, size of individual claims, financial resources of class members, and the ability of claimants to institute individual suits.”); Pa. Pub. Sch. Emps. Ret. Sys., 772 F.3d at 120 (“However, the numerosity inquiry is not strictly mathematical but must take into account the context of the particular case, in particular whether a class is superior to joinder based on other relevant factors including: (i) judicial economy, (ii) geographic dispersion, (iii) the financial resources of class members, (iv) their ability to sue separately, and (v) requests for injunctive relief that would involve future class members.” (citing Robidoux, 987 F.2d at 936)).
These factors are only relevant to a binary choice at the certification stage: a class action versus joinder of all interested parties. At this point, we do not consider the possibility that plaintiffs may bring individual suits. After all, the text of Rule 23(a)(1) refers to whether “the class is so numerous that joinder of all members is impracticable,”11 not whether the class is so numerous that failing to certify presents the risk of many separate lawsuits.
While all factors are relevant, we note at the outset that not all are created equal. Instead, both judicial economy and the ability to litigate as joined parties are of primary importance. As we have held, judicial economy is one of the purposes behind Rule 23(a)(1) and class actions in general. Marcus, 687 F.3d at 594. The same is true of ensuring that small-value claims have a mechanism by which they can be economically litigated. Id.; Deposit Guaranty Nat'l Bank, Jackson, Miss. v. Roper, 445 U.S. 326, 339, 100 S.Ct. 1166, 63 L.Ed.2d 427 (1980) (“Where it is not economically feasible to obtain relief within the traditional framework of a multiplicity of small individual suits for damages, aggrieved persons may be without any effective redress unless they may employ the class action device.”). If we were to say that judicial economy and the ability of class members to bring their own suits as named parties weighed in favor of class certification, how could the other factors outweigh these considerations even though the core purposes of a class action were being advanced?12 In this case, the District Court's judicial economy analysis was incorrect, as it improperly placed great weight on the late stage of the proceeding. Additionally, the District Court did not fully explore the ability of class members to join as plaintiffs.
a. Judicial Economy
Judicial economy, a primary factor frequently cited, looks to the administrative burden that multiple or aggregate claims place upon the courts. Marcus, 687 F.3d at 594 (stating that the numerosity requirement “also promotes judicial economy by sparing the courts the burden of having to decide numerous, sufficiently similar individual actions seriatim”); id. (“Courts no longer have to conduct a single, administratively burdensome action with all interested parties compelled to join and be present.”). This factor takes into account any efficiency considerations regarding the joinder of all interested parties that the district court deems relevant, including the number of parties and the nature of the action. See 5 Moore's Federal Practice § 23.22 (instructing a court to consider “the actual, practical difficulties of joining all of the potential class members” by inquiring whether joinder “would be expensive, time-consuming, and logistically unfeasible”). In analyzing judicial economy, we focus on whether the class action mechanism is substantially more efficient than joinder of all parties.
Here, the District Court “conclude[d] that judicial economy [was] best served by trying this case as a class action.” King Drug Co., 309 F.R.D. at 206. It made this decision by looking to “the extensive history of the litigation and the exhaustive discovery that ha[d] been conducted.” Id. It expressed concern that further discovery would delay the case even more, or that unnamed class members would opt to file suit elsewhere, resulting in other civil actions with additional discovery and the potential for inconsistent verdicts. Id. at 206–07 (“Joinder of the absent class members would likely require additional rounds of discovery, which would only further delay a trial date. Further, if cases were brought within other jurisdictions, additional discovery is certainly a possibility.”).13 While these predictions may come true, the late stage of litigation is not by itself an appropriate consideration to take into account as part of a numerosity analysis.14
In complex cases such as this antitrust suit, the class certification decision is often delayed until after years of fact and expert discovery have been conducted and dispositive motions have been litigated. See Hydrogen Peroxide, 552 F.3d at 324 (“But even with some limits on discovery and the extent of the hearing, the district judge must receive enough evidence, by affidavits, documents, or testimony, to be satisfied that each Rule 23 requirement has been met.” (quoting In re Initial Pub. Offerings Sec. Litig., 471 F.3d 24, 41 (2d Cir. 2006))). Courts routinely refuse to certify classes based on the need to conduct further discovery before being able to properly rule on a class certification motion. See In re New Motor Vehicles Canadian Export Antitrust Litig., 522 F.3d 6, 26–27 (1st Cir. 2008) (noting that the district court erred in preliminarily certifying the class because of the “novelty and complexity of the theories advanced and the gaps in the evidence proffered”); Valley Drug Co. v. Geneva Pharm., Inc., 350 F.3d 1181, 1192 (11th Cir. 2003) (“[T]he record needed to decide [the class certification] issue remains incomplete because the district court improperly denied Abbott's request to conduct so-called ‘downstream discovery.’ ”). However, such decisions do not prejudice a plaintiff; the class certification motion is not denied, but only deferred until after further discovery is conducted.
Conversely, a rule that would allow courts to consider the late stage of litigation and the sunk costs already incurred in their numerosity analyses would place a thumb on the scale in favor of a numerosity finding for no reason other than the fact that the complex nature of a case resulted in the class certification decision being deferred for years. Our view is consistent with the 2003 amendments to Rule 23(c)(1)(A) which state that the class certification decision should be made “a[t] an early practicable time after a person sues or is sued as a class representative” as opposed to the previous rule which said that the decision be made “as soon as practicable after commencement of an action.” See Fed. R. Civ. P. 23 advisory committee's notes to 2003 amendment (noting that the “as soon as practicable” designation does not “capture[ ] the many valid reasons that may justify deferring the initial certification decision”). We have recognized that the rule was modified in order to discourage “premature certification determinations.” Richardson v. Bledsoe, ––– F.3d ––––, ––––, 2016 WL 3854216, at *4 (3d Cir. July 15, 2016) (quoting Weiss v. Regal Collections, 385 F.3d 337, 347 (3d Cir. 2004)).15
As the Advisory Committee noted, there are “many valid reasons that may justify deferring the initial certification decision,” including the need to conduct discovery, a determination of what issues would be presented at trial, and the defendant's desire to “win dismissal or summary judgment as to the individual plaintiffs without certification and without binding the class that might have been certified.” Fed. R. Civ. P. 23 advisory committee's notes to 2003 amendment. Thus, while Rule 23(c)(1)(A) now encourages further discovery so that all of the information and evidence relevant to certification is before a district judge before she makes the certification decision, the District Court's analysis here would seem to consider any lengthy period following the filing of a putative class action as weighing in favor of finding numerosity. This cannot be right. Judicial economy does not permit consideration of the sunk costs from past discovery and litigation, or the need to conduct further discovery if the class is not certified.16
Moreover, while the District Court expressed concern that “[j]oinder of the absent class members would likely require additional rounds of discovery,” King Drug, 309 F.R.D. at 206, this does not mean that the litigation would have to begin anew for the unnamed class members. If the members all opted to join the case as individual plaintiffs, the District Court could, in its discretion, limit discovery where “is unreasonably cumulative or duplicative, or can be obtained from some other source that is more convenient, less burdensome, or less expensive.” Fed. R. Civ. P. 26(b)(2)(C)(i). At this point, Defendants have not shown what further discovery they are entitled to; they only claim that they are entitled to further discovery as a matter of due process.17 In addition, as a class, Plaintiffs have been using the same experts. It is not clear that there would be a need for that to change merely because Plaintiffs would be joined as individual parties instead of moving forward as a class.
On remand, when considering the judicial economy factor of the numerosity analysis, the District Court should not take into account the sunk costs of the litigation or the need to further delay trial were the class not to be certified.18 In other words, without considering the late stage of the litigation, it should determine whether a class action would have been a substantially more efficient mechanism of litigating this suit than joinder of all parties. This primarily involves considerations of docket control, taking into account practicalities as simple as that of every attorney making an appearance on the record. At the same time, the District Court is free to rely on its superior understanding of how the case has proceeded to date for the purpose of determining whether the class mechanism would have actually been a substantially more efficient use of judicial resources than joinder of the parties at the onset of the litigation.
b. Ability and Motivation to be Joined as Plaintiffs
The second purpose behind the numerosity requirement is to further the broader class action goal of providing those with small claims reasonable access to a judicial forum for the resolution of those claims. Thus, the ability and motivation of Plaintiffs to pursue their litigation via joinder is the second factor upon which we focus. See Marcus, 687 F.3d at 594 (stating that the numerosity requirement “creates greater access to judicial relief, particularly for those persons with claims that would be uneconomical to litigate individually”).19
This primarily 20 involves an examination of the stakes at issue for the individual claims and the complexity of the litigation, which will typically correlate with the costs of pursuing these claims. Though joinder is certainly more economical for most plaintiffs than pursuing the case alone, it is often still uneconomical for an individual with a negative value claim to join a lawsuit.21 After all, each plaintiff may need to hire his own counsel to protect his individual interests—although total litigation costs would still likely be lower due to joint litigation agreements. Similarly, each plaintiff would be subject to discovery, whereas the defendants would have to show a greater need for discovery from unnamed plaintiffs in a class action.22 See Clark v. Universal Builders, Inc., 501 F.2d 324, 340–41 (7th Cir. 1974) (placing the burden on defendants to show the need for discovery from unnamed class members to ensure that the discovery is not requested “as a tactic to take undue advantage of the class members or as a stratagem to reduce the number of claimants” (internal quotation marks omitted)).
The District Court did not properly consider this factor, as it focused instead on whether the individual plaintiffs could have brought their own, individual suits.23 However, the numerosity rule does not envision the alternative of individual suits; it considers only the alternative of joinder. Here, the class members, based on the record before us, appear likely to have the ability and incentive to bring suit as joined parties, thus preventing the alleged wrongdoers from escaping liability.24 In fact, three class members, none of whom are named plaintiffs, each have claims estimated at over $1 billion—even before the trebling of damages. These three make up over 97% of the total value of the class claims, and can hardly be considered as candidates who need the aggregative advantages of the class device. While this factor could weigh in favor of class status if the remaining class members had very small claims, that is simply not the case here. Thirteen of the other nineteen class members have claims that are greater than $1 million, the value that the two parties seem to agree is the appropriate figure at which point bringing one's own suit becomes economical. On the other hand, there are only six class members with claims below $1 million each. While it may be uneconomical for these claims to be pursued in individual litigation, there has been no showing that it would be uneconomical for these six class members to be individually joined as parties in a traditional lawsuit. On remand, the District Court should consider this issue. Even if it were uneconomical for some or all of these six individual plaintiffs to join the suit, the District Court must still determine whether, considering all the other relevant factors, class status—which is “an exception to the usual rule that litigation is conducted by and on behalf of the individual named parties only,” Wal–Mart Stores, 564 U.S. at 348, 131 S.Ct. 2541—is appropriate here.
The District Court abused its discretion in analyzing the two most important numerosity factors when it considered the late stage of the litigation as relevant to the judicial economy factor and failed to properly consider the ability and motivation of the plaintiffs to proceed as joined, as opposed to individual, parties. We therefore remand for the District Court to conduct a rigorous numerosity analysis for this class of twenty-two (or twenty-five) members. In conducting this rigorous analysis, factors that the District Court may consider include the financial resources of the class members, the geographic dispersion of the class members, the ability to identify future claimants, together with the fact that these claims are for damages, and not injunctive relief.
Contrary to the dissent's assertion, we are not “erecting roadblocks that do not exist.” Although the dissent suggests that Defendants have not yet shown why joinder is practicable, that suggestion is beside the point. The burden is on Plaintiffs to show why joinder is impracticable. Marcus, 687 F.3d at 591 (“The party seeking certification bears the burden of establishing each element of Rule 23”—including the numerosity requirement—“by a preponderance of the evidence.”); id. at 595 (“Critically, numerosity—like all Rule 23 requirements—must be proven by a preponderance of the evidence.”). Moreover, the dissent would have Defendants' inability to articulate an argument against finding numerosity obviate a district court's obligation to conduct “a rigorous analysis” and determine “that the prerequisites of Rule 23(a) have been satisfied.” Wal–Mart, 564 U.S. at 351, 131 S.Ct. 2541 (internal quotation marks omitted); Hydrogen Peroxide, 552 F.3d at 310 (“[A] class may not be certified without a finding that each Rule 23 requirement is met.”).
Finally, the dissent makes the extravagant claim that “nothing about [this case] cries out for anything but class treatment.” Yet this is not the typical class action where hundreds or thousands of claims are aggregated in order to ensure that the wrongdoer is held accountable and that small claims are vindicated. See Thorogood, 547 F.3d at 744. Putting aside the small number of class members in this case, the judges in the majority have never seen a class action where three class members, each with billions of dollars at stake and close to 100% of the total value of class claims between them, have been allowed to sit on the sidelines as unnamed class members. Plaintiffs must satisfy their burden of showing why we should allow this unique putative class to take advantage of this “exception to the usual rule that litigation is conducted by and on behalf of the individual named parties only.” Wal–Mart, 564 U.S. at 348, 131 S.Ct. 2541 (internal quotation marks omitted). At this point, they have failed to meet that burden, and any suggestion that this is a run-of-the-mill class action ignores the facts of this case.25
Although we remand for the District Court to reconsider its numerosity analysis, we also see a need to address Defendants' predominance argument. This argument makes selective use of language from the Supreme Court's recent decision in Comcast Corp. v. Behrend, ––– U.S. ––––, 133 S.Ct. 1426, 185 L.Ed.2d 515 (2013). The interpretation of Comcast advanced by Defendants is overly broad and simplistic, and, if the class were to meet the numerosity requirement on remand, the predominance argument advanced by Defendants is untenable.
Under Rule 23(b)(3), “questions of law or fact common to class members [must] predominate over any questions affecting only individual members.”26 This “inquiry tests whether proposed classes are sufficiently cohesive to warrant adjudication by representation.” Amchem Prods., 521 U.S. at 623, 117 S.Ct. 2231. “If anything, Rule 23(b)(3)'s predominance criterion is even more demanding than Rule 23(a),” Comcast, 133 S.Ct. at 1432, as it is “[f]ramed for situations in which ‘class-action treatment is not as clearly called for’ as it is in Rule 23(b)(1) and (b)(2) situations.” Amchem Prods., 521 U.S. at 615, 117 S.Ct. 2231 (quoting Fed. R. Civ. P. 23 advisory committee's notes to 1966 amendment). This “inquiry is especially dependent upon the merits of a plaintiff's claim, since the nature of the evidence that will suffice to resolve a question determines whether the question is common or individual.” In re Constar Int'l Inc. Sec. Litig., 585 F.3d 774, 780 (3d Cir. 2009) (internal quotation marks omitted). The Supreme Court has noted that “[a]n individual question is one where members of a proposed class will need to present evidence that varies from member to member, while a common question is one where the same evidence will suffice for each member to make a prima facie showing [or] the issue is susceptible to generalized, class-wide proof.” Tyson Foods, Inc. v. Bouaphakeo, –––U.S. ––––, 136 S.Ct. 1036, 1045, 194 L.Ed.2d 124 (2016) (internal quotation marks omitted).
The predominance requirement applies to damages as well, because the efficiencies of the class action mechanism would be negated if “[q]uestions of individual damage calculations ․ overwhelm questions common to the class.” Comcast, 133 S.Ct. at 1433. This does not mean, however, that damages must be “susceptible of measurement across the entire class for purposes of Rule 23(b)(3).” Neale v. Volvo Cars of N.A., LLC, 794 F.3d 353, 374 (3d Cir. 2015) (internal quotation marks omitted).
Defendants contend that Plaintiffs cannot satisfy the predominance requirement of Rule 23(b)(3). They make two interrelated arguments: (1) Plaintiffs' theory of liability runs afoul of Comcast because, after the grant of summary judgment on the global conspiracy claim, Plaintiffs' damages model no longer corresponds to their remaining theory of liability that there were four independent Section 1 conspiracies; and (2) predominance cannot be demonstrated because Plaintiffs' remaining theory of liability must isolate the harm that each individual reverse-payment settlement agreement caused each individual class member under the doctrine of antitrust standing.27
1. Comcast Argument
Comcast was an antitrust suit brought by a class of Comcast subscribers. The plaintiffs initially had four theories of antitrust impact: (1) “Comcast's clustering made it profitable for Comcast to withhold local sports programming from its competitors”; (2) “Comcast's activities reduced the level of competition from ‘overbuilders' ”; (3) “Comcast reduced the level of ‘benchmark’ competition on which cable customers rely to compare prices”; and (4) “clustering increased Comcast's bargaining power relative to content providers.” 133 S.Ct. at 1430–31. Their damages model “did not isolate damages resulting from any one theory of antitrust impact,” id. at 1431, and simply “assumed the validity of all four theories of antitrust impact,” id. at 1434. The district court limited its certification order to the overbuilding theory because it was the only antitrust theory capable of classwide proof, but found the predominance requirement to be satisfied even though the damages model was not altered to reflect the only theory of harm remaining. Id. at 1431. A divided panel of our Court affirmed, Behrend v. Comcast Corp., 655 F.3d 182 (3d Cir. 2011), with Judge Jordan writing separately to say that he “would vacate the certification order to the extent it provides for a single class as to proof of damages,” id. at 209 (Jordan, J., concurring in the judgment in part and dissenting in part), because the model of plaintiffs' expert “no longer fits Plaintiffs' sole theory of antitrust impact, and, instead, produces damages calculations that are not the certain result of the wrong,” id. at 217 (internal quotation marks omitted).
The Supreme Court reversed, holding that while the damages model does not need to be exact, “a model purporting to serve as evidence of damages in [a] class action must measure only those damages attributable to that theory. If the model does not even attempt to do that, it cannot possibly establish that damages are susceptible of measurement across the entire class for purposes of Rule 23(b)(3).” Comcast, 133 S.Ct. at 1433. Because the plaintiffs' damages model reflected injury from all four alleged antitrust violations, and because only the overbuilding theory of harm remained, the damages model was unable to “bridge the differences between supra-competitive prices in general and supra-competitive prices attributable to the deterrence of overbuilding.” Id. at 1435. The Supreme Court explained “[p]rices whose level above what an expert deems ‘competitive’ has been caused by factors unrelated to an accepted theory of antitrust harm are not ‘anticompetitive’ in any sense relevant here.” Id.
In the case before us, Plaintiffs' expert, Dr. Leitzinger, created a damages model that calculated the savings to the class if generic entry had occurred earlier. He noted the prices and overcharges actually paid by the class members and compared that to but-for worlds that included the launch of anywhere between one and five generic competitors. Crucially, this model did not allocate damages amongst the five original defendants (Cephalon and the four generic manufacturers), attribute a certain amount of harm from each individual reverse-payment settlement, or identify which class members were harmed by which reverse-payment settlement. In Defendants' view, because only individual conspiracies remain, any damages model must reflect the harm caused by each individual conspiracy to each individual class member, and the use of the same damages model that envisioned a global conspiracy “does not even attempt,” Id. at 1433, to correspond to this remaining theory of liability.28
However, Plaintiffs' theory of liability is not that each individual agreement caused an individual harm, such that a new damages model would be required under Comcast. Instead, their theory of liability is that each individual agreement contributed to the market-wide harm, and that all five original defendants are jointly and severally liable 29 for this harm as concurrent tortfeasors. This theory may ultimately be proven wrong, but it does match Plaintiffs' damages theory. Defendants next try to argue that Plaintiffs' theory of liability must isolate the harm from each individual agreement, and that any reliance on joint and several liability conflicts with the requirements of antitrust standing.
2. Antitrust Impact
Defendants argue that Plaintiffs are attempting to circumvent the doctrine of antitrust standing by asserting the theory of joint and several liability. In essence, they are arguing that the joint and several theory of liability is not “plausible in theory,” Hydrogen Peroxide, 552 F.3d at 325, because under the doctrine of antitrust standing Plaintiffs must show how each individual agreement harmed each individual class member.
In an antitrust class action, “impact often is critically important for the purpose of evaluating Rule 23(b)(3)'s predominance requirement because it is an element of the claim that may call for individual, as opposed to common, proof.” Id. at 311. A district court must thus undertake a “rigorous assessment of the available evidence and the method or methods by which plaintiffs propose to use the evidence to prove impact at trial.” Id. at 312. The class should only be certified “if such impact is plausible in theory [and] it is also susceptible to proof at trial through available evidence common to the class.” Id. at 325. This inquiry often involves an overlap into the merits. Id. at 324.
Defendants argue that, in the absence of the global conspiracy claim, Plaintiffs must prove which class members suffered an injury under a specific bilateral agreement. They state that under the doctrine of antitrust standing, a class member who would have purchased generic modafinil from Ranbaxy cannot hold Mylan liable; a class member who would have purchased generic modafinil from Mylan cannot hold Ranbaxy liable; and a class member who would have purchased generic modafinil from Teva cannot hold either Ranbaxy or Mylan liable. If correct, such individualized inquiries would defeat predominance, and Plaintiffs' joint and several liability theory would not be plausible. We agree with the general proposition that an antitrust plaintiff cannot defeat the doctrine of antitrust standing by resort to common-law tort principles untethered to antitrust law. But Defendants' objection is misplaced in this case because the common law principle of joint and several liability is being invoked by Plaintiffs for the proper purpose of establishing antitrust impact and therefore antitrust standing.
The doctrine of antitrust standing requires a plaintiff to “prove more than injury causally linked to an illegal presence in the market.” Brunswick Corp. v. Pueblo Bowl–O–Mat, Inc., 429 U.S. 477, 489, 97 S.Ct. 690, 50 L.Ed.2d 701 (1977).30 This inquiry instead looks to whether the plaintiff suffered an antitrust injury, i.e., an “injury of the type the antitrust laws were intended to prevent and that flows from that which makes defendants' acts unlawful.” Id. In Associated General Contractors of California, Inc. v. California State Council of Carpenters (AGC), 459 U.S. 519, 537–38, 103 S.Ct. 897, 74 L.Ed.2d 723 (1983), the Supreme Court stated that many factors go into this determination. We have condensed these factors into a multi-part test:
(1) the causal connection between the antitrust violation and the harm to the plaintiff and the intent by the defendant to cause that harm, with neither factor alone conferring standing; (2) whether the plaintiff's alleged injury is of the type for which the antitrust laws were intended to provide redress; (3) the directness of the injury, which addresses the concerns that liberal application of standing principles might produce speculative claims; (4) the existence of more direct victims of the alleged antitrust violations; and (5) the potential for duplicative recovery or complex apportionment of damages.
Ethypharm S.A. France v. Abbott Labs., 707 F.3d 223, 232–33 (3d Cir. 2013) (quoting In re Lower Lake Erie Iron Ore Antitrust Litig., 998 F.2d 1144, 1165–66 (3d Cir. 1993)). We have said that the “directness of injury” is “the focal point by which the remainder of the AGC factors are guided.” Lower Lake Erie, 998 F.2d at 1166 n.19 (citing Holmes v. Sec. Investor Prot. Corp., 503 U.S. 258, 269, 112 S.Ct. 1311, 117 L.Ed.2d 532 (1992)).
Defendants rely solely on a pre-AGC case of ours, Mid–West Paper Products Co. v. Continental Group, Inc., 596 F.2d 573 (3d Cir. 1979), which concerned the “all-important[ ] directness factor,” in support of their position that Plaintiffs lack antitrust standing to bring claims against generic manufacturers from whom they would not have purchased. Lower Lake Erie, 998 F.2d at 1167–68 (conducting an analysis of the AGC factors and discussing Mid–West Paper in the directness of injury section).31 In Mid–West Paper, the plaintiff claimed that it “suffered as a direct purchaser of consumer bags from competitors of the defendants, who allegedly were able to charge artificially inflated prices as a consequence of defendants' price-fixing.” 596 F.2d at 580 (footnote omitted). We held that the plaintiff, who was not a customer of any member of the conspiracy, lacked antitrust standing to sue the conspiracy members even though it paid higher prices as a result of the conspiracy. In other words, the customer of a competitor of conspiracy members was not “one whose protection is the fundamental purpose of the antitrust laws.” Id. at 583 (internal quotation marks omitted).
In reaching this conclusion in Mid–West Paper, we took several factors into consideration. First, we noted that it would be “almost impossible, and at the very least unwieldy” to calculate the harm to the plaintiff from the conspiracy, because so many variables went into the competitor's price calculation irrespective of the existence of the monopoly. Id. at 584. The value of any harm caused by the anticompetitive conduct would be speculative and “would transform this antitrust litigation into the sort of complex economic proceeding” that the direct-purchaser rule 32 was adopted in part to prevent. Id. at 585. In addition, the defendants were “not in a direct or immediate relationship” to the plaintiff, and they gained no advantage from the plaintiff's injury. Id. at 583. Moreover, there was another group of victims who were more likely to sue the conspiracy members—those who purchased directly from them—and one of the purposes of the antitrust standing doctrine is to “compensate[ ] those victims who are most likely to assume the mantle of private attorneys general for the injuries that they suffered.” Id. at 585. For that reason, we “concentrate[ ] the entire award in the hands of the direct purchasers in all but unusual circumstances and thereby giv[e] them an incentive to sue.” Id. If we were to allow the customer of a competitor to sue for treble damages when the “causal link to defendants' activities is [so] tenuous,” it would “subject antitrust violators to potentially ruinous liabilities, well in excess of their illegally-earned profits, because ․ [violators] would be held accountable for higher prices that arguably ensued in the entire industry.” Id. at 586.
As is clear from the above description, Defendants' argument that Mid–West Paper means that a customer of a non-defendant cannot have antitrust standing is an oversimplification. Mid–West Paper reached its result because it wanted to ensure that only those who are most directly harmed by the anticompetitive conduct can sue to remedy the antitrust violation. When, as in Mid–West Paper, the anticompetitive conduct is price-fixing, the only customers who will have antitrust standing are the direct customers of the conspiracy members. The case before us is not about price-fixing. It is, instead, a case about market exclusion, as it concerns conduct that prevents a competitive market from forming at all.33 In such a scenario all market customers should have antitrust standing to sue those engaged in the allegedly anticompetitive conduct because all suffer equally from the foreclosure of choice. See AGC, 459 U.S. at 538, 103 S.Ct. 897 (“[T]he Sherman Act was enacted to assure customers the benefits of price competition, and our prior cases have emphasized the central interest in protecting the economic freedom of participants in the relevant market.”).
In fact, in Lower Lake Erie, we addressed market exclusion in the market for the unloading of iron ore from ships. Traditionally, iron ore was shipped across the Great Lakes, unloaded at railroad-owned docks onto a railroad, and then transported to the steel mills. Lower Lake Erie, 998 F.2d at 1153–55. Large cranes called “huletts” were affixed to the docks and were needed to unload the iron ore from the ships, and, because the non-railroad-owned docks were not equipped with huletts, they “were not competitors for this segment of the ore business.” Id. at 1153. A new, less expensive technology was developed that would allow the iron ore to be unloaded without the use of huletts, and thus open the transshipment market to non-railroad-owned docks. Id. The railroad companies suppressed this new technology by threatening non-railroad-owned docks with higher rates, among other measures. Id.
The issue of antitrust standing arose when the steel companies sued the railroad companies for higher rates paid to the vessel companies. Id. at 1167. We held that this injury was sufficiently direct, despite the railroad company's reliance on Mid–West Paper. Specifically, we noted that even though the steel companies paid higher rates than it otherwise would have to several ore transportation companies—both defendants and non-defendants—“it was unquestionably the steel companies who bore the brunt of the increased costs attributed to the railroad's agreement to thwart development of the less expensive technology.” Id. at 1168; id. (“The steel companies were the sole customers of the industry involved in the transshipment of ore; indeed, the industry existed for them.”). Although there were other victims of the harm, such as the vessel companies, the dock companies, and trucking companies, this did “not diminish the directness of the steel companies' injury.” Id. at 1168–69.
Unlike in Mid–West Paper, where there was a market for consumer bags and we knew who was buying from whom, there was no market in this case due to Defendants' allegedly anticompetitive conduct in delaying the availability of generic modafinil. Just as the railroad docks and their older, more expensive technology were the steel companies' only choice in Lower Lake Erie, Cephalon's brand-name version of modafinil—Provigil—was the only option available to the DPP class. All other options were prevented from entering the market by the allegedly anticompetitive conduct of the railroad companies and the drug manufacturers, respectively.
Under Plaintiffs' theory, each of the four generic manufacturers allegedly entered into separate anticompetitive arrangements with Cephalon.34 If any one of them had refused to enter into this arrangement, there would have been no antitrust injury for anyone, as the market would have worked as envisioned by the Hatch–Waxman Act: there would have been between one and five generic manufacturers competing with the brand-name modafinil during the 180-day exclusivity period, after which there would have been a fully competitive market. However, because all four entered into these reverse-payment settlement agreements and prevented a competitive market from forming, each contributed to the market-wide harm, and each can be held jointly and severally liable for such harm. This is not the sum of four separate individual harms emanating from each agreement; instead, it is a harm that all four agreements work jointly to produce, even if there was no conspiracy between the generic manufacturers. The class member who would have purchased from Teva is harmed by the Ranbaxy and Mylan agreements to the same extent that a Ranbaxy or Mylan customer would be. Thus, any class member would have antitrust standing to sue any or all of the four generic companies individually. There is no need to pursue an individualized inquiry into the harm caused by each agreement, and “questions of law or fact common to class members predominate over any questions affecting only individual members.” Fed. R. Civ. P. 23(b)(3). Defendants' attempt to dictate Plaintiffs' theory of liability based on the doctrine of antitrust standing should fail.
For the reasons stated above, we will vacate the District Court's class certification order, and we will remand to the District Court for further consideration of whether joinder of all class members is impracticable.
Today, the Majority concludes that the able District Court judge abused his discretion by purportedly focusing on a consideration that we have never—indeed, by my research, no court has ever—stated it should not consider. How can that be? Furthermore, how can it be that the Majority mischaracterizes the late stage of the proceedings as being the focus of Judge Goldberg's ruling when his reasoning actually focuses on the considerations that our case law dictates it should? Also how can it be that in analyzing judicial economy district courts are prohibited from considering the stage of the proceedings? I am perplexed. I am similarly perplexed as to why the Majority is directing the District Court on remand to figure out whether joinder is practicable when the appellants have failed to make that case themselves. I therefore respectfully dissent from part III.A of the Majority's opinion.
The District Court Correctly Applied Rule 23(a)(1)
The text of Rule 23(a)(1) provides the standard by which a district court determines if a putative class is numerous enough to be certified—the district court must determine if “joinder of all members is impracticable.” See also Newberg on Class Actions § 3:11 (5th ed.) (“[Rule 23(a)(1)'s] core requirement is that joinder be impracticable.”). Because the focus of Rule 23(a)(1) is on the practicability of joinder, it is well-established that “[t]he numerosity requirement requires examination of the specific facts of each case and imposes no absolute limitations.” Gen. Tel. Co. of the Nw. v. EEOC, 446 U.S. 318, 330, 100 S.Ct. 1698, 64 L.Ed.2d 319 (1980); see also Stewart v. Abraham, 275 F.3d 220, 226 (3d Cir. 2001) (“No minimum number of plaintiffs is required to maintain a suit as a class action․”); Newberg on Class Actions § 3:11 (“Numerousness—the presence of many class members—provides an obvious situation in which joinder may be impracticable, but it is not the only such situation; thus, Rule 23(a)(1)'s analysis may, in specific circumstances, focus on other factors as well.”). In examining the “specific facts of each case,” Gen. Tel. Co., 446 U.S. at 330, 100 S.Ct. 1698, we have instructed courts to bear in mind the underpinnings of the numerosity requirement. The first of these is “judicial economy by sparing courts the burden of having to decide numerous, sufficiently similar individual actions seriatim.” Marcus v. BMW of N. Am., LLC, 687 F.3d 583, 594 (3d Cir. 2012). The second is “greater access to judicial relief, particularly for those persons with claims that would be uneconomical to litigate individually.” Id.1
Here, the District Court, after making a careful finding that the putative class consisted of 22 members, had to make a close call. Our cases have recognized that “[w]hile there are exceptions, numbers under twenty-one have generally been held to be too few.” Weiss v. York Hosp., 745 F.2d 786, 808 n.35 (3d Cir. 1984) (quoting 3B J. Moore, Moore's Federal Practice ¶ 23.05, at 23–150 (2d ed. 1982)). This recognition, however, does not stem from any mechanical numerical requirement, but rather from an understanding that the considerations bearing on the practicability of joinder are less likely to be met in classes with fewer than 21 members. See Newberg on Class Actions § 3:11 (“Numerousness—the presence of many class members—provides an obvious situation in which joinder may be impracticable, but it is not the only such situation; thus, Rule 23(a)(1)'s analysis may, in specific circumstances, focus on other factors as well.”). Recognizing the closeness of the issue, the District Court did exactly as we have instructed it to do and looked to factors bearing on the objectives we cited in Marcus.2 Indeed, the District Court examined the very factors that the Majority embraces: judicial economy, the geographic dispersion of class members, the claimants' ability and motivation to litigate as joined plaintiffs, and the financial resources of class members.
The District Court first considered whether judicial economy weighs in favor of finding joinder to be impracticable:
Considering the extensive history of this litigation and the exhaustive discovery that has been conducted, I conclude that judicial economy is best served by trying this case as a class action. Joinder of the absent class members would likely require additional rounds of discovery, which would only further delay a trial date. Further, if cases were brought within other jurisdictions, additional discovery is certainly a possibility, and separate trials could result in inconsistent verdicts.
JA-021. The reasons cited for finding judicial economy to favor certification of the class are entirely appropriate. “Judicial economy” means “[e]fficiency in the operation of the courts and the judicial system; esp., the efficient management of litigation so as to minimize duplication of effort and to avoid wasting the judiciary's time and resources.” Judicial Economy, Black's Law Dictionary (9th ed. 2009). The District Court here noted that the additional discovery and potential separate trials would further delay litigation that was already near its end stages, wasting the judiciary's time and resources and requiring the duplication of efforts.
The District Court next considered the geographic dispersion of the class members, a factor widely recognized as vital in determining whether joinder is practicable. See, e.g., Pa. Pub. Sch. Emps. Ret. Sys. v. Morgan Stanley & Co., 772 F.3d 111, 120 (2d Cir. 2014) (listing “geographic dispersion” as a factor in the numerosity analysis); Newberg on Class Actions § 3:12 (same); 7A Charles Allen Wright & Arthur R. Miller, Federal Practice & Procedure § 1762 (3d ed.) (same); 1 Moore's Federal Practice § 23.22 (Matthew Bender 3d ed.) (same). The District Court found this factor to weigh heavily in favor of finding joinder impracticable, noting that the “class members are spread out over thirteen states and Puerto Rico.” JA-021. This, the District Court found, “would certainly present challenges to Plaintiffs in attempting to coordinate the litigation if all class members were joined, particularly if additional discovery was required.” JA-021. Again, this finding is certainly reasonable, is supported by the record, and is in accordance with our instructions as to the relevant considerations in the numerosity calculation.
Against these concerns about judicial economy and, in particular, geographic dispersion, the District Court considered that many of the class members were sophisticated corporations with strong incentives to bring their own lawsuits. But this was not true for every class member, as six had claims below $1 million which might not be enough incentive “to engage in costly antitrust litigation on their own.” JA-022. This wholly appropriate consideration bears upon the objective to provide “greater access to judicial relief, particularly for those persons with claims that would be uneconomical to litigate individually.” Marcus, 687 F.3d at 594.
Ultimately, the District Court found the factors favoring the plaintiffs' position (judicial economy and geographic dispersion) to be more compelling than the factor favoring the defendants' position (the financial resources of the class members). In short, the District Court considered the policies we outlined in Marcus and made a thoughtful determination in a close case. Our abuse-of-discretion standard compels us to affirm that thoughtful determination. Moreover, our clearly erroneous standard compels us to not disturb the factual findings on which it was based.
The District Court Did Not Err in Considering the Stage of the Proceedings
The Majority, however, concludes that the District Court erred in its analysis of the “judicial economy” factor by taking into consideration the stage of the proceedings. As an initial note, I do not read the District Court's analysis as turning solely upon a consideration of the late stage of the proceedings. Rather, the District Court examined many factors, most notably the additional discovery and judicial resources that would have to be expended were the cases to be litigated outside of the class action mechanism, regardless of how far advanced the classwide proceedings were.
At any rate, it is appropriate—indeed, necessary—for a district court to consider the stage of the proceedings when examining whether judicial economy favors class litigation or individual litigation. In considering judicial economy, a district judge must predict how the options before him will play out. This prediction becomes nonsensical, however, if the district judge cannot take into consideration the amount of effort already expended. If you want to determine whether the path you are following is the most economical, is it not important to consider how far along that path you have already traveled?3
Unsurprisingly, then, courts widely—if not universally—recognize that it is appropriate for courts to consider the stage of the proceedings when weighing judicial economy. See, e.g., Carnegie–Mellon Univ. v. Cohill, 484 U.S. 343, 350, 108 S.Ct. 614, 98 L.Ed.2d 720 (1988) (“[A] federal court should consider and weigh in each case, and at every stage of the litigation, the values of judicial economy, convenience, fairness, and comity in order to decide whether to exercise jurisdiction over a case brought in that court involving pendent state-law claims.” (emphasis added)); Zambelli Fireworks Mfg. Co. v. Wood, 592 F.3d 412, 420–21 (3d Cir. 2010) (“However, considerations of efficiency, fairness, and judicial economy weigh against a wholesale dismissal of the action at this stage.” (emphasis added)); Parker & Parsley Petroleum Co. v. Dresser Indus., 972 F.2d 580, 587 (5th Cir. 1992) (“At the stage of the proceedings when the motion was filed, judicial economy would have been better served by dismissal.” (emphasis added)); Park S. Hotel Corp. v. N.Y. Hotel Trades Council, 851 F.2d 578, 582 (2d Cir. 1988) (“[J]udicial economy would not be served by remanding the case at this late stage for arbitration, which almost certainly would be followed by further judicial proceedings.” (emphasis added)); United States v. Timmons, 672 F.2d 1373, 1380 (11th Cir. 1982) (“The district judge appropriately considered that joinder would not serve the interests of judicial economy in view of the late stage of the proceedings․” (emphasis added)).
The Majority asserts, however, without citation to any authority, that a district court cannot consider “the fact that the complex nature of a case resulted in the class certification decision being deferred for years,” see Majority Op. ––––, or “the need to further delay trial were the class not to be certified,” see Majority Op. ––––, or even “the need to conduct further discovery if the class is not certified,” see Majority Op. ––––. But these are precisely the type of practicalities—how long it would take, how complex it would be, how expensive it would be—that help determine the practicability of joinder. The Majority's directive as to what a district court should consider turns the issue into an exercise in abstraction.4 If a district judge cannot consider the practicalities of cost and time, then judicial economy will be poorly served indeed, and one of the core purposes of the class action mechanism—to “save[ ] the resources of both the courts and the parties by permitting an issue potentially affecting every [class member] to be litigated in an economical fashion under Rule 23”—will be undercut. See Califano v. Yamasaki, 442 U.S. 682, 701, 99 S.Ct. 2545, 61 L.Ed.2d 176 (1979).
The District Court Properly Considered the Ability of Plaintiffs to Litigate via Joinder
The Majority also characterizes the District Court's ruling as focusing not on the ability of the plaintiffs to litigate via joinder but “focus[ing] instead on whether the individual plaintiffs could have brought their own, individual suits.” See Majority Op. ––––. I disagree. To the contrary, the focus of the District Court's opinion is on joinder throughout.5 See, e.g., JA-020-22 (“Joinder of the absent class members would likely require additional rounds of discovery, which would only further delay a trial date.”); (“The considerable geographic dispersion of the parties would certainly present challenges to plaintiffs in attempting to coordinate the litigation if all class members were joined, particularly if additional discovery was required.” (emphasis added)); (“Accordingly, Plaintiffs have demonstrated by a preponderance of the evidence that the parties are sufficiently numerous so as to make joinder impracticable.”).
The Majority makes its own contrary finding, surmising that the class members “appear likely to have the ability and incentive to bring suit as joined parties.” See Majority Op. ––––.6 It directs the District Court on remand to consider whether it would be “uneconomical” for the six smaller class members to be joined. It thus instructs the District Court to make the case for joinder—a case the defendants failed to support themselves. The defendants offered little argument (let alone evidence) before the District Court that, notwithstanding the vast geographic dispersion of the plaintiffs, surely joinder would be practicable. Indeed, at oral argument, counsel for Ranbaxy was asked whether joinder was impracticable and responded, “I don't know.” Oral Arg. at 9:30-10:00.7 The Majority is erecting roadblocks that do not exist.
Moreover, if one were to speculate as to the likelihood of these plaintiffs, many competitors in a relatively small market, agreeing to come together—for surely they couldn't be forced to do so—the speculation would be to the contrary. Experience would dictate that many obvious practical reasons stand in the way of joinder: desire to have one's self and own law firm control the litigation, choice of favorable forum, familiarity with the local jurisdiction's laws and procedures, fear of being dragged into settlement, and concerns about the costs of litigating in a far-flung locale. Further, the larger plaintiffs could clearly afford to go their own way. Even in cases that come before the Judicial Panel on Multidistrict Litigation for joinder, where the issue involves only pre-trial proceedings,8 plaintiffs invariably raise reasons for opposing joinder.9 How can we possibly assume that the plaintiffs would have the “ability and incentive” to bring suit as joined parties? If the defendants had supported such a notion, that would be another matter. But the Majority asks the District Court to make its own record as to practicability. That is not our role, nor the role of the District Court.
* * *
Lastly, I am struck by the inescapable fact that this case has proceeded as a class action for years and nothing about it cries out for anything but class treatment.10 One has only to read the Majority's analysis of the real issues before the Court to conclude that it is unimaginable that this case should be torn apart at this late date and sent to the far corners of the United States to start over again as separate actions before several judges, each deciding anew the identical issues facing each plaintiff's claims. It should not be remanded at this late date.
This should not happen because Judge Goldberg has ably managed this case for a decade and properly considered every factor we have ever held to be relevant in determining whether a class is so numerous that joinder would be impracticable. He has not abused his discretion in so doing or made clearly erroneous findings of fact. I would therefore affirm the judgment of the District Court in its entirety. Accordingly, I respectfully dissent from Part III.A of the Majority's opinion.
1. This exclusivity period is granted via the Hatch–Waxman Act, and has nothing to do with whether the drug is covered by a patent.
2. It is a common practice for a brand manufacturer to market its own generic version of the drug when generic entry occurs. Unlike an ANDA filer, the brand manufacturer is not barred from entering the generic market during the 180-day exclusivity period to which the first paragraph IV filer is entitled. See Teva Pharm. Indus. Ltd. v. Crawford, 410 F.3d 51, 54 (D.C. Cir. 2005) (holding that 21 U.S.C. § 355(j)(5)(B)(iv) did not prevent the filer of the original NDA from launching its own generic during the 180-day exclusivity period).
3. An orphan drug is used to treat a rare disease or ailment. Because pharmaceutical companies may lack the financial incentive to develop such drugs, the Orphan Drug Act provides the brand manufacturer with a seven-year period of non-patent exclusivity. See 21 U.S.C. § 360cc(a).
4. In a reverse-payment settlement, “a party with no claim for damages (something that is usually true of a paragraph IV litigation defendant) walks away with money simply so that it will stay away from the patentee's market.” F.T.C. v. Actavis, Inc. ––– U.S. ––––, 133 S.Ct. 2223, 2233, 186 L.Ed.2d 343 (2013). Such agreements are subject to antitrust scrutiny under the “rule of reason” inquiry because such settlements, “where large and unjustified, can bring with it the risk of significant anticompetitive effects.” Id. at 2237.
5. Generic manufacturer Apotex Inc. nonetheless filed a declaratory judgment action in the Eastern District of Pennsylvania in June 2006 alleging non-infringement, invalidity, and unenforceability of the ′516 patent. Apotex Inc. v. Cephalon, Inc., No. 2:06–cv–2768, 2011 WL 6090696, at *1 (E.D. Pa. Nov. 7, 2011). The District Court held that the patent was invalid and unenforceable on November 7, 2011, a ruling which was upheld on appeal. Apotex Inc. v. Cephalon, Inc., 500 Fed.Appx. 959 (Fed. Cir. 2013) (per curiam). The District Court, in a separate opinion, also held that Apotex would not infringe the ′516 patent. Apotex, Inc. v. Cephalon, Inc., No. 2:06–cv–2768, 2012 WL 1080148, at *1 (E.D. Pa. Mar. 28, 2012).
6. Other parties challenging the reverse-payment settlement agreements are a putative class of end-payors, generic competitor Apotex Inc., several retail plaintiffs, and the F.T.C., which originally filed suit in the District Court for the District of Colombia before being transferred to Judge Goldberg's docket. The FTC sued only Cephalon. Teva purchased Cephalon on October 14, 2011, and on May 18, 2015, the F.T.C. settled with Teva for $1.2 billion. The remaining suits are consolidated for purposes of liability, and they have all been stayed pending our ruling on the DPP class certification issue.
7. Despite this conclusion, we recognize the thoughtful work of the District Court, which was diligently done even though there is a paucity of precedent on the numerosity issue.
8. Defendants raised two other challenges to the size of the class before the District Court and in a cursory manner on appeal. They argue (1) that named plaintiff King Drug Company of Florence, Inc. (“King Drug”) should not be included in the class because it went out of business before generic modafinil entered the market in 2012, and thus there is no way of knowing if it would have even purchased generic modafinil, and (2) that Drogueria Betances should not be included in the class because all of its brand modafinil purchases were made after generic entry. We see no need to question the inclusion of these two class members. King Drug presented testimony showing that it would have purchased generic modafinil instead of Provigil if it had been on the market. Similarly, the experts of both parties agreed that it takes several months before prices fall to competitive levels after generic entry. Because Drogueria Betances made its brand modafinil purchases only one month after generic entry, it is conceivable that it paid an overcharge.
9. Nearly identical language is found in the Second Restatement of Contracts, which was pending approval at the time of In re Fine Paper Litigation. Restatement (Second) of Contracts § 326 (“[A]n assignment of a part of a right ․ is operative as to that part to the same extent and in the same manner as if the part had been a separate right.”).
10. Although normally “Rule 23(a)(1) does not require a plaintiff to offer direct evidence of the exact number and identities of the class members,” Marcus v. BMW of N.A., LLC, 687 F.3d 583, 596 (3d Cir. 2012), when the number of class members is so small that any deviation may impact the district court's numerosity analysis, plaintiffs must provide evidence of each class member's identity or risk having that member not counted. The declaration of the settlement administrator that there are three more class members is not enough in this case, where we are at the low end of what is deemed to be a sufficient number of class members. This is particularly true where all assignees—partial or otherwise—are large corporations whose identity is easily ascertainable. On remand, Plaintiffs will need to provide more evidence concerning these three potential class members if they wish to have them counted for numerosity purposes.
11. The superiority analysis required under Rule 23(b)(3) similarly calls for an inquiry into judicial economy and places great weight on whether the individual members can bring their own claims. However, superiority, unlike numerosity, considers alternatives to class actions other than joinder. See Fed. R. Civ. P. 23(b)(3) (requiring an inquiry into whether “a class action is superior to other available methods for fairly and efficiently adjudicating the controversy”); In re Warfarin Sodium Antitrust Litig., 391 F.3d 516, 533–34 (3d Cir. 2004) (“The superiority requirement ‘asks the court to balance, in terms of fairness and efficiency, the merits of a class action against those of alternative available methods of adjudication.’ ” (quoting In re Prudential Ins. Co. Am. Sales Practice Litig. Agent Actions, 148 F.3d 283, 316 (3d Cir. 1998))); id. at 534 (finding superiority to be satisfied because “there are a potentially large number of class members in this matter․ [and] each consumer has a very small claim in relation to the cost of prosecuting a lawsuit. Thus, from the consumers' standpoint, a class action facilitates spreading of the litigation costs among the numerous injured parties and encourages private enforcement of the statutes.”). Numerosity, of course, is a prerequisite to all class actions, while a finding of superiority is necessary only in a (b)(3) suit.
12. The third purpose behind class actions mentioned in Marcus, the due process concern of protecting the ability of individual members to bring their own claims, Marcus, 687 F.3d at 594–95, is not present in Rule 23(b)(3) actions where members have the right to opt out of the class and where the identity of all class members is ascertainable such that there will be no difficulties in ensuring that they receive notice of the representative action. See Fed. R. Civ. P. 23(c)(2)(B)(v).
13. The dissent does not “read the District Court's analysis [of the judicial economy factor] as turning upon a consideration of the late stage of the proceeding.” However, this analysis consisted of three sentences in a single paragraph, each of which focused on the late stage of the proceeding. King Drug Co. of Florence, Inc. v. Cephalon, Inc., 309 F.R.D. 195, 206–07 (E.D. Pa. 2015). We also note that the District Court's entire numerosity section spanned three pages, one of which is nothing more than a summary of the parties' arguments.
14. The dissent cites several cases that it claims “recognize that it is appropriate for courts to consider the stage of the proceedings when weighing judicial economy.” None of these cases are class actions though, which we again emphasize are the “exception to the usual rule that litigation is conducted by and on behalf of the individual named parties only.” Wal–Mart Stores, Inc. v. Dukes, 564 U.S. 338, 348, 131 S.Ct. 2541, 180 L.Ed.2d 374 (2011) (quoting Califano v. Yamasaki, 442 U.S. 682, 700–01, 99 S.Ct. 2545, 61 L.Ed.2d 176 (1979)).
15. Weiss was abrogated on other grounds by Campbell–Ewald Co. v. Gomez, ––– U.S. ––––, 136 S.Ct. 663, 193 L.Ed.2d 571 (2016). However, its reasoning concerning the impropriety of “premature certification decisions” was reaffirmed in Richardson, ––– F.3d at ––––, 2016 WL 3854216, at *4.
16. The District Court also considered the effects on judicial economy if individual suits in separate jurisdictions would be filed absent class certification. However, the text of Rule 23(a)(1) envisions only two scenarios: joinder of all class members or a class action. Fed R. Civ. P. 23(a)(1) (inquiring whether “joinder of all members is impracticable”). The possibility of individual suits filed in separate jurisdictions is not a consideration that a district court should entertain in deciding numerosity vel non.
17. In the District Court, Defendants never asked for discovery from unnamed class members. Defendants claim that a request for discovery of unnamed class members would have been futile because it is highly circumscribed. However, the citations that they provide in support of this view make clear that this is merely a heightened standard, and if they could show a need for discovery from unnamed class members the District Court would allow it. See 5 Moore's Federal Practice § 33.20 (“Reasonable discovery ․ should be permitted from unnamed class members when the special circumstances of the case justify it.”); Clark v. Universal Builders, Inc., 501 F.2d 324, 341 (7th Cir. 1974) (“The taking of depositions of absent class members is—as is true of written interrogatories—appropriate in special circumstances.”).
18. The dissent suggests that we proclaim this rule “without any citation to authority.” Of course, our dissenting colleague fails to provide any citation to authority to support a contrary rule. In fact, the only authorities that we can find to support the dissent's position are the District Court's opinion in this case, and another district court opinion from the Eastern District of Pennsylvania upon which the District Court here relied, In re Wellbutrin XL Antitrust Litig., No. 08–2431, 2011 WL 3563385, at *3 (E.D. Pa. Aug. 11, 2011). This is a matter of first impression for any court of appeals. Indeed, our Court has never even identified the factors that a district court should consider in its numerosity analysis despite the dissent's assertion that the District Court in this case “properly considered every factor we have ever held to be relevant” in this analysis. We cannot abdicate our responsibility to conduct a de novo review of legal issues. See In re Hydrogen Peroxide Antitrust Litig., 552 F.3d 305, 312 (3d Cir. 2008) (recognizing that although class certification decisions are reviewed for abuse of discretion, “[w]hether an incorrect legal standard has been used is an issue of law to be reviewed de novo” (internal quotation marks omitted)).
19. We read Marcus's language about the ability “to litigate individually,” Marcus, 687 F.3d at 594, to refer to each plaintiff appearing on the record as a joined party, and not whether each individual plaintiff can litigate his or her own claim as the sole plaintiff. While the latter concern is certainly a policy justification for the class device generally, as we emphasize, Rule 23(a)(1) requires only the binary choice between class actions and joinder of all parties.
20. Other considerations may be relevant to a district court in determining class members' ability and motivation to be joined as named plaintiffs. For example, the District Court here recognized that a fear of retaliation may hinder the ability and motivation of a party to appear as a named plaintiff. In this case, the District Court noted that there was no proof of any fear of retaliation, and we do not disturb that factual finding on appeal. King Drug Co. of Florence, Inc. v. Cephalon, Inc., 309 F.R.D. 195, 207 (E.D. Pa. 2015).
21. A negative value claim is a “claim[ ] that could not be brought on an individual basis because the transaction costs of bringing an individual action exceed the potential relief.” In re Baby Prods. Antitrust Litig., 708 F.3d 163, 179 (3d Cir. 2013).
22. While discovery from joined parties is not subject to the heightened discovery standard of unnamed class members, even in a non-class action a district court has the discretion to limit unnecessary discovery pursuant to Rule 26(b)(C).
23. The dissent contends that we misread the District Court's analysis, and argues that “the focus of the District Court's opinion is on joinder throughout.” Yet every reference to joinder that the dissent cites comes from portions of the District Court opinion that were not about the ability of the plaintiffs to litigate via joinder. Instead, these references are: “[j]oinder of the absent class members would likely require additional rounds of discovery,” which appears in the judicial economy section; “[t]he considerable geographic dispersion of the parties would certainly present challenges to plaintiffs in attempting to coordinate the litigation if all class members were joined,” which obviously is in the geographic dispersion section; and “Plaintiffs have demonstrated by a preponderance of the evidence that the parties are sufficiently numerous so as to make joinder impracticable,” which is in the conclusion of the numerosity analysis. Even a cursory look at the section on the ability and incentive of the class members to litigate reveals that the District Court was focused on the alternative of individual suits, not on joinder. See King Drug, 309 F.R.D. at 207 (“Two factors that may weigh against Plaintiffs are the financial resources of the class members and the parties' abilities to bring individual suits.”) (emphasis added); id. (“These prospective class members likely do not have the same incentive to engage in costly antitrust litigation on their own.”) (emphasis added). To the extent that the District Court did properly consider the alternative of joinder, as the dissent contends, on remand the District Court has the opportunity to more clearly state this when it conducts its rigorous numerosity analysis. At this point, the references to “individual suits” and “on their own” prominently stand out when surrounded by the references to “joinder” in the other sections.
24. Most of the dissent's possible reasons why the class members would not be likely to join as named plaintiffs—“desire to have one's self and own law firm control the litigation, choice of favorable forum, familiarity with the local jurisdictions laws and procedures, [and] fear of being dragged into settlement”—are equally applicable to the decision of whether to opt out of the class. See Phillips Petroleum Co. v. Shutts, 472 U.S. 797, 813, 105 S.Ct. 2965, 86 L.Ed.2d 628 (1985) (discussing the importance of allowing opt outs because if a “plaintiff's claim is sufficiently large or important that he wishes to litigate it on his own, he will likely have retained an attorney or have thought about filing suit, and should be fully capable of exercising his right to ‘opt out’ ”). Moreover, these reasons do not show why joinder is “impracticable”; they simply show that joinder may not be the preferred method of proceeding with the case. If a plaintiff wants to proceed individually, it has that choice. The plaintiff does not need to join the suit—just as it need not remain a member of a certified class—if it wants to control its own litigation, choose a more favorable forum, select a jurisdiction whose laws and procedures it is familiar with, or avoid being dragged into a settlement. Cf. In re Diet Drugs Prods. Liability Litig., 369 F.3d 293, 308 (3d Cir. 2004) (“By waiving an initial opt-out, the class member surrenders what may be valuable rights, in return for countervailing benefits.”).
25. The dissent makes the argument that if the class were not certified, then several individual judges would have to address what it terms “the real issues before the Court.” Yet the only other issue before the Court is the Comcast predominance issue. If the class were not certified because of a failure to satisfy the numerosity requirement, there would be no Comcast argument, as predominance is a question that arises only in the class action context. Additionally, the fact that there is a “key issue” that the parties seek to litigate does not justify class status.
26. Rule 23(b)(3) also states that “a class action [must be] superior to other available methods for fairly and efficiently adjudicating the controversy.” This second requirement—superiority—is not at issue in this appeal.
27. Plaintiffs argue that we should exercise our pendent appellate jurisdiction and review the District Court's grant of summary judgment on the global antitrust conspiracy claim because reversal on this claim would moot the Comcast issue. The use of the pendent appellate jurisdiction doctrine “is an exercise of discretion by a Court of Appeals and should be used sparingly.” United States v. Spears, 859 F.2d 284, 287 (3d Cir. 1988). If we were to reverse on the Comcast issue, we would deem it prudent to examine the global antitrust conspiracy claim. However, because we would affirm on predominance grounds, we do not deem the class certification order and the summary judgment order to be so “inextricably intertwined” that the exercise of our pendent appellate jurisdiction would be appropriate. CTF Hotel Holdings, Inc. v. Marriott Int'l, Inc., 381 F.3d 131, 136 (3d Cir. 2004) (internal quotation marks omitted). Accordingly, we express no view on the merits of Plaintiffs' global antitrust conspiracy claim.
28. Defendants have not challenged the substance of Dr. Leitzinger's methodology.
29. Under the doctrine of joint and several liability, “[i]f the tortious conduct of each of two or more persons is a legal cause of harm that cannot be apportioned, each is subject to liability for the entire harm, irrespective of whether their conduct is concurring or consecutive.” Restatement (Second) of Torts § 879 (1979); United States v. Alcan Aluminum Corp., 964 F.2d 252, 268 (3d Cir. 1992) (applying the doctrine of joint and several liability to an environmental statute when the harm was indivisible amongst the tortfeasors). The Third Restatement of Torts provides no guidance. Restatement (Third) of Torts: Apportionment of Liability § 17 (stating that “the law of the applicable jurisdiction determines whether” whether concurrent tortfeasors “are jointly and severally liable”).
30. Antitrust standing, unlike Article III standing, is not a jurisdictional requirement. Associated Gen. Contractors of Cal., Inc. v. Cal. State Council of Carpenters, 459 U.S. 519, 535 n.31, 103 S.Ct. 897, 74 L.Ed.2d 723 (1983) (“Harm to the antitrust plaintiff is sufficient to satisfy the constitutional standing requirement of injury in fact, but the court must make a further determination whether the plaintiff is a proper party to bring a private antitrust action.”); Ethypharm S.A. France v. Abbott Labs., 707 F.3d 223, 232 (3d Cir. 2013) (describing antitrust standing as a prudential limitation that “does not affect the subject matter jurisdiction of the court, as Article III standing does”).
31. Because the parties only dispute the relevance of Mid–West Paper and the “directness” factor, and we reject Defendants' understanding of Mid–West Paper, we will not analyze the other factors of antitrust standing.
32. The direct-purchaser rule states that only immediate customers of a supplier have antitrust standing to sue for damages as customers even if the direct purchaser passes the entirety of the higher price down the supply chain. Illinois Brick Co. v. Illinois, 431 U.S. 720, 746, 97 S.Ct. 2061, 52 L.Ed.2d 707 (1977)
33. Preventing a market from forming differs from an attempt to suppress competition in an established market. See Blue Shield of Va. v. McCready, 457 U.S. 465, 483, 102 S.Ct. 2540, 73 L.Ed.2d 149 (1982) (stating that, in a conspiracy to suppress competition in the psychotherapy market by restricting access to psychologists (as opposed to psychiatrists), customers of the psychologists would only be indirectly injured).
34. Defendants argue that either Teva—as the first company to settle with Cephalon—or Barr—as the last to do so—caused all of the injury, and that Mylan or Ranbaxy cannot be held liable. While we have delved deep into the merits in order to opine on the predominance question, this argument by Defendants is inappropriate at the class certification stage. It has nothing to do with whether common questions of law and fact predominate, and instead goes to the issue of liability. See Tyson Foods, Inc. v. Bouaphakeo, ––– U.S. ––––, 136 S.Ct. 1036, 1047, 194 L.Ed.2d 124 (2015) (“When, as here, ‘the concern about the proposed class is not that it exhibits some fatal dissimilarity but, rather, a fatal similarity—[an alleged] failure of proof as to an element of the plaintiffs' cause of action—courts should engage that question as a matter of summary judgment, not class certification.’ ” (quoting Richard A. Nagareda, Class Certification in the Age of Aggregate Proof, 84 N.Y.U. L. Rev. 97, 107 (2009))).
1. As the Majority notes, the third underpinning, to “prevent [ ] putative class representatives and their counsel, when joinder can be easily accomplished, from unnecessarily depriving members of a small class of their right to a day in court to adjudicate their own claims,” id. is not relevant to 23(b)(3) actions. See Majority Op. –––– n.12.
2. The Majority asserts that the factors we should consider have not been previously set forth. But Marcus's recitation of the policies that animate numerosity provides a helpful standard from which the factors to consider are readily discernible.
3. The Majority's references to “sunk costs” are inapt. See Majority Op. ––––, ––––, ––––. Sunk costs are costs that have already incurred and cannot be recovered. See Verizon Commc'ns, Inc. v. FCC, 535 U.S. 467, 499, 122 S.Ct. 1646, 152 L.Ed.2d 701 (2002) (“ ‘Sunk costs' are unrecoverable past costs․” (emphasis added)). The District Court's analysis did not consider sunk costs, but rather the relative costs, going forward, of joinder and class litigation. To determine the relative costs, going forward, of joinder and class litigation, one needs to know how much remains to be done under either alternative.
4. The Majority instructs district courts to consider whether, in a hypothetical world, joinder would have been more efficient than the class mechanism. Cf. Majority Op. –––– (“In other words, without considering the late stage of the litigation, it should determine whether a class action would have been a substantially more efficient mechanism of litigating this suit than joinder of all parties․ At the same time, the District Court is free to rely on its superior understanding of how the case has proceeded to date for the purpose of determining whether the class mechanism would have actually been a substantially more efficient use of judicial resources than joinder of the parties at the onset of the litigation.”).
5. The Majority, citing references to “individual suits” in the District Court's opinion, posits that “[e]ven a cursory look at the section on the ability and incentive of the class members to litigate reveals that the District Court was focused on the alternative of individual suits, not on joinder.” See Majority Op. –––– n.23. But the two concepts are not exclusive of each other, as the Majority itself recognizes in footnote 19, when it “read[s] Marcus's language about the ability ‘to litigate individually,’ to refer to each plaintiff appearing on the record as a joined party, and not whether each individual plaintiff can litigate his or her own claim as the sole plaintiff.” See Majority Op. –––– n.19 (citation omitted). Litigation not pursued on a classwide basis is individual litigation, even if pursued via joinder, and the parties joined in a proceeding remain responsible for the individual litigation of their claims. See 7 Wright & Miller, supra, § 1652 (“Consequently, rights that are separate and distinct under the governing law are not transformed into joint rights when plaintiffs join under Rule 20 in a federal court action; each plaintiff's right of action remains distinct, as if it had been brought separately.”).
6. This assertion stems from the defendants' argument that “[e]ach of the 16 absent class members has the ability and the financial incentive to file its own claim.” See Appellants' Br. 45. The Majority adopts this speculation as fact, but it is mere argument and speculation.
7. An audio recording of the oral argument is available online at http://www2.ca3.uscourts.gov/oralargument/audio/15-3475InReModafinil.mp3
8. This case would not be appropriate for an MDL as it is ready for trial and pre-trial proceedings are largely completed.
9. See, e.g., In re: Sci. Drilling Int'l, Inc., FLSA Litig., 24 F.Supp.3d 1364, 1364–65 (J.P.M.L. 2014) (“Plaintiffs oppose centralization as unnecessary, stating that they recognize the overlap in the actions and they already have agreed to coordinate pretrial proceedings to avoid duplicative discovery and inconsistent rulings.”); In re: Standard & Poor's Rating Agency Litig., 949 F.Supp.2d 1360, 1361 (J.P.M.L. 2013) (“Plaintiffs oppose centralization and argue, inter alia, that the Panel has never centralized litigation of this type, that transfer to a distant forum will inconvenience the states, and that transfer is unnecessary in light of the historic cooperation among state attorneys general.”); In re Le–Nature's, Inc., Commercial Litig., 609 F.Supp.2d 1372, 1373–74 (J.P.M.L. 2009) (“Plaintiffs opposed to centralization argue, inter alia, that (1) the allegations pertaining to the bottling actions make up only a minimal part of the Trustee's case; (2) all active parties to the bottling actions have admitted that Le–Nature's perpetrated a systematic fraudulent scheme and, therefore, a large portion of the allegations set forth in the Trustee's action is insignificant to the bottling actions; (3) the bottling actions are straightforward fraud cases that can readily be handled by their respective district courts; and (4) discovery can be coordinated in the bottling actions without centralization.”).
10. The Majority contends that this is no “run-of-the-mill class action” given the top-heavy distribution of the claims among the class members. See Majority Op. ––––. But whether the class looks like other classes is not controlling as to whether the requirements of Rule 23 have been met. Rule 23 was “designed to allow an exception to the usual rule that litigation is conducted by and on behalf of the individual named parties only.” Califano, 442 U.S. at 700–01, 99 S.Ct. 2545 (emphasis added). The plaintiff's burden under Rule 23 is merely to demonstrate compliance by a preponderance of the evidence—not to establish “proof beyond any doubt.” Reyes v. Netdeposit, LLC, 802 F.3d 469, 485 (3d Cir. 2015). Here, given the evidence the plaintiffs have adduced regarding, inter alia, the impracticability of joinder and the predominance of common questions of law and fact, and given the paucity of contrary evidence adduced by the defendants, I reiterate that nothing about this case cries out for anything but class treatment.
SMITH, Circuit Judge.