Learn About the Law
Get help with your legal needs
FindLaw’s Learn About the Law features thousands of informational articles to help you understand your options. And if you’re ready to hire an attorney, find one in your area who can help.
The PEOPLE of the State of New York, Plaintiff, v. H.K., Defendant.
In People v. Wakefield, 175 A.D.3d 158, 107 N.Y.S.3d 487 [3d Dept.], lv denied, 34 N.Y.3d 1083, 116 N.Y.S.3d 158, 139 N.E.3d 816 [2019], the Court considered whether the prosecution's introduction of testimony concerning analysis of deoxyribonucleic acid (“DNA”) evidence conducted using the TrueAllele Casework System (hereinafter, “TrueAllele”), a software program, violated the defendant's right to confront his accusers. The Court found that “TrueAllele, by running at the source code's direction, compared DNA found at the crime scene to that of defendant's DNA and generated the report containing the likelihood ratios, which, in effect, implicated defendant in the murder.” (id. at 168, 107 N.Y.S.3d 487). Applying the primary purpose test outlined in People v. Pealer, 20 N.Y.3d 447, 962 N.Y.S.2d 592, 985 N.E.2d 903, cert denied, 571 U.S. 846, 134 S.Ct. 105, 187 L.Ed.2d 77 [2013], the Court determined that the TrueAllele report was testimonial. (Wakefield at 168-9, 107 N.Y.S.3d 487). Yet, despite the fact that the report was generated through “a synergy and distributed cognition continuum between human and machine”, the Court did not find the distribution allotted to the program to be sufficient to “transform the source code into a declarant.” (id. at 169, 107 N.Y.S.3d 487 [internal citations omitted] ). In Wakefield, Mark Perlin, “the founder, chief scientist and chief executive officer of Cybergenetics,” the company which developed and marketed TrueAllele, testified at trial. (id. at 161, 107 N.Y.S.3d 487). Perlin specified which functions were performed by a human analyst and which were done by the program. Since Perlin, “the declarant in the epistemological, existential and legal sense”, testified at trial, the Court held that the defendant's right to confrontation was not violated. (id. at 169-70, 107 N.Y.S.3d 487).
In this case, the defendant stands charged with two counts of Forcible touching (Penal Law (PL)) 130.52(1)), one count of Endangering the welfare of a child (PL 260.10(1)), two counts of Sexual abuse in the third degree (PL 130.55) and two counts of Harassment in the second degree (PL 240.26(1)). It is alleged that on October 28, 2018, the defendant, sexually assaulted two teenagers by, inter alia, entering their bedroom and touching them on their breasts and vaginas. The complainants reported the incident immediately to their mother and other family members who confronted the defendant and contacted the police. The defendant, a tenant of the complainants' grandmother, made statements to the family and the police acknowledging interacting with the girls but denying that any sexual contact took place. The complainants were transported to a hospital where they were examined by a sexual assault nurse examiner and a sexual assault evidence collection kit was prepared that included swabs taken of relevant areas of both complainants. The defendant gave a consensual DNA sample and was later ordered to provide an additional sample for comparison. DNA samples were also taken from both complainants.
Relevant findings from the analysis conducted by the Office of the Chief Medical Examiner (OCME) include 1 :
1. Dried secretion swab from “left breast” of (complainant 1) contained a DNA mixture that is “approximately 12.2 quadrillion (1.22x1016) times more probable if the sample originated from H.K., (complainant 1), and one unknown person than if it originated from (complainant 1) and two unknown persons. Therefore, this supports that H.K. is included as a contributor to this sample.”
2. Dried secretion swab from “right breast” of (complainant 1) contained a DNA mixture that is “approximately 4.81 quadrillion (4.81x1015) times more probable if the sample originated from H.K., (complainant 1), and one unknown person than if it originated from (complainant 1) and two unknown persons. Therefore, this supports that H.K. is included as a contributor to this sample.”
3. “A likelihood ratio was calculated for the comparison of H.K. to the DNA mixture found on the dried secretion swab from ‘left breast’ (of complainant 2) H.K. is excluded as a contributor to this sample.”
Additionally, male DNA was recovered from other swabbed areas of complainant 2, but in insufficient concentrations to permit DNA typing.
The People have sought to admit the first two findings at trial through the testimony of OCME Criminalist Level II Alison Eychner. The defense objected, arguing that in order to calculate the probability ratios described above, the criminalist used a software program called STRmix that is similar, but not identical, to TrueAllele. Citing Wakefield, they sought preclusion of the criminalist's testimony on this topic as violative of the Confrontation Clause. In order to resolve the question of whether such testimony would run afoul of the principles set forth in Wakefield, the Court ordered a hearing.
The issue at the hearing, as agreed to by both parties was: whether testimony of the criminalist provides a sufficient opportunity for cross-examination to satisfy the Confrontation Clause as defined by Crawford v. Washington, 541 U.S. 36, 124 S.Ct. 1354, 158 L.Ed.2d 177 (2004) and its progeny?
Findings of Fact
The People called Tiffany Vasquez, a Criminalist IV and assistant technical leader in the Department of Forensic Biology at the New York City OCME. The Court finds her to be a credible witness. The defense did not present any witnesses.
Ms. Vasquez has worked at the OCME for 15 years. Criminalist IV is the highest-level criminalist designation. Her responsibilities include supervising Criminalists I, II and III, reviewing case files and technical data, and triaging evidence. As an assistant technical leader, Ms. Vasquez reviews validation data for new technologies, assists in developing standard operating procedures, assists analysts with technical issues with their casework and assists in training analysts.2 Ms. Vasquez hold a bachelor's degree in molecular and cell biology from the University of California and a master's degree in forensic science from the University of Illinois at Chicago.
Ms. Vasquez assists with training on STRmix by giving lectures and assisting in exercises. She also conducts oral examinations for the analysts that are completing their training. She assists analysts who report to her and other supervisees while they are being trained and with their first cases. She has conducted or reviewed thousands of statistical analyses.
There are four stages of DNA testing. These are: (1) extraction, in which the sample is heated in order to release the DNA and isolate it; (2) quantitation, where the amount of DNA is determined; (3) amplification, where the DNA is copied; and (4) detection, where a genetic analyzer is used to determine whether there is any usable DNA in the sample.
Once the detection phase is completed, the analyst enters the raw data into a software program called Genemarker. The analyst uses Genemarker to perform an evaluation of the data to account for any artifacts of the process or other “background noise” that would interfere with the results. (I:11: 14)3 Then, “they will look at the sample and the case as a whole to determine what samples can be interpreted or compared.” (I:13: 4-6)
When this process is complete, a criminalist is assigned to the case and goes through each sample presented to see if it can be compared to another sample or interpreted. Each sample can contain DNA from a single source or a mixture from two or three people.4
The criminalist will look at any reference samples provided by a victim to see if that can give them information about the sample they are comparing. If it is a single-source sample, they may be able to assign a genotype or DNA profile then.
If the sample contains a mixture of two or three people, the analyst will employ STRmix in determining probabilities. STRmix is “a probabilistic genotyping software program, which means that it's an assistive tool that helps DNA analysts with their interpretation of data.” (I:15: 11-14)5 . Ms. Vasquez further explained:
“For a mixture, sometimes if you have a lot of DNA from one person in the mixture and very little DNA from another person in the mixture, it can also be straightforward to determine the genotype of that major contributor who contributed more DNA or the minor contributor that contributed less DNA. But many mixtures are less straightforward, and they have similar amounts from both people in the mixture or three people. So, an analyst can determine about how much DNA came from each person and possible genotypes for each of the individual people, but STRmix can assist them in assigning some probability to each of those genotypes that are possible in that mixture. That is the probabilistic portion of the software it is assigning probabilities to each of the genotypes.” (I:16: 3-11).
After the analyst designates whether it is a two-person or three-person mixture, “STRmix gives an estimate about how much DNA came from each individual, and possible genotypes for each contributor and assigns a probability for each of those genotypes.” (I:17: 17-20). The analyst then compares those results with what they would expect to see from the sample:
“So, are those genotypes possibilities that it came up with reasonable based on the DNA results that they are interpreting themselves? Do the mixture proportions meet their expectation of what they can calculate as a mixture proportion for that particular sample? And they're trained in all of that interpretation before they are trained in STRmix. They are trained in how to determine mixture proportion; they are trained how to assign genotypes to individual contributors within a mixture. They are just making sure STRmix conforms to their expectations.” (I:18: 3-12)
If the results produced by STRmix differ from the interpretations of the analyst, the analyst will re-examine the inputs to determine if there were mistakes, such as an incorrect number of contributors or a missed artifact from processing. The analyst will then rerun the data based on the corrected parameters.
When comparing DNA results to a known reference sample, STRmix generates a likelihood ratio using biological and mathematical modeling. Thus, when it is looking at different loci, “it assigns a probability for each genotype combination.” (I:42: 15-16). “It compares millions of possibilities, looks at how well they fit the data based on its own biological modeling and then generates an individual statistic for every locus.” (I:45: 12-16). OCME analysts then “are looking at the STRmix outputs and comparing it to their interpretation of the data to see that it aligns with their interpretation.” (II:34: 5-7).
An analyst does not examine each of the millions of proposals or guesses the program makes, but “they are looking at [the] summary table that shows what happened over time.” (II:34: 13-15). “If given the proposal, they could calculate why it's accepted or rejected on a particular guess.” (II:34: 23-24).
When asked if the analyst would be able to perform these tasks without using the STRmix software, Ms. Vasquez explained that they would. The calculations would not be as exact, but they would be “in the ballpark.” (I:16: 18-19). On cross-examination, she further explained, “I don't know that they will be able to do it in exactly the way that STRmix does it, but they can still recognize the output of STRmix and see that it matches their interpretation or expectation.” (I:46: 2-5). Due to the increasing sensitivity of DNA testing and the number of locations that analysts consider, “(t)o just go through all of that manual interpretation in a reasonable amount of time will be extremely challenging for an analyst to do. We wouldn't get any cases done.” (I:16: 21-24). If an analyst were given an unlimited amount of time, they could produce the same output as STRmix. STRmix is not an “expert system” that functions “without any human intervention by the analyst.” (II:52: 4-16).
Once they are confident in the results, the analyst will then report their interpretations of the sample in accordance with OCME standard operating procedures. “So STRmix is assisting them with those interpretations, but in the end, they are putting their interpretation into their reports based on a combination of what they are looking at in that DNA data along with the information they obtained from STRmix.” (I:20: 4-8).
On cross-examination, Ms. Vasquez testified that that there are various issues that the STRmix program models for, including stutter, drop-in, drop-out, random walk standard deviation, effective sample size thinning, degradation, saturation burn-in accepts, highest posterior density iterations and duplicate runs. She testified that changes to most of these factors could affect the eventual likelihood ratio produced. On redirect, she testified that in order to control for these effects, OCME provides assumptions and parameters to the software. While the analyst sets the number of contributors, “[m]ost of the other settings that are inputted into the software were determined through internal validation [by OCME]”. (II:50: 21-22).
In terms of training on STRmix, criminalists are first trained on interpretation of DNA data, followed by three days of training on the theories behind the STRmix software followed by exercises that demonstrate how STRmix works. The training includes “how STRmix uses the assumptions and parameters.” (II:51: 11-13). They also do hands-on exercises with known samples. Finally, they complete oral and written examinations.
Ms. Vasquez has some limited familiarity with TrueAllele based on having seen presentations on the product, watched webinars and read papers about it. When asked about similarities and differences between the programs, she replied:
“So from what I know of TrueAllele it is similar to STRmix in that it uses some of the same computing processes to look at DNA data, and it generates probabilities and it generates what is called a likelihood ratio for a comparison as a statistic. One of the ways it is different is that analysis step where an analyst intervenes, applies a threshold to separate signal from noise and unlabels artifacts. With TrueAllele, that is not necessary the raw data from the genetic analyzer can be put directly into TrueAllele, and it will make some number of contributor determinations for you, as well as differentiate between artifacts and true peaks.” (I:21: 20-22: 7).
In order to assist in explaining the role STRmix plays in performing an analysis, Ms. Vasquez drew an analogy to Google maps. A driver could use a map to get their destination, calculate the mileage and make an approximation about travel time. With Google maps, the driver could see all that information mapped out in front of them—but they would still have to check that they are going to the correct place and verify that they had gotten there.
Finally, Ms. Vasquez testified that the analyst does not need or use the source code for the program to come to their conclusions.
Conclusions of Law
Both the Sixth Amendment of the federal Constitution and Article 1, Section 6, of the New York Constitution provide that every defendant has the right to confront witnesses against them. The factfinder at trial may not consider “testimonial statements of a witness who did not appear at trial unless he was unavailable to testify, and the defendant had a prior opportunity for cross-examination.” (Crawford v. Washington, 541 U.S. 36, 53-54, 124 S.Ct. 1354, 158 L.Ed.2d 177 [2004]). There is no “forensic evidence” exception--the prosecution may not introduce a testimonial report prepared by an analyst unless they present a witness capable of testifying to its truth. (Melendez-Diaz v. Massachusetts 557 U.S. 305, 129 S.Ct. 2527, 174 L.Ed.2d 314 [2009]).
The US Supreme Court held that an affidavit from a state laboratory analyst that stated that a particular substance was tested and found to contain cocaine, was “functionally identical to live, in-court testimony, doing ‘precisely what a witness does on direct examination.’ ” (id. at 310-1, 129 S.Ct. 2527, citing Davis v. Washington, 547 U.S. 813, 830, 126 S.Ct. 2266, 165 L.Ed.2d 224 [2006]). Similarly, an affidavit prepared by an analyst who reported the result of a gas chromatograph machine was considered testimonial since the affidavit also certified that the machine was in proper working order, that the sample entered was the correct one and that nothing that occurred during testing affected the integrity of the test. (Bullcoming v. New Mexico, 564 U.S. 647, 131 S.Ct. 2705, 180 L.Ed.2d 610 [2011]).
The New York Court of Appeals has identified “two factors that are ‘especially important’ in resolving whether to designate a statement as testimonial—‘first, whether the statement was prepared in a manner resembling ex parte examination and second, whether the statement accused the defendant of criminal wrongdoing.’ ” (People v. Pealer, 20 N.Y.3d 447, 453, 962 N.Y.S.2d 592, 985 N.E.2d 903 [2013], quoting People v. Rawlins, 10 N.Y.3d 136, 156, 855 N.Y.S.2d 20, 884 N.E.2d 1019 [2008]).
Raw data describing a DNA profile without linking it to the accused was found not to be testimonial as it “shed no light on the guilt of the accused in the absence of an expert's opinion that the results genetically match a known sample.” (Rawlins at 159, 855 N.Y.S.2d 20, 884 N.E.2d 1019). An autopsy report that was redacted to eliminate any opinions by the medical examiner was held to be non-testimonial because it was “very largely a contemporaneous, objective account of observable facts”. (People v. Freycinet, 11 N.Y.3d 38, 42, 862 N.Y.S.2d 450, 892 N.E.2d 843 [2008]). Breathalyzer calibration and related maintenance records are not testimonial as the purpose of these tasks was “to ensure the reliability of [the] machines—not to secure evidence for use in any particular criminal proceeding.” (Pealer at 455, 962 N.Y.S.2d 592, 985 N.E.2d 903).
The testimony of a single criminalist may not, under certain circumstances, be sufficient to satisfy the confrontation clause where the testimony concerns conclusions drawn by other criminalists involved in interpreting DNA evidence. In People v. John, 27 N.Y.3d 294, 33 N.Y.S.3d 88, 52 N.E.3d 1114 [2016], multiple OCME analysts participated in performing DNA testing on a sample from a gun the defendant was accused of possessing. Their final report found that “[t]he combination of the DNA alleles found in the sample would be expected to be found in approximately ‘1 in greater than 1 trillion people.’ ” (id. at 298, 33 N.Y.S.3d 88, 52 N.E.3d 1114). To make such a calculation, “[e]xperienced analysts convert numeric identifiers into a DNA profile using machine-generated raw data analyzed by a software program and the analyst's independent manual examination which involves an editing process.” (id., citing John M. Butler, Fundamentals of Forensic DNA Typing at 213[2010] ). The results were compared to a sample of the defendant's DNA in a “table of numbers resembling a box score” and “the series of numbers were identical.” (id. at 299, 52 N.E.3d 1114). In that case, at trial, a Criminalist II testified as to the processes and findings of the OCME and laid the foundation for admission of the various analysts' reports as business records. The witness had not performed any of the tests on either sample herself, nor did she observe or supervise any of the tests. The reports were held to be testimonial. Yet, the prosecution “did not produce the analyst who generated the DNA profile from either the gun or the exemplar in this case.” (id. at 309, 52 N.E.3d 1114). The Court found that “these critical analysts were effectively insulated from cross-examination” and that the testifying witness provided “nothing more than surrogate testimony to prove a required fact.” (id.)
Recently, the Court of Appeals reiterated that “when confronted with testimonial DNA evidence at trial, a defendant is entitled to cross-examine ‘an analyst who witness, performed or supervised the generation of defendant's DNA profile, or who used his or her independent analysis on the raw data.’ ” (People v. Tsintzelis, 2020 N.Y. Slip Op. 02026, 35 N.Y.3d 925, 146 N.E.3d 1160 [March 24, 2020] quoting John at 315, 52 N.E.3d 1114; see also People v. Austin, 30 N.Y.3d 98, 64 N.Y.S.3d 650, 86 N.E.3d 542 [2017]).
In Wakefield, the Court considered whether the testifying analyst improperly served as a conduit for the analysis and conclusions of a software program, rather an actual analyst. The concern about the implications for the right to confrontation was heightened in Wakefield since the source code for the program, TrueAllele, is proprietary and was not disclosed to the defense.
Perlin, the developer, “explained that TrueAllele is what is known as an ‘expert system,’ describing how, beyond the calculations made, the program is designed to have a certain degree of artificial intelligence in order to make additional inferences as more information becomes available.” Wakefield at 167, 107 N.Y.S.3d 487. The Court recognized that due process issues can arise when decisions are made by a software program, rather than by, or at the direction of, the analyst. “Given the exponential growth of technologies such as artificial intelligence, to embrace the future we must assess, and perhaps reassess, the constitutional requirements of due process that arise where law and modern science collide.” (id. at 165-166, 107 N.Y.S.3d 487, citing, e.g. Christian Chessman, A “Source” of Error: Computer Code, Criminal Defendants, and the Constitution, 105 Cal L Rev 179 [2017]; Katherine Kwong, The Algorithm Says You Did It: The Use of Black Box Algorithms to Analyze Complex DNA Evidence, 31 Harv JL & Tech 275 [2017]; Andrea Roth, Machine Testimony, 126 Yale LJ 1972 [2017]; Edward J. Imwinkelried, Computer Source Code: A Source of the Growing Controversy Over the Reliability of Automated Forensic Techniques, 66 DePaul L Rev 97 [2016]).
These concerns were alleviated in Wakefield by the fact that Perlin himself testified, both at the Frye hearing and at trial. The Court reviewed the many functions that the analyst performs in directing, setting parameters and reviewing the results of the program. It also considered Perlin's testimony “as to genetic science, the TrueAllele program and the formulation of the TrueAllele report through the computer processors and algorithms, including the MCMC algorithm.”6 (id. at 169, 107 N.Y.S.3d 487 [citations omitted] ). Thus, any Confrontation Clause concerns were met because the defendant had the opportunity to confront his “true accuser.” (id. at 170, 107 N.Y.S.3d 487).
While the Court found that the report produced by TrueAllele was testimonial, it did not find the source code to be a declarant, explaining:
“This is not to say that an artificial intelligence-type system could never be a declarant, nor is there little doubt that the report and likelihood ratios at issue were derived through distributed cognition between technology and humans (see Itiel E. Dror & Jennifer L. Mnookin, The Use of Technology in Human Expert Domains: Challenges and Risks Arising from the Use of Automated Fingerprint Identification Systems in Forensic Science, 9 Law, Probability & Risk 47, 48-49 [2010]). Indeed, similar to many expert reports, the testimonial aspects of the TrueAllele report are formulated through a synergy and distributed cognition continuum between human and machine (see Itiel E. Dror & Jennifer L. Mnookin, The Use of Technology in Human Expert Domains: Challenges and Risks Arising from the Use of Automated Fingerprint Identification Systems in Forensic Science, 9 Law, Probability & Risk at 48), but this fact alone does not tip the scale so far as to transform the source code into a declarant.” (id. at 169, 107 N.Y.S.3d 487).
In this case, the result reached using the STRmix software forcefully advances the assertion that the defendant had illegal contact with the minor complainant. On the other hand, the calculations prepared using the STRmix software are not equivalent to the types of affidavits and other testimonial substitutes that have resulted in findings of Confrontation Clause violations.
Here, STRmix was used as a tool to assist the analyst in her interpretation of the data. It was not working independently. The analyst determined whether the program considered the sample to be a two-person or three-person mixture. STRmix then performed an analysis, using known mathematical and biological models, to give an estimate about the quantity of DNA from each individual and possible genotypes. It then assigned probabilities to each of the possible genotypes. The analyst compared those results with her expectations in order to see if the data made sense based on her training and experience. If the data did not conform with her expectations, the analyst would re-examine the inputs to make sure there are no errors. If there were any, she would correct them and re-run the data.
The program is a tool that performs these analyses much faster than an analyst could. If, however, the analyst was given an unlimited amount of time, she could produce the same output as STRmix. In essence, the software is acting as a highly sophisticated calculator. In contrast to TrueAllele, STRmix is not an “expert system” that relies on artificial intelligence. In this way, STRmix is more akin to a program like Genemarker, which was used in this case to remove artifacts and “background noise” and was not the subject of an objection.
Under these circumstances, the analyst who utilized STRmix can be meaningfully cross-examined. She has been trained on the underlying principles of biology and biological and mathematical modeling as well as the operation of the STRmix software and its underlying principles. She personally inputted the raw data used by the program and designated whether it was a two- or three-person mix. She has knowledge about the assumptions and parameters that have been set by OCME to direct the program's efforts. She can examine particular results produced by the program and assess their accuracy. The results are not the product of “artificial intelligence” for which the analyst does not have responsibility. The analyst is the declarant.
Additionally, unlike in Wakefield, the defense here has not argued that the source code is not available for their inspection. (see https://www.strmix.com/assets/Uploads/Defence-Access-to-STRmix-April-2016.pdf [accessed April 16, 2020] ).
The Court finds that the defendant's right to confrontation will be properly preserved. The analyst will not be serving as a mere conduit for the conclusions of witnesses who will not be called to testify. Rather, the analyst will be testifying as one who performed the analyses at issue, using STRmix as an assistive tool.
Accordingly, the motion to preclude the testimony of Criminalist Alison Eychner is denied.
This constitutes the Decision and Order of the Court.
FOOTNOTES
1. Names of the complainants are omitted from this decision pursuant to Section 50-B of the Civil Rights Law. The defendant's name has been abbreviated since the case has been resolved with a non-criminal disposition and is otherwise sealed, pursuant to CPL 160.55(1).
2. Throughout the hearing, the witness and the parties used the terms “analyst” and “criminalist” interchangeably.
3. References to the testimony refer to the day (I or II), page, and line(s) in the transcript.
4. OCME does not interpret samples determined to be from four or more individuals.
5. DNA test results interpreted using STRmix have been found admissible since the software was unanimously recommended for use by the DNA Subcommittee of the New York State Commission on Forensic Science and found to be generally accepted within the relevant scientific community. (People v. Bullard-Daniel, 54 Misc. 3d 177, 42 N.Y.S.3d 714 [Niagara County Court, 2016]).
6. The MCMC algorithm “is used to solve high dimension calculus problems that would be impossible or impractical without a computer so as to identify all possibilities, not just the maximum possibility”. (Wakefield at 167, 107 N.Y.S.3d 487, citing Ben Shaver, A Zero-Math Introduction to Markov Chair Monte Carlo Methods, Towards Data Science, available at https://towardsdatascience.com/a-zero-math-introduction-to-markov-chain-monte-carlo-methods-dcba889e0c50).
Laurence E. Busching, J.
Thank you for your feedback!
A free source of state and federal court opinions, state laws, and the United States Code. For more information about the legal concepts addressed by these cases and statutes visit FindLaw's Learn About the Law.
Docket No: 2018BX036525
Decided: May 15, 2020
Court: Criminal Court, City of New York.
Search our directory by legal issue
Enter information in one or both fields (Required)
Harness the power of our directory with your own profile. Select the button below to sign up.
Learn more about FindLaw’s newsletters, including our terms of use and privacy policy.
Get help with your legal needs
FindLaw’s Learn About the Law features thousands of informational articles to help you understand your options. And if you’re ready to hire an attorney, find one in your area who can help.
Search our directory by legal issue
Enter information in one or both fields (Required)