### Overview of the study

The present study aimed to develop an individual identification method using TVFs as biological fingerprints and CT data containing ante-mortem and post-mortem TVF data. We used a flowchart of events from discovery of a corpse to individual identification (Fig. 4) to prepare an ante-mortem database and the corresponding post-mortem data to assess the accuracy of identification. individual.

### CT images

Prepared CT data included 620 ante-mortem cases and 82 corresponding ante-mortem and post-mortem cases. The accuracy of this method of individual identification was assessed using TVF-based matching using Euclidean distances or modified Hausdorff distance of 82 cases (56 males, 26 females) out of a total of 702 cases. antemortem. The imaging conditions for these CT data were: tube voltage, 120-140 kV; tube current, 30–725 mA, and slice thickness, 0.6–3 mm. Although the CT data included all of the thoracic vertebrae, the CT data of 147 of the 702 cases did not include some of the thoracic vertebrae (Fig. 5a). CT data for 12 of the 82 cases for which ante-mortem and post-mortem data existed did not include some data on the thoracic vertebrae (Fig. 5b). All ante-mortem data for the 620 cases were obtained from the Lung Image Database Consortium image collection^{23}. The 82 cases for which ante-mortem and post-mortem images were obtained were scanned at Niigata City General Hospital (Niigata, Japan) and the ethics committee of this hospital approved their use in this study (ref. acceptance: 16-003, decision date: 4/13/2016, Title: Development of an individual identification method based on the comparison of ante-mortem/post-mortem radiographic images.). Since these images were obtained from a deceased person, the ethics committee considered that informed consent was not necessary. All procedures performed in studies involving human participants were conducted in accordance with the ethical standards of the institutional and/or national research committee and the Declaration of Helsinki of 1964 and subsequent amendments or comparable ethical standards.

### TVF measurement method

The shortest diameter (mm) of the TVF (height, width and depth) was measured as a biological fingerprint for individual identification (Fig. 6). The width and depth were defined as line segments passing through the center of height, which was defined as a line segment passing through the center of width and depth. Aquarius NET (TeraRecon Inc., NC, US) was used as the image display system to measure TVF. A flowchart of the TVF measurement method is shown in Fig. 7. The TVF measurement method was as follows. (1) The image thickness was fixed at 1 mm using the maximum intensity projection method to facilitate the selection of the most lateral part of the thoracic vertebrae on the CT data. The window width and level were set to the predefined bone conditions in Aquarius NET (window width, 616 and window level, 258). This made it possible to clearly visualize the outline of the thoracic vertebrae. (2) The vertebrae to be measured have been oriented so that they are not tilted with the horizontal direction in the image. It was necessary to orient each vertebra because the thoracic vertebrae were oriented differently along the physiological curvature. (3) The vertebrae were displayed in an anterior to posterior direction and the width and height were measured. The width measurement was made parallel to the horizontal direction. The height measurement was made parallel to the vertical direction. (4) The direction from left to right was displayed and the depth was measured. The depth measurement was made parallel to the horizontal direction. (5) TVF measurements were performed on all thoracic vertebrae that could be measured.

### Individual identification method using Euclidean distance or modified Hausdorff distance

The shortest diameters in height, width and depth of the thoracic vertebrae of the corpse (post-mortem data) obtained using the TVF measurement method were recorded. The shortest diameters in height, width and depth of the thoracic vertebrae of the antemortem data obtained from the same TVF measurement method were recorded and an antemortem database was compiled. Euclidean distance or modified Hausdorff distance was calculated as the distance between two points in three-dimensional feature space at TVF (height, width, and depth) of ante-mortem and post-mortem data. The antemortem database (antemortem data set) was called U. The distance *E*_{I} (p_{I} ∈U,q) enters the “j”th recorded ante-mortem data, p_{I}, and post-mortem data, q, were expressed using a formula. The equation of this method using the Euclidean distance is the following (1):

$${TH}_{i}left({p}_{j}in U,qright)=sqrt{{left({q}_{H}-{p}_{jH} right)}^{2}+{left({q}_{W}-{p}_{jW}right)}^{2}+{left({q}_{D}-{ p}_{jD}right)}^{2}}$$

(1)

where *p*_{jH} represents the measured height of the ante-mortem data, *p*_{jW} represents the measured width of the ante mortem data, and *p*_{J.D.} represents the measured depth of the ante-mortem data. In the same way, *q*_{H}, *q*_{O}and *q*_{D} represent the height, width and depth measurements of the post-mortem data respectively. The equation of this method using the Hausdorff distance is the following (2):

$${TH}_{i}left({p}_{j}in U,qright)=maxleft{minleft{distanceleft(left[{p}_{jH}, {p}_{jW},{p}_{jD}right],left[{q}_{H}, {q}_{W}, {q}_{D}right]right)right}right}$$

(2)

In this study, a modified Hausdorff distance is used, which is an improvement of the Hausdorff distance^{24}. *E*_{I}(p_{I} ∈U, q) represents the “n” data of thoracic vertebrae measured in the “j” recorded ante-mortem data, p, and post-mortem data, q, where the maximum number of thoracic vertebrae (12) is the maximum value of n. Thus, the total Euclidean distance or the modified Hausdorff distance *SUM*_{I}(p_{I} ∈U,q) when the number of thoracic vertebrae measured in the “j”th recorded ante-mortem data is n can be expressed as follows:

$${SUMTH}_{i}left({p}_{j}in U,qright)= sum_{i=1}^{n}{TH}_{i}({p} _{j}in U,q)$$

(3)

Thus, the set with the smallest value among all *SUM*_{I}(p_{I} ∈U,q) is considered to be the same person. However, the smaller the number of vertebrae used in the calculation of Euclidean distance or modified Hausdorff distance, the smaller the total Euclidean distance or modified Hausdorff distance. Therefore, individual identification may not be able to be performed correctly in some cases due to the number of vertebrae remaining in the cadaver. Therefore, we have divided *SUM*_{I}(p_{I} ∈U,q) by the number of vertebrae measured n in the “j” ante-mortem data recorded to allow individual identification regardless of the number of vertebrae available for measurement as follows:

$$E(m)={Rank Asc}_{m}left[frac{{SUMTH}_{i}left({p}_{j}in U,qright)}{n}right]$$

(4)

where E is the set of Euclidean distances or the modified Hausdorff distance between the “j” recorded ante-mortem data, p_{I}, and post-mortem data, q, averaged over the number of vertebrae used in the analysis. Asc.Rank_{m} represents a function that reorders the ante-mortem data set U, p_{I}in order of Euclidean distance (increasing order) between ante-mortem data, p_{I}, and post-mortem data, q. Thus, E(m) represents the ante-mortem data, p_{I}, with the “m”th smallest Euclidean distance or modified Hausdorff distance. In other words, this function can sort the candidates of the same person by order of similarity of the biological fingerprints with the post-mortem data, q. In the present study, m was defined as a value between 1 and 10, and 10 candidates for individual identification were selected.

### Method evaluation

A success rate was defined to evaluate our identification method. Cases in which the magnitude of the Euclidean distance between ante-mortem and post-mortem data for the same person was within the top 10 in ascending order were considered “success” and represented the percentage of cases affected among the 82 cases with post-mortem data. The reason for considering cases where the magnitude of the Euclidean distance was less than 10 in ascending order as the success rate was to ensure individual identification by combining this method with other methods if the number of candidates could be reduced to at least 10 using this method. In addition, each Euclidean distance was ranked in ascending order and the average value up to the 10th rank was studied. The average Euclidean distance between neighboring ranks was also examined for significant differences using the *you* test. Statistical analysis was performed using R version 4.2.0 [R Core Team (2022)]which is a language and environment for statistical computing (R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/).

In addition, a threshold was defined for the Euclidean distances or the modified Hausdorff distance in this study. The cases that were below the threshold were considered to be the same person. In other words, we investigated whether the individual identification method using the Euclidean distance or the modified Hausdorff distance could be used as a one-to-one verification method. FAR, FRR and EER were calculated in this survey. Euclidean distance and modified Hausdorff distance were normalized to compare individual identification methods. The fraction of true positives and the fraction of false positives were calculated from all Euclidean distances and the modified Hausdorff distance and individual identification results. The area of the ROC curve is called AUC. If the AUC is large, the identification methods are considered to have superior individual identification capability.

### Ethical approval and consent to participate

This work has not been previously published in part or in its entirety. All procedures performed in studies involving human participants were conducted in accordance with the ethical standards of the institutional and/or national research committee and the Declaration of Helsinki of 1964 and subsequent amendments or comparable ethical standards.